• Title/Summary/Keyword: 병렬회로

Search Result 1,179, Processing Time 0.031 seconds

Design and performance Evaluation of Vertically-Parteitioned Parallel Signature File Method) (수직 분할 병렬 요약화일 기법의 설계 및 성능평가)

  • Kim, Jeong-Gi;Yu, Gyeong-Min;Jang, Jae-U
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.1
    • /
    • pp.66-79
    • /
    • 1999
  • 요약화일 기법은 대규모 데이터베이스 응용에서 효율적인 색인 기법으로 알려져 있으며 최근에는 보다 빠른 검색을 위해 병렬 요약화일 기법이 제안되고 있다. 본 논문에서는 효율적으로 병렬 처리를 할 수 있는 수직 분할 병렬 요약화일(Vertically-partitioned Parallel Signature File, VPSF) 기법을 제안한다. 본 VPSF는 동적인 환경에 잘 적응하도록 신장해싱을 이용하며, 검색의 효율성을 위해 프레임 슬라이스 기법을 사용한다. 실행의 편중을 없애기 위해 요약을 수직으로 분할하여 레코드를 프로세싱노드에 저장함으로써 병렬처리를 수행한다. 뿐만 아니라, 본 논문에서는 VPSF의 효율성을 보이기 위해 성능평가 모델을 제시하고, 실제의 레코드 집합을 가지고 실험을 실시하여 검색시간, 부가저장공간, 삽입시간에 대해 성능을 평가한다. 또한 레코드 집합의 분포에 다른 성능을 평가하기 위해 표준편차를 반으로 줄인 반 정규분포와 두 배로 크게 한 정규분포에 대한 성능평가를 실시한다. VPSF기법은 기존의 병렬 요약화일 기법들과 비교할 때, 실제 레코드 집합의 정규분포에서 기존의 Hamming filter 보다 평균 40% 정도 검색성능이 개선된다. 반 정규분포에서는 Hamming filter 보다 약 50% , HPSF보다 약 20% 정도 개선된 검색 성능을 보인다. 뿐만 아니라, 부가 저장공간 및 삽입시간에도 기존의 방법들보다 좋은 성능을 보인다. 일반적으로 VPSF는 데이터베이스의 레코드 크기가 서로 비슷할 때 그리고 데이터베이스의 크기가 클수록 우수한 검색 성능을 보인다.

An Advanced Parallel Join Algorithm for Managing Data Skew on Hypercube Systems (하이퍼큐브 시스템에서 데이타 비대칭성을 고려한 향상된 병렬 결합 알고리즘)

  • 원영선;홍만표
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.3_4
    • /
    • pp.117-129
    • /
    • 2003
  • In this paper, we propose advanced parallel join algorithm to efficiently process join operation on hypercube systems. This algorithm uses a broadcasting method in processing relation R which is compatible with hypercube structure. Hence, we can present optimized parallel join algorithm for that hypercube structure. The proposed algorithm has a complete solution of two essential problems - load balancing problem and data skew problem - in parallelization of join operation. In order to solve these problems, we made good use of the characteristics of clustering effect in the algorithm. As a result of this, performance is improved on the whole system than existing algorithms. Moreover. new algorithm has an advantage that can implement non-equijoin operation easily which is difficult to be implemented in hash based algorithm. Finally, according to the cost model analysis. this algorithm showed better performance than existing parallel join algorithms.

A Direct Digital Frequency Synthesizer Using A Low Power Pipelined Parallel Accumulator (저전력 파이프라인 병렬 누적기를 사용한 직접 디지털 주파수 합성기)

  • 양병도;김이섭
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.5
    • /
    • pp.361-368
    • /
    • 2003
  • A new high-speed direct digital frequency synthesizer using a low power pipelined parallel accumulator is proposed. The proposed pipelined parallel accumulator uses both pipelining and paralleling techniques to increase speed and to reduce power consumption. The 2-pipelined 2-parallel accumulator only consumes 66% and 69% power of the 4-pipelined accumulator and the 4-parallel accumulator respectively with the same throughput. The proposed accumulator can achieve higher throughput with smaller area and less power consumption in lower clock frequency. All circuit simulations and implementations are based on a 0.35um CMOS process with VCC = 3.3V.

Performance Characteristics of Thermoelectric Generator Modules For Parallel and Serial Electrical Circuits (전기회로 구성 방법에 따른 열전발전 모듈 성능 특성)

  • Kim, Yun-Ho;Kim, Myung-Kee;Kim, Seo-Young;Rhee, Gwang-Hoon;Um, Suk-Kee
    • Korean Journal of Air-Conditioning and Refrigeration Engineering
    • /
    • v.22 no.5
    • /
    • pp.259-267
    • /
    • 2010
  • An experiment has been performed in order to investigate the characteristics of multiple thermoelectric modules (TEMs) with electrical circuits. The open circuit voltage of TEM connected parallel circuit is equal to the sum of individual TEMs. In contrast, the open circuit voltage is equal to the average of that individual TEM for a series circuit. The power output and conversion efficiency of TEM for both parallel and series circuits increase as the operating temperature conditions for individual TEMs becomes identical. Comparing parallel with series circuits, the power generation performance is more excellent for series circuit than parallel circuit. This result is attributed to the power loss from the TEM with better power generation performance.

Parallel Type Neural Network for Direct Control Method of Nonlinear System (비선형 시스템의 직접제어방식을 위한 병렬형 신경회로망)

  • 김주웅;정성부;서원호;엄기환
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2000.05a
    • /
    • pp.406-409
    • /
    • 2000
  • We propose the modified neural network which are paralleled to control nonlinear systems. The proposed method is a direct control method to use inverse model of the plant. Nonlinear systems are divided into two parts; linear part and nonlinear part, and it is controlled by RLS method and recursive multi-layer neural network with each other. We simulate to verify the performance of the proposed method and are compared with conventional direct neural network control method. The proposed control method is improved the control performance than the conventional method.

  • PDF

Massive Parallel Processing Algorithm for Semiconductor Process Simulation (반도체 공정 시뮬레이션을 위한 초고속 병렬 연산 알고리즘)

  • 이제희;반용찬;원태영
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.36D no.3
    • /
    • pp.48-58
    • /
    • 1999
  • In this paper, a new parallel computation method, which fully utilize the parallel processors both in mesh generation and FEM calculation for 2D/3D process simulation, is presented. High performance parallel FEM and parallel linear algebra solving technique was showed that excessive computational requirement of memory size and CPU time for the three-dimensional simulation could be treated successively. Our parallelized numerical solver successfully interpreted the transient enhanced diffusion (TED) phenomena of dopant diffusion and irregular shape of R-LOCOS within 15 minutes. Monte Carlo technique requires excessive computational requirement of CPU time. Therefore high performance parallel solving technique were employed to our cascade sputter simulation. The simulation results of Our sputter simulator allowed the calculation time of 520 sec and speedup of 25 using 30 processors. We found the optimized number of ion injection of our MC sputter simulation is 30,000.

  • PDF

A Reconfigurable Load and Performance Balancing Scheme for Parallel Loops in a Clustered Computing Environment (클러스터 컴퓨팅 환경에서 병렬루프 처리를 위한 재구성 가능한 부하 및 성능 균형 방법)

  • 김태형
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.1
    • /
    • pp.49-56
    • /
    • 2004
  • Load imbalance is a serious impediment to achieving good performance in parallel processing. Global load balancing schemes cannot adequately manage to balance parallel tasks generated from a single application. Dynamic loop scheduling methods are known to be useful in balancing parallel loops on shared-memory multiprocessor machines. However, their centralized nature causes a bottleneck for the relatively small number of processors in a network of workstations because of order-of-magniture differences in communication overheads. Moreover, improvements of basis loops scheduling methods have not effectively dealt with irregularly distributed workloads in parallel loops, which commonly occur in applications for a network of workstation. In this paper, we present a new reconfigurable and decentralized balancing method for parallel loops on a network of workstations. Since our method supplements performance balancing with those tranditional load balancing methods, it minimizes the overall execution time.

Tuning the Performance of Haskell Parallel Programs Using GC-Tune (GC-Tune을 이용한 Haskell 병렬 프로그램의 성능 조정)

  • Kim, Hwamok;An, Hyungjun;Byun, Sugwoo;Woo, Gyun
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.459-465
    • /
    • 2017
  • Although the performance of computer hardware is increasing due to the development of manycore technologies, software lacking a proportional increase in throughput. Functional languages can be a viable alternative to improve the performance of parallel programs since such languages have an inherent parallelism in evaluating pure expressions without side-effects. Specifically, Haskell is notably popular for parallel programming because it provides easy-to-use parallel constructs based on monads. However, the scalability of parallel programs in Haskell tends to fluctuate as the number of cores increases, and the garbage collector is suspected to be the source of this fluctuations because it affects both the space and the time needed to execute the programs. This paper uses the tuning tool, GC-Tune, to improve the scalability of the performance. Our experiment was conducted with a parallel plagiarism detection program, and the scalability improved. Specifically, the fluctuation range of the speedup was narrowed down by 39% compared to the original execution of the program without any tuning.

Parallel Design and Implementation of Shot Boundary Detection Algorithm (샷 경계 탐지 알고리즘의 병렬 설계와 구현)

  • Lee, Joon-Goo;Kim, SeungHyun;You, Byoung-Moon;Hwang, DooSung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.2
    • /
    • pp.76-84
    • /
    • 2014
  • As the number of high-density videos increase, parallel processing approaches are necessary to process a large-scale of video data. When a processing method of video data requires thousands of simple operations, GPU-based parallel processing is preferred to CPU-based parallel processing by way of reducing the time and space complexities of a given computation problem. This paper studies the parallel design and implementation of a shot-boundary detection algorithm. The proposed shot-boundary detection algorithm uses pixel brightness comparisons and global histogram data among the blocks of frames, and the computation of these data is characterized with the high parallelism for the related operations. In order to maximize these operations in parallel, the computations of the pixel brightness and histogram are designed in parallel and implemented in NVIDIA GPU. The GPU-based shot detection method is tested with 10 videos from the set of videos in National Archive of Korea. In experiments, the detection rate is similar but the computation time is about 10 time faster to that of the CPU-based algorithm.

Backend of a Parallelizing Compiler for an Heterogeneous Parallel System (이기종 병렬 시스템을 위한 자동적 병렬화 컴파일러 후위)

  • Kwon, Dae-Suk;Kim, Hsung-Hwan;Han, Sang-Yong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.8
    • /
    • pp.710-718
    • /
    • 2000
  • Many multiprocessing systems have been developed to exploit the parallelism and to improve the performance. However, the naive multiprocessing schemes were not successful as many researchers thought, due to the heavy cost of communication and synchronization resulting from parallelization. In this paper, we will identify the reasons for the poor performance and the compiler requirements for the performance improvement. We realized that the decisions for multiprocessing should be derived by the overhead information. We applied this idea to the automatic parallelizing compiler, SUIF. We substituted the original backend of SUIF with our backend using MPI, and gave it the capability to validate parallelization decisions based on overhead parameters. This backend converts the intermediate code containing spacification of parallelizable regions into the distributed-memory based parallel program with MPI function calls without excessive parallelization that may cause performance degradation.

  • PDF