• Title/Summary/Keyword: 병렬 알고리즘

Search Result 1,326, Processing Time 0.025 seconds

Performance Enhancement of a DVA-tree by the Independent Vector Approximation (독립적인 벡터 근사에 의한 분산 벡터 근사 트리의 성능 강화)

  • Choi, Hyun-Hwa;Lee, Kyu-Chul
    • The KIPS Transactions:PartD
    • /
    • v.19D no.2
    • /
    • pp.151-160
    • /
    • 2012
  • Most of the distributed high-dimensional indexing structures provide a reasonable search performance especially when the dataset is uniformly distributed. However, in case when the dataset is clustered or skewed, the search performances gradually degrade as compared with the uniformly distributed dataset. We propose a method of improving the k-nearest neighbor search performance for the distributed vector approximation-tree based on the strongly clustered or skewed dataset. The basic idea is to compute volumes of the leaf nodes on the top-tree of a distributed vector approximation-tree and to assign different number of bits to them in order to assure an identification performance of vector approximation. In other words, it can be done by assigning more bits to the high-density clusters. We conducted experiments to compare the search performance with the distributed hybrid spill-tree and distributed vector approximation-tree by using the synthetic and real data sets. The experimental results show that our proposed scheme provides consistent results with significant performance improvements of the distributed vector approximation-tree for strongly clustered or skewed datasets.

Motion Estimation Specific Instructions and Their Hardware Architecture for ASIP (ASIP을 위한 움직임 추정 전용 연산기 구조 및 명령어 설계)

  • Hwang, Sung-Jo;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.3
    • /
    • pp.106-111
    • /
    • 2011
  • This paper presents an ASIP (Application-specific Instruction Processor) for motion estimation that employs specific IME instructions and its programmable and reconfigurable hardware architecture for various video codecs, such as H.264/AVC, MPEG4, etc. With the proposed specific instructions and hardware accelerator, it can handle the real-time processing requirement of High Definition (HD) video. With the parallel operations and SAD unit control using pattern information, the proposed IME instruction supports not only full search algorithm but also other fast search algorithms. The hardware size is 77K gates for each Processing Element Group (PEG) which has 256 SAD PEs. The proposed ASIP runs at 160MHz with sixteen PEGs and it can handle 1080p@30 frame in real time.

SW-HW Co-design of a High-performance Dehazing System Using OpenCL-based High-level Synthesis Technique (OpenCL 기반의 상위 수준 합성 기술을 이용한 고성능 안개 제거 시스템의 소프트웨어-하드웨어 통합 설계)

  • Park, Yongmin;Kim, Minsang;Kim, Byung-O;Kim, Tae-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.8
    • /
    • pp.45-52
    • /
    • 2017
  • This paper presents a high-performance software-hardware dehazing system based on a dedicated hardware accelerator for the haze removal. In the proposed system, the dedicated hardware accelerator performs the dark-channel-prior-based dehazing process, and the software performs the other control processes. For this purpose, the dehazing process is realized as an OpenCL kernel by finding the inherent parallelism in the algorithm and is synthesized into a hardware by employing a high-level-synthesis technique. The proposed system executes the dehazing process much faster than the previous software-only dehazing system: the performance improvement is up to 96.3% in terms of the execution time.

FPGA Implementation of SVM Engine for Training and Classification (기계학습 및 분류를 위한 SVM 엔진의 FPGA 구현)

  • Na, Wonseob;Jeong, Yongjin
    • Journal of IKEEE
    • /
    • v.20 no.4
    • /
    • pp.398-411
    • /
    • 2016
  • SVM, a machine learning method, is widely used in image processing for it's excellent generalization performance. However, to add other data to the pre-trained data of the system, we need to train the entire system again. This procedure takes a lot of time, especially in embedded environment, and results in low performance of SVM. In this paper, we implemented an SVM trainer and classifier in an FPGA to solve this problem. We parlallelized the repeated operations inside SVM and modified the exponential operations of the kernel function to perform fixed point modelling. We implemented the proposed hardware on Xilinx ZC 706 evaluation board and used TSR algorithm to verify the FPGA result. It takes about 5 seconds for the proposed hardware to train 2,000 data samples and 16.54ms for classification for $1360{\times}800$ resolution in 100MHz frequency, respectively.

Open Platform for Improvement of e-Health Accessibility (의료정보서비스 접근성 향상을 위한 개방형 플랫폼 구축방안)

  • Lee, Hyun-Jik;Kim, Yoon-Ho
    • Journal of Digital Contents Society
    • /
    • v.18 no.7
    • /
    • pp.1341-1346
    • /
    • 2017
  • In this paper, we designed the open service platform based on integrated type of individual customized service and intelligent information technology with individual's complex attributes and requests. First, the data collection phase is proceed quickly and accurately to repeat extraction, transformation and loading. The generated data from extraction-transformation-loading process module is stored in the distributed data system. The data analysis phase is generated a variety of patterns that used the analysis algorithm in the field. The data processing phase is used distributed parallel processing to improve performance. The data providing should operate independently on device-specific management platform. It provides a type of the Open API.

Design of Neuro-Fuzzy Controller using Relative Gain Matrix (상대 이득 행렬을 이용한 뉴로-퍼지 제어기의 설계)

  • Seo Sam-Jun;Kim Dongwon;Park Gwi-Tae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.1
    • /
    • pp.24-29
    • /
    • 2005
  • In the fuzzy control for the multi-variable system, it is difficult to obtain the fuzzy rule. Therefore, the parallel structure of the independent single input-single output fuzzy controller using a pairing between the input and output variable is applied to the multi-variable system. However, among the input/output variables which arc not paired the interactive effects should be taken into account. these mutual coupling of variables affect the control performance. Therefore, for the control system with a strong coupling property, the control performance is sometimes lowered. In this paper, the effect of mutual coupling of variables is considered by the introduction of a neuro-fuzzy controller using relative gain matrix. This proposed neuro-fuzzy controller automatically adjusts the mutual coupling weight between variables using a neural network which is realized by back-propagation algorithm. The good performance of the proposed nero-fuzzy controller is verified through computer simulations on 200MW boiler systems.

Image Pattern Classification and Recognition by Using the Associative Memory with Cellular Neural Networks (셀룰라 신경회로망의 연상메모리를 이용한 영상 패턴의 분류 및 인식방법)

  • Shin, Yoon-Cheol;Park, Yong-Hun;Kang, Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.2
    • /
    • pp.154-162
    • /
    • 2003
  • In this paper, Associative Memory with Cellular Neural Networks classifies and recognizes image patterns as an operator applied to image process. CNN processes nonlinear data in real-time like neural networks, and made by cell which communicates with each other directly through its neighbor cells as the Cellular Automata does. It is applied to the optimization problem, associative memory, pattern recognition, and computer vision. Image processing with CNN is appropriate to 2-D images, because each cell which corresponds to each pixel in the image is simultaneously processed in parallel. This paper shows the method for designing the structure of associative memory based on CNN and getting output image by choosing the most appropriate weight pattern among the whole learned weight pattern memories. Each template represents weight values between cells and updates them by learning. Hebbian rule is used for learning template weights and LMS algorithm is used for classification.

Parallel Genetic Algorithm-Tabu Search Using PC Cluster System for Optimal Reconfiguration of Distribution Systems (배전계통 최적 재구성 문제에 PC 클러스터 시스템을 이용한 병렬 유전 알고리즘-타부 탐색법 구현)

  • Mun Kyeong-Jun;Song Myoung-Kee;Kim Hyung-Su;Kim Chul-Hong;Park June Ho;Lee Hwa-Seok
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.53 no.10
    • /
    • pp.556-564
    • /
    • 2004
  • This paper presents an application of parallel Genetic Algorithm-Tabu Search(GA-TS) algorithm to search an optimal solution of a reconfiguration in distribution system. The aim of the reconfiguration of distribution systems is to determine switch position to be opened for loss minimization in the radial distribution systems, which is a discrete optimization problem. This problem has many constraints and very difficult to solve the optimal switch position because it has many local minima. This paper develops parallel GA-TS algorithm for reconfiguration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solution of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper 10% of the population to enhance the local searching capabilities. With migration operation, best string of each node is transferred to the neighboring node aster predetermined iterations are executed. For parallel computing, we developed a PC-cluster system consisting of 8 PCs. Each PC employs the 2 GHz Pentium Ⅳ CPU and is connected with others through ethernet switch based fast ethernet. To show the usefulness of the proposed method, developed algorithm has been tested and compared on a distribution systems in the reference paper. From the simulation results, we can find that the proposed algorithm is efficient and robust for the reconfiguration of distribution system in terms of the solution qualify. speedup. efficiency and computation time.

SHA-1 Pipeline Configuration According to the Maximum Critical Path Delay (최대 임계 지연 크기에 따른 SHA-1 파이프라인 구성)

  • Lee, Je-Hoon;Choi, Gyu-Man
    • Convergence Security Journal
    • /
    • v.16 no.7
    • /
    • pp.113-120
    • /
    • 2016
  • This paper presents a new high-speed SHA-1 pipeline architecture having a computation delay close to the maximum critical path delay of the original SHA-1. The typical SHA-1 pipelines are based on either a hash operation or unfolded hash operations. Their throughputs are greatly enhanced by the parallel processing in the pipeline, but the maximum critical path delay will be increased in comparison with the unfolding of all hash operations in each round. The pipeline stage logics in the proposed SHA-1 has the latency is similar with the result of dividing the maximum threshold delay of a round by the number of iterations. Experimental results show that the proposed SHA-1 pipeline structure is 0.99 and 1.62 at the operating speed ratio according to circuit size, which is superior to the conventional structure. The proposed pipeline architecture is expected to be applicable to various cryptographic and signal processing circuits with iterative operations.

A CPU and GPU Heterogeneous Computing Techniques for Fast Representation of Thin Features in Liquid Simulations (액체 시뮬레이션의 얇은 특징을 빠르게 표현하기 위한 CPU와 GPU 이기종 컴퓨팅 기술)

  • Kim, Jong-Hyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.24 no.2
    • /
    • pp.11-20
    • /
    • 2018
  • We propose a new method particle-based method that explicitly preserves thin liquid sheets for animating liquids on CPU-GPU heterogeneous computing framework. Our primary contribution is a particle-based framework that splits at thin points and collapses at dense points to prevent the breakup of liquid on GPU. In contrast to existing surface tracking methods, the our method does not suffer from numerical diffusion or tangles, and robustly handles topology changes on CPU-GPU framework. The thin features are detected by examining stretches of distributions of neighboring particles by performing PCA(Principle component analysis), which is used to reconstruct thin surfaces with anisotropic kernels. The efficiency of the candidate position extraction process to calculate the position of the fluid particle was rapidly improved based on the CPU-GPU heterogeneous computing techniques. Proposed algorithm is intuitively implemented, easy to parallelize and capable of producing quickly detailed thin liquid animations.