Search | Korea Science

Parallel LDPC Decoder for CMMB on CPU and GPU Using OpenCL (OpenCL을 활용한 CPU와 GPU 에서의 CMMB LDPC 복호기 병렬화)

Park, Joo-Yul;Hong, Jung-Hyun;Chung, Ki-Seok
- IEMEK Journal of Embedded Systems and Applications
- /
- v.11 no.6
- /
- pp.325-334
- /
- 2016
Recently, Open Computing Language (OpenCL) has been proposed to provide a framework that supports heterogeneous computing platforms. By using an OpenCL framework, digital communication systems can support various protocols in a unified computing environment to achieve both high portability and high performance. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes for China Multimedia Mobile Broadcasting (CMMB) on a heterogeneous platform. Each step of LDPC decoding has different parallelization characteristics. In this paper, steps suitable for task-level parallelization are executed on the CPU, and steps suitable for data-level parallelization are processed by the GPU. To improve the performance of the proposed OpenCL kernels for LDPC decoding operations, explicit thread scheduling, loop-unrolling, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance by using heterogeneous multi-core processors on a unified computing framework.
https://doi.org/10.14372/IEMEK.2016.11.6.325 인용 PDF KSCI

Parallelization of a Purely Functional Bisimulation Algorithm

Ahn, Ki Yung
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.1
- /
- pp.11-17
- /
- 2021
In this paper, we demonstrate a performance boost by parallelizing a purely functional bisimulation algorithm on a multicore processor machine. The key idea of this parallelization is exploiting the referential transparency of purely functional programs to minimize refactoring of the original implementation without any parallel constructs. Both original and parallel implementations are written in Haskell, a purely functional programming language. The change from the original program to the parallel program is minuscule, maintaining almost original structure of the program. Through benchmark, we show that the proposed parallelization doubles the performance of the bisimulation test compared to the original non-parallel implementation. We also shaw that similar performance boost is also possible for a memoized version of the bisimulation implementation.
https://doi.org/10.9708/jksci.2021.26.01.011 인용 PDF KSCI HTML

Multi-Threaded Parallel H.264/AVC Decoder for Multi-Core Systems (멀티코어 시스템을 위한 멀티스레드 H.264/AVC 병렬 디코더)

Kim, Won-Jin;Cho, Keol;Chung, Ki-Seok
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.47 no.11
- /
- pp.43-53
- /
- 2010
Wide deployment of high resolution video services leads to active studies on high speed video processing. Especially, prevalent employment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a multi-core platform. Parallel H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Multi-Threaded Parallelization(MTP). In MTP, to reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before the parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition(HD) and a full high-definition(FHD) video, respectively compared with that of popular existing method called 2Dwave.
PDF KSCI

Research on parallelization mechanism of inductively coupled plasma for large area plasma source

Lee, Jang-Jae;Kim, Si-Jun;Kim, Gwang-Gi;Lee, Ba-Da;Lee, Yeong-Seok;Yeom, Hui-Jung;Kim, Dae-Ung;Yu, Sin-Jae
- Proceedings of the Korean Vacuum Society Conference
- /
- 2016.02a
- /
- pp.183-183
- /
- 2016
Inductively coupled plasma having the high-density is often used for high productivity in the plasma processing. In large area processing, the plasma can be generated by using the multi-pole connected in parallel. However, in case of this, the power cannot transfer to plasma uniformly. To address the problem, we studied the mechanism of inductively coupled plasma connected in parallel by using transformer model. We also studied about the change of the plasma parameters over the time through the power balance equation and particle balance equation.
PDF

A Data Dependency Elimination Algorithm for Extracting Maximum Parallelism (최대 병렬성 추출을 위한 자료 종속성 제거 알고리즘)

송월봉;박두순
- Journal of KIISE:Software and Applications
- /
- v.26 no.1
- /
- pp.139-139
- /
- 1999
In most application programs, loops usually comprise most of the computation in a program and the most important source of parallelism. When the data dependency relation is uniformin terms of distance, several compile time parallelization methods were introduced. On the otherhand,when the data dependency relation is non-uniform in distance, the compile time extraction ofparallelism is much complicated. In this paper, a general method the extracting parallelism in nestedloops is presented. This algorithm can be applicable where the dependency relation is both uniform andnon-uniform in distance. According to execution repeatedly the statements in nested loops, thealgorithm which effectively removes these kind of data dependencies is developed in order to presentthe total parallelization of nested loops.

A Technique for Fast Process Creation Based on Creation Location

Kim, Byung-Jin;Ahn, Young-Ho;Chung, Ki-Seok
- Journal of Computing Science and Engineering
- /
- v.5 no.4
- /
- pp.283-287
- /
- 2011
Due to the proliferation of software parallelization on multi-core CPUs, the number of concurrently executing processes is rapidly increasing. Unlike processes running in a server environment, those executing in a multi-core desktop or a multi-core mobile platform have various correlations. Therefore, it is crucial to consider correlations among concurrently running processes. In this paper, we exploit the property that for a given created location in the binary image of the parent process, the average running time of child processes residing in the run-queue differs. We claim that this property can be exploited to improve the overall system performance by running processes that have a relatively short running time before those with a longer running time. Experimental results verified that the running time was actually improved by 11%.
https://doi.org/10.5626/JCSE.2011.5.4.283 인용 PDF KPUBS

High-Performance Computer-Generated Hologram by Optimized Implementation of Parallel GPGPUs

Lee, Yoon-Hyuk;Seo, Young-Ho;Yoo, Ji-Sang;Kim, Dong-Wook
- Journal of the Optical Society of Korea
- /
- v.18 no.6
- /
- pp.698-705
- /
- 2014
We propose a new development for calculating a computer-generated hologram (CGH) through the use of multiple general-purpose graphics processing units (GPGPUs). For optimization of the implementation, CGH parallelization, object point tiling, memory selection for object point, hologram tiling, CGMA (compute to global memory access) ratio by block size, and memory mapping were considered. The proposed CGH was equipped with a digital holographic video system consisting of a camera system for capturing images (object points) and CPU/GPGPU software (S/W) for various image processing activities. The proposed system can generate about 37 full HD holograms per second using about 6K object points.
https://doi.org/10.3807/JOSK.2014.18.6.698 인용 PDF KSCI KPUBS HTML

Parallelization of an Unstructured Implicit Euler Solver (내재적 방법을 이용한 비정렬 유동해석 기법의 병렬화)

Kim J. S.;Kang H. J.;Park Y. M.;Kwon O. J.
- Journal of computational fluids engineering
- /
- v.5 no.2
- /
- pp.20-27
- /
- 2000
An unstructured implicit Euler solver is parallelized on a Cray T3E. Spatial discretization is accomplished by a cell-centered finite volume formulation using an upwind flux differencing. Time is advanced by the Gauss-Seidel implicit scheme. Domain decomposition is accomplished by using the k-way n-partitioning method developed by Karypis. In order to analyze the parallel performance of the solver, flows over a 2-D NACA 0012 airfoil and 3-D F-5 wing were investigated.
PDF

Lane Detection using Embedded Multi-core Platform (임베디드 멀티코어 플랫폼을 이용한 차선검출)

Lee, Kwang-Yeob;Kim, Dong-Han;Park, Tae-Ryoung
- Journal of IKEEE
- /
- v.15 no.3
- /
- pp.255-260
- /
- 2011
In this paper, we propose a parallelization technique in lane detection by using Hough transform. Hough transform has a weakness that it has a lot computation quantity, because it has to compute ${\rho}$ value in all candidate ${\Theta}$ to be detected in an image. We propose an architecture of parallel processing for this transform in a multi-core environment. The parallel processing has application to Hough transform as well as noise reduction and edge detection. This proposed architecture has 5.17 times improvement in performance compare to the existing algorithm.
https://doi.org/10.7471/ikeee.2011.15.3.255 인용 PDF KSCI

Parallelization of an Unstructured Implicit Euler Solver (내재적 방법을 이용한 비정렬 유동해석 기법의 병렬화)

Kim J. S.;Kang H. J.;Park Y. M.;Kwon O. J.
- 한국전산유체공학회:학술대회논문집
- /
- 1999.11a
- /
- pp.193-200
- /
- 1999
An unstructured implicit Euler solver is parallelized on a Cray T3E. Spatial discretization is accomplished by a cell-centered finite volume formulation using an unpwind flux differencing. Time is advanced by the Gauss-Seidel implicit scheme. Domain decomposition is accomplished by using the k-way N-partitioning method developed by Karypis. In order to analyze the parallel performance of the solver, flows over a 2-D NACA 0012 airfoil and a 3-D F-5 wing were investigated.
PDF

Search Result 215, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)