• Title/Summary/Keyword: data Parallel

Search Result 2,369, Processing Time 0.034 seconds

Efficient Parallel Block-layered Nonbinary Quasi-cyclic Low-density Parity-check Decoding on a GPU

  • Thi, Huyen Pham;Lee, Hanho
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.210-219
    • /
    • 2017
  • This paper proposes a modified min-max algorithm (MMMA) for nonbinary quasi-cyclic low-density parity-check (NB-QC-LDPC) codes and an efficient parallel block-layered decoder architecture corresponding to the algorithm on a graphics processing unit (GPU) platform. The algorithm removes multiplications over the Galois field (GF) in the merger step to reduce decoding latency without any performance loss. The decoding implementation on a GPU for NB-QC-LDPC codes achieves improvements in both flexibility and scalability. To perform the decoding on the GPU, data and memory structures suitable for parallel computing are designed. The implementation results for NB-QC-LDPC codes over GF(32) and GF(64) demonstrate that the parallel block-layered decoding on a GPU accelerates the decoding process to provide a faster decoding runtime, and obtains a higher coding gain under a low $10^{-10}$ bit error rate and low $10^{-7}$ frame error rate, compared to existing methods.

Development of a CUBRID-Based Distributed Parallel Query Processing System

  • Kim, Hyeong-Il;Yang, HyeonSik;Yoon, Min;Chang, Jae-Woo
    • Journal of Information Processing Systems
    • /
    • v.13 no.3
    • /
    • pp.518-532
    • /
    • 2017
  • Due to the rapid growth of the amount of data, research on bigdata processing has been highlighted. For bigdata processing, CUBRID Shard is able to support query processing in parallel way by dividing the database into a number of CUBRID servers. However, CUBRID Shard can answer a user's query only when the query is required to gain accesses to a single CUBRID server, instead of multiple ones. To solve the problem, in this paper we propose a CUBRID based distributed parallel query processing system that can answer a user's query in parallel and distributed manner. Finally, through the performance evaluation, we show that our proposed system provides 2-3 times better performance on query processing time than the existing CUBRID Shard.

Parallel Processing of 3D Rigid-Plastic FEM on a Cluster System (클러스터 시스템에서 3차원 강소성 유한요소법의 병렬처리)

  • Choi Young;Seo Yongwie
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.22 no.1
    • /
    • pp.122-129
    • /
    • 2005
  • On the cluster system, the parallel code of rigid-plastic FEM has been developed. The cluster system, Simforge, has 15 processors and the total memory is 4.5GBytes. In the developed parallel code, the distributed data of the column-wise partitioned stiffness are stored as the compressed row storage and the diagonal preconditioned conjugate gradient solver is applied. The analysis of block upsetting is performed with the parallel code on Simforge cluster system. In this paper, the analysis results are compared and discussed.

A design of synchronous nonlinear and parallel for pipeline stage on IP-based H.264 decoder implementation (IP기반 H.264 디코더 설계를 위한 동기식 비선형 및 병렬화 파이프라인 설계)

  • Ko, Byung-Soo;Kong, Jin-Hyeung
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.409-410
    • /
    • 2008
  • This paper presents nonlinear and parallel design for synchronous pipelining in IP-based H.264 decoder implementation. Since H.264 decoder includes the dataflow of feedback loop, the data dependency requires one NOP stage per pipelining latency to drop the throughput into 1/2. Further, it is found that, in execution time, the stage scheduled for MC is more occupied than that for CAVLD/ITQ/DF. The less efficient stage would be improved by nonlinear scheduling, while the fully-utilized stage could be accelerated by parallel scheduling of IP. The optimization yields 3 nonlinear {CAVLD&ITQ}|3 parallel (MC/IP&Rec.)| 3 nonlinear {DF} pipelined architecture for IP-based H.264 decoder. In experiments, the nonlinear and parallel pipelined H.264 decoder, including existing IPs, could deal with full HD video at 41.86MHz, in real time processing.

  • PDF

Numerical Investigation of the Flow Pulsation in the Gap connecting with Two Parallel Rectangular Channels with Different Cross-section Areas (크기가 다른 단면을 가진 평행한 사각 유로를 연결하는 협소유로의 맥동유동에 관한 수치해석)

  • Seo, Jeong-Sik;Shin, Jong-Kuen;Choi, Young-Don
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.33 no.7
    • /
    • pp.512-519
    • /
    • 2009
  • Flow pulsation in the gap connecting with two parallel channels is investigated by RANS and URANS approaches. The two parallel channels are connected by a small channel called for a gap. The parallel channels are designed to have different cross section area with its ratio of 0.5. Computations are conducted using a CFX 11.0 code. The bulk Reynolds number is 60,000. Predicted results are compared with the previous experimental data. Mean velocity profile at the center of gap region are compared with experiments for its validation. Spectral analysis on the lateral velocity in the center of the gap was performed. Auto correlation for the axial-flow velocity pattern was presented. The unsteady structure of the flow pulsation was visualized in the region of the gap in the parallel channel.

Estimation of Relative Potency with the Parallel-Line Model

  • Lee, Tae-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.633-640
    • /
    • 2012
  • Biological methods are described for the assay of certain substances and preparations whose potency cannot be adequately assured by chemical or physical analysis. The principle applied through these assays is of a comparison with a standard preparation to determine how much of the examined substance produces the same biological effects as a given quantity (the Unit) of the standard preparation. In these dilution assays, to estimate the relative potencies of the unknown preparations to the standard preparations, it is necessary to compare dose-response relationships of standard and unknown preparations. The dose-response relationship in the dilution assay is non-linear and sigmoid when a wide range of doses is applied. The parallel line model (applied to the dose region with the steepest slope) is used to estimate the relative potency. In this paper, the statistical theory in the parallel line model is explained with an application to a dilution assay data. The parallel line method is implemented in a SAS program and is available at the author's homepage(http://cafe.daum.net/go.analysis).

Installation Error Calibration by Using Levenberg-Marquardt Method on a Cubic Parallel Manipulator (Levenberg-Marquardt 방법을 이용한 육면형 병렬기구의 설치 오차 보정)

  • 임승룡;임현규;최우천;송재복;홍대희
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.20 no.2
    • /
    • pp.184-191
    • /
    • 2003
  • A parallel manipulator has high stiffness and all the joint errors on the device are not accumulated at the end -effector unlike a serial manipulator. These are the reasons why the parallel manipulator has been widely used in many fields of industry. In the parallel manipulator, it is very important to predict the exact pose of the end-effector when we want to control the end-effector motion. Installation errors have to be determined in order to predict and control the actual position and pose of the end-effector. This paper presents an algorithm to find the whole 36 joint error components with joint clearance errors and measurement errors considered, when a link length measurement sensor is used and data more than 36 times are acquired for 36 different configurations. A simulation test using this algorithm is performed with a Matlab program which uses the Levenberg-Marquardt method that is known to be efficient for non-linear optimization.

Interfacing the Visual Projector to PC using the Parallel Port (PC 병렬 포트를 이용한 실물화상기 인터페이스)

  • 이재혁
    • Proceedings of the IEEK Conference
    • /
    • 2000.06c
    • /
    • pp.173-176
    • /
    • 2000
  • In this study, a new multimedia data converter is proposed. Also the PC interfacing met hod using the parallel port of is suggested. The image compression/decompression is based on the JPEG algorithm, which is widely used for an effective compression in the image processing industry. The suggested interfacing method is based on the IEEE1284 and IEEE1284.3 protocol, which is a standard in the PC's parallel port interface.

  • PDF

Design of Parallel Processor for Image Processing

  • No, Seok-Hwan;Park, Jong-Won
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.743-744
    • /
    • 2006
  • This paper presents implementation of parallel processing system for image processing. The parallel processing system proposed consisted of 16 processing elements, and multi-access memory system, and interface modules. The multi-access memory system we introduced is made up of a memory module selection, a data routing module, and an address calculation and routing module.

  • PDF

The Analysis and Application of the Parallel Coupled Line with Open Stub (개방 스터브를 갖는 평행결합선로의 해석과 응용)

  • Lee, Won-Kyun;Lee, Hong-Seob;Hwang, Hee-Yong
    • Journal of Industrial Technology
    • /
    • v.27 no.B
    • /
    • pp.153-160
    • /
    • 2007
  • In this paper, the exact analysis of the parallel coupled line with open stub is presented. This structure shows LPF characteristics with broad stopband and sharp skirt characteristics. We derived the exact Z-matrix expression of the structure. In order to show the validation of the expression we designed $3^{th}$ order Chebyshev LPF using the structure. The simulated data excellently agreed with the predicted values by the calculation using the derived expression.

  • PDF