Search | Korea Science

The Cooperative Parallel X-Match Data Compression Algorithm (협동 병렬 X-Match 데이타 압축 알고리즘)

윤상균
- Journal of KIISE:Computer Systems and Theory
- /
- v.30 no.10
- /
- pp.586-594
- /
- 2003
X-Match algorithm is a lossless compression algorithm suitable for hardware implementation owing to its simplicity. It can compress 32 bits per clock cycle and is suitable for real time compression. However, as the bus width increases 64-bit, the compression unit also need to increase. This paper proposes the cooperative parallel X-Match (X-MatchCP) algorithm, which improves the compression speed by performing the two X-Match algorithms in parallel. It searches the all dictionary for two words, combines the compression codes of two words generated by parallel X-Match compression and outputs the combined code while the previous parallel X-Match algorithm searches an individual dictionary. The compression ratio in X-MatchCP is almost the same as in X-Match. X-MatchCP algorithm is described and simulated by Verilog hardware description language.
PDF KSCI

Proposal for Decoding-Compatible Parallel Deflate Algorithm by Inserting Control Header Composed of Non-Compressed Blocks (비 압축 블록으로 구성된 제어 헤더 삽입을 통한 압축 해제 호환성 있는 병렬 처리 Deflate 알고리즘 제안)

Kim Jung Hoon
- KIPS Transactions on Software and Data Engineering
- /
- v.12 no.5
- /
- pp.207-216
- /
- 2023
For decoding-compatible parallel Deflate algorithm, this study proposed a new method of the control header being made in such a way that essential information for parallel compression and decompression are stored in the Disposed Bit Area (DBA) of the non-compression block and being inserted into the compressed blocks. Through this, parallel compression and decompression are possible while maintaining perfect compatibility with the existing decoder. After applying this method, the compression time was reduced by up to 71.2% compared to the sequential processing method, and the parallel decompression time was reduced by up to 65.7%. In particular, it is well known that parallel decompression is impossible due to the structural limitations of the Deflate algorithm. However, the decoder equipped with the proposed method enables high-speed parallel decompression at the algorithm level and maintains compatibility, so that parallelly compressed data can be decoded normally by existing decoder programs.
https://doi.org/10.3745/KTSDE.2023.12.5.207 인용 PDF

H.264/AVC Fast Intra Mode Decision using GPGPU Parallel Programming (GPGPU 병렬 프로그래밍을 이용한 H.264/AVC 고속 화면내 예측 모드 결정)

Choi, Sung-Jun;Han, Ki-Hun;Yoo, Yeong-Soo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2011.11a
- /
- pp.110-112
- /
- 2011
GPU의 병렬성과 연산능력을 일반적인 공학적 문제 해결에 적용하는 GPGPU 컴퓨팅에 대한 연구가 최근 활발히 진행되고 있다. 비디오 압축과정에는 많은 양의 화소 데이터에 동일하게 반복되는 연산을 수행하는 알고리즘이 많이 적용되므로 GPGPU를 통한 고속 병렬 계산의 응용 분야로 매우 적합하다. H.264/AVC는 비디오를 압축하는 가장 최신의 국제표준으로 여러 제품군과 서비스에 대한 적용되어 시장에서 널리 사용되고 있다. 본 논문에서는 GPGPU의 응용 분야로 주목 받고 있는 비디오 압축 분야에 대한 적용으로 H.264/AVC의 화면내 예측 모드 결정과정에 GPGPU 병렬 프로그래밍을 적용하여 예측 모드 결정 속도를 향상하는 방법을 제안한다. GPU상에서의 데이터 병렬처리를 위해 CUDA C언어를 사용하였으며, CPU상에서의 연산은 C언어를 사용하여 구현되었다. GPU상에서 프레임 전체에 대한 화면내 예측 모드를 병렬적으로 결정함으로써 이에 소요되는 시간을 줄여 줄 수 있었다. 실험결과 GPU상에서 병렬적으로 예측 모드를 결정할 때 Full-HD급 영상에서 약 2.8배 정도의 속도 향상을 확인할 수 있었다. 향후 GPGPU 병렬 프로그래밍을 화면 내 예측뿐만 아니라 반복되는 연산을 수행하는 다른 알고리즘에도 적용하여 부호화기의 계산 부담을 덜어준다면 고속 실시간 비디오 압축 부호기 개발이 더욱 용이해 질것으로 기대된다.
PDF

Implementation and analysis of a parallel suffix tree construction algorithm using TBB and Cilk Plus (TBB, Cilk Plus를 이용한 병렬 접미사 트리 생성 알고리즘 구현 및 성능 분석)

Seo, Jun-Ho;Na, Joong-Chae
- Proceedings of the Korean Information Science Society Conference
- /
- 2012.06a
- /
- pp.403-405
- /
- 2012
접미사 트리는 문자열 압축, 텍스트 처리, 생물정보학 등 다양한 응용 분야에서 사용되는 인덱스 자료구조이다. 최근 64bit 하드웨어와 멀티코어 CPU가 보급됨에 따라 메모리상에서 병렬로 접미사 트리를 생성하는 알고리즘이 활발히 연구되고 있다. 본 논문에서는 McCreight의 선형시간 알고리즘과 Chen의 병렬 알고리즘을 기반으로 메모리상에서 접미사 트리를 병렬로 생성하는 구현 방법을 보였으며, TBB, Cilk Plus와 같은 병렬 프로그래밍 라이브러리를 이용하여 병렬 알고리즘을 구현하였다. 알고리즘 실험 결과 병렬로 수행한 알고리즘이 직렬로 수행한 결과보다 최대 4배 가량 성능 향상을 얻을 수 있었으며, 병렬 라이브러리를 사용함으로써 가지는 오버헤드는 극히 적은 것으로 나타났다.

Compression-Based Volume Rendering on Distributed Memory Parallel Computers (분산 메모리 구조를 갖는 병렬 컴퓨터 상에서의 압축 기반 볼륨 렌더링)

Koo, Gee-Bum;Park, Sang-Hun;Song, Dong-Sub;Ihm, In-Sung
- Journal of KIISE:Computing Practices and Letters
- /
- v.6 no.5
- /
- pp.457-467
- /
- 2000
본 논문에서는 분산 메모리 구조를 갖는 병렬 컴퓨터 상에서 방대한 크기를 갖는 볼륨 데이터의 효과적인 가시화를 위한 병렬 광선 투사법을 제안한다. 데이터의 압축을 기반으로 하는 본 기법은 다른 프로세서의 메모리로부터 데이터를 읽기보다는 자신의 지역 메모리에 존재하는 압축된 데이터를 빠르게 복원함으로써 병렬 렌더링 성능을 향상시키는 것을 목표로 한다. 본 기법은 객체-순서와 영상-순서 탐색 알고리즘 모두의 정점을 이용하여 성능을 향상시켰다. 즉, 블록 단위의 최대-최소 팔진트리의 탐색과 각 픽셀의 불투명도 값을 동적으로 유지하는 실시간 사진트리를 응용함으로써 객체-공간과 영상-공간 각각의 응집성을 이용하였다. 본 논문에서 제안하는 압축 기반 병렬 볼륨 렌더링 방법은 렌더링 수행 중 발생하는 프로세서간의 통신을 최소화하도록 구현되었는데, 이러한 특징은 프로세서 사이의 상당히 높은 데이터 통신 비용을 감수하여야 하는 PC 및 워크스테이션의 클러스터와 같은 더욱 실용적인 분산 환경에서 매우 유용하다. 본 논문에서는 Cray T3E 병렬 컴퓨터 상에서 Visible Man 데이터를 이용하여 실험을 수행하였다.
PDF

The Mixed Finite Element Analysis for Nearly Incompressible and Impermeable Porous Media Using Parallel Algorithm (병렬알고리즘 이용한 비압축, 비투과성 포화 다공질매체의 혼합유한요소해석)

Tak, Moon-Ho;Kang, Yoon-Sik;Park, Tae-Hyo
- Journal of the Computational Structural Engineering Institute of Korea
- /
- v.23 no.4
- /
- pp.361-368
- /
- 2010
In this paper, the parallel algorithm using MPI(Message-Passing Interface) library is introduced in order to improve numerical efficiency for the staggered method for nearly incompressible and impermeable porous media which was introduced by Park and Tak(2010). The porous media theory and the staggered method are also briefly introduced in this paper. Moreover, we account for MPI library for blocking, non-blocking, and collective communication, and propose combined the staggered method with the blocking and nonblocking MPI library. And then, we present how to allocate CPUs on the staggered method and the MPI library, which is related with the numerical efficiency in order to solve unknown variables on nearly incompressible and impermeable porous media. Finally, the results comparing serial solution with parallel solution are verified by 2 dimensional saturated porous model according to the number of FEM meshes.
PDF KSCI

A Parallel Algorithm for 3D Geographic Information System (3차원 공간정보 시스템을 위한 병렬 알고리즘)

Jo, Jeong-U;Kim, Jin-Seok
- The KIPS Transactions:PartA
- /
- v.9A no.2
- /
- pp.217-224
- /
- 2002
Many systems handle 3D-image were used. High-performance computer systems and techniques of compressing images to handle 3D-image were used. But there will be cost Problems, if GIS system is implemented, using the high-performance system. And if GIS system is implemented, using the techniques of compressing images, there will be some loss of a image. It will take a long processing time to handle 3D-images using a general PC because the size of 3D-image files are very huge. The parallel algorithm presented in the paper can improve speed to handle 3D-image using parallel computer system. The system uses the method of displacing images from nodes to screens, dividing a 3D-image into multiple sub images on multiple nodes. The performance of the presented algorithm showers improving speed by experiments.
https://doi.org/10.3745/KIPSTA.2002.9A.2.217 인용 PDF KSCI

A Novel VLSI Architecture for Parallel Adaptive Dictionary-Base Text Compression (가변 적응형 사전을 이용한 텍스트 압축방식의 병렬 처리를 위한 VLSI 구조)

Lee, Yong-Doo;Kim, Hie-Cheol;Kim, Jung-Gyu
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.6
- /
- pp.1495-1507
- /
- 1997
Among a number of approaches to text compression, adaptive dictionary schemes based on a sliding window have been very frequently used due to their high performance. The LZ77 algorithm is the most efficient algorithm which implements such adaptive schemes for the practical use of text compression. This paperpresents a VLSI architecture designed for processing the LZ77 algorithm in parallel. Compared with the other VLSI architectures developed so far, the proposed architecture provides the more viable solution to high performance with regard to its throughput, efficient implementation of the VLSI systolic arrays, and hardware scalability. Indeed, without being affected by the size of the sliding window, our system has the complexity of O(N) for both the compression and decompression and also requires small wafer area, where N is the size of the input text.
PDF

Optimal Design of Multi-Fuzzy Controller and Its application to Air Conditioning System (다중 퍼지 제어기의 최적 설계와 에어컨 시스템으로의 적용)

Jang, Han-Jong;Choe, Jeong-Nae;O, Seong-Gwon
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2008.04a
- /
- pp.313-316
- /
- 2008
에어컨 시스템은 압축기(Compressor), 응축기(Condenser), 증발기(Evaporator)와 확장밸브(Expansion Valve)로 구성되며, 에어컨 시스템에서 과열도와 저압(증발기의 압력)은 시스템의 효율 증대 및 성능 개선과 안정성에 대하여 결정적인 영향을 미친다. 따라서, 과열도와 저압을 조절하기 위해, 각각의 압축기내의 인버터 주파수와 확장밸브의 개도 제어가 중요하며 선형과 비선형 시스템 모두에 대하여 견실한 성능을 나타내고, 외란에 대하여 강인한 성능을 보이는 퍼지 제어기를 설계한다. 본 논문에서는 과열도와 저압을 제어하기 위하여, 3대의 확장밸브와 1대의 압축기를 가진 에어컨 시스템에 대하여 다중 퍼지 제어기를 설계한다. 또한, 각 제어 플랜트에 대하여 최적의 퍼지 제어기를 설계하기 위하여 3가지 최적화 알고리즘을 사용한다. 즉, 직렬 유전자 알고리즘(Serial Genetic Algorithm; SGA)과 병렬 유전자 알고리즘인 계층적 공정 경쟁 유전자 알고리즘(Hierarchical Fair Competition Genetic Algorithm; HFCGA), 그리고 Particle Swarm Optimization(PSO)을 사용하여 다중 퍼지 제어기를 최적화하고 시뮬레이션의 결과를 비교한다.
PDF

A Parallel Implementation of JPEG2000 4K Ultra High Definition Image using OpenCL (OpenCL을 이용한 JPEG2000 4K 초고화질 영상처리의 병렬고속화 구현)

Park, Daeseung;Kim, Cheong Ghil
- Journal of Satellite, Information and Communications
- /
- v.10 no.1
- /
- pp.1-5
- /
- 2015
With the help of fast growing multimedia technology and high preference for users of large screens, the newest video coding standard, HEVC (High Efficiency Video Coding) high-quality video compression), has been introduced. Therefore, the high definition image services which are four times more clear than conventional HD video, are getting popular. JPEG 2000 also has stated to support 4K and 8K UHD. As a result, it requires fast processing technology to read and write UHD images. This paper introduces a study on fast parallel processing technology for UHD images. For this purpose, first, JPEG 2000 is reviewed and a GPU based parallel implementation is proposed for a preprocessing of color conversion stage. The parallelled algorithm is implemented with OpenCL (Open Computing Language). The simulation results show that the proposed method shows 5 times performance improvements on processing speed for 4K UHD over the method using threads.
PDF KSCI

Search Result 54, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)