통합 검색 | Korea Science

아날로그 PRML 디코더를 위한 아날로그 병렬처리 회로의 전향 차동 구조 (Feed forward Differential Architecture of Analog Parallel Processing Circuits for Analog PRML Decoder)

마헤스워 샤퍄라;양창주;김형석
- 전기학회논문지
- /
- 제59권8호
- /
- pp.1489-1496
- /
- 2010
A feed forward differential architecture of analog PRML decoder is investigated to implement on analog parallel processing circuits. The conventional PRML decoder performs the trellis processing with the implementation of single stage in digital and its repeated use. The analog parallel processing-based PRML comes from the idea that the decoding of PRML is done mainly with the information of the first several number of stages. Shortening the trellis processing stages but implementing it with analog parallel circuits, several benefits including higher speed, no memory requirement and no A/D converter requirement are obtained. Most of the conventional analog parallel processing-based PRML decoders are differential architecture with the feedback of the previous decoded data. The architecture used in this paper is without feedback, where error metric accumulation is allowed to start from all the states of the decoding stage, which enables to be decoded without feedback. The circuit of the proposed architecture is simpler than that of the conventional analog parallel processing structure with the similar decoding performance. Characteristics of the feed forward differential architecture are investigated through various simulation studies.
https://doi.org/10.5370/KIEE.2010.59.8.1489 인용 PDF KSCI

IoT/에지 컴퓨팅에서 저전력 메모리 아키텍처의 개선 연구 (A Study on Improvement of Low-power Memory Architecture in IoT/edge Computing)

조두산
- 한국산업융합학회 논문집
- /
- 제24권1호
- /
- pp.69-77
- /
- 2021
The widely used low-cost design methodology for IoT devices is very popular. In such a networked device, memory is composed of flash memory, SRAM, DRAM, etc., and because it processes a large amount of data, memory design is an important factor for system performance. Therefore, each device selects optimized design factors such as function, performance and cost according to market demand. The design of a memory architecture available for low-cost IoT devices is very limited with the configuration of SRAM, flash memory, and DRAM. In order to process as much data as possible in the same space, an architecture that supports parallel processing units is usually provided. Such parallel architecture is a design method that provides high performance at low cost. However, it needs precise software techniques for instruction and data mapping on the parallel architecture. This paper proposes an instruction/data mapping method to support optimized parallel processing performance. The proposed method optimizes system performance by actively using hardware and software parallelism.
https://doi.org/10.21289/KSIC.2021.24.1.69 인용 PDF KSCI HTML

A Parallel Search Algorithm and Its Implementation for Digital k-Winners-Take-All Circuit

Yoon, Myungchul
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제15권4호
- /
- pp.477-483
- /
- 2015
The k-Winners-Take-All (kWTA) is an operation to find the largest k (>1) inputs among N inputs. Parallel search algorithm of kWTA for digital inputs is not invented yet, so most of digital kWTA architectures have O(N) time complexity. A parallel search algorithm for digital kWTA operation and the circuits for its VLSI implementation are presented in this paper. The proposed kWTA architecture can compare all inputs simultaneously in parallel. The time complexity of the new architecture is O(logN), so that it is scalable to a large number of digital data. The high-speed kWTA operation and its O(logN) dependency of the new architecture are verified by simulations. It takes 290 ns in searching for 5 winners among 1024 of 32 bit data, which is more than thousands of times faster than existing digital kWTA circuits, as well as existing analog kWTA circuits.
https://doi.org/10.5573/JSTS.2015.15.4.477 인용 PDF KSCI

상용 응용을 위한 병렬처리 구조 설계 (Design of the new parallel processing architecture for commercial applications)

한우종;윤석한;임기욱
- 전자공학회논문지B
- /
- 제33B권5호
- /
- pp.41-51
- /
- 1996
In this paper, anew parallel processing system based on a cluster architecture which provides scalability of a parallel processing system while maintains shared memory multiprocessor characteristics is proposed. In recent days low cost, high performnce microprocessors have led to construction of large scale parallel processing systems. Such parallel processing systems provides large scalability but are mainly used for scientific applications which have large data parallelism. A shared memory multiprocessor system like TICOM is currently used as aserver for the commercial application, however, the shared memory multiprocessor system is known to have very limited scalability. The proposed architecture can support scalability and performance of the parallel processing system while it provides adaptability for the commerical application, hence it can overcome the limitation of the shared memory multiprocessor. The architecture and characteristics of the proposed system shall be described. A proprietary hierarchical crsossbar network is designed for this system, of which the protocol, routing and switching technique and the signal transfer technique are optimized for the proposed architecture. The design trade-offs for the network are described in this paper and with simulation usihng the SES/workbench, it is explored that the network fits to the proposed architecture.
PDF

PERFORMANCE OF A KNIGHT TOUR PARALLEL ALGORITHM ON MULTI-CORE SYSTEM USING OPENMP

VIJAYAKUMAR SANGAMESVARAPPA;VIDYAATHULASIRAMAN
- Journal of applied mathematics & informatics
- /
- 제41권6호
- /
- pp.1317-1326
- /
- 2023
Today's computers, desktops and laptops were build with multi-core architecture. Developing and running serial programs in this multi-core architecture fritters away the resources and time. Parallel programming is the only solution for proper utilization of resources available in the modern computers. The major challenge in the multi-core environment is the designing of parallel algorithm and performance analysis. This paper describes the design and performance analysis of parallel algorithm by taking the Knight Tour problem as an example using OpenMP interface. Comparison has been made with performance of serial and parallel algorithm. The comparison shows that the proposed parallel algorithm achieves good performance compared to serial algorithm.
https://doi.org/10.14317/jami.2023.1317 인용 PDF

ITU-T J.83 ANNEX B의 Parity Checksum Generator를 위한 병렬 처리 구조 (Parallel Processing Architecture for Parity Checksum Generator Complying with ITU-T J.83 ANNEX B)

이종엽;홍언표;하동수;임회정
- 한국통신학회논문지
- /
- 제34권6C호
- /
- pp.619-625
- /
- 2009
이 논문은 ITU-T Recommendation J.83 Annex B에서 패킷 동기화와 에러 검출을 위해 사용된 패리티 체크섬 생성기의 병렬 구조를 제안한다. 제안된 병렬 처리 구조는 기존의 직렬 처리 구조에서 일어나는 병목현상을 제거하여 패리티 체크섬을 생성하는데 필요한 처리 시간을 상당히 줄여준다. 실험 결과는 제안된 병렬 처리 구조가 16%의 면적증가로 처리 속도를 83.1%나 줄일 수 있다는 것을 보여준다.
PDF KSCI

트랜스퓨터를 사용한 피라미드형 병렬 어레이 컴퓨터 (TPPAC) 구조 (Transputer-based Pyramidal Parallel Array Computer(TPPAC) architecture (Prelimineary Version))

정창성;정철환
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1988년도 전기.전자공학 학술대회 논문집
- /
- pp.647-650
- /
- 1988
This paper proposes and sketches out a new parallel architecture of transputer-based pyramidal parallel array computer (TPPAC) used to process computationally intensive problems for geometric processing applications such as computer vision, image processing etc. It explores how efficiently the pyramid computer architecture is designed using transputer chips, and poses a new interconnection scheme for TPPAC without using additional transputers.
PDF

병렬 파이프라인 프로세서 아키덱처의 설계 (Design of a Parallel Pipelined Processor Architecture)

이상정;김광준
- 전자공학회논문지B
- /
- 제32B권3호
- /
- pp.11-23
- /
- 1995
In this paper, a parallel pipelined processor model which acts as a small VLIW processor architecture and a scheduling algorithm for extracting instruction-level parallelism on this architecture are proposed. The proposed model has a dual-instruction mode which has maximum 4 basic operations being executed in parallel. By combining these basic operations, variable instruction set can be designed for various applications. The scheduling algorithm schedules basic operations for parallel execution and removes pipeline hazards by examining data dependency and resource conflict relations. In order to examine operation and evaluate the performance,a C compiler and a simulator are developed. By simulating various test programs with the compiler and the simulator, the characteristics and the performance result of the proposed architecture are measured.
PDF

개선된 수정 유클리드 알고리듬을 이용한 고속의 Reed-Solomon 복호기의 설계 (Implementation of High-Speed Reed-Solomon Decoder Using the Modified Euclid's Algorithm)

김동선;최종찬;정덕진
- 대한전기학회논문지:전력기술부문A
- /
- 제48권7호
- /
- pp.909-915
- /
- 1999
In this paper, we propose an efficient VLSI architecture of Reed-Solomon(RS) decoder. To improve the speed. we develope an architecture featuring parallel and pipelined processing. To implement the parallel and pipelined processing architecture, we analyze the RS decoding algorithm and the honor's algorithm for parallel processing and we also modified the Euclid's algorithm to apply the efficient parallel structure in RS decoder. To show the proposed architecture, the performance of the proposed RS decoder is compared to Shao's and we obtain the 10 % efficiency in area and three times faster in speed when it's compared to Shao's time domain decoder. In addition, we implemented the proposed RS decoder with Altera FPGA Flex10K-50.
PDF

An Efficient Interpolation Hardware Architecture for HEVC Inter-Prediction Decoding

Jin, Xianzhe;Ryoo, Kwangki
- Journal of information and communication convergence engineering
- /
- 제11권2호
- /
- pp.118-123
- /
- 2013
This paper proposes an efficient hardware architecture for high efficiency video coding (HEVC), which is the next generation video compression standard. It adopts several new coding techniques to reduce the bit rate by about 50% compared with the previous one. Unlike the previous H.264/AVC 6-tap interpolation filter, in HEVC, a one-dimensional seven-tap and eight-tap filter is adopted for luma interpolation, but it also increases the complexity and gate area in hardware implementation. In this paper, we propose a parallel architecture to boost the interpolation performance, achieving a luma $4{\times}4$ block interpolation in 2-4 cycles. The proposed architecture contains shared operations reducing the gate count increased due to the parallel architecture. This makes the area efficiency better than the previous design, in the best case, with the performance improved by about 75.15%. It is synthesized with the MagnaChip $0.18{\mu}m$ library and can reach the maximum frequency of 200 MHz.
https://doi.org/10.6109/jicce.2013.11.2.118 인용 PDF KSCI

검색결과 885건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)