Search | Korea Science

An Implementation of a Memory Operation System Architecture for Memory Latency Penalty Reduction in SIMT Based Stream Processor (Memory Latency Penalty를 개선한 SIMT 기반 Stream Processor의 Memory Operation System Architecture 설계)

Lee, Kwang-Yeob
- Journal of IKEEE
- /
- v.18 no.3
- /
- pp.392-397
- /
- 2014
In this paper, we propose a memory operation system architecture for memory latency penalty reduction in SIMT architecture based stream processor. The proposed architecture applied non-blocking cache architecture to reduce cache miss penalty generated by blocking cache architecture. We verified that the proposed memory operation architecture improve the performance of the stream processor by comparing processing performances of various algorithms. We measured the performance improvement rate that was improved in accordance with the ratio of memory instruction in each algorithm. As a result, we confirmed that the performance of stream processor improves up to minimum 8.2% and maximum 46.5%.
https://doi.org/10.7471/ikeee.2014.18.3.392 인용 PDF KSCI

Trends in AI Computing Processor Semiconductors Including ETRI's Autonomous Driving AI Processor (인공지능 컴퓨팅 프로세서 반도체 동향과 ETRI의 자율주행 인공지능 프로세서)

Yang, J.M.;Kwon, Y.S.;Kang, S.W.
- Electronics and Telecommunications Trends
- /
- v.32 no.6
- /
- pp.57-65
- /
- 2017
Neural network based AI computing is a promising technology that reflects the recognition and decision operation of human beings. Early AI computing processors were composed of GPUs and CPUs; however, the dramatic increment of a floating point operation requires an energy efficient AI processor with a highly parallelized architecture. In this paper, we analyze the trends in processor architectures for AI computing. Some architectures are still composed using GPUs. However, they reduce the size of each processing unit by allowing a half precision operation, and raise the processing unit density. Other architectures concentrate on matrix multiplication, and require the construction of dedicated hardware for a fast vector operation. Finally, we propose our own inAB processor architecture and introduce domestic cutting-edge processor design capabilities.
https://doi.org/10.22648/ETRI.2017.J.320607 인용 PDF

On the Implementation of the Digital Neuron Processor (디지탈 뉴런프로세서의 구현에 관한 연구)

홍봉화;이지영
- Journal of the Korea Society of Computer and Information
- /
- v.4 no.2
- /
- pp.27-38
- /
- 1999
This paper proposes a high speed digital neuron processor which uses the residue number system, making the high speed operation possible without carry propagation,. Consisting of the MAC(Multiplier and with Accumulator) operation unit, quotient operation unit and sigmoid function operation unit, the neuron processor is designed through 0.8$\mu$m CMOS fabrication. The result shows that the new implemented neuron processor can run at the speed of 19.2 nSec and the size can be reduced to 1/2 compared to the neuron processor implemented by the real number operation unit.
PDF

VLSI Design of Processor IP for TCP/IP Protocol Stack (TCP/IP프로토콜 스택 프로세서 IP의 VLSI설계)

최병윤;박성일;하창수
- Proceedings of the IEEK Conference
- /
- 2003.07b
- /
- pp.927-930
- /
- 2003
In this paper, a design of processor IP for TCP/IP protocol stack is described. The processor consists of input and output buffer memory with dual bank structure, 32-bit RISC microprocessor core, DMA unit with on-the-fly checksum capability. To handle the various modes of TCP/IP protocol, hardware and software co-design approach is used rather than the conventional state machine based design. To eliminate delay time due to the data transfer and checksum operation, DAM module which can execute the checksum operation on-the-fly along with data transfer operation is adopted. By programming the on-chip code ROM of RISC processor differently. the designed stack processor can support the packet format conversion operations required in the various TCP/IP protocols.
PDF

Implementation of GA Processor for Efficient Sequence Generation (효율적인 DNA 서열 생성을 위한 진화연산 프로세서 구현)

Jeon, Sung-Mo;Kim, Tae-Seon;Lee, Chong-Ho
- Proceedings of the KIEE Conference
- /
- 2003.11c
- /
- pp.376-379
- /
- 2003
DNA computing based DNA sequence Is operated through the biology experiment. Biology experiment used as operator causes illegal reactions through shifted hybridization, mismatched hybridization, undesired hybridization of the DNA sequence. So, it is essential to design DNA sequence to minimize the potential errors. This paper proposes method of the DNA sequence generation based evolutionary operation processor. Genetic algorithm was used for evolutionary operation and extra hardware, namely genetic algorithm processor was implemented for solving repeated evolutionary process that causes much computation time. To show efficiency of the Proposed processor, excellent result is confirmed by comparing between fitness of the DNA sequence formed randomly and DNA sequence formed by genetic algorithm processor. Proposed genetic algorithm processor can reduce the time and expense for preparing DNA sequence that is essential in DNA computing. Also it can apply design of the oligomer for development of the DNA chip or oligo chip.
PDF

A Prototype of Three Dimensional Operations for GIS

Chi, Jeong-Hee;Lee, Jin-Yul;Kim, Dae-Jung;Ryu, Keun-Ho;Kim, Kyong-Ho
- Proceedings of the KSRS Conference
- /
- 2002.10a
- /
- pp.880-884
- /
- 2002
According to the development of computer technology, especially in 3D graphics and visualization, the interest for 3D GIS has been increasing. Several commercial GIS softwares are ready to provide 3D function in their traditional 2D GIS. However, most of these systems are focused on visualization of 3D objects and supports few analysis functions. Therefore in this paper, we design not only a spatial operation processor which can support spatial analysis functions as well as 3D visualization, but also implement a prototype to operate them. In order to support interoperability between the existing models, the proposed spatial operation processor supports the 3D spatial operations based on 3D geometry object model which is designed to extend 2D geometry model of OGIS consortium, and supports index based on R$^*$-Tree. The proposed spatial operation processor can be applied in 3D GIS to support 3D analysis functions.
PDF

Implemenation of an ASIP for acceleration SAD operation (SAD 연산의 가속을 위한 멀티미디어 코프로세서 구현)

Jo, Jung-Hyun;Jeong, Ha-Young
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.809-810
- /
- 2006
An H.264 algorithm is commonly used for video compression applications. This algorithm requires a large number of data computations, for example, the sum of absolute difference (SAD) operation. We analyzed H.264 reference encoding workloads. The H.264 encoding program has 8.78% SAD operation. The SAD operation is to sum up 16 difference-values in H.264 $4{\times}4$ sub-blocks. In order to accelerate SAD operations, we implemented an application specific instruction-set processor (ASIP) that can execute SAD and data transfer instructions. The proposed coprocessor has an absolute value generator and a carry save adder (CSA) unit to sum up 8 difference-values per one clock cycle. We completed SAD operation in 2 clock cycles. Experimental results show that the performance is improved by 34% of total execution time.
PDF

Simulation on a test vector Implementation of a pipeline processor using a HDL (HDL을 이용한 파이프라인 프로세서의 테스트 벡터 구현에 의한 시뮬레이션)

박두열
- Journal of the Korea Society of Computer and Information
- /
- v.5 no.3
- /
- pp.16-28
- /
- 2000
In this paper, we implemented by describing a pipeline processor using a HDL in functional level, simulated and verified it's operation. When simulating a implemented processor. We first specify assembly instruction that is Performed in the processor. entered by programming using the instruction sets at the experimental framework. Thus, the procedure that is presented in this paper can easily identify and verify the purpose for implementation and operation of a system by using test vector. Also, it was possible that exactly simulate a system. The method was comfortable that document a system operation to implement.
PDF

Implementation of a 3D Graphics Hardwired T&L Accelerator based on a SoC Platform for a Mobile System (SoC 플랫폼 기반 모바일용 3차원 그래픽 Hardwired T&L Accelerator 구현)

Lee, Kwang-Yeob;Koo, Yong-Seo
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.9
- /
- pp.59-70
- /
- 2007
In this paper, we proposed an effective T&L(Transform & Lighting) Processor architecture for a real time 3D graphics acceleration SoC(System on a Chip) in a mobile system. We designed Floating point arithmetic IPs for a T&L processor. And we verified IPs using a SoC Platform. Designed T&L Processor consists of 24 bit floating point data format and 16 bit fixed point data format, and supports the pipeline keeping the balance between Transform process and Lighting process using a parallel computation of 3D graphics. The delay of pipeline processing only Transform operation is almost same as the delay processing both Transform operation and Lighting operation. Designed T&L Processor is implemented and verified using a SoC Platform. The T&L Processor operates at 80MHz frequency in Xilinx-Virtex4 FPGA. The processing speed is measured at the rate of 20M Vertexes/sec.
PDF KSCI

An Implementation of Digital Neural Network Using Systolic Array Processor (영어 수계를 이용한 디지털 신경망회로의 실현)

윤현식;조원경
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.2
- /
- pp.44-50
- /
- 1993
In this paper, we will present an array processor for implementation of digital neural networks. Back-propagation model can be formulated as a consecutive matrix-vector multiplication problem with some prespecified thresholding operation. This operation procedure is suited for the design of an array processor, because it can be recursively and repeatedly executed. Systolic array circuit architecture with Residue Number System is suggested to realize the efficient arithmetic circuit for matrix-vector multiplication and compute sigmoid function. The proposed design method would expect to adopt for the application field of neural networks, because it can be realized to currently developed VLSI technology.
PDF

Search Result 616, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)