통합 검색 | Korea Science

고성능 내장형 마이크로프로세서를 위한 분기예측기의 설계 및 성능평가 (Branch Predictor Design and Its Performance Evaluation for A High Performance Embedded Microprocessor)

이상혁;김일관;최린
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2002년도 하계종합학술대회 논문집(2)
- /
- pp.129-132
- /
- 2002
AE64000 is the 64-bit high-performance microprocessor that ADC Co. Ltd. is developing for an embedded environment. It has a 5-stage pipeline and uses Havard architecture with a separated instruction and data caches. It also provides SIMD-like DSP and FP operation by enabling the 8/16/32/64-bit MAC operation on 64-bit registers. AE64000 processor implements the EISC ISA and uses the instruction folding mechanism (Instruction Folding Unit) that effectively deals with LERI instruction in EISC ISA. But this unit makes branch prediction behavior difficult. In this paper, we designs a branch predictor optimized for AE64000 Pipeline and develops a AES4000 simulator that has cycle-level precision to validate the performance of the designed branch predictor. We makes TAC(Target address cache) and BPT(branch prediction table) seperated for effective branch prediction and uses the BPT(removed indexed) that has no address tags.
PDF

Web 기반 워드프로세서 코스웨어의 설계 및 분석 (A design and analysis of Web-Based courseware for word processor)

강윤희;이주홍;한선관
- 정보교육학회논문지
- /
- 제7권2호
- /
- pp.189-197
- /
- 2003
WBI(Web Based Instruction)는 교수 학습 자료의 개발 부담으로 특정 교과에 국한되어 있다. 본 논문은 WBI를 워드프로세서의 수업에 적용하여 인터넷 기반의 개별화된 교수-학습 시스템을 구현하였다. WBI를 적용한 워드프로세서 수업 방식은 전통적 수업 방식에 비해 학생들이 더욱 흥미를 느끼게 하고, 워드프로세서의 수준별, 능력별, 단계별 학습 선택으로 인해 학생 중심의 학습을 가능하게 하였다. 또한 개별학습 과제를 통해 학습내용을 실시간 평가 할 수 있으며 피드백이 가능하여 학습 효과를 극대화시킬 수 있었다.
PDF

The design of a 32-bit Microprocessor for a Sequence Control using an Application Specification Integrated Circuit(ASIC) (ICEIC'04)

Oh Yang
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2004년도 학술대회지
- /
- pp.486-490
- /
- 2004
Programmable logic controller (PLC) is widely used in manufacturing system or process control. This paper presents the design of a 32-bit microprocessor for a sequence control using an Application Specification Integrated Circuit (ASIC). The 32-bit microprocessor was designed by a VHDL with top down method; the program memory was separated from the data memory for high speed execution of 274 specified sequence instructions. Therefore it was possible that sequence instructions could be operated at the same time during the instruction fetch cycle. And in order to reduce the instruction decoding time and the interface time of the data memory interface, an instruction code size was implemented by 32-bits. And the real time debugging as single step run, break point run was implemented. Pulse instruction, step controller, master controllers, BIN and BCD type arithmetic instructions, barrel shit instructions were implemented for many used in PLC system. The designed microprocessor was synthesized by the S1L50000 series which contains 70,000 gates with 0.65um technology of SEIKO EPSON. Finally, the benchmark was performed to show that designed 32-bit microprocessor has better performance than Q4A PLC of Mitsubishi Corporation.
PDF

IoT/에지 컴퓨팅에서 저전력 메모리 아키텍처의 개선 연구 (A Study on Improvement of Low-power Memory Architecture in IoT/edge Computing)

조두산
- 한국산업융합학회 논문집
- /
- 제24권1호
- /
- pp.69-77
- /
- 2021
The widely used low-cost design methodology for IoT devices is very popular. In such a networked device, memory is composed of flash memory, SRAM, DRAM, etc., and because it processes a large amount of data, memory design is an important factor for system performance. Therefore, each device selects optimized design factors such as function, performance and cost according to market demand. The design of a memory architecture available for low-cost IoT devices is very limited with the configuration of SRAM, flash memory, and DRAM. In order to process as much data as possible in the same space, an architecture that supports parallel processing units is usually provided. Such parallel architecture is a design method that provides high performance at low cost. However, it needs precise software techniques for instruction and data mapping on the parallel architecture. This paper proposes an instruction/data mapping method to support optimized parallel processing performance. The proposed method optimizes system performance by actively using hardware and software parallelism.
https://doi.org/10.21289/KSIC.2021.24.1.69 인용 PDF KSCI HTML

Fast NAND Flash Memory System for Instruction Code Execution

Jung, Bo-Sung;Kim, Cheong-Ghil;Lee, Jung-Hoon
- ETRI Journal
- /
- 제34권5호
- /
- pp.787-790
- /
- 2012
The objective of this research is to design a high-performance NAND flash memory system containing a buffer system. The proposed instruction buffer in the NAND flash memory consists of two parts, that is, a fully associative temporal buffer for temporal locality and a fully associative spatial buffer for spatial locality. A spatial buffer with a large fetching size turns out to be effective for serial instructions, and a temporal buffer with a small fetching size is devised for branch instructions. Simulation shows that the average memory access time of the proposed system is better than that of other buffer systems with four times more space. The average miss ratio is improved by about 70% compared with that of other buffer systems.
https://doi.org/10.4218/etrij.12.0211.0472 인용 PDF KSCI

Design of Vector Register Architecture in DSP Processor for Efficient Multimedia Processing

Wu, Chou-Pin;Wu, Jen-Ming
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제7권4호
- /
- pp.229-234
- /
- 2007
In this paper, we present an efficient instruction set architecture using vector register file hardware to accelerate operation of general matrix-vector operations in DSP microprocessor. The technique enables in-situ row-access as well as column access to the register files. It can reduce the number of memory access significantly. The technique is especially useful for block-based video signal processing kernels such as FFT/IFFT, DCT/IDCT, and two-dimensional filtering. We have applied the new instruction set architecture to in-loop deblocking filter processing in H.264 decoder. Performance comparisons show that the required load/store operations for the in-loop deblocking filter can be reduced about 42%. The architecture would improve the processing speed, and code density in DSP microprocessor especially for video signal processing substantially.
https://doi.org/10.5573/JSTS.2007.7.4.229 인용 PDF KSCI

공학설계수업에서의 PBL 모형 개발 및 효과 분석 (Development of a PBL model and Analysis of its Effect in Engineering Design Instruction)

김성봉;홍효정
- 한국산학기술학회논문지
- /
- 제11권11호
- /
- pp.4310-4319
- /
- 2010
지식정보화 사회로의 진전이 가속화 되는 21세기에 공학도들에게 창의적 문제해결능력이 무엇보다 강조되고 있는데, 이러한 능력 개발을 위한 접근방식은 다양할 것이다. 본 연구는, 문제중심학습(PBL)이 그러한 접근방식에 있어 하나의 효과적인 대안으로 인식하고, 기존의 PBL모형에 기초하지만 대학 공학설계수업에 적합하도록 독자적으로 개발하고 그 효과를 수업현장에서 검증해 본 'JPBL' 모형을 소개하는 것을 주목적으로 하였다. 효과분석 결과, JPBL 수업을 받기 전과 후의 수업만족도와 조직몰입도는 통계적으로 유의한 차이가 있었으며, 이러한 양적인 결과는 질적자료 분석에서도 확인할 수 있었다. 이러한 연구결과에 터해 JPBL 모형 개발의 의미와 한계에 대해 논의하였다.
https://doi.org/10.5762/KAIS.2010.11.11.4310 인용 PDF KSCI

Design of a DI model-based Content Addressable Memory for Asynchronous Cache

Battogtokh, Jigjidsuren;Cho, Kyoung-Rok
- International Journal of Contents
- /
- 제5권2호
- /
- pp.53-58
- /
- 2009
This paper presents a novel approach in the design of a CAM for an asynchronous cache. The architecture of cache mainly consists of four units: control logics, content addressable memory, completion signal logic units and instruction memory. The pseudo-DCVSL is useful to make a completion signal which is a reference for handshake control. The proposed CAM is a very simple extension of the basic circuitry that makes a completion signal based on DI model. The cache has 2.75KB CAM for 8KB instruction memory. We designed and simulated the proposed asynchronous cache including CAM. The results show that the cache hit ratio is up to 95% based on pseudo-LRU replacement policy.
https://doi.org/10.5392/IJoC.2009.5.2.053 인용 PDF

비트 및 워드 연산용 초고속 프로세서 설계 (The Design of High Speed Bit and Word Processor)

허재동;양오
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2002년도 하계학술대회 논문집 D
- /
- pp.2534-2536
- /
- 2002
This paper presents the design of high speed bit and word processor for sequence logic control using a FPGA. This FPGA is able to execute sequence instruction during program fetch cycle, because the program memory was separated from the data memory for high speed execution at 40MHz clock. Also this processor has 274 instructions set with a 32bit fixed width, so instruction decoding time and data memory interface time was reduced. This FPGA was synthesized by V600EHQ240 and Foundation tool of Xilinx company. The final simulation was successfully performed under Foundation tool simulation environment. And the FPGA programmed by VHDL for a 240 pin HQFP package. Finally the benchmark was performed to prove that the designed for bit and word processor has better performance than Q4A of Mitsubishi for the sequence logic control.
PDF

움직임 추정 전용 프로세서를 위한 효율적인 루프 가속기 (Efficient Loop Accelerator for Motion Estimation Specific Instruction-set Processor)

하재명;정호선;선우명훈
- 전자공학회논문지
- /
- 제50권7호
- /
- pp.159-166
- /
- 2013
본 논문은 움직임 추정 전용 프로세서를 위한 효율적인 루프 가속기를 제안한다. 실제로 움직임 추정 알고리즘은 복잡하고 다양한 순환 명령어들을 포함하고 있다. 본 논문에서는 효율적인 하드웨어 루프 명령어들을 지원하기 위해서, 네 개의 루프 명령어와 그에 따른 하드웨어 구조를 소개한다. 검증 결과 제안된 루프 가속기가 early-termination을 이용한 움직임 추정 시 비교명령어와 조건부 점프명령어를 갖고 있는 전형적인 구현 방법과 비교했을 때 평균 명령어 사이클 수를 약 29% 줄일 수 있다는 것을 보여준다. 제안된 움직임 추정 전용 프로세서 루프 가속기는 프로그램 메모리의 접근 빈도를 상당히 줄일 수 있고, 전력 소모를 많이 절약할 수 있다. 따라서, 제안된 루프 가속기는 전력 소모가 적고, 유연한 움직임 추정에 적합하다.
https://doi.org/10.5573/ieek.2013.50.7.159 인용 PDF KSCI

검색결과 173건 처리시간 0.029초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)