Search | Korea Science

Parallel Branch Instruction Extension for Thumb-2 Instruction Set Architecture (Thumb-2 명령어 집합 구조의 병렬 분기 명령어 확장)

Kim, Dae-Hwan
- Journal of the Korea Society of Computer and Information
- /
- v.18 no.7
- /
- pp.1-10
- /
- 2013
In this paper, the parallel branch instruction is proposed which executes a branch instruction and the frequently used instruction simultaneously to improve the performance of Thumb-2 instruction set architecture. In the proposed approach, new 32-bit parallel branch instructions are introduced which combine 16-bit branch instruction with each of the frequently used 16-bit LOAD, ADD, MOV, STORE, and SUB instructions, respectively. To provide the encoding space of the new instructions, the register field in less frequently executed instructions is reduced, and the new instructions are encoded by using the saved bits. Experiments show that the proposed approach improves performance by an average of 8.0% when compared to the conventional approach.
https://doi.org/10.9708/jksci.2013.18.7.001 인용 PDF KSCI

Design of an Instruction Fetch Unit for RAPTOR, a On-Chip Multiprocessor (RAPTOR의 명령어 페치 유닛 설계)

이성권;오형철이상원한우종
- Proceedings of the IEEK Conference
- /
- 1998.10a
- /
- pp.767-770
- /
- 1998
This paper introduces an instruction fetch unit which is designed for RAPTOR, an on-chip multiprocessor. In order to reduce control hazards, the proposed fetch unit supports a hybrid branch prediction scheme which consists of a static scheme and the 2bC branch prediction scheme. The fetch unit also utilizes the branch folding technique with two instruction buffers to avoid the branch penalty caused by imspredictions. Instructions are predecoded in the fetch unit to achieve extra performance gain.
PDF

Analytical Models of Instruction Fetch on Superscalar Processors

Kim, Sun-Mo;Jung, Jin-Ha;Park, Sang-Bang
- Proceedings of the IEEK Conference
- /
- 2000.07b
- /
- pp.619-622
- /
- 2000
This research presents an analytical model to predict the instruction fetch rate on superscalar Processors. The proposed model is also able to analyze the performance relationship between cache miss and branch prediction miss. The proposed model takes into account various kind of architectural parameters such as branch instruction probability, cache miss rate, branch prediction miss rate, and etc.. To prove the correctness of the proposed model, we performed extensive simulations and compared the results with those of the analytical models. Simulation results showed that the pro-posed model can estimate the instruction fetch rate accurately within 10% error in most cases. The model is also able to show the effects of the cache miss and branch prediction miss on the performance of instruction fetch rate, which can provide an valuable information in designing a balanced system.
PDF

Accurate Prediction of Polymorphic Indirect Branch Target (간접 분기의 타형태 타겟 주소의 정확한 예측)

백경호;김은성
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.41 no.6
- /
- pp.1-11
- /
- 2004
Modern processors achieve high performance exploiting avaliable Instruction Level Parallelism(ILP) by using speculative technique such as branch prediction. Traditionally, branch direction can be predicted at very high accuracy by 2-level predictor, and branch target address is predicted by Branch Target Buffer(BTB). Except for indirect branch, each of the branch has the unique target, so its prediction is very accurate via BTB. But because indirect branch has dynamically polymorphic target, indirect branch target prediction is very difficult. In general, the technique of branch direction prediction is applied to indirect branch target prediction, and much better accuracy than traditional BTB is obtained for indirect branch. We present a new indirect branch target prediction scheme which combines a indirect branch instruction with its data dependent register of the instruction executed earlier than the branch. The result of SPEC benchmark simulation which are obtained on SimpleScalar simulator shows that the proposed predictor obtains the most perfect prediction accuracy than any other existing scheme.
PDF KSCI

Early Start Branch Prediction to Resolve Prediction Delay (분기 명령어의 조기 예측을 통한 예측지연시간 문제 해결)

Kwak, Jong-Wook;Kim, Ju-Hwan
- The KIPS Transactions:PartA
- /
- v.16A no.5
- /
- pp.347-356
- /
- 2009
Precise branch prediction is a critical factor in the IPC Improvement of modern microprocessor architectures. In addition to the branch prediction accuracy, branch prediction delay have a profound impact on overall system performance as well. However, it tends to be overlooked when the architects design the branch predictor. To tolerate branch prediction delay, this paper proposes Early Start Prediction (ESP) technique. The proposed solution dynamically identifies the start instruction of basic block, called as Basic Block Start Address (BB_SA), and the solution uses BB_SA when predicting the branch direction, instead of branch instruction address itself. The performance of the proposed scheme can be further improved by combining short interval hiding technique between BB_SA and branch instruction. The simulation result shows that the proposed solution hides prediction latency, with providing same level of prediction accuracy compared to the conventional predictors. Furthermore, the combination with short interval hiding technique provides a substantial IPC improvement of up to 10.1%, and the IPC is actually same with ideal branch predictor, regardless of branch predictor configurations, such as clock frequency, delay model, and PHT size.
https://doi.org/10.3745/KIPSTA.2009.16A.5.347 인용 PDF KSCI

Branch Predictor Design and Its Performance Evaluation for A High Performance Embedded Microprocessor (고성능 내장형 마이크로프로세서를 위한 분기예측기의 설계 및 성능평가)

Lee, Sang-Hyuk;Kim, Il-Kwan;Choi, Lynn
- Proceedings of the IEEK Conference
- /
- 2002.06b
- /
- pp.129-132
- /
- 2002
AE64000 is the 64-bit high-performance microprocessor that ADC Co. Ltd. is developing for an embedded environment. It has a 5-stage pipeline and uses Havard architecture with a separated instruction and data caches. It also provides SIMD-like DSP and FP operation by enabling the 8/16/32/64-bit MAC operation on 64-bit registers. AE64000 processor implements the EISC ISA and uses the instruction folding mechanism (Instruction Folding Unit) that effectively deals with LERI instruction in EISC ISA. But this unit makes branch prediction behavior difficult. In this paper, we designs a branch predictor optimized for AE64000 Pipeline and develops a AES4000 simulator that has cycle-level precision to validate the performance of the designed branch predictor. We makes TAC(Target address cache) and BPT(branch prediction table) seperated for effective branch prediction and uses the BPT(removed indexed) that has no address tags.
PDF

A Wide-Window Superscalar Microprocessor Profiling Performance Model Using Multiple Branch Prediction (대형 윈도우에서 다중 분기 예측법을 이용하는 수퍼스칼라 프로세서의 프로화일링 성능 모델)

Lee, Jong-Bok
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.58 no.7
- /
- pp.1443-1449
- /
- 2009
This paper presents a profiling model of a wide-window superscalar microprocessor using multiple branch prediction. The key idea is to apply statistical profiling technique to the superscalar microprocessor with a wide instruction window and a multiple branch predictor. The statistical profiling data are used to obtain a synthetical instruction trace, and the consecutive multiple branch prediction rates are utilized for running trace-driven simulation on the synthesized instruction trace. We describe our design and evaluate it with the SPEC 2000 integer benchmarks. Our performance model can achieve accuracy of 8.5 % on the average.
PDF KSCI

Design of a G-Share Branch Predictor for EISC Processor

Kim, InSik;Jun, JaeYung;Na, Yeoul;Kim, Seon Wook
- IEIE Transactions on Smart Processing and Computing
- /
- v.4 no.5
- /
- pp.366-370
- /
- 2015
This paper proposes a method for improving a branch predictor for the extendable instruction set computer (EISC) processor. The original EISC branch predictor has several shortcomings: a small branch target buffer, absence of a global history, a one-bit local branch history, and unsupported prediction of branches following LERI, which is a special instruction to extend an immediate value. We adopt a G-share branch predictor and eliminate the existing shortcomings. We verified the new branch predictor on a field-programmable gate array with the Dhrystone benchmark. The newly proposed EISC branch predictor also accomplishes higher branch prediction accuracy than a conventional branch predictor.
https://doi.org/10.5573/IEIESPC.2015.4.5.366 인용 PDF KSCI

A Branch Target Buffer Using Shared Tag Memory with TLB (TLB 태그 공유 구조의 분기 타겟 버퍼)

Lee, Yong-Hwan
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- v.9 no.2
- /
- pp.899-902
- /
- 2005
Pipeline hazard due to branch instructions is the major factor of the degradation on the performance of microprocessors. Branch target buffer predicts whether a branch will be taken or not and supplies the address of the next instruction on the basis of that prediction. If the branch target buffer predicts correctly, the instruction flow will not be stalled. This leads to the better performance of microprocessor. In this paper, the architecture of a tag memory that branch target buffer and TLB can share is presented. Because the two tag memories used for branch target buffer and TLB each is replaced by single shared tag memory, we can expect the smaller ship size and the faster prediction. This hared tag architecture is more advantageous for the microprocessors that uses more bits of address and exploits much more instruction level parallelism.
PDF

A Combined BTB Architecture for effective branch prediction (효율적인 분기 예측을 위한 공유 구조의 BTB)

Lee Yong-hwan
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.9 no.7
- /
- pp.1497-1501
- /
- 2005
Branch instructions which make the sequential instruction flow changed cause pipeline stalls in microprocessor. The pipeline hazard due to branch instructions are the most serious problem that degrades the performance of microprocessors. Branch target buffer predicts whether a branch will be taken or not and supplies the address of the next instruction on the basis of that prediction. If the hanch target buffer predicts correctly, the instruction flow will not be stalled. This leads to the better performance of microprocessor. In this paper, the architecture of a ta8 memory that branch target buffer and TLB can share is presented. Because the two tag memories used for branch target buffer and TLB each is replaced by single combined tag memory, we can expect the smaller chip size and the faster prediction. This shared tag architecture is more advantageous for the microprocessors that uses more bits of address and exploits much more instruction level parallelism.
PDF KSCI

Search Result 73, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)