Search | Korea Science

Design of an ALU for SMT Microprocessors (SMT 마이크로프로세서에 적합한 ALU의 설계)

김상철;홍인표;이용석
- Proceedings of the IEEK Conference
- /
- 2003.07d
- /
- pp.1383-1386
- /
- 2003
In this paper, an ALU for Simultaneous Multi-Threading (SMT) microprocessors is designed. The SMT architecture improves notably performance and utilization of processes compared with conventional superscalar architectures by executing instructions from multiple threads at the same time. This ALU adopts data bypassing method to process multi-threads. And it can flush instructions in the same thread that generate exceptions such as branch misprediction. interrupt etc, performance of SMT microprocessors with data bypassing and exception handler can be improved.
PDF

Analysis and Application of Performance Improvement of a Real-time Simulation Visualization based on Multi-thread Pipelining Parallel Processing (다중 스레드 파이프라인 병렬처리를 통한 실시간 시뮬레이션 시각화의 성능 향상 해석 및 적용)

Lee, Jun Hee;Song, Hee Kang;Kim, Tag Gon
- Journal of the Korea Society for Simulation
- /
- v.26 no.3
- /
- pp.13-22
- /
- 2017
This research proposes and applies a pipelining parallel processing technique to enhance the speed of visualizing the results of real-time simulations. Generally, a simulation with real-time visualization consists of three processes: executing a simulation model, transmitting simulation result, and visualizing simulation result. If we have these processes in serial, the latency from simulation to visualization will be very long, which degrades the speed of visualization of data from real-time simulation. Thus, the main purpose of this research is maximizing performance by adapting pipelining parallel processing technique to the real-time simulation visualization. Also we show that performance is improved by adding multi-threading technique to each process. This paper proposes a theoretical performance model and simulation results of the techniques and then we applied this to an air combat simulation model as a case study. As the result, it shows that the performance is greatly enhanced than the original model's execution time.
https://doi.org/10.9709/JKSS.2017.26.3.013 인용 PDF KSCI

Back-Office Process Agents and Reference Construction Framework for Internet Shopping Malls (인터넷 쇼핑몰 운영을 위한 후방 프로세스 에이전트와 참조 구축 프레임웍)

박광호
- Journal of Intelligence and Information Systems
- /
- v.5 no.1
- /
- pp.167-186
- /
- 1999
인터넷 유통업은 기본적으로 대량 트랜잭션 발생을 목표로 한다. 본 논문에서는 인터넷 유통업의 대표적인 형태인 인터넷 쇼핑몰 운영을 위한 내부 프로세스 에이전트를 정의하고 이들의 참조 구축 프레임웍을 제시하고 있다. 인터넷 쇼핑몰의 후방 프로세스를 분석해 보았으며 이를 토대로 다양한 운영층 프로세스 에이전트 유형과 특성을 정의하였다. 또한, 다수의 에이전트로 구성된 프로세스 에이전트팀 조직과 활동 원칙도 제시하였다. 에이전트의 구현을 위해 멀티쓰레딩 기법을 사용하였다. 단순한 데이터 처리를 담당하는 운영층 프로세스 에이전트에 대한 연구는 향후 보다 복잡한 지능을 가진 전략층 프로세스 에이전트에 대한 연구로 발전할 것이다.
PDF

Performance Evaluation of a Simultaneous MultiThreading (동시 다중 쓰레딩을 이용하는 마이크로 프로세서 성능평가)

이정훈;오영은;박형우;김진석
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04a
- /
- pp.1-3
- /
- 2003
프로세서의 효율을 높이기 위한 방법으로 독립적인 쓰레드들을 한 프로세서 사이클에 동시에 실행시킬 수 있는 SMT 기술에 관한 많은 연구가 수행되어왔다［1, 2, 3, 4］. 많은 연구에서 SMT 기술에 대한 성능을 시뮬레이션 수준에서 측정하였기 때문에, 실제 환경에서 SMT 기술의 성능을 측정할 필요가 있다. 본 논문에서는 SMT 기술이 구현된 프로세서에서 각종 벤치마킹을 직접 수행해 봄으로써 실제 환경에서의 성능을 측정해 보았으며, 이를 기존의 SMP와의 비교를 통해 SMT 기술이 실제로 얼마만큼 좋은 성능을 낼 수 있는지 실험을 통해 보였다.
PDF

A Processor Architecture for 802.11 Wireless LAN Environment (802.11 Wireless LAN 환경에 적합한 프로세서 구조)

전성재;홍인표;이용주;이용석;정진우
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.10a
- /
- pp.550-552
- /
- 2004
최근 휴대폰, PDA, 노트북 등의 모바일 제품의 인기에 따라 모바일에 대한 소비자의 관심이 증대되고 있으며, 대형 네트워크 장비보다 소형의 개인 휴대용의 모바일 제품의 성장세가 두드러지고 있다. 이러한 추세에 따라 무선랜에 대한 관심도 증대되고 있다. 본 논문에서는 기존의 ARM 프로세서를 기반으로 802..11 무선랜 환경에 맞는 네트워크 프로세서 구조에 대한 연구를 수행하였다. 그 결과 전송과 수신이 빈번하게 동시에 일어나는 무선랜 환경에서는 multi-threading을 처리할 수 있는 프로세서가 구조(SMT)가 Superscalar 구조에 비해 높은 성능 향상 폭을 보여주었다
PDF

Efficient Parallel Processing for Depth-Map Estimation in Real-Time (실시간 깊이 지도 획득을 위한 효율적인 병렬 처리)

Cho, Chil-Suk;Jun, Ji-In;Choo, Hyun-Gon;Park, Jong-Il
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2012.07a
- /
- pp.44-46
- /
- 2012
Depth map를 구하는 방법 중 많이 사용되어지는 방법으로 stripe 패턴을 이용하는 방법이 존재한다. 이 방법은 Pro-Cam 시스템을 이용하며 프로젝터로 조사한 패턴을 카메라로 촬영하여 원래의 패턴과 촬영된 패턴간의 기하학적인 관계를 구하여 depth map를 구하는 방법이다. 본 논문에서는 이와 같이 구조광을 이용하여 depth map 획득 시스템을 효과적으로 multi-thread를 사용하여 실시간 처리하는 것을 제안한다. 일반적으로 자주 사용되는 multi-threading 기법에는 CPU의 thread를 이용하는 OpenMP와 GPU의 thread를 이용하는 CUDA가 있다. 이 두 가지 기법은 수행하는데 차이점이 존재하기 때문에 상황에 따라 OpenMP가 더 좋은 효율을 보이는 부분이 있고 CUDA가 더 좋은 효율을 보이는 부분이 있다. 때문에 우리는 이 두 가지에 대해서 각 부분의 특성에 맞게 더 좋은 효율을 보이는 multi-thread를 이용하였다. 결과적으로 우리는 $1280{\times}800$의 영상에 대해 25fps 이상의 depth map를 획득하였다.
PDF

GPU-Based Acceleration of Quantum-Inspired Evolutionary Algorithm (GPU를 이용한 Quantum-Inspired Evolutionary Algorithm 가속)

Ryoo, Ji-Hyun;Park, Han-Min;Choi, Ki-Young
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.49 no.8
- /
- pp.1-9
- /
- 2012
Quantum-Inspired Evolutionary Algorithm(QEA) contains sufficient data-level parallelism to be naturally accelerated on GPUs. For an efficient reduction of execution time, however, careful task-mapping should be done to properly reflect the characteristics of CPU and GPU. Furthermore, when deciding which part of the application should run on GPU, we need to consider the data transfer between CPU and GPU memory spaces as well as the data-level parallelism. In addition, the usage of zero-copy host memory, proper choice of the execution configuration, and thread organization considering memory coalescing is important to further reduce the execution time. With all these techniques, we could run QEA 3.69 times faster on average in comparison with the multi-threading CPU for the case of 0-1 knapsack problem with 30,000 items.
PDF KSCI

Development of the software for high speed data transfer of the high-speed, large capacity data archive system for the storage of the correlation data from Korea-Japan Joint VLBI Correlator (KJJVC)

Park, Sun-Youp;Kang, Yong-Woo;Roh, Duk-Gyoo;Oh, Se-Jin;Yeom, Jae-Hwan;Sohn, Bong-Won;Yukitoshi, Kanya;Byun, Do-Young
- Bulletin of the Korean Space Science Society
- /
- 2008.10a
- /
- pp.37.2-37.2
- /
- 2008
Korea-Japan Joint VLBI Correlator (KJJVC), to be used for Korean VLBI Network (KVN) in Korea Astronomy & Space Science Institute (KASI), is a high-speed calculator that outputs the correlation results in the maximum speed of 1.4GB/sec.To receive and record this data keeping up with this speed and with no loss, the design of the software running on the data archive system for receving and recording the output data from the correlator is very important. But, the simple kind of programming using just single thread that receives data from network and records it by turns, can cause a bottleneck effect while processing high speed data and a probable data loss, and cannot utilize the merit of hardwares supporting multi core or hyper threading, or operating systems supporting these hardwares. In this talk we summarize the design of the data transfer software for KJJVC and high speed, large capacity data archive system using general socket programming and multi threading techniques, and the pre-BMT(Bench Marking Test) results from the tests of the storage product providers' proposals using this software.
PDF

Implementation of a Scoreboard Array and a Port Arbiter for In-order SMT Processors (순차적 SMT Processor를 위한 Scoreboard Array와 포트 중재 모듈의 구현)

Heo, Chang-Yong;Hong, In-Pyo;Lee, Yong-Surk
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.41 no.6
- /
- pp.59-70
- /
- 2004
SMT(Simultaneous Multi Threading) architecture uses TLP(Thread Level Parallelism) and increases processor throughput, such that issue slots can be filled with instructions from multiple independent threads. Having multiple ready threads reduces the probability that a functional unit is left idle, which increases processor efficiency. To utilize those advantages for the SMT processors, the issue unit must control the flow of instructions from different threads and not create conflicts among those instructions, which make the SMT issue logic extremely complex. Therefore, our SMT architecture, which is modeled in this paper, uses an in-order-issue and completion scheme, and therefore, can use a simple issue mechanism with a scoreboard already instead of using register renaming or a reorder buffer. However, an SMT scoreboarding mechanism is still more complex and costlier than that of a single threaded conventional processor. This paper proposes an optimal implementation of a scoreboarding mechanism for an ARM-based SMT architecture.
PDF KSCI

Real-time H.264/AVC High 4:4:4 Predictive Decoder Using Multi-Thread and SIMD Instructions (멀티쓰레드와 SIMD 명령어를 이용한 실시간 H.264/AVC High 4:4:4 Predictive 디코더의 구현)

Kim, Yong-Hwan;Kim, Je-Woo;Choi, Byeong-Ho;Lee, Seok-Pil;Paik, Joon-Ki
- 한국정보통신설비학회:학술대회논문집
- /
- 2007.08a
- /
- pp.350-353
- /
- 2007
This paper presents an real-time implementation of H.264/AVC High 4:4:4 Predictive profile decoder using general-purpose processors by exploiting multi-threading technique and Single Instruction Multiple Data (SIMD) instructions without any quality degradation. We analyze differences between the existing High profile and High 4:4:4 Predictive profile decoder, and show various optimization techniques to decode high fidelity and high definition (HD) video in real-time. Simulation results show that the proposed decoder can play high fidelity HD video at average 40 frames per seconds (fps) for the IBBrBP bistream and about 50 fps for the Intra-only bitstream.
PDF

Search Result 80, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)