• Title/Summary/Keyword: Processor

Search Result 4,827, Processing Time 0.025 seconds

Learning Module Design for Neural Network Processor(ERNIE) (신경회로망칩(ERNIE)을 위한 학습모듈 설계)

  • Jung, Je-Kyo;Kim, Yung-Joo;Dong, Sung-Soo;Lee, Chong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2003.11b
    • /
    • pp.171-174
    • /
    • 2003
  • In this paper, a Learning module for a reconfigurable neural network processor(ERNIE) was proposed for an On-chip learning. The existing reconfigurable neural network processor(ERNIE) has a much better performance than the software program but it doesn't support On-chip learning function. A learning module which is based on Back Propagation algorithm was designed for a help of this weak point. A pipeline structure let the learning module be able to update the weights rapidly and continuously. It was tested with five types of alphabet font to evaluate learning module. It compared with C programed neural network model on PC in calculation speed and correctness of recognition. As a result of this experiment, it can be found that the neural network processor(ERNIE) with learning module decrease the neural network training time efficiently at the same recognition rate compared with software computing based neural network model. This On-chip learning module showed that the reconfigurable neural network processor(ERNIE) could be a evolvable neural network processor which can fine the optimal configuration of network by itself.

  • PDF

A Parallel Processor System for Cultural Assets Image Retrieval (문화재 검색을 위한 병렬처리기 구조)

  • Yoon, Hee-Jun;Lee, Hyung;Han, Ki-Sun;Partk, Jong-Won
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.2
    • /
    • pp.154-161
    • /
    • 1998
  • This paper proposes a parallel processor system which processes cultural assets image recognition and retrieval algorithm in real time. A serial algorithm which is developed for the parallel processor system is parallellized. The parallel processor system consists of a control unit, 100 PE(Processing Elements), and 10 Park's multi-access memory systems which has 11 memory modules per each one. The parallel processor system is simulated by CADENCE Verilog-XL which is a package for the hardware simulation. With the same simulated results as that of the serial algorithm, the speed ratio of the parallel algorithm to the serial one is 81. The parallel processor system we proposed is quite effective for cultural assets image processing.

  • PDF

MXTM-CFAR Processor and Its Performance Analysis (MXTM-CFAR 처리기와 그 성능분석)

  • 김재곤;김응태;송익호;김형명
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.7
    • /
    • pp.719-729
    • /
    • 1992
  • An improved MXTM (maximum trimmed mean) -CFAR (constant false alarm rate) processor is proposed to reduce false alarm rates In detecting radar targets and Its performance character is ticsare analyzed to be compared with those of other CFAR processors. The proposed MXTM-CFAR processor is obtained by combining the GO (greatest of ) -CFAR processor reducing excessive falsealarm rate at riutter edges with the TM-CFAR processor showing good performances In homo-geneous Jnonhornog eneous background. Performance analyses have been done by computing detection probability, constant false alarm rate and detection thresholds under the homogeneous or multiple target environments and at the clutter edges. Analysis results how that the proposed CFAR processor maintains its performance as good as those of,05(order statistics) and TM-CFAR inhomogeneous and multiple target environments and Can reduce the false alarm rate at clutter edges. Overall computing time hfs been also reduced.

  • PDF

Design and Implementation of Content Switching Network Processor and Scalable Switch Fabric

  • Chang, You-Sung;Yi, Ju-Hwan;Oh, Hun-Seung;Lee, Seung-Wang;Kang, Moo-Kyung;Chun, Jung-Bum;Lee, Jun-Hee;Kim, Jin-Seok;Kim, Sang-Ho;Jung, Hee-Jae;Hong, Il-Sung;Kim, Yong-Hwan;Lee, Yu-Sik;Kyung, Chong-Min
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.3 no.4
    • /
    • pp.167-174
    • /
    • 2003
  • This paper proposes a network processor especially optimized for content switching. With 2Gbps port capability, it integrates packet processor cluster, content-based classification engine and traffic manager on a single chip. A switch fabric architecture is also designed for scale-up of the network processor's capability over hundreds gigabit bandwidth. Applied in real network systems, the network processor shows wire-speed network address translator (NAT) and content-based switching performance.

An Implementation of SoC FPGA-based Real-time Object Recognition and Tracking System (SoC FPGA 기반 실시간 객체 인식 및 추적 시스템 구현)

  • Kim, Dong-Jin;Ju, Yeon-Jeong;Park, Young-Seak
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.10 no.6
    • /
    • pp.363-372
    • /
    • 2015
  • Recent some SoC FPGA Releases that integrate ARM processor and FPGA fabric show better performance compared to the ASIC SoC used in typical embedded image processing system. In this study, using the above advantages, we implement a SoC FPGA-based Real-Time Object Recognition and Tracking System. In our system, the video input and output, image preprocessing process, and background subtraction processing were implemented in FPGA logics. And the object recognition and tracking processes were implemented in ARM processor-based programs. Our system provides the processing performance of 5.3 fps for the SVGA video input. This is about 79 times faster processing power than software approach based on the Nios II Soft-core processor, and about 4 times faster than approach based the HPS processor. Consequently, if the object recognition and tracking system takes a design structure combined with the FPGA logic and HPS processor-based processes of recent SoC FPGA Releases, then the real-time processing is possible because the processing speed is improved than the system that be handled only by the software approach.

Easily Adaptable On-Chip Debug Architecture for Multicore Processors

  • Xu, Jing-Zhe;Park, Hyeongbae;Jung, Seungpyo;Park, Ju Sung
    • ETRI Journal
    • /
    • v.35 no.2
    • /
    • pp.301-310
    • /
    • 2013
  • Nowadays, the multicore processor is watched with interest by people all over the world. As the design technology of system on chip has developed, observing and controlling the processor core's internal state has not been easy. Therefore, multicore processor debugging is very difficult and time-consuming. Thus, we need a reliable and efficient debugger to find the bugs. In this paper, we propose an on-chip debug architecture for multicore processors that is easily adaptable and flexible. It is based on the JTAG standard and supports monitoring mode debugging, which is different from run-stop mode debugging. Compared with the debug architecture that supports the run-stop mode debugging, the proposed architecture is easily applied to a debugger and has the advantage of having a desirable gate count and execution cycle. To verify the on-chip debug architecture, it is applied to the debugger of the prototype multicore processor and is tested by interconnecting it with a software debugger based on GDB and configured for the target processor.

An Industrial Case Study of the ARM926EJ-S Power Modeling

  • Kim, Hyun-Suk;Kim, Seok-Hoon;Lee, Ik-Hwan;Yoo, Sung-Joo;Chung, Eui-Young;Choi, Kyu-Myung;Kong, Jeong-Taek;Eo, Soo-Kwan
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.5 no.4
    • /
    • pp.221-228
    • /
    • 2005
  • In this work, our goal is to develop a fast and accurate power model of the ARM926EJ-S processor in the industrial design environment. Compared with existing work on processor power modeling which focuses on the power states of processor core, our model mostly focuses on the cache power model. It gives more than 93% accuracy and 1600 times speedup compared with post-layout gate-level power estimation. We also address two practical issues in applying the processor power model to the real design environment. One is to incorporate the power model into an existing commercial instruction set simulator. The other is the re-characterization of power model parameters to cope with different gate-level netlists of the processor obtained from different design teams and different fabrication technology.

80μW/MHz 0.68V Ultra Low-Power Variation-Tolerant Superscalar Dual-Core Application Processor

  • Kwon, Youngsu;Lee, Jae-Jin;Shin, Kyoung-Seon;Han, Jin-Ho;Byun, Kyung-Jin;Eum, Nak-Woong
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.2
    • /
    • pp.71-77
    • /
    • 2015
  • Upcoming ground-breaking applications for always-on tiny interconnected devices steadily demand two-fold features of processor cores: aggressively low power consumption and enhanced performance. We propose implementation of a novel superscalar low-power processor core with a low supply voltage. The core implements intra-core low-power microarchitecture with minimal performance degradation in instruction fetch, branch prediction, scheduling, and execution units. The inter-core lockstep not only detects malfunctions during low-voltage operation but also carries out software-based recovery. The chip incorporates a pair of cores, high-speed memory, and peripheral interfaces to be implemented with a 65nm node. The processor core consumes only 24mW at 350MHz and 0.68V, resulting in power efficiency of $80{\mu}W/MHz$. The operating frequency of the core reaches 850MHz at 1.2V.

A 95% accurate EEG-connectome Processor for a Mental Health Monitoring System

  • Kim, Hyunki;Song, Kiseok;Roh, Taehwan;Yoo, Hoi-Jun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.16 no.4
    • /
    • pp.436-442
    • /
    • 2016
  • An electroencephalogram (EEG)-connectome processor to monitor and diagnose mental health is proposed. From 19-channel EEG signals, the proposed processor determines whether the mental state is healthy or unhealthy by extracting significant features from EEG signals and classifying them. Connectome approach is adopted for the best diagnosis accuracy, and synchronization likelihood (SL) is chosen as the connectome feature. Before computing SL, reconstruction optimizer (ReOpt) block compensates some parameters, resulting in improved accuracy. During SL calculation, a sparse matrix inscription (SMI) scheme is proposed to reduce the memory size to 1/24. From the calculated SL information, a small world feature extractor (SWFE) reduces the memory size to 1/29. Finally, using SLs or small word features, radial basis function (RBF) kernel-based support vector machine (SVM) diagnoses user's mental health condition. For RBF kernels, look-up-tables (LUTs) are used to replace the floating-point operations, decreasing the required operation by 54%. Consequently, The EEG-connectome processor improves the diagnosis accuracy from 89% to 95% in Alzheimer's disease case. The proposed processor occupies $3.8mm^2$ and consumes 1.71 mW with $0.18{\mu}m$ CMOS technology.

Effects of DRAM in The Embedded Processor Performance (DRAM이 임베디드 프로세서의 성능에 끼치는 영향)

  • Lee, Jong-Bok
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.943-948
    • /
    • 2017
  • Currently, embedded systems designed for specific applications are used extensively in consumer electronics, smart phones, autonomous vehicles, robots, and plant control, etc. In addition, the importance of DRAM, which has a great influence on the performance of an embedded processor constituting an embedded system, has been increasing day by day, and research on DRAM has been actively conducted in industry and academia. Therefore, it is important to have a more accurate DRAM model in order to obtain reliable results when evaluating the performance of an embedded processor through simulation. In this paper, we developed an embedded processor simulator capable of interworking with a DRAM simulator. We also analyzed the influence of the DRAM model, which operates correctly on a cycle-by-cycle basis, on the performance of the embedded processor by using the MiBench embedded benchmark.