• Title/Summary/Keyword: On-Chip Memory

Search Result 296, Processing Time 0.026 seconds

Implementation of an AMBA-Based IP for H.264 Transform and Quantization (H.264 변환 및 양자화 기능을 갖는 AMBA 기반 IP 구현)

  • Lee, Seon-Young;Cho, Kyeong-Soon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.10 s.352
    • /
    • pp.126-133
    • /
    • 2006
  • This paper describes an AMBA-based IP to perform forward and inverse transform and quantization required in the H.264 video compression standard. The transform and quantization circuit was optimized for area and performance. The AHB wrapper was added to the circuit for the AMBA-based operation. The user of the IP can specify how long the bus may be occupied by the IP and also where the video data are stored in the external memory. The function of the proposed IP based on AMBA Specification was verified on the platform board with Xilinx FPGA and ARM9 processor. We fabricated an MPW chip using $0.25{\mu}m$ standard cells and observed its correct operations on silicon.

SoC Virtual Platform with Secure Key Generation Module for Embedded Secure Devices

  • Seung-Ho Lim;Hyeok-Jin Lim;Seong-Cheon Park
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.116-130
    • /
    • 2024
  • In the Internet-of-Things (IoT) or blockchain-based network systems, secure keys may be stored in individual devices; thus, individual devices should protect data by performing secure operations on the data transmitted and received over networks. Typically, secure functions, such as a physical unclonable function (PUF) and fully homomorphic encryption (FHE), are useful for generating safe keys and distributing data in a network. However, to provide these functions in embedded devices for IoT or blockchain systems, proper inspection is required for designing and implementing embedded system-on-chip (SoC) modules through overhead and performance analysis. In this paper, a virtual platform (SoC VP) was developed that includes a secure key generation module with a PUF and FHE. The SoC VP platform was implemented using SystemC, which enables the execution and verification of various aspects of the secure key generation module at the electronic system level and analyzes the system-level execution time, memory footprint, and performance, such as randomness and uniqueness. We experimentally verified the secure key generation module, and estimated the execution of the PUF key and FHE encryption based on the unit time of each module.

A Combined BTB Architecture for effective branch prediction (효율적인 분기 예측을 위한 공유 구조의 BTB)

  • Lee Yong-hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.7
    • /
    • pp.1497-1501
    • /
    • 2005
  • Branch instructions which make the sequential instruction flow changed cause pipeline stalls in microprocessor. The pipeline hazard due to branch instructions are the most serious problem that degrades the performance of microprocessors. Branch target buffer predicts whether a branch will be taken or not and supplies the address of the next instruction on the basis of that prediction. If the hanch target buffer predicts correctly, the instruction flow will not be stalled. This leads to the better performance of microprocessor. In this paper, the architecture of a ta8 memory that branch target buffer and TLB can share is presented. Because the two tag memories used for branch target buffer and TLB each is replaced by single combined tag memory, we can expect the smaller chip size and the faster prediction. This shared tag architecture is more advantageous for the microprocessors that uses more bits of address and exploits much more instruction level parallelism.

The Design of MPI Hardware Unit for Enhanced Broadcast Communication (효율적인 브로드캐스트 통신을 지원하는 MPI 하드웨어 유닛 설계)

  • Yun, Hee-Jun;Chung, Won-Young;Lee, Yong-Surk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.11B
    • /
    • pp.1329-1338
    • /
    • 2011
  • This paper proposes an algorithm and hardware architecture for a broadcast communication which has the worst bottleneck among multiprocessor using distributed memory architectures. In conventional systems, collective communication is converted into point-to-point communications by MPI library cell without considering the state of communication port of each processing node which represents the processing node is in busy state or free state. If conflicting point-to-point communication occurs during broadcast communication, the transmitting speed for broadcast communication is decreased. Thus, this paper proposed an algorithm which determines the order of point-to-point communications for broadcast communication according to the state of each processing node. According to the state of each processing node, the proposed algorithm decreases total broadcast communication time by transmitting message preferentially to the processing node with communication port in free state. The proposed MPI unit for broadcast communication is evaluated by modeling it with systemC. In addition, it achieved a highly improved performance for broadcast communication up to 78% with 16 nodes. This result shows the proposed algorithm is useful to improving total performance of MPSoC.

Design of Smart Frame SoC to support the IoT Services (IoT 서비스를 지원하는 Smart Frame SoC 설계)

  • Yang, Dong-hun;Hwang, In-han;Kim, A-ra;Guard, Kanda;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.503-506
    • /
    • 2015
  • In accordance with IoT(Internet of Things) commercialization, the need to design SoC-based hardware platform with wireless communication is increasing. This paper therefor proposes an SoC platform architecture with Smart Frame System inter-communicating between devices. Wireless communication functions and high-performance real-time image processing hardware structure was applied to existing digital photo frame. We developed a smart phone application to control the smart frame through Bluetooth communication. The SoC platform hardware consists of CIS controller, Memory controller, ISP(Image Signal Processing) module for image scaling, Bluetooth Interface for inter-communicating between devices, VGA/TFT-LCD controller for displaying video. The Smart Frame System to support the IoT services was implemented and verified using HBE-SoC-IPD test board equipped with Virtex4 XC4VLX80 FPGA. The operating frequency is 54MHz.

  • PDF

Analysis of Tensor Processing Unit and Simulation Using Python (텐서 처리부의 분석 및 파이썬을 이용한 모의실행)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.3
    • /
    • pp.165-171
    • /
    • 2019
  • The study of the computer architecture has shown that major improvements in price-to-energy performance stems from domain-specific hardware development. This paper analyzes the tensor processing unit (TPU) ASIC which can accelerate the reasoning of the artificial neural network (NN). The core device of the TPU is a MAC matrix multiplier capable of high-speed operation and software-managed on-chip memory. The execution model of the TPU can meet the reaction time requirements of the artificial neural network better than the existing CPU and the GPU execution models, with the small area and the low power consumption even though it has many MAC and large memory. Utilizing the TPU for the tensor flow benchmark framework, it can achieve higher performance and better power efficiency than the CPU or CPU. In this paper, we analyze TPU, simulate the Python modeled OpenTPU, and synthesize the matrix multiplication unit, which is the key hardware.

Design of a Portable Activity Monitoring System (휴대용 활동 상태 모니터링 시스템의 설계)

  • Lee, Seung-Hyung;Park, Ho-Dong;Yoon, Hyung-Ro;Lee, Kyung-Joung
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.1
    • /
    • pp.32-38
    • /
    • 2002
  • This paper describes a development of a portable physical activity monitoring system using two accelerometers to quantify physical activity. The system hardware consists of two piezoresistive accelerometers, amplifiers with gain of 30, lowpass filters with cut-off frequency of 15Hz, offset control circuits, one-chip microcontroller and flash memory card. In order to evaluate the performance of the system we acquired 3 channel data at 32 sample/sec from body-fixed accelerometers in chest and right upper leg. And then the acquired data were processed by MatLab on personal computer. We tried to distinguish not only fundamental actions which are steady-state activities such as standing, sitting, and lying but also dynamic activities with walking, up a stairway, down a stairway, and running. Five subjects participated the evaluation process which compare the video data with the measured data. As a result, the activity classification rate of 90.6% on average was obtained. Overall results showed that the steady-state activities could be classified from the low component of 3-axis acceleration signal and dynamic activities could be distinguished from frequency analysis using wavelet transform and FFT. Finally, we could find that this system can be applied to acquire and analyze the static and dynamic physical activity data.

Improving Data Accuracy Using Proactive Correlated Fuzzy System in Wireless Sensor Networks

  • Barakkath Nisha, U;Uma Maheswari, N;Venkatesh, R;Yasir Abdullah, R
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3515-3538
    • /
    • 2015
  • Data accuracy can be increased by detecting and removing the incorrect data generated in wireless sensor networks. By increasing the data accuracy, network lifetime can be increased parallel. Network lifetime or operational time is the time during which WSN is able to fulfill its tasks by using microcontroller with on-chip memory radio transceivers, albeit distributed sensor nodes send summary of their data to their cluster heads, which reduce energy consumption gradually. In this paper a powerful algorithm using proactive fuzzy system is proposed and it is a mixture of fuzzy logic with comparative correlation techniques that ensure high data accuracy by detecting incorrect data in distributed wireless sensor networks. This proposed system is implemented in two phases there, the first phase creates input space partitioning by using robust fuzzy c means clustering and the second phase detects incorrect data and removes it completely. Experimental result makes transparent of combined correlated fuzzy system (CCFS) which detects faulty readings with greater accuracy (99.21%) than the existing one (98.33%) along with low false alarm rate.

80μW/MHz 0.68V Ultra Low-Power Variation-Tolerant Superscalar Dual-Core Application Processor

  • Kwon, Youngsu;Lee, Jae-Jin;Shin, Kyoung-Seon;Han, Jin-Ho;Byun, Kyung-Jin;Eum, Nak-Woong
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.2
    • /
    • pp.71-77
    • /
    • 2015
  • Upcoming ground-breaking applications for always-on tiny interconnected devices steadily demand two-fold features of processor cores: aggressively low power consumption and enhanced performance. We propose implementation of a novel superscalar low-power processor core with a low supply voltage. The core implements intra-core low-power microarchitecture with minimal performance degradation in instruction fetch, branch prediction, scheduling, and execution units. The inter-core lockstep not only detects malfunctions during low-voltage operation but also carries out software-based recovery. The chip incorporates a pair of cores, high-speed memory, and peripheral interfaces to be implemented with a 65nm node. The processor core consumes only 24mW at 350MHz and 0.68V, resulting in power efficiency of $80{\mu}W/MHz$. The operating frequency of the core reaches 850MHz at 1.2V.

Research on the Waveform Generator Technology for the SAR Payload

  • Won, Young-Jin;Youn, Young-Su;Kim, Jin-Hee
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.37 no.2
    • /
    • pp.228.1-228.1
    • /
    • 2012
  • Digital waveform generation technology for SAR payload can be divided into DDS(Direct Digital Synthesizer) method and Memory Mapped(M/M) method. DDS is the single chip which consists of the Sine Table, NCO(Numerically Controlled Oscillator), DAC, and so on. DDS method is a very simple method because the circuit configuration is not complex but has a disadvantage that can not control phase and amplitude easily by using NCO. M/M method has the complexity of the circuit configuration because it requires the memories which stores the waveforms, the control circuits, and DAC. And this method should apply the high interface technology for being compatible with the wide bandwidth of the digital signal and has the difficulty for PCB design because the number of the signal lines should be increased according to the number of the data bits for DAC. Although it has several disadvantages, this method has the capability of pre-distortion function which can compensate the phase and amplitude characteristics of the system and also has an excellent advantage to make any arbitrary waveform, so this method is considered as an important technology with DDS method. This research describes the technological trends of the waveform generator for the SAR payload and analyzes the characteristics of the technology.

  • PDF