• Title/Summary/Keyword: Hardware sharing

Search Result 171, Processing Time 0.022 seconds

Exploiting Thread-Level Parallelism in Lockstep Execution by Partially Duplicating a Single Pipeline

  • Oh, Jaeg-Eun;Hwang, Seok-Joong;Nguyen, Huong Giang;Kim, A-Reum;Kim, Seon-Wook;Kim, Chul-Woo;Kim, Jong-Kook
    • ETRI Journal
    • /
    • v.30 no.4
    • /
    • pp.576-586
    • /
    • 2008
  • In most parallel loops of embedded applications, every iteration executes the exact same sequence of instructions while manipulating different data. This fact motivates a new compiler-hardware orchestrated execution framework in which all parallel threads share one fetch unit and one decode unit but have their own execution, memory, and write-back units. This resource sharing enables parallel threads to execute in lockstep with minimal hardware extension and compiler support. Our proposed architecture, called multithreaded lockstep execution processor (MLEP), is a compromise between the single-instruction multiple-data (SIMD) and symmetric multithreading/chip multiprocessor (SMT/CMP) solutions. The proposed approach is more favorable than a typical SIMD execution in terms of degree of parallelism, range of applicability, and code generation, and can save more power and chip area than the SMT/CMP approach without significant performance degradation. For the architecture verification, we extend a commercial 32-bit embedded core AE32000C and synthesize it on Xilinx FPGA. Compared to the original architecture, our approach is 13.5% faster with a 2-way MLEP and 33.7% faster with a 4-way MLEP in EEMBC benchmarks which are automatically parallelized by the Intel compiler.

  • PDF

Design and Implementation of Multi-mode Sensor Signal Processor on FPGA Device (다중모드 센서 신호 처리 프로세서의 FPGA 기반 설계 및 구현)

  • Soongyu Kang;Yunho Jung
    • Journal of Sensor Science and Technology
    • /
    • v.32 no.4
    • /
    • pp.246-251
    • /
    • 2023
  • Internet of Things (IoT) systems process signals from various sensors using signal processing algorithms suitable for the signal characteristics. To analyze complex signals, these systems usually use signal processing algorithms in the frequency domain, such as fast Fourier transform (FFT), filtering, and short-time Fourier transform (STFT). In this study, we propose a multi-mode sensor signal processor (SSP) accelerator with an FFT-based hardware design. The FFT processor in the proposed SSP is designed with a radix-2 single-path delay feedback (R2SDF) pipeline architecture for high-speed operation. Moreover, based on this FFT processor, the proposed SSP can perform filtering and STFT operation. The proposed SSP is implemented on a field-programmable gate array (FPGA). By sharing the FFT processor for each algorithm, the required hardware resources are significantly reduced. The proposed SSP is implemented and verified on Xilinxh's Zynq Ultrascale+ MPSoC ZCU104 with 53,591 look-up tables (LUTs), 71,451 flip-flops (FFs), and 44 digital signal processors (DSPs). The FFT, filtering, and STFT algorithm implementations on the proposed SSP achieve 185x average acceleration.

An Efficient VM-Level Scaling Scheme in an IaaS Cloud Computing System: A Queueing Theory Approach

  • Lee, Doo Ho
    • International Journal of Contents
    • /
    • v.13 no.2
    • /
    • pp.29-34
    • /
    • 2017
  • Cloud computing is becoming an effective and efficient way of computing resources and computing service integration. Through centralized management of resources and services, cloud computing delivers hosted services over the internet, such that access to shared hardware, software, applications, information, and all resources is elastically provided to the consumer on-demand. The main enabling technology for cloud computing is virtualization. Virtualization software creates a temporarily simulated or extended version of computing and network resources. The objectives of virtualization are as follows: first, to fully utilize the shared resources by applying partitioning and time-sharing; second, to centralize resource management; third, to enhance cloud data center agility and provide the required scalability and elasticity for on-demand capabilities; fourth, to improve testing and running software diagnostics on different operating platforms; and fifth, to improve the portability of applications and workload migration capabilities. One of the key features of cloud computing is elasticity. It enables users to create and remove virtual computing resources dynamically according to the changing demand, but it is not easy to make a decision regarding the right amount of resources. Indeed, proper provisioning of the resources to applications is an important issue in IaaS cloud computing. Most web applications encounter large and fluctuating task requests. In predictable situations, the resources can be provisioned in advance through capacity planning techniques. But in case of unplanned and spike requests, it would be desirable to automatically scale the resources, called auto-scaling, which adjusts the resources allocated to applications based on its need at any given time. This would free the user from the burden of deciding how many resources are necessary each time. In this work, we propose an analytical and efficient VM-level scaling scheme by modeling each VM in a data center as an M/M/1 processor sharing queue. Our proposed VM-level scaling scheme is validated via a numerical experiment.

Cluster-based Energy-aware Data Sharing Scheme to Support a Mobile Sink in Solar-Powered Wireless Sensor Networks (태양 에너지 수집형 센서 네트워크에서 모바일 싱크를 지원하기 위한 클러스터 기반 에너지 인지 데이터 공유 기법)

  • Lee, Hong Seob;Yi, Jun Min;Kim, Jaeung;Noh, Dong Kun
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1430-1440
    • /
    • 2015
  • In contrast with battery-based wireless sensor networks (WSNs), solar-powered WSNs can operate for a longtime assuming that there is no hardware fault. Meanwhile, a mobile sink can save the energy consumption of WSN, but its ineffective movement may incur so much energy waste of not only itself but also an entire network. To solve this problem, many approaches, in which a mobile sink visits only on clustering-head nodes, have been proposed. But, the clustering scheme also has its own problems such as energy imbalance and data instability. In this study, therefore, a cluster-based energy-aware data-sharing scheme (CE-DSS) is proposed to effectively support a mobile sink in a solar-powered WSN. By utilizing the redundant energy efficiently, CE-DSS shares the gathered data among cluster-heads, while minimizing the unexpected black-out time. The simulation results show that CE-DSS increases the data reliability as well as conserves the energy of the mobile sink.

Treatment of Transverse Patella Fracture with Minimally Invasive Load-Sharing Patellar Tendon Suture and Cannulated Screws (최소 침습 기법 슬개건 부하 분산 봉합술과 유관 나사못을 이용한 슬개골 횡골절의 치료)

  • Lee, Beom-Seok;Park, Byeong-Mun;Yang, Bong-Seok;Kim, Kyu-Wan
    • Journal of the Korean Orthopaedic Association
    • /
    • v.56 no.6
    • /
    • pp.540-545
    • /
    • 2021
  • A transverse fracture is the most common type of displaced patella fracture requiring surgery. These fractures are commonly fixed with parallel Kirschner wires or screws that cross the fracture line, often with an additional tension band. Nevertheless, conventional fixation methods of patella fractures have prevalent complications caused by the protrusion of wires or pins. These complications necessitate additional surgery for hardware removal, increase medical cost, and can limit the function of the knee joint. This paper reports cases treated with a minimally invasive load-sharing percutaneous suture of the patella tendon. The procedure provides reliable fixation for transverse patella fractures, minimizes soft tissue injuries, preserves blood flow, and reduces postoperative pain. In addition, the procedure also reduces the irritation and pain caused by the internal fixture, thereby reducing the risk of restricted knee joint movement.

An Efficient Hardware Implementation of ARIA Block Cipher Algorithm Supporting Four Modes of Operation and Three Master Key Lengths (4가지 운영모드와 3가지 마스터 키 길이를 지원하는 블록암호 알고리듬 ARIA의 효율적인 하드웨어 구현)

  • Kim, Dong-Hyeon;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.11
    • /
    • pp.2517-2524
    • /
    • 2012
  • This paper describes an efficient implementation of KS(Korea Standards) block cipher algorithm ARIA. The ARIA crypto-processor supports three master key lengths of 128/192/256-bit and four modes of operation including ECB, CBC, OFB and CTR. A hardware sharing technique, which shares round function in encryption/decryption with key initialization, is employed to reduce hardware complexity. It reduces about 20% of gate counts when compared with straightforward implementation. The ARIA crypto-processor is verified by FPGA implementation, and synthesized with a $0.13-{\mu}m$ CMOS cell library. It has 46,100 gates on an area of $684-{\mu}m{\times}684-{\mu}m$ and the estimated throughput is about 1.28 Gbps at 200 MHz@1.2V.

Hardware Design of Arccosine Function for Mobile Vector Graphics Processor (모바일 벡터 그래픽 프로세서용 역코사인 함수의 하드웨어 설계)

  • Choi, Byeong-Yoon;Lee, Jong-Hyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.727-736
    • /
    • 2009
  • In this paper, the $arccos(cos^{-1})$ arithmetic unit for mobile graphics accelerator is designed. The mobile vector graphics applications need tight area, execution time, power dissipation, and accuracy constraints compared to desktop PC applications. The designed processor adopts 2nd-order polynomial approximation scheme based on IEEE floating point data format to satisfy speed and accuracy conditions and reduces area via hardware sharing structure. The arccosine processor consists of 15,280 gates and its estimated operating frequency is about 125Mhz at operating condition of $0.35{\mu}m$ CMOS technology. Because the processor can execute arccosine function within 7 clock cycles, it has about 17 MOPS(million arccos operations per second) execution rate and can be applicable to mobile OpenVG processor. And because of its flexible architecture, it can be applicable to the various transcendental functions such as exponential, trigonometric and logarithmic functions via replacement of ROM and minor hardware modification.

An HEVC intra encoder sharing DCT with RDO for a low complex hardware (하드웨어 복잡도를 줄이기 위한 RDO내 DCT 공유구조의 HEVC 화면내 예측부호화기)

  • Lee, Sukho;Jang, Juneyoung;Byun, Kyungjun;Eum, Nakwoong
    • Smart Media Journal
    • /
    • v.3 no.4
    • /
    • pp.16-21
    • /
    • 2014
  • HEVC is the latest joint video coding standard with ITU-T SG16 WP and ISO/IEC JTC1/SC29/WG11. Its coding efficiency is about two times compared to H.264 high profile. Intra prediction has 35 directional modes including dc and planer. However an accurate mode decision on lots of modes with SSE is too costly to implement it with hardware. The key idea of this paper is a DCT shared architecture to reduce the complexity of HEVC intra encoder. It is to use same DCT block to quantize as well as to calculate SSE in RDO. The proposed intra encoder uses two step mode decision to lighten complexity with simplified RDO blocks and shares the transform resources. Its BD-rate increase is negligible at 20% on hardware aspect and the operating clock frequency is 300MHz@60fps on FHD ($1920{\times}1080$) image.

A Cortex-M0 based Security System-on-Chip Embedded with Block Ciphers and Hash Function IP (블록암호와 해시 함수 IP가 내장된 Cortex-M0 기반의 보안 시스템 온 칩)

  • Choe, Jun-Yeong;Choi, Jun-Baek;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.23 no.2
    • /
    • pp.388-394
    • /
    • 2019
  • This paper describes a design of security system-on-chip (SoC) that integrates a Cortex-M0 CPU with an AAW (ARIA-AES- Whirlpool) crypto-core which implements two block cipher algorithms of ARIA and AES and a hash function Whirlpool into an unified hardware architecture. The AAW crypto-core was implemented in a small area through hardware sharing based on algorithmic characteristics of ARIA, AES and Whirlpool, and it supports key sizes of 128-bit and 256-bit. The designed security SoC was implemented on FPGA device and verified by hardware-software co-operation. The AAW crypto-core occupied 5,911 slices, and the AHB_Slave including the AAW crypto-core was implemented with 6,366 slices. The maximum clock frequency of the AHB_Slave was estimated at 36 MHz, the estimated throughputs of the ARIA-128 and the AES-128 was 83 Mbps and 78 Mbps respectively, and the throughput of the Whirlpool hash function of 512-bit block was 156 Mbps.

Intents of Acquisitions in Information Technology Industrie (정보기술 산업에서의 인수 유형별 인수 의도 분석)

  • Cho, Wooje;Chang, Young Bong;Kwon, Youngok
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.123-138
    • /
    • 2016
  • This study investigates intents of acquisitions in information technology industries. Mergers and acquisitions are a strategic decision at corporate-level and have been an important tool for a firm to grow. Plenty of firms in information technology industries have acquired startups to increase production efficiency, expand customer base, or improve quality over the last decades. For example, Google has made about 200 acquisitions since 2001, Cisco has acquired about 210 firms since 1993, Oracle has made about 125 acquisitions since 1994, and Microsoft has acquired about 200 firms since 1987. Although there have been many existing papers that theoretically study intents or motivations of acquisitions, there are limited papers that empirically investigate them mainly because it is challenging to measure and quantify intents of M&As. This study examines the intent of acquisitions by measuring specific intents for M&A transactions. Using our measures of acquisition intents, we compare the intents by four acquisition types: (1) the acquisition where a hardware firm acquires a hardware firm, (2) the acquisition where a hardware firm acquires a software/IT service firm, (3) the acquisition where a software/IT service firm acquires a hardware firm, and (4) the acquisition where a software /IT service firm acquires a software/IT service firm. We presume that there are difference in reasons why a hardware firm acquires another hardware firm, why a hardware firm acquires a software firm, why a software/IT service firm acquires a hardware firm, and why a software/IT service firm acquires another software/IT service firm. Using data of the M&As in US IT industries, we identified major intents of the M&As. The acquisition intents are identified based on the press release of M&A announcements and measured with four categories. First, an acquirer may have intents of cost saving in operations by sharing common resources between the acquirer and the target. The cost saving can accrue from economies of scope and scale. Second, an acquirer may have intents of product enhancement/development. Knowledge and skills transferred from the target may enable the acquirer to enhance the product quality or to expand product lines. Third, an acquirer may have intents of gain additional customer base to expand the market, to penetrate the market, or to enter a foreign market. Fourth, a firm may acquire a target with intents of expanding customer channels. By complementing existing channel to the customer, the firm can increase its revenue. Our results show that acquirers have had intents of cost saving more in acquisitions between hardware companies than in acquisitions between software companies. Hardware firms are more likely to acquire with intents of product enhancement or development than software firms. Overall, the intent of product enhancement/development is the most frequent intent in all of the four acquisition types, and the intent of customer base expansion is the second. We also analyze our data with the classification of production-side intents and customer-side intents, which is based on activities of the value chain of a firm. Intents of cost saving operations and those of product enhancement/development can be viewed as production-side intents and intents of customer base expansion and those of expanding customer channels can be viewed as customer-side intents. Our analysis shows that the ratio between the number of customer-side intents and that of production-side intents is higher in acquisitions where a software firm is an acquirer than in the acquisitions where a hardware firm is an acquirer. This study can contribute to IS literature. First, this study provides insights in understanding M&As in IT industries by answering for question of why an IT firm intends to another IT firm. Second, this study also provides distribution of acquisition intents for acquisition types.