• Title/Summary/Keyword: Bit operation

Search Result 752, Processing Time 0.027 seconds

Bit Operation Optimization and DNN Application using GPU Acceleration (GPU 가속기를 통한 비트 연산 최적화 및 DNN 응용)

  • Kim, Sang Hyeok;Lee, Jae Heung
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1314-1320
    • /
    • 2019
  • In this paper, we propose a new method for optimizing bit operations and applying them to DNN(Deep Neural Network) in software environment. As a method for this, we propose a packing function for bitwise optimization and a masking matrix multiplication operation for application to DNN. The packing function converts 32-bit real value to 2-bit quantization value through threshold comparison operation. When this sequence is over, four 32-bit real values are changed to one 8-bit value. The masking matrix multiplication operation consists of a special operation for multiplying the packed weight value with the normal input value. And each operation was then processed in parallel using a GPU accelerator. As a result of this experiment, memory saved about 16 times than 32-bit DNN Model. Nevertheless, the accuracy was within 1%, similar to the 32-bit model.

A study on the programming conditions suppressing the lateral diffusion of charges for the SONOS two-bit memory (SONOS two-bit 메모리의 측면확산에 영향을 주는 programming 조건 연구)

  • Lee, Myung-Shik;An, Ho-Myung;Seo, Kwang-Yell;Koh, Jung-Hyuk;Kim, Byung-Cheul;Kim, Joo-Yeon
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2005.11a
    • /
    • pp.117-120
    • /
    • 2005
  • The SONOS devices have been fabricated by the conventional $0.35{\mu}m$ complementary metal-oxide-semiconductor (CMOS) process with NOR array. Two-bit operation using conventional process achieve the high density memory compare with other two-bit memory. Lateral diffusion phenomenon in the two-bit operation cause soft error in the memory. In this study, the programming conditions arc investigated in order to reduce lateral diffusion for two-bit operation of CSL-NOR type SONOS flash cell.

  • PDF

Programming Characteristics of the Multi-bit Devices Based on SONOS Structure (SONOS 구조를 갖는 멀티 비트 소자의 프로그래밍 특성)

  • 김주연
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.16 no.9
    • /
    • pp.771-774
    • /
    • 2003
  • In this paper, the programming characteristics of the multi-bit devices based on SONOS structure are investigated. Our devices have been fabricated by 0.35 $\mu\textrm{m}$ complementary metal-oxide-semiconductor (CMOS) process with LOCOS isolation. In order to achieve the multi-bit operation per cell, charges must be locally frapped in the nitride layer above the channel near the source-drain junction. Programming method is selected by Channel Hot Electron (CUE) injection which is available for localized trap in nitride film. To demonstrate CHE injection, substrate current (Isub) and one-shot programming curve are investigated. The multi-bit operation which stores two-bit per cell is investigated. Also, Hot Hole(HH) injection for fast erasing is used. The fabricated SONOS devices have ultra-thinner gate dielectrics and then have lower programming voltage, simpler process and better scalability compared to any other multi-bit storage Flash memory. Our programming characteristics are shown to be the most promising for the multi-bit flash memory.

Realization of Two-bit Operation by Bulk-biased Programming Technique in SONOS NOR Array with Common Source Lines

  • An, Ho-Myoung;Seo, Kwang-Yell;Kim, Joo-Yeon;Kim, Byung-Cheul
    • Transactions on Electrical and Electronic Materials
    • /
    • v.7 no.4
    • /
    • pp.180-183
    • /
    • 2006
  • We report for the first time two-bit operational characteristics of a high-density NOR-type polysilicon-oxide-nitride-oxide-silicon (SONOS) array with common source line (CSL). An undesired disturbance, especially drain disturbance, in the NOR array with CSL comes from the two-bit-per-cell operation. To solve this problem, we propose an efficient bulk-biased programming technique. In this technique, a bulk bias is additionally applied to the substrate of memory cell for decreasing the electric field between nitride layer and drain region. The proposed programming technique shows free of drain disturbance characteristics. As a result, we have accomplished reliable two-bit SONOS array by employing the proposed programming technique.

Distributed Bit Loading and Power Control Algorithm to Increase System Throughput of Ad-hoc Network (Ad-hoc 네트워크의 Throughput 향상을 위한 적응적 MCS 레벨 기반의 분산형 전력 제어 알고리즘)

  • Kim, Young-Bum;Wang, Yu-Peng;Chang, Kyung-Hi;Yun, Chang-Ho;Park, Jong-Won;Lim, Yong-Kon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.4A
    • /
    • pp.315-321
    • /
    • 2010
  • In Ad-hoc networks, centralized power control is not suitable due to the absence of base stations, which perform the power control operation in the network to optimize the system performance. Therefore, each node should perform power control algorithm distributedly instead of the centralized one. The conventional distributed power control algorithm does not consider the adaptive bit loading operation to change the MCS (modulation and coding scheme) according to the received SINR (signal to interference and noise ratio), which limits the system throughput. In this paper, we propose a novel distributed bit loading and power control algorithm, which considers the adaptive bit loading operation to increase total system throughput and decrease outage probability. Simulation results show that the proposed algorithm performs much better than the conventional algorithm.

Accuracy Improvement Method for 1-Bit Convolutional Neural Network (1-Bit 합성곱 신경망을 위한 정확도 향상 기법)

  • Im, Sung-Hoon;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1115-1122
    • /
    • 2018
  • In this paper, we analyze the performance degradation of previous 1-Bit convolutional neural network method and introduce ways to mitigate it. Previous work applies 32-Bit operation to first and last layers. But our method applies 32-Bit operation to second layer too. We also show that nonlinear activation function can be removed after binarizing inputs and weights. In order to verify the method proposed in this paper, we experiment the object detection neural network for korean license plate detection. Our method results in 96.1% accuracy, but the existing method results in 74% accuracy.

A design of Software 2D BitBLT Engine based on RTOS (RTOS 기반의 소프트웨어 2D BitBLT 엔진의 설계)

  • Kim, Bong-Joo;Hong, Jiman
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.35-41
    • /
    • 2014
  • In this paper, we proposed the implementation of software-based 2D BitBLT engine on the pSOS operating system and the operation of the BitBLT engine on patient monitoring device was verified. To verify the proposed method on the patient monitoring device, we designed prototype PCB board, and verified the operation. We designed the motherboard by using ARM9-based CPU. Because hardware-based BitBLT module was replaced with software-based one, CPU load problem was weighted. To solve this problem, w changed 400Mhz processor instead of 200Mhz processor. We implemented 2D BitBLT kernel module as a device driver which is one of the key elements of a graphics controller GUI in patient monitoring device.

Implementation of LEA Lightwegiht Block Cipher GCM Operation Mode on 32-Bit RISC-V (32-Bit RISC-V상에서의 LEA 경량 블록 암호 GCM 운용 모드 구현)

  • Eum, Si-Woo;Kwon, Hyeok-Dong;Kim, Hyun-Ji;Yang, Yu-Jin;Seo, Hwa-Jeong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.163-170
    • /
    • 2022
  • LEA is a lightweight block cipher developed in Korea in 2013. In this paper, among block cipher operation methods, CTR operation mode and GCM operation mode that provides confidentiality and integrity are implemented. In the LEA-CTR operation mode, we propose an optimization implementation that omits the operation between states through the state fixation and omits the operation through the pre-operation by utilizing the characteristics of the fixed nonce value of the CTR operation mode. It also shows that the proposed method is applicable to the GCM operation mode, and implements the GCM through the implementation of the GHASH function using the Galois Field(2128) multiplication operation. As a result, in the case of LEA-CTR to which the proposed technique is applied on 32-bit RISC-V, it was confirmed that the performance was improved by 2% compared to the previous study. In addition, the performance of the GCM operation mode is presented so that it can be used as a performance indicator in other studies in the future.

Efficient Implementation of Single Error Correction and Double Error Detection Code with Check Bit Pre-computation for Memories

  • Cha, Sanguhn;Yoon, Hongil
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.12 no.4
    • /
    • pp.418-425
    • /
    • 2012
  • In this paper, efficient implementation of error correction code (ECC) processing circuits based on single error correction and double error detection (SEC-DED) code with check bit pre-computation is proposed for memories. During the write operation of memory, check bit pre-computation eliminates the overall bits computation required to detect a double error, thereby reducing the complexity of the ECC processing circuits. In order to implement the ECC processing circuits using the check bit pre-computation more efficiently, the proper SEC-DED codes are proposed. The H-matrix of the proposed SEC-DED code is the same as that of the odd-weight-column code during the write operation and is designed by replacing 0's with 1's at the last row of the H-matrix of the odd-weight-column code during the read operation. When compared with a conventional implementation utilizing the odd-weight- column code, the implementation based on the proposed SEC-DED code with check bit pre-computation achieves reductions in the number of gates, latency, and power consumption of the ECC processing circuits by up to 9.3%, 18.4%, and 14.1% for 64 data bits in a word.

Resuable Design of 32-Bit RISC Processor for System On-A Chip (SOC 설계를 위한 저전력 32-비트 RISC 프로세서의 재사용 가능한 설계)

  • 이세환;곽승호;양훈모;이문기
    • Proceedings of the IEEK Conference
    • /
    • 2001.06b
    • /
    • pp.105-108
    • /
    • 2001
  • 4 32-bit RISC core is designed for embedded application and DSP. This processor offers low power consumption by fully static operation and compact code size by efficient instruction set. Processor performance is improved by wing conditional instruction execution, block data transfer instruction, multiplication instruction, bunked register file structure. To support compact code size of embedded application, It is capable cf executing both 16-bit instructions and 32-bit instruction through mixed mode instruction conversion Furthermore, for fast MAC operation for DSP applications, the processor has a dedicated hardware multiplier, which can complete a 32-bit by 32-bit integer multiplication within seven clock cycles. These result in high instruction throughput and real-time interrupt response. This chip is implemented with 0.35${\mu}{\textrm}{m}$, 4- metal CMOS technology and consists of about 50K gate equivalents.

  • PDF