Search | Korea Science

Bit Operation Optimization and DNN Application using GPU Acceleration (GPU 가속기를 통한 비트 연산 최적화 및 DNN 응용)

Kim, Sang Hyeok;Lee, Jae Heung
- Journal of IKEEE
- /
- v.23 no.4
- /
- pp.1314-1320
- /
- 2019
In this paper, we propose a new method for optimizing bit operations and applying them to DNN(Deep Neural Network) in software environment. As a method for this, we propose a packing function for bitwise optimization and a masking matrix multiplication operation for application to DNN. The packing function converts 32-bit real value to 2-bit quantization value through threshold comparison operation. When this sequence is over, four 32-bit real values are changed to one 8-bit value. The masking matrix multiplication operation consists of a special operation for multiplying the packed weight value with the normal input value. And each operation was then processed in parallel using a GPU accelerator. As a result of this experiment, memory saved about 16 times than 32-bit DNN Model. Nevertheless, the accuracy was within 1%, similar to the 32-bit model.
https://doi.org/10.7471/ikeee.2019.23.4.1314 인용 PDF KSCI

A study on the programming conditions suppressing the lateral diffusion of charges for the SONOS two-bit memory (SONOS two-bit 메모리의 측면확산에 영향을 주는 programming 조건 연구)

Lee, Myung-Shik;An, Ho-Myung;Seo, Kwang-Yell;Koh, Jung-Hyuk;Kim, Byung-Cheul;Kim, Joo-Yeon
- Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
- /
- 2005.11a
- /
- pp.117-120
- /
- 2005
The SONOS devices have been fabricated by the conventional $0.35{\mu}m$ complementary metal-oxide-semiconductor (CMOS) process with NOR array. Two-bit operation using conventional process achieve the high density memory compare with other two-bit memory. Lateral diffusion phenomenon in the two-bit operation cause soft error in the memory. In this study, the programming conditions arc investigated in order to reduce lateral diffusion for two-bit operation of CSL-NOR type SONOS flash cell.
PDF

Programming Characteristics of the Multi-bit Devices Based on SONOS Structure (SONOS 구조를 갖는 멀티 비트 소자의 프로그래밍 특성)

김주연
- Journal of the Korean Institute of Electrical and Electronic Material Engineers
- /
- v.16 no.9
- /
- pp.771-774
- /
- 2003
In this paper, the programming characteristics of the multi-bit devices based on SONOS structure are investigated. Our devices have been fabricated by 0.35 $\mu\textrm{m}$ complementary metal-oxide-semiconductor (CMOS) process with LOCOS isolation. In order to achieve the multi-bit operation per cell, charges must be locally frapped in the nitride layer above the channel near the source-drain junction. Programming method is selected by Channel Hot Electron (CUE) injection which is available for localized trap in nitride film. To demonstrate CHE injection, substrate current (Isub) and one-shot programming curve are investigated. The multi-bit operation which stores two-bit per cell is investigated. Also, Hot Hole(HH) injection for fast erasing is used. The fabricated SONOS devices have ultra-thinner gate dielectrics and then have lower programming voltage, simpler process and better scalability compared to any other multi-bit storage Flash memory. Our programming characteristics are shown to be the most promising for the multi-bit flash memory.
https://doi.org/10.4313/JKEM.2003.16.9.771 인용 PDF KSCI

Realization of Two-bit Operation by Bulk-biased Programming Technique in SONOS NOR Array with Common Source Lines

An, Ho-Myoung;Seo, Kwang-Yell;Kim, Joo-Yeon;Kim, Byung-Cheul
- Transactions on Electrical and Electronic Materials
- /
- v.7 no.4
- /
- pp.180-183
- /
- 2006
We report for the first time two-bit operational characteristics of a high-density NOR-type polysilicon-oxide-nitride-oxide-silicon (SONOS) array with common source line (CSL). An undesired disturbance, especially drain disturbance, in the NOR array with CSL comes from the two-bit-per-cell operation. To solve this problem, we propose an efficient bulk-biased programming technique. In this technique, a bulk bias is additionally applied to the substrate of memory cell for decreasing the electric field between nitride layer and drain region. The proposed programming technique shows free of drain disturbance characteristics. As a result, we have accomplished reliable two-bit SONOS array by employing the proposed programming technique.
https://doi.org/10.4313/TEEM.2006.7.4.180 인용 PDF KSCI

Distributed Bit Loading and Power Control Algorithm to Increase System Throughput of Ad-hoc Network (Ad-hoc 네트워크의 Throughput 향상을 위한 적응적 MCS 레벨 기반의 분산형 전력 제어 알고리즘)

Kim, Young-Bum;Wang, Yu-Peng;Chang, Kyung-Hi;Yun, Chang-Ho;Park, Jong-Won;Lim, Yong-Kon
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.4A
- /
- pp.315-321
- /
- 2010
In Ad-hoc networks, centralized power control is not suitable due to the absence of base stations, which perform the power control operation in the network to optimize the system performance. Therefore, each node should perform power control algorithm distributedly instead of the centralized one. The conventional distributed power control algorithm does not consider the adaptive bit loading operation to change the MCS (modulation and coding scheme) according to the received SINR (signal to interference and noise ratio), which limits the system throughput. In this paper, we propose a novel distributed bit loading and power control algorithm, which considers the adaptive bit loading operation to increase total system throughput and decrease outage probability. Simulation results show that the proposed algorithm performs much better than the conventional algorithm.
PDF KSCI

Accuracy Improvement Method for 1-Bit Convolutional Neural Network (1-Bit 합성곱 신경망을 위한 정확도 향상 기법)

Im, Sung-Hoon;Lee, Jae-Heung
- Journal of IKEEE
- /
- v.22 no.4
- /
- pp.1115-1122
- /
- 2018
In this paper, we analyze the performance degradation of previous 1-Bit convolutional neural network method and introduce ways to mitigate it. Previous work applies 32-Bit operation to first and last layers. But our method applies 32-Bit operation to second layer too. We also show that nonlinear activation function can be removed after binarizing inputs and weights. In order to verify the method proposed in this paper, we experiment the object detection neural network for korean license plate detection. Our method results in 96.1% accuracy, but the existing method results in 74% accuracy.
https://doi.org/10.7471/ikeee.2018.22.4.1115 인용 PDF KSCI HTML

A design of Software 2D BitBLT Engine based on RTOS (RTOS 기반의 소프트웨어 2D BitBLT 엔진의 설계)

Kim, Bong-Joo;Hong, Jiman
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.4
- /
- pp.35-41
- /
- 2014
In this paper, we proposed the implementation of software-based 2D BitBLT engine on the pSOS operating system and the operation of the BitBLT engine on patient monitoring device was verified. To verify the proposed method on the patient monitoring device, we designed prototype PCB board, and verified the operation. We designed the motherboard by using ARM9-based CPU. Because hardware-based BitBLT module was replaced with software-based one, CPU load problem was weighted. To solve this problem, w changed 400Mhz processor instead of 200Mhz processor. We implemented 2D BitBLT kernel module as a device driver which is one of the key elements of a graphics controller GUI in patient monitoring device.
https://doi.org/10.9708/jksci.2014.19.4.035 인용 PDF KSCI

Implementation of LEA Lightwegiht Block Cipher GCM Operation Mode on 32-Bit RISC-V (32-Bit RISC-V상에서의 LEA 경량 블록 암호 GCM 운용 모드 구현)

Eum, Si-Woo;Kwon, Hyeok-Dong;Kim, Hyun-Ji;Yang, Yu-Jin;Seo, Hwa-Jeong
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.32 no.2
- /
- pp.163-170
- /
- 2022
LEA is a lightweight block cipher developed in Korea in 2013. In this paper, among block cipher operation methods, CTR operation mode and GCM operation mode that provides confidentiality and integrity are implemented. In the LEA-CTR operation mode, we propose an optimization implementation that omits the operation between states through the state fixation and omits the operation through the pre-operation by utilizing the characteristics of the fixed nonce value of the CTR operation mode. It also shows that the proposed method is applicable to the GCM operation mode, and implements the GCM through the implementation of the GHASH function using the Galois Field(2¹²⁸) multiplication operation. As a result, in the case of LEA-CTR to which the proposed technique is applied on 32-bit RISC-V, it was confirmed that the performance was improved by 2% compared to the previous study. In addition, the performance of the GCM operation mode is presented so that it can be used as a performance indicator in other studies in the future.
https://doi.org/10.13089/JKIISC.2022.32.2.163 인용 PDF KSCI HTML

Efficient Implementation of Single Error Correction and Double Error Detection Code with Check Bit Pre-computation for Memories

Cha, Sanguhn;Yoon, Hongil
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.12 no.4
- /
- pp.418-425
- /
- 2012
In this paper, efficient implementation of error correction code (ECC) processing circuits based on single error correction and double error detection (SEC-DED) code with check bit pre-computation is proposed for memories. During the write operation of memory, check bit pre-computation eliminates the overall bits computation required to detect a double error, thereby reducing the complexity of the ECC processing circuits. In order to implement the ECC processing circuits using the check bit pre-computation more efficiently, the proper SEC-DED codes are proposed. The H-matrix of the proposed SEC-DED code is the same as that of the odd-weight-column code during the write operation and is designed by replacing 0's with 1's at the last row of the H-matrix of the odd-weight-column code during the read operation. When compared with a conventional implementation utilizing the odd-weight- column code, the implementation based on the proposed SEC-DED code with check bit pre-computation achieves reductions in the number of gates, latency, and power consumption of the ECC processing circuits by up to 9.3%, 18.4%, and 14.1% for 64 data bits in a word.
https://doi.org/10.5573/JSTS.2012.12.4.418 인용 PDF KSCI

Resuable Design of 32-Bit RISC Processor for System On-A Chip (SOC 설계를 위한 저전력 32-비트 RISC 프로세서의 재사용 가능한 설계)

이세환;곽승호;양훈모;이문기
- Proceedings of the IEEK Conference
- /
- 2001.06b
- /
- pp.105-108
- /
- 2001
4 32-bit RISC core is designed for embedded application and DSP. This processor offers low power consumption by fully static operation and compact code size by efficient instruction set. Processor performance is improved by wing conditional instruction execution, block data transfer instruction, multiplication instruction, bunked register file structure. To support compact code size of embedded application, It is capable cf executing both 16-bit instructions and 32-bit instruction through mixed mode instruction conversion Furthermore, for fast MAC operation for DSP applications, the processor has a dedicated hardware multiplier, which can complete a 32-bit by 32-bit integer multiplication within seven clock cycles. These result in high instruction throughput and real-time interrupt response. This chip is implemented with 0.35${\mu}{\textrm}{m}$, 4- metal CMOS technology and consists of about 50K gate equivalents.
PDF

Search Result 750, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)