Search | Korea Science

Analysis of Deep learning Quantization Technology for Micro-sized IoT devices (초소형 IoT 장치에 구현 가능한 딥러닝 양자화 기술 분석)

YoungMin KIM;KyungHyun Han;Seong Oun Hwang
- Journal of Internet of Things and Convergence
- /
- v.9 no.1
- /
- pp.9-17
- /
- 2023
Deep learning with large amount of computations is difficult to implement on micro-sized IoT devices or moblie devices. Recently, lightweight deep learning technologies have been introduced to make sure that deep learning can be implemented even on small devices by reducing the amount of computation of the model. Quantization is one of lightweight techniques that can be efficiently used to reduce the memory and size of the model by expressing parameter values with continuous distribution as discrete values of fixed bits. However, the accuracy of the model is reduced due to discrete value representation in quantization. In this paper, we introduce various quantization techniques to correct the accuracy. We selected APoT and EWGS from existing quantization techniques, and comparatively analyzed the results through experimentations The selected techniques were trained and tested with CIFAR-10 or CIFAR-100 datasets in the ResNet model. We found out problems with them through experimental results analysis and presented directions for future research.
https://doi.org/10.20465/KIOTS.2023.9.1.009 인용 PDF

A Time-Domain Parameter Extraction Method for Speech Recognition using the Local Peak-to-Peak Interval Information (국소 극대-극소점 간의 간격정보를 이용한 시간영역에서의 음성인식을 위한 파라미터 추출 방법)

임재열;김형일;안수길
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.31B no.2
- /
- pp.28-34
- /
- 1994
In this paper, a new time-domain parameter extraction method for speech recognition is proposed. The suggested emthod is based on the fact that the local peak-to-peak interval, i.e., the interval between maxima and minima of speech waveform is closely related to the frequency component of the speech signal. The parameterization is achieved by a sort of filter bank technique in the time domain. To test the proposed parameter extraction emthod, an isolated word recognizer based on Vector Quantization and Hidden Markov Model was constructed. As a test material, 22 words spoken by ten males were used and the recognition rate of 92.9% was obtained. This result leads to the conclusion that the new parameter extraction method can be used for speech recognition system. Since the proposed method is processed in the time domain, the real-time parameter extraction can be implemented in the class of personal computer equipped onlu with an A/D converter without any DSP board.
PDF

A Study on Isolated Word Recognition using Improved Multisection Vector Quantization Recognition System (개선된 MSVQ 인식 시스템을 이용한 단독어 인식에 관한 연구)

An, Tae-Ok;Kim, Nam-Joong;Song, Chul;Kim, Soon-Hyeob
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.16 no.2
- /
- pp.196-205
- /
- 1991
This paper is a study on the isolated word recognition of speaker independent which proposes to newly improved MSVQ(multisection vector quantization) recognition system which improve the classical MSVQ recognition system. It is a difference that test pattern has on more section than reference pattern in recognition system 146 DDD area names are selected as recognition vocabulary. 12th LPC cepstral coefficients is used as feature parameter. and when codebook is generated, MINSUM and MINMAX are used in finding the centroid. According to the experiment result. it is proved that this method is better than VQ(vector quantization) recognition methods, DTW(dynamic time warping) pattern matching methods and classical MSVQ methods for recognition rate and recognition time.
PDF

Adaptive Watermarking Using Successive Subband Quantization and Perceptual Model Based on Multiwavelet Transform Domain (멀티웨이브릿 변환 영역 기반의 연속 부대역 양자화 및 지각 모델을 이용한 적응 워터마킹)

권기룡;이준재
- Journal of Korea Multimedia Society
- /
- v.6 no.7
- /
- pp.1149-1158
- /
- 2003
Content adaptive watermark embedding algorithm using a stochastic image model in the multiwavelet transform is proposed in this paper. A watermark is embedded into the perceptually significant coefficients (PSCs) of each subband using multiwavelet transform. The PSCs in high frequency subband are selected by SSQ, that is, by setting the thresholds as the one half of the largest coefficient in each subband. The perceptual model is applied with a stochastic approach based on noise visibility function (NVF) that has local image properties for watermark embedding. This model uses stationary Generalized Gaussian model characteristic because watermark has noise properties. The watermark estimation use shape parameter and variance of subband region. it is derive content adaptive criteria according to edge and texture, and flat region. The experiment results of the proposed watermark embedding method based on multiwavelet transform techniques were found to be excellent invisibility and robustness.
PDF

Design of the Vector-Scalar Quantizer of LSP Parameters for Wideband Speech Coder (광대역 음성부호화기를 위한 백터-스칼라 LSP 파라미터 양자화기 설계)

신재현;이인성;지덕구;윤병식;최송인
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.40 no.4
- /
- pp.286-291
- /
- 2003
In this Paper, we designed an LSP(Line Spectral Pairs) parameter quantizer with cascaded structure of vector quantizer and scalar quantizer for the wideband speech coder. We have chosen the 16th-order of the LP coefficients. These coefficients are then transformed into the LSP parameters which have the excellent properties for quantization and easy stability checking condition of synthesis filter. In the first stage of quantization, input LSP parameters are split-vector-quantized using two 8-th order codebooks. In the second stage, the components of residual vector are individually quantized by the scalar quantizer utilizing the ordering property of LSP parameters. The designed adaptive VQ-SQ quantizer using 35 bits/frame shows the wideband transparency that the average spectral distortion should be less than 1.6 ㏈ and less than 4% of the frames should have SD above 3 ㏈. The simulation results show that the designed quantizer provides a 2-3 bits/frame saving over the typical vector-scalar quantizer.
PDF KSCI

An Adaptive Rate Control Using Piecewise Linear Approximation Model (부분 선형 근사 모델을 이용한 적응적 비트율 제어)

조창형;정제창;최병욱
- Journal of Broadcast Engineering
- /
- v.2 no.2
- /
- pp.194-205
- /
- 1997
In video compression standards such as MPEG and H.263. rate control is one of the key components for good coding performance. This paper presents a simple adaptive rate control scheme using a piecewise linear approximation model. While conventional buffer control approach is performed by adjusting the quantization parameter linearly according to the buffer fullness. the proposed approach uses a piecewise linear approximation model derived from logarithmic relation between the quantization parameter and bitrate in data compression. In addition. a forward analyzer performed in the spatial domain is used to improve image quality. Simulation results demonstrate that the proposed method provides better performance than the conventional one and reduces the fluctuation of the PSNR per frame while maintaining the quality of the reconstructed frames at a relatively stable level.
PDF

Quantization Parameter Selection Method For H.264-based Multi-view Video Coding (H.264 기반 다시점 비디오 부호화를 위한 양자화 계수 결정 방법)

Park, Pil-Kyu;Ho, Yo-Sung
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.32 no.6C
- /
- pp.579-584
- /
- 2007
Recently various prediction structures have been proposed to exploit inter-view correlation among multi-view video sequences. In this paper, we propose a QP(quantization parameter) selection method for the B frame inserted in the first frames of each GOP(group of pictures), where we change QP for the B frame adaptively to achieve uniform picture quality and overall coding gain. Each B frame is coded with reference to two frames in its adjacent views. We calculate QP for the B frame based on the correlation between the two reference frames, calculated using their rate-distortion costs. By applying the proposed method to the MVC reference prediction structure, we have improved the coding gain by 0.09$\sim$0.16 dB.
PDF KSCI

PSNR-based Initial QP Determination for Low Bit Rate Video Coding

Park, Sang-Hyun
- Journal of information and communication convergence engineering
- /
- v.10 no.3
- /
- pp.315-320
- /
- 2012
In H.264/AVC, the first frame of a group of pictures (GOP) is encoded in intra mode which generates a large number of bits. The number of bits for the I-frame affects the qualities of the following frames of a GOP since they are encoded using the bits remaining among the bits allocated to the GOP. In addition, the first frame is used for the inter mode encoding of the following frames. Thus, the initial quantization parameter (QP) affects the following frames as well as the first frame. In this paper, an adaptive peak signal to noise ratio (PSNR)-based initial QP determination algorithm is presented. In the proposed algorithm, a novel linear model is established based on the observation of the relation between the initial QPs and PSNRs of frames. Using the linear model and PSNR results of the encoded GOPs, the proposed algorithm accurately estimates the optimal initial QP which maximizes the PSNR of the current GOP. It is shown by experimental results that the proposed algorithm predicts the optimal initial QP accurately and thus achieves better PSNR performance than that of the existing algorithm.
https://doi.org/10.6109/jicce.2012.10.3.315 인용 PDF KSCI

Speaker Identification Based on Vowel Classification and Vector Quantization (모음 인식과 벡터 양자화를 이용한 화자 인식)

Lim, Chang-Heon;Lee, Hwang-Soo;Un, Chong-Kwan
- The Journal of the Acoustical Society of Korea
- /
- v.8 no.4
- /
- pp.65-73
- /
- 1989
In this paper, we propose a text-independent speaker identification algorithm based on VQ(vector quantization) and vowel classification, and its performance is studied and compared with that of a conventional speaker identification algorithm using VQ. The proposed speaker identification algorithm is composed of three processes: vowel segmentation, vowel recognition and average distortion calculation. The vowel segmentation is performed automatlcally using RMS energy, BTR(Back-to-Total cavity volume Ratio)and SFBR(Signed Front-to-Back maximum area Ratio) extracted from input speech signal. If the Input speech signal Is noisy, particularity when the SNR is around 20dB, the proposed speaker identification algorithm performs better than the reference speaker identification algorithm when the correct vowel segmentation is done. The same result is obtained when we use the noisy telephone speech signal as an input, too.
PDF

Analysis of Quantization Parameter of Key Pictures in Distributed Video Coding (분산비디오 기술의 율 왜곡 성능 개선을 위한 키 픽처의 양자화 계수 분석)

Eun, Hyun;Shim, Hiuk Jae;Jeon, Byeungwoo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2010.11a
- /
- pp.239-241
- /
- 2010
분산 비디오 기술의 대표적인 기술 중 하나는 와이너 지브 부호화 기술이다. 와이너 지브 부호화 구조에서 보조정보는 인트라 복호화된 키 픽처들을 이용하여 생성한다. 키 픽처의 객관적 화질은 보조정보의 성능에 많은 영향을 끼치고, 잡음이 많은 보조정보를 복호화에 이용할 경우 부호화로부터 많은 패리티 비트를 요구하게 되어 율 왜곡 성능을 저하된다. 기존의 부호화 기술은 키 픽처 부호화 시 Quantization Matrix에 따라 미리 정의된 양자화 계수를 이용한다. 본 논문에서는 미리 정의된 양자화 계수 보다 낮은 계수 값을 사용하여 부호화 하는 방법을 제안한다. 제안방법은 키 픽처의 객관적 화질이 높아짐에 따라 보조정보의 화질을 향상시킨다. 잡음이 적은 보조정보는 와이너 지브 복호화 시 율 왜곡 성능을 향상시킨다. 실험결과는 기존 방법에 비해 최대 0.7dB에 이르는 성능향상을 보인다.
PDF

Search Result 145, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)