• Title/Summary/Keyword: model quantization

Search Result 227, Processing Time 0.024 seconds

A Study on the Speech Recognition Moduleas Design Using HMM Speech Recognition Algorithm (HMM(Hidden Markov Model) 음성인식 알고리즘을 이용한 효율적인 음성인식 모듈 개발 설계에 관한 연구)

  • 김정훈;류홍석;강재명;강성인;이상배
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.337-340
    • /
    • 2002
  • 본 논문에서는 휠체어 시스템에 화자 독립 고립단어 인식을 위한 임베디드 시스템 설계에 관한 내용을 서술한다. 실제 환경에서는 잡음이 포함되어 있어 인식률을 저하시키므로, 잡음을 제거하는 방식 중 가장 간단한 방식인 스펙트럼 차감법(Spectral subtraction method)을 사용하여 잡음을 제거했다 전처리 단계에서는 12차 LPC&Cepstrum 방식을 사용했고, 인식 알고리즘은 DHMM (Discrete Hidden Markov Model)을 전반부 인식기로 사용했다. 이 알고리즘을 적용하기 위해서는 데이터 간소화를 위해 벡터양자화(Vector Quantization) 처리가 전제되어야한다 또한 인식알고리즘은 인식률을 향상을 위해 후처리 인식기로 신경망(MLP:Multi-layer Perceptron)을 통해서 인식률을 향상시켰다 화자 독립 시스템에 맞는 인식 단어의 구성은 총 7개단어로 남녀 총 25명 목소리로 구성하였다. 그리고 하드웨어 구성은 32-bits floating point 방식인 TMS320C32를 적용했고, 메모리 부분은 4Mbyte로 설계를 했으며, 메인보드의 설계는 현재 완성 단계에 있다.

Robust pattern watermarking using wavelet transform and multi-weights (웨이브렛 변환과 다중 가중치를 이용한 강인한 패턴 워터마킹)

  • 김현환;김용민;김두영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.3B
    • /
    • pp.557-564
    • /
    • 2000
  • This paper presents a watermarking algorithm for embedding visually recognizable pattern (Mark, Logo, Symbol, stamping or signature) into the image. first, the color image(RGB model)is transformed in YCbCr model and then the Y component is transformed into 3-level wavelet transform. Next, the values are assembled with pattern watermark. PN(pseudo noise) code at spread spectrum communication method and mutilevel watermark weights. This values are inserted into discrete wavelet domain. In our scheme, new calculating method is designed to calculate wavelet transform with integer value in considering the quantization error. and we used the color conversion with fixed-point arithmetic to be easy to make the hardware hereafter. Also, we made the new solution using mutilevel threshold to robust to common signal distortions and malicious attack, and to enhance quality of image in considering the human visual system. the experimental results showed that the proposed watermarking algorithm was superior to other similar water marking algorithm. We showed what it was robust to common signal processing and geometric transform such as brightness. contrast, filtering. scaling. JPEG lossy compression and geometric deformation.

  • PDF

Combination of Classifiers Decisions for Multilingual Speaker Identification

  • Nagaraja, B.G.;Jayanna, H.S.
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.928-940
    • /
    • 2017
  • State-of-the-art speaker recognition systems may work better for the English language. However, if the same system is used for recognizing those who speak different languages, the systems may yield a poor performance. In this work, the decisions of a Gaussian mixture model-universal background model (GMM-UBM) and a learning vector quantization (LVQ) are combined to improve the recognition performance of a multilingual speaker identification system. The difference between these classifiers is in their modeling techniques. The former one is based on probabilistic approach and the latter one is based on the fine-tuning of neurons. Since the approaches are different, each modeling technique identifies different sets of speakers for the same database set. Therefore, the decisions of the classifiers may be used to improve the performance. In this study, multitaper mel-frequency cepstral coefficients (MFCCs) are used as the features and the monolingual and cross-lingual speaker identification studies are conducted using NIST-2003 and our own database. The experimental results show that the combined system improves the performance by nearly 10% compared with that of the individual classifier.

Multi-view Rate Control based on HEVC for 3D Video Services

  • Lim, Woong;Lee, Sooyoun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.245-249
    • /
    • 2013
  • In this paper, we propose two rate control algorithms for multi-view extension of HEVC with two rate control algorithms adopted in HEVC and analyze the multi-view rate control performance. The proposed multi-view rate controls are designed on HEVC-based multi-view video coding (MV-HEVC) platform with consideration of high-level syntax, inter-view prediction, etc. not only for the base view but also for the extended views using the rate control algorithms based on URQ (Unified Rate-Quantization) and R-lambda model adopted in HEVC. The proposed multi-view rate controls also contain view-wise target bit allocation for providing the compatibility to the base view. By allocating the target bitrates for each view, the proposed multi-view rate control based on URQ model achieved about 1.83% of average bitrate accuracy and 1.73dB of average PSNR degradation. In addition, about 2.97% of average bitrate accuracy and 0.31dB of average PSNR degradation are achieved with the proposed multi-view rate control based on R-lambda model.

Analytical Model for the Threshold Voltage of Long-Channel Asymmetric Double-Gate MOSFET based on Potential Linearity (전압분포의 선형특성을 이용한 Long-Channel Asymmetric Double-Gate MOSFET의 문턱전압 모델)

  • Yang, Hee-Jung;Kim, Ji-Hyun;Son, Ae-Ri;Kang, Dae-Gwan;Shin, Hyung-Soon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.2
    • /
    • pp.1-6
    • /
    • 2008
  • A compact analytical model of the threshold voltage for long-channel Asymmetric Double-Gate(ADG) MOSFET is presented. In contrast to the previous models, channel doping and carrier quantization are taken into account. A more compact model is derived by utilizing the potential distribution linearity characteristic of silicon film at threshold. The accuracy of the model is verified by comparisons with numerical simulations for various silicon film thickness, channel doping concentration and oxide thickness.

Proposal of Parameter Range that Offered Optimal Performance in the Coastal Morphodynamic Model (XBeach) Through GLUE

  • Bae, Hyunwoo;Do, Kideok;Kim, Inho;Chang, Sungyeol
    • Journal of Ocean Engineering and Technology
    • /
    • v.36 no.4
    • /
    • pp.251-269
    • /
    • 2022
  • The process-based XBeach model has numerous empirical parameters because of insufficient understanding of hydrodynamics and sediment transport on the nearshore; hence, it is necessary to calibrate parameters to apply to various study areas and wave conditions. Therefore, the calibration process of parameters is essential for the improvement of model performance. Generally, the trial-and-error method is widely used; however, this method is passive and limited to various and comprehensive parameter ranges. In this study, the Generalized Likelihood Uncertainty Estimation (GLUE) method was used to estimate the optimal range of three parameters (gamma, facua, and gamma2) using morphological field data collected in Maengbang beach during the four typhoons that struck from September to October 2019. The model performance and optimal range of empirical parameters were evaluated using Brier Skill Score (BSS) along with the baseline profiles, sensitivity, and likelihood density analysis of BSS in the GLUE tools. Accordingly, the optimal parameter combinations were derived when facua was less than 0.15 and simulated well the shifting shape, from crescentic sand bar to alongshore uniform sand bars in the surf zone of Maengbang beach after storm impact. However, the erosion and accretion patterns nearby in the surf zone and shoreline remain challenges in the XBeach model.

Robust Watermarking Algorithm for 3D Mesh Models (3차원 메쉬 모델을 위한 강인한 워터마킹 기법)

  • 송한새;조남익;김종원
    • Journal of Broadcast Engineering
    • /
    • v.9 no.1
    • /
    • pp.64-73
    • /
    • 2004
  • A robust watermarking algorithm is proposed for 3D mesh models. Watermark is inserted into the 2D image which is extracted from the target 3D model. Each Pixel value of the extracted 2D image represents a distance from the predefined reference points to the face of the given 3D model. This extracted image is defined as “range image” in this paper. Watermark is embedded into the range image. Then, watermarked 3D mesh is obtained by modifying vertices using the watermarked range Image. In extraction procedure, the original model is needed. After registration between the original and the watermarked models, two range images are extracted from each 3D model. From these images. embedded watermark is extracted. Experimental results show that the proposed algorithm is robust against the attacks such as rotation, translation, uniform scaling, mesh simplification, AWGN and quantization of vertex coordinates.

Fault Diagnosis of Rotating System Mass Unbalance Using Hidden Markov Model (HMM을 이용한 회전체 시스템의 질량편심 결함진단)

  • Ko, Jungmin;Choi, Chankyu;Kang, To;Han, Soonwoo;Park, Jinho;Yoo, Honghee
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.25 no.9
    • /
    • pp.637-643
    • /
    • 2015
  • In recent years, pattern recognition methods have been widely used by many researchers for fault diagnoses of mechanical systems. The soundness of a mechanical system can be checked by analyzing the variation of the system vibration characteristic along with a pattern recognition method. Recently, the hidden Markov model has been widely used as a pattern recognition method in various fields. In this paper, the hidden Markov model is employed for the fault diagnosis of the mass unbalance of a rotating system. Mass unbalance is one of the critical faults in the rotating system. A procedure to identity the location and size of the mass unbalance is proposed and the accuracy of the procedure is validated through experiment.

3D Model Compression For Collaborative Design

  • Liu, Jun;Wang, Qifu;Huang, Zhengdong;Chen, Liping;Liu, Yunhua
    • International Journal of CAD/CAM
    • /
    • v.7 no.1
    • /
    • pp.1-10
    • /
    • 2007
  • The compression of CAD models is a key technology for realizing Internet-based collaborative product development because big model sizes often prohibit us to achieve a rapid product information transmission. Although there exist some algorithms for compressing discrete CAD models, original precise CAD models are focused on in this paper. Here, the characteristics of hierarchical structures in CAD models and the distribution of their redundant data are exploited for developing a novel data encoding method. In the method, different encoding rules are applied to different types of data. Geometric data is a major concern for reducing model sizes. For geometric data, the control points of B-spline curves and surfaces are compressed with the second-order predictions in a local coordinate system. Based on analysis to the distortion induced by quantization, an efficient method for computation of the distortion is provided. The results indicate that the data size of CAD models can be decreased efficiently after compressed with the proposed method.

Determination and Performance Evaluation of a Codebook for MIMO Systems Utilizing Statistical Properties of The Spatial Channel Model (공간 채널 모델의 통계적 특성을 활용하는 MIMO 시스템의 코드북 결정 및 성능 평가)

  • Suh, Junyeub;Kang, Hosik;Sung, Wonjin
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.7
    • /
    • pp.22-30
    • /
    • 2015
  • For long-term evolution (LTE) MIMO transmission, codebooks are used to utilize the estimated channel information under the limited feedeback environment, and related study has been actively performed. Existing codebooks include codevectos constructed based on vector quantization (VQ) and discrete Fourier transform (DFT), and the LTE standard specifies codebooks modified from these examples to support up to 8 transmit antennas. As the number of antennas increases and as the spatial channel model is used as a standard environment to evaluate the LTE transmission performance, new beamforming methods as well as codebook designs are needed. In this paper, we implement the 3-dimensional spatial channel model (3D-SCM) to analyze the key statistical characteristics of the generated channel, and present efficient ways of determining corresponding codebooks. In particular, we propose a nonuniform-phase DFT-based codebook to improve the existing uniform-phase DFT-based codebook, and evaluate its performance under the given SCM transmission environment. There exists a strong tendancy in statistical distributions of the phase difference between adjacent antenna elements for the SCM, which can be appropriately exploited in codebook design to produce a performance gain over the existing design.