Search | Korea Science

HMM-based Speech Recognition using DMS Model and Double Spectral Feature (DMS 모델과 이중 스펙트럼 특징을 이용한 HMM에 의한 음성 인식)

Ann Tae-Ock
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.7 no.4
- /
- pp.649-655
- /
- 2006
This paper proposes a HMM-based recognition method using DMSVQ(Dynamic Multi-Section Vector Quantization) codebook by DMS model and double spectral feature, as a method on the speech recognition of speaker-independent. LPC cepstrum parameter is used as a instantaneous spectral feature and LPC cepstrum's regression coefficient is used as a dynamic spectral feature These two spectral features are quantized as each VQ codebook. HMM using DMS model is modeled by receiving instantaneous spectral feature and dynamic spectral feature by input. Other experiments to compare with the results of recognition experiments using proposed method are implemented by the various conventional recognition methods under the equivalent environment of data and conditions. Through the experiment results, it is proved that the proposed method in this paper is superior to the conventional recognition methods.
PDF

A Study on Channel Mis-match Compensation Technique for Robust Speaker Verification System (강인한 화자확인 시스템을 위한 채널 불일치 보상 기법에 관한 연구)

강철호;정희석
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.3
- /
- pp.228-234
- /
- 2004
In this paper, we proposed the compensation technique that overcomes the limitations of the conventional approaches through summing up the bias terms between world's codebook and individual codebook vectors of feature parameters. But, mean compensation without condition can bring higher false acceptance. Therefore, the proposed technique compensates the channel mis-match condition by weighted bias sum using nonlinear function regarding to the distortion between speech and silence. The simulation results show that the FRR (flase reject rate) is decreased 14.95% when the proposed algorithm was applied.
PDF KSCI

Voice Personality Transformation Using a Multiple Response Classification and Regression Tree (다중 응답 분류회귀트리를 이용한 음성 개성 변환)

이기승
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.3
- /
- pp.253-261
- /
- 2004
In this paper, a new voice personality transformation method is proposed. which modifies speaker-dependent feature variables in the speech signals. The proposed method takes the cepstrum vectors and pitch as the transformation paremeters, which represent vocal tract transfer function and excitation signals, respectively. To transform these parameters, a multiple response classification and regression tree (MR-CART) is employed. MR-CART is the vector extended version of a conventional CART, whose response is given by the vector form. We evaluated the performance of the proposed method by comparing with a previously proposed codebook mapping method. We also quantitatively analyzed the performance of voice transformation and the complexities according to various observations. From the experimental results for 4 speakers, the proposed method objectively outperforms a conventional codebook mapping method. and we also observed that the transformed speech sounds closer to target speech.
PDF KSCI

Limited Feedback Precoding for Correlated Massive MIMO Systems (공간 상관도를 가지는 거대배열 다중안테나 시스템에서 압축채널 제한적 피드백 알고리즘)

Lim, Yeon-Geun;Chae, Chan-Byoung
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.39A no.7
- /
- pp.431-436
- /
- 2014
In this paper, we propose a compressive sensing-based channel quantization feedback mechanism that is appropriate for practical massvie multiple-input multiple-output (MIMO) systems. We assume that the base station (BS) has a compact uniform square array that has a highly correlated channel. To serve multiple users, the BS uses a zero-forcing precoder. Our proposed channel feedback algorithm can reduce the feedback overhead as well as a codebook search complexity. Numerical simulations confirm our analytical results.
https://doi.org/10.7840/kics.2014.39A.7.431 인용 PDF KSCI

Random beamforming applying codebook rotation (다중 코드북을 이용한 랜덤 빔 형성 기법)

Kang, Ji-Won;Yoo, Byung-Wook;Seo, Jeong-Tae;Lee, Chung-Yong
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.46 no.7
- /
- pp.1-5
- /
- 2009
Random beanforming exploits multiuser diversity gain in static channels. Since the gain is restricted by the user population, some extended works have been proposed. Among them, a codebook-based opportunistic beamforming technique forms multiple random beams with small pilots. The technique however has difficulty in designing beams flexibly by the channel statistics. In this paper, we propose a technique forming the multiple random beams by rotating codebooks. The proposed technique enables the flexible design of beams so that multiuser diversity and beam selection diversity are exploited simultaneously with small pilots robust to the channel statistics.
PDF KSCI

PCA-Based MPEG Video Retrieval in Compressed Domain (PCA에 기반한 압축영역에서의 MPEG Video 검색기법)

이경화;강대성
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.40 no.1
- /
- pp.28-33
- /
- 2003
This paper proposes a database index and retrieval method using the PCA(Principal Component Analysis). We perform a scene change detection and key frame extraction from the DC Image constructed by DCT DC coefficients in the compressed video stream that is video compression standard such as MPEG. In the extracted key frame, we use the PCA, then we can make codebook that has a statistical data as a codeword, which is saved as a database index. We also provide retrieval image that are similar to user's query image in a video database. As a result of experiments, we confirmed that the proposed method clearly showed superior performance in video retrieval and reduced computation time and memory space.
PDF KSCI

System-Level Performance of Limited Feedback Schemes for Massive MIMO

Choi, Yongin;Lee, Jaewon;Rim, Minjoong;Kang, Chung Gu;Nam, Junyoung;Ko, Young-Jo
- ETRI Journal
- /
- v.38 no.2
- /
- pp.280-290
- /
- 2016
To implement high-order multiuser multiple input and multiple output (MU-MIMO) for massive MIMO systems, there must be a feedback scheme that can warrant its performance with a limited signaling overhead. The interference-to-noise ratio can be a basis for a novel form of Codebook (CB)-based MU-MIMO feedback scheme. The objective of this paper is to verify such a scheme's performance under a practical system configuration with a 3D channel model in various radio environments. We evaluate the performance of various CB-based feedback schemes with different types of overhead reduction approaches, providing an experimental ground with which to optimize a CB-based MU-MIMO feedback scheme while identifying the design constraints for a massive MIMO system.
https://doi.org/10.4218/etrij.16.0115.0064 인용 PDF KSCI

Korean Speech Recognition using DHMM (DHMM을 이용한 한국어 음성 인식)

Ann, T.O.;Lee, K.S.;Yoo, H.K.;Lee, H.J.;Cho, H.J.;Byun, Y.G.;Kim, S.H.
- The Journal of the Acoustical Society of Korea
- /
- v.10 no.1
- /
- pp.52-60
- /
- 1991
This paper describes the study on isolated word recognition by using DHMM(Dynamic Hidden Markov Model) which has dynamic feature of spectrum as a parameter. This paper discusses speech recognition experiment basedon HMM which can evaluate not only instantaneous spectral features but also dynamic spectral features. LPC cepstrum parameters is used as a static feature and LPC cepstrum's regression coefficient is used as a dynamic feature. These two features are quantized by each VQ codebook. DHMM is modeled by receiving static vector and dynamic vector by input. In the whole experiment, as recognition experiment using DHMM shows 92.7% of recognition rate while the experiment using conventional HMM shows 88.8% of recognition rate, DHMM proved to be a useful model.
PDF

Enhanced VLAD

Wei, Benchang;Guan, Tao;Luo, Yawei;Duan, Liya;Yu, Junqing
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.10 no.7
- /
- pp.3272-3285
- /
- 2016
Recently, Vector of Locally Aggregated Descriptors (VLAD) has been proposed to index image by compact representations, which encodes powerful local descriptors and makes significant improvement on search performance with less memory compared against the state of art. However, its performance relies heavily on the size of the codebook which is used to generate VLAD representation. It indicates better accuracy needs higher dimensional representation. Thus, more memory overhead is needed. In this paper, we enhance VLAD image representation by using two level hierarchical-codebooks. It can provide more accurate search performance while keeping the VLAD size unchanged. In addition, hierarchical-codebooks are used to construct multiple inverted files for more accurate non-exhaustive search. Experimental results show that our method can make significant improvement on both VLAD image representation and non-exhaustive search.
https://doi.org/10.3837/tiis.2016.07.022 인용 PDF KSCI KPUBS HTML

Content-Based Retrieval System Design over the Internet (인터넷에 기반한 내용기반 검색 시스템 설계)

Kim Young Ho;Kang Dae-Seong
- Journal of Institute of Control, Robotics and Systems
- /
- v.11 no.5
- /
- pp.471-475
- /
- 2005
Recently, development of digital technology is occupying a large part of multimedia information like character, voice, image, video, etc. Research about video indexing and retrieval progresses especially in research relative to video. This paper proposes the novel notation in order to retrieve MPEG video in the international standards of moving picture encoding For realizing the retrieval-system, we detect DCT DC coefficient, and then we obtain shot to apply MVC(Mean Value Comparative) notation to image constructed DC coefficient. We choose the key frame for start-frame of a shot, and we have the codebook index generating it using feature of DC image and applying PCA(principal Component Analysis) to the key frame. Also, we realize the retrieval-system through similarity after indexing. We could reduce error detection due to distinguish shot from conventional shot detection algorithm. In the mean time, speed of indexing is faster by PCA due to perform it in the compressed domain, and it has an advantage which is to generate codebook due to use statistical features. Finally, we could realize efficient retrieval-system using MVC and PCA to shot detection and indexing which is important step of retrieval-system, and we using retrieval-system over the internet.
https://doi.org/10.5302/J.ICROS.2005.11.5.471 인용 PDF KSCI

Search Result 346, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)