• 제목/요약/키워드: Dictionary

검색결과 1,125건 처리시간 0.026초

한국어 자동 발음열 생성을 위한 예외발음사전 구축 (Building an Exceptional Pronunciation Dictionary For Korean Automatic Pronunciation Generator)

  • 김선희
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.167-177
    • /
    • 2003
  • This paper presents a method of building an exceptional pronunciation dictionary for Korean automatic pronunciation generator. An automatic pronunciation generator is an essential element of speech recognition system and a TTS (Text-To-Speech) system. It is composed of a part of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words which have exceptional pronunciations from text corpus based on the characteristics of the words of exceptional pronunciation through phonological research and text analysis. Thus, the method contributes to improve performance of Korean automatic pronunciation generator as well as the performance of speech recognition system and TTS system.

  • PDF

Vehicle Image Recognition Using Deep Convolution Neural Network and Compressed Dictionary Learning

  • Zhou, Yanyan
    • Journal of Information Processing Systems
    • /
    • 제17권2호
    • /
    • pp.411-425
    • /
    • 2021
  • In this paper, a vehicle recognition algorithm based on deep convolutional neural network and compression dictionary is proposed. Firstly, the network structure of fine vehicle recognition based on convolutional neural network is introduced. Then, a vehicle recognition system based on multi-scale pyramid convolutional neural network is constructed. The contribution of different networks to the recognition results is adjusted by the adaptive fusion method that adjusts the network according to the recognition accuracy of a single network. The proportion of output in the network output of the entire multiscale network. Then, the compressed dictionary learning and the data dimension reduction are carried out using the effective block structure method combined with very sparse random projection matrix, which solves the computational complexity caused by high-dimensional features and shortens the dictionary learning time. Finally, the sparse representation classification method is used to realize vehicle type recognition. The experimental results show that the detection effect of the proposed algorithm is stable in sunny, cloudy and rainy weather, and it has strong adaptability to typical application scenarios such as occlusion and blurring, with an average recognition rate of more than 95%.

Modal parameter identification with compressed samples by sparse decomposition using the free vibration function as dictionary

  • Kang, Jie;Duan, Zhongdong
    • Smart Structures and Systems
    • /
    • 제25권2호
    • /
    • pp.123-133
    • /
    • 2020
  • Compressive sensing (CS) is a newly developed data acquisition and processing technique that takes advantage of the sparse structure in signals. Normally signals in their primitive space or format are reconstructed from their compressed measurements for further treatments, such as modal analysis for vibration data. This approach causes problems such as leakage, loss of fidelity, etc., and the computation of reconstruction itself is costly as well. Therefore, it is appealing to directly work on the compressed data without prior reconstruction of the original data. In this paper, a direct approach for modal analysis of damped systems is proposed by decomposing the compressed measurements with an appropriate dictionary. The damped free vibration function is adopted to form atoms in the dictionary for the following sparse decomposition. Compared with the normally used Fourier bases, the damped free vibration function spans a space with both the frequency and damping as the control variables. In order to efficiently search the enormous two-dimension dictionary with frequency and damping as variables, a two-step strategy is implemented combined with the Orthogonal Matching Pursuit (OMP) to determine the optimal atom in the dictionary, which greatly reduces the computation of the sparse decomposition. The performance of the proposed method is demonstrated by a numerical and an experimental example, and advantages of the method are revealed by comparison with another such kind method using POD technique.

타언어권 화자 음성 인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법 (Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition)

  • 김민아;오유리;김홍국;이연우;조성의;이성로
    • 대한음성학회지:말소리
    • /
    • 제65호
    • /
    • pp.93-103
    • /
    • 2008
  • In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.

  • PDF

3차원 형태 특징의 사전 학습을 이용한 기하 복원 (Geometry Reconstruction Using Dictionary Learning of 3D Shape Features)

  • 황정민;윤여진;최수미
    • 한국컴퓨터그래픽스학회논문지
    • /
    • 제23권1호
    • /
    • pp.57-65
    • /
    • 2017
  • 본 논문에서는 포인트 클라우드로 구성된 모델 내의 오류를 줄이고, 기하학적 형태를 복원하기 위한 사전 학습 방법을 제시한다. 이를 위해, 대상 모델과 유사한 형태 특징을 갖는 모델로부터 3차원 특징 정보를 추출하여 사전을 구성하고, 이를 통해 기하 복원을 수행한다. 본 연구에서 제시한 방법은 다음과 같이 세 단계로 구성된다. 첫째, 유사 모델로부터 기하 패치를 구성하는 단계, 둘째, 획득한 패치의 3차원 형태 특징을 학습하는 단계, 셋째, 학습된 사전을 이용하여 기하를 복원하는 단계이며, 최종적으로 원본 모델과 복원 결과의 오차를 계산하며, 복원 결과의 정확도를 확인한다.

Phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary

  • Yang, Byunggon
    • 말소리와 음성과학
    • /
    • 제8권2호
    • /
    • pp.11-16
    • /
    • 2016
  • This study explores the phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary to provide phoneticians and linguists with fundamental phonetic data on English word components. Entry words in the dictionary file were syllabified using an R script and examined to obtain the following results: First, English words preferred consonants to vowels in their word components. In addition, monophthongs occurred much more frequently than diphthongs. When all consonants were categorized by manner and place, the distribution indicated the frequency order of stops, fricatives, and nasals according to manner and that of alveolars, bilabials and velars according to place. These results were comparable to the results obtained from the Buckeye Corpus (Yang, 2012). Second, from the analysis of syllable structure, two-syllable words were most favored, followed by three- and one-syllable words. Of the words in the dictionary, 92.7% consisted of one, two or three syllables. This result may be related to human memory or decoding time. Third, the English words tended to exhibit discord between onset and coda consonants and between adjacent vowels. Dissimilarity between the last onset and the first coda was found in 93.3% of the syllables, while 91.6% of the adjacent vowels were different. From the results above, the author concludes that an analysis of the phonetic symbols in a dictionary may lead to a deeper understanding of English word structures and components.

Laser Spot Detection Using Robust Dictionary Construction and Update

  • Wang, Zhihua;Piao, Yongri;Jin, Minglu
    • Journal of information and communication convergence engineering
    • /
    • 제13권1호
    • /
    • pp.42-49
    • /
    • 2015
  • In laser pointer interaction systems, laser spot detection is one of the most important technologies, and most of the challenges in this area are related to the varying backgrounds, and the real-time performance of the interaction system. In this paper, we present a robust dictionary construction and update algorithm based on a sparse model of background subtraction. In order to control dynamic backgrounds, first, we determine whether there is a change in the backgrounds; if this is true, the new background can be directly added to the dictionary configurations; otherwise, we run an online cumulative average on the backgrounds to update the dictionary. The proposed dictionary construction and update algorithm for laser spot detection, is robust to the varying backgrounds and noises, and can be implemented in real time. A large number of experimental results have confirmed the superior performance of the proposed method in terms of the detection error and real-time implementation.

한국어 형태소 분석을 위한 효율적 기분석 사전의 구성 방법 (Construction of an Efficient Pre-analyzed Dictionary for Korean Morphological Analysis)

  • 곽수정;김보겸;이재성
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제2권12호
    • /
    • pp.881-888
    • /
    • 2013
  • 기분석 사전은 형태소 분석기의 속도와 정확도를 향상시키고, 과분석을 줄이기 위해 사용된다. 하지만 기분석 사전에 저장된 어절 중에 저장된 형태소 분석 결과가 부족한 어절, 즉 불충분 분석 어절이 존재할 경우 오히려 형태소 분석기의 정확도를 떨어뜨리는 원인으로 작용할 수 있다. 본 논문에서는 세종 형태 분석 말뭉치(문어체, 2011)를 이용해 말뭉치의 크기와 어절 빈도의 변화에 따라 사전의 정답 제시율이 변화하는 양상을 측정하였다. 그리고 통계기반의 형태소 분석기인 SMA와 기분석 사전을 결합한 통합 시스템을 구성하여 기분석 사전의 충분 분석률이 99.82% 이상일 때 시스템 전체 성능이 향상되는 것을 확인하였다. 또한 160만 어절의 말뭉치를 이용할 때는 32회 이상 출현한 어절로, 630만 어절로 구성된 말뭉치를 이용할 때는 64회 이상 출현한 어절로 사전을 구성하는 것이 통합 시스템의 성능을 가장 높게 할 수 있었다.

K-SVD 기반 사전 훈련과 비음수 행렬 분해 기법을 이용한 중첩음향이벤트 검출 (Overlapping Sound Event Detection Using NMF with K-SVD Based Dictionary Learning)

  • 최현식;금민석;고한석
    • 한국음향학회지
    • /
    • 제34권3호
    • /
    • pp.234-239
    • /
    • 2015
  • 비음수 행렬 분해(Nonnegative Matrix Factorization, NMF) 기법은 사전행렬과 크기성분을 번갈아 가며 업데이트 하면서 구하는 방법이며 직관적 해석 및 구현의 용이성으로 인해 중첩음향이벤트 분리 및 검출방법으로 널리 활용되었다. 하지만 비음수 행렬 분해의 고유한 특성인 부분기반표현(part-based representation)으로 인해 하나의 음향 이벤트를 구성 하는 사전(dictionary)의 파편화 현상이 발생하고, 다른 음향이벤트와 중복되는 사전이 생성되어 결과적으로 분리, 검출 성능의 저하 문제가 발생한다. 본 논문에서는 사전 획득 단계의 부분기반표현에 의한 문제를 해소하기 위해 K-Singular Value Decomposition(K-SVD)을 사용하여 사전을 획득하고, 음향이벤트 검출 단계 에서는 기존 비음수 행렬 분해 기법을 이용하여 크기를 획득 한다. 제안하는 방식을 통해 비음수 행렬 분해 기반의 사전을 사용하는 경우보다 중첩음향이벤트 검출 성능이 개선되는 것을 확인하였다.

4차원 Light Field 영상에서 Dictionary Learning 기반 초해상도 알고리즘 (Dictionary Learning based Superresolution on 4D Light Field Images)

  • 이승재;박인규
    • 방송공학회논문지
    • /
    • 제20권5호
    • /
    • pp.676-686
    • /
    • 2015
  • Light field 카메라를 이용하여 영상을 취득한 후 다양한 응용 프로그램으로 확장이 가능한 4차원 light field 영상은 일반적인 2차원 공간영역(spatial domain)과 추가적인 2차원 각영역(angular domain)으로 구성된다. 그러나 이러한 4차원 light field 영상을 유한한 해상도를 가진 2차원 CMOS 센서로 취득하므로 저해상도의 제약이 존재한다. 본 논문에서는 이러한 4차원 light field 영상이 가지는 해상도 제약 조건을 해결하기 위하여, 4차원 light field 영상에 적합한 딕셔너리 학습 기반(dictionary learning-based) 초해상도(superresolution) 알고리즘을 제안한다. 제안하는 알고리즘은 4차원 light field 영상으로부터 추출한 많은 수의 4차원 패치(patch)들을 바탕으로 딕셔너리를 구성 및 훈련하며, 학습된 딕셔너리를 바탕으로 저해상도 입력 영상의 해상도를 향상시키는 과정을 수행한다. 제안하는 알고리즘은 공간영역과 각영역의 해상도를 동시에 각각 2배 향상시킨다. 실험에 사용된 영상은 상용 light field 카메라인 Lytro에서 취득하였고 기존의 알고리즘과의 비교를 통해 제안하는 알고리즘의 우수성을 검증한다.