• Title/Summary/Keyword: subword

Search Result 38, Processing Time 0.023 seconds

A Study on the Development of Korea Telecom Automatic Voice Recognition System (음성인식에 의한 연구센타 부서안내 시스팀 개발에 관한 연구)

  • Koo, Myoung-Wan;Sohn, Il-Hyun;Doh, Sam-Joo;Lee, Jong-Rak
    • Annual Conference on Human and Language Technology
    • /
    • 1992.10a
    • /
    • pp.185-192
    • /
    • 1992
  • 이 논문에서는 음성인식기술을 이용한 연구센타 부서안내 시스팀(KARS:Korea Telecom Automatic voice Recognition system)에 대하여 기술하였다. 이 시스팀은 기본적으로 음성응답 시스팀과 유사하지만 명령입력을 위해 푸시버튼 대신 음성을 이용한다는 점이 다르다. 사용자가 마이크로폰을 통해 음성명령을 입력하면, 이 시스팀은 사용자의 음성명령을 인식하여 연구센타내 각 부서의 간략한 소개, 전화번호 및 위치를 안내해 준다. 이 시스팀은 HMM(Hidden Markov Model)을 이용하는 화자독립 격리단어 인식시스팀으로서 116개의 부서이름과 7개의 제어용 단어로 구성되어 있는 123개 단어를 인식할 수 있다. 이 시스팀은 음소와 유사한 한국어 서브워드(subword)를 HMM의 기본단위로 사용하며 인식 실험결과 98.6%의 인식율을 얻을 수 있었다.

  • PDF

Automatic proficiency assessment of Korean speech read aloud by non-natives using bidirectional LSTM-based speech recognition

  • Oh, Yoo Rhee;Park, Kiyoung;Jeon, Hyung-Bae;Park, Jeon Gue
    • ETRI Journal
    • /
    • v.42 no.5
    • /
    • pp.761-772
    • /
    • 2020
  • This paper presents an automatic proficiency assessment method for a non-native Korean read utterance using bidirectional long short-term memory (BLSTM)-based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced-alignment step using a native AM and non-native AM, and (c) a linear regression-based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un-segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.

Subword Modeling of Vocabulary Independent Speech Recognition Using Phoneme Clustering (음소 군집화 기법을 이용한 어휘독립음성인식의 음소모델링)

  • Koo Dong-Ook;Choi Joon Ki;Yun Young-Sun;Oh Yung-Hwan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.33-36
    • /
    • 2000
  • 어휘독립 고립단어인식은 미리 훈련된 부단어(sub-word) 단위의 음향모델을 이용하여 수시로 변하는 인식대상어휘를 인식하는 것이다. 본 논문에서는 소용량 음성 데이터베이스를 이용하여 어휘독립음성인식 시스템을 구성하였다. 소용량 음성 데이터베이스에서 미관측문맥 종속형 부단어에 대한 처리에 효과적인 백오프 기법을 이용한 음소 군집화 방법으로 문턱값을 변화시키며 인식실험을 수행하였다. 그리고 훈련용 데이터의 부족으로 인하여 문맥 종속형 부단어 모델이 훈련용 데이터베이스로 편중되는 문제를 deleted interpolation 방법을 이용하여 문맥 종속형 부단어 모델과 문맥 독립형 부단어 모델을 병합함으로써 해결하였다. 그 결과 음성인식의 성능이 향상되었다.

  • PDF

Automatic Evaluation of Elementary School English Writing Based on Recurrent Neural Network Language Model (순환 신경망 기반 언어 모델을 활용한 초등 영어 글쓰기 자동 평가)

  • Park, Youngki
    • Journal of The Korean Association of Information Education
    • /
    • v.21 no.2
    • /
    • pp.161-169
    • /
    • 2017
  • We often use spellcheckers in order to correct the syntactic errors in our documents. However, these computer programs are not enough for elementary school students, because their sentences are not smooth even after correcting the syntactic errors in many cases. In this paper, we introduce an automated method for evaluating the smoothness of two synonymous sentences. This method uses a recurrent neural network to solve the problem of long-term dependencies and exploits subwords to cope with the rare word problem. We trained the recurrent neural network language model based on a monolingual corpus of about two million English sentences. In our experiments, the trained model successfully selected the more smooth sentences for all of nine types of test set. We expect that our approach will help in elementary school writing after being implemented as an application for smart devices.

Automatic Selection of Similar Sentences for Teaching Writing in Elementary School (초등 글쓰기 교육을 위한 유사 문장 자동 선별)

  • Park, Youngki
    • Journal of The Korean Association of Information Education
    • /
    • v.20 no.4
    • /
    • pp.333-340
    • /
    • 2016
  • When elementary students write their own sentences, it is often educationally beneficial to compare them with other people's similar sentences. However, it is impractical for use in most classrooms, because it is burdensome for teachers to look up all of the sentences written by students. To cope with this problem, we propose a novel approach for automatic selection of similar sentences based on a three-step process: 1) extracting the subword units from the word-level sentences, 2) training the model with the encoder-decoder architecture, and 3) using the approximate k-nearest neighbor search algorithm to find the similar sentences. Experimental results show that the proposed approach achieves the accuracy of 75% for our test data.

A Study on the Instruction Set Architecture of Multimedia Extension Processor (멀티미디어 확장 프로세서의 명령어 집합 구조에 관한 연구)

  • O, Myeong-Hun;Lee, Dong-Ik;Park, Seong-Mo
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.38 no.6
    • /
    • pp.420-435
    • /
    • 2001
  • As multimedia technology has rapidly grown recently, many researches to process multimedia data efficiently using general-purpose processors have been studied. In this paper, we proposed multimedia instructions which can process multimedia data effectively, and suggested a processor architecture for those instructions. The processor was described with Verilog-HDL in the behavioral level and simulated with CADENCE$^{TM}$ tool. Proposed multimedia instructions are total 48 instructions which can be classified into 7 groups. Multimedia data have 64-bit format and are processed as parallel subwords of 8-bit 8 bytes, 16-bit 4 half words or 32-bit 2 words. Modeled processor is developed based on the Integer Unit of SPARC V.9. It has five-stage pipeline RISC architecture with Harvard principle.e.

  • PDF

The Unsupervised Learning-based Language Modeling of Word Comprehension in Korean

  • Kim, Euhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.41-49
    • /
    • 2019
  • We are to build an unsupervised machine learning-based language model which can estimate the amount of information that are in need to process words consisting of subword-level morphemes and syllables. We are then to investigate whether the reading times of words reflecting their morphemic and syllabic structures are predicted by an information-theoretic measure such as surprisal. Specifically, the proposed Morfessor-based unsupervised machine learning model is first to be trained on the large dataset of sentences on Sejong Corpus and is then to be applied to estimate the information-theoretic measure on each word in the test data of Korean words. The reading times of the words in the test data are to be recruited from Korean Lexicon Project (KLP) Database. A comparison between the information-theoretic measures of the words in point and the corresponding reading times by using a linear mixed effect model reveals a reliable correlation between surprisal and reading time. We conclude that surprisal is positively related to the processing effort (i.e. reading time), confirming the surprisal hypothesis.

Simulation of YUV-Aware Instructions for High-Performance, Low-Power Embedded Video Processors (고성능, 저전력 임베디드 비디오 프로세서를 위한 YUV 인식 명령어의 시뮬레이션)

  • Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.5
    • /
    • pp.252-259
    • /
    • 2007
  • With the rapid development of multimedia applications and wireless communication networks, consumer demand for video-over-wireless capability on mobile computing systems is growing rapidly. In this regard, this paper introduces YUV-aware instructions that enhance the performance and efficiency in the processing of color image and video. Traditional multimedia extensions (e.g., MMX, SSE, VIS, and AltiVec) depend solely on generic subword parallelism whereas the proposed YUV-aware instructions support parallel operations on two-packed 16-bit YUV (6-bit Y, 5-bits U, V) values in a 32-bit datapath architecture, providing greater concurrency and efficiency for color image and video processing. Moreover, the ability to reduce data format size reduces system cost. Experiment results on a representative dynamically scheduled embedded superscalar processor show that YUV-aware instructions achieve an average speedup of 3.9x over the baseline superscalar performance. This is in contrast to MMX (a representative Intel#s multimedia extension), which achieves a speedup of only 2.1x over the same baseline superscalar processor. In addition, YUV-aware instructions outperform MMX instructions in energy reduction (75.8% reduction with YUV-aware instructions, but only 54.8% reduction with MMX instructions over the baseline).