Search | Korea Science

Speech Recognition for the Korean Vowel 'ㅣ' based on Waveform-feature Extraction and Neural-network Learning (파형 특징 추출과 신경망 학습 기반 모음 'ㅣ' 음성 인식)

Rho, Wonbin;Lee, Jongwoo;Lee, Jaewon
- KIISE Transactions on Computing Practices
- /
- v.22 no.2
- /
- pp.69-76
- /
- 2016
With the recent increase of the interest in IoT in almost all areas of industry, computing technologies have been increasingly applied in human environments such as houses, buildings, cars, and streets; in these IoT environments, speech recognition is being widely accepted as a means of HCI. The existing server-based speech recognition techniques are typically fast and show quite high recognition rates; however, an internet connection is necessary, and complicated server computing is required because a voice is recognized by units of words that are stored in server databases. This paper, as a successive research results of speech recognition algorithms for the Korean phonemic vowel 'ㅏ', 'ㅓ', suggests an implementation of speech recognition algorithms for the Korean phonemic vowel 'ㅣ'. We observed that almost all of the vocal waveform patterns for 'ㅣ' are unique and different when compared with the patterns of the 'ㅏ' and 'ㅓ' waveforms. In this paper we propose specific waveform patterns for the Korean vowel 'ㅣ' and the corresponding recognition algorithms. We also presents experiment results showing that, by adding neural-network learning to our algorithm, the voice recognition success rate for the vowel 'ㅣ' can be increased. As a result we observed that 90% or more of the vocal expressions of the vowel 'ㅣ' can be successfully recognized when our algorithms are used.
https://doi.org/10.5626/KTCP.2016.22.2.69 인용 KSCI

The Trend of Integrated Solution Service Based on VoIP and Voice Recognition (VoIP와 음석인식에 기반한 통합솔루션 서비스 동향)

Oh, Jae-Sam;Yoon, Young-Keun
- Journal of Information Technology Services
- /
- v.1 no.1
- /
- pp.57-66
- /
- 2002
We are looking at the two different kinds of IT on this paper. One is VoIP and the other is VR (voice recognition). We are more interesting at the evolving techniques and services produced by combining the two techniques mentioned above. Recently, there are so many services and products appeared in the market using voice recognition technique. Now the technique has progressed on the level that can even replace the user interfaces using the QUI or general DTMF. Therefore, we are expecting so many various new services showed UP In the market which is combination of the VoIP and VR. Up until now, three models are available in the field which are wired telephone, wireless telephone, and wireless internet. We know the effectiveness of the VoIP is maximized more when this technique is combined with others rather than used alone without other techniques.
PDF KSCI

PSTN용 멀티미디어 통합 시스템 구현에 대한 연구

백원석;신성효;이교식;장인철;남현정;박병수;장혜진;박재현;박규식
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.201-204
- /
- 1999
본 연구의 목적은 기존의 PSTN 전화망을 이용하여 음성 통화는 물론 데이터, 팩스 일 동영상 등 다양한 멀티미디어 정보를 송수신 할 수 있는 멀티미디어 단말 시스템을 개발하는데 있다. 본 개발시스템은 ITU-T의 V.34 규격을 만족하는 DSP 모뎀과 음성 및 데이터를 동시에 송수신할 수 있는 DSVD 기능, 그리고 다양한 멀티미디어 기능을 구현 및 제어하게되는 시스템 콘트롤러등으로 구성되며 사용자들로서는 LCD/Touch Screen 메뉴에서 원하는 정보서비스를 선택하기만 하면 음성, 데이터 통신, 팩스 등의 다양한 서비스를 이용할 수 있게 된다.
PDF

An Experimental Field Trial of Stock Information Retrieval System Based on Speech Recognition (음성인식기술을 이용한 증권정보 안내 시스템의 실험적 실용시험)

도삼주
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.241-244
- /
- 1994
이 논문은 대어휘, 화자독립 음성인식 시스템인 KT-STOCK과 이 시스템에 대한 전화망을 통한 실험적 실용시험에 대해 기술하였다. KT-STOCK은 현재 주식시장에 상장된 712개 회사의 현재주가를 음성을 이용하여 검색할 수 있는 시스템이다. 이 시스템은 hidden markov model 기술에 기반을 둔 고립단어 인식 시스템이며 유사음소를 기본 인식단위로 사용한다. KT-STOCK은 1994년 6월 24일부터 실험적 실용시험 중에 있다. 중간 결과에 따르면 모의 실험 결과는 실제 환경에서의 시험과 차이가 있는 거승로 나타났다. 실제 환경에서 이 시스템의 인식률은 현재 61.9%이다.
PDF

WiBro동향과 사업전략

서종렬
- The Proceeding of the Korean Institute of Electromagnetic Engineering and Science
- /
- v.15 no.3
- /
- pp.36-49
- /
- 2004
최근 이동전화 시장은 전 세계 대부분의 국가에서 3세대 서비스의 부진과 음성서비스의 포화 내지는 정체상태에 직면하고 있으며, 3세대 IMT2000 서비스의 경우, 기존망의 급속한 성능 향상과 장비 및 단말기 개발 지연 등으로 이용자의 Needs를 충분히 반영하지 못하여 시장 활성화가 부진한 상태이다. 음성서비스와 함께 이동전화 서비스의 양축이라 할 수 있는 무선인터넷 서비스의 경우 음성을 중심으로 한 이동전화 시스템 특성상 저속의 전송속도 및 비싼 요금 문제로 이동전화 시장이 포화된 우리나라에서 조차도 서비스 확산에 어려움을 겪고 있는 상황이다. (중략)
PDF KSCI

Implementation of User Interface Specification for Digital Speech Communication Model System (디지틀 음성통화 모델 시스템의 사용자 접속 규격의 구현)

홍진우
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1992.06a
- /
- pp.155-159
- /
- 1992
본 논문은 디지틀 음성통신 서비스의 품질기준 (통화품질, 접속품질)을 설정하기 위한 실험용 통화모델 시스템의 사용자 접속 규격을 구현한 것이다. 본 논문에서 구현한 사용자 접속 규격은 디지틀 음성통신망에서 사용하려는 접속 protocol을 그대로 적용하여 실제의 상황과 같도록 하기 위하여 종합정보통신망 (ISDN)의 표준규격인 사용자 - 망 접속 규격을 이용하여 구현하였다. CCITT 권고안 I series의 규격을 만족하는 Protocol Simulator 와의 연동실험을 통하여 구현한 모델 시스템에 대한 사용자 접속 규격의 처리 및 성능을 확인하였다.
PDF

A Study for the Voice channel extension method using Code Division Multiplexing (부호분할 다중화 기법을 이용한 음성 회선 확대 방안 연구)

권기형;신용조
- Journal of the Korea Society of Computer and Information
- /
- v.3 no.4
- /
- pp.103-109
- /
- 1998
Domastic telephony transmission networks mainly using El in 2.048Mb㎰ is composed to 30 channels and each channel is assigned to 64Kb㎰ voice coding rate. El method always uses TDM, so it is fixed channels. In this paper, it shows that using CDM enlarge the subscribers and voice channels
PDF

Creation and labeling of multiple phonotopic maps using a hierarchical self-organizing classifier (계층적 자기조직화 분류기를 이용한 다수 음성자판의 생성과 레이블링)

Chung, Dam;Lee, Kee-Cheol;Byun, Young-Tai
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.21 no.3
- /
- pp.600-611
- /
- 1996
Recently, neural network-based speech recognition has been studied to utilize the adaptivity and learnability of neural network models. However, conventional neural network models have difficulty in the co-articulation processing and the boundary detection of similar phonmes of the Korean speech. Also, in case of using one phonotopic map, learning speed may dramatically increase and inaccuracies may be caused because homogeneous learning and recognition method should be applied for heterogenous data. Hence, in this paper, a neural net typewriter has been designed using a hierarchical self-organizing classifier(HSOC), and related algorithms are presented. This HSOC, during its learing stage, distributed phoneme data on hierarchically structured multiple phonotopic maps, using Kohonen's self-organizing feature maps(SOFM). Presented and experimented in this paper were the algorithms for deciding the number of maps, map sizes, the selection of phonemes and their placement per map, an approapriate learning and preprocessing method per map. If maps are divided according to a priorlinguistic knowledge, we would have difficulty in acquiring linguistic knowledge and how to alpply it(e.g., processing extended phonemes). Contrarily, our HSOC has an advantage that multiple phonotopic maps suitable for given input data are self-organizable. The resulting three korean phonotopic maps are optimally labelled and have their own optimal preprocessing schemes, and also confirm to the conventional linguistic knowledge.
PDF

Universal Personal Telecommunications using Specialized Resource Functions in the Intelligent Peripheral (Intelligent Peripheral의 특수 음성 자원을 이용한 Universal Personal Telecommunications 서비스)

Kim, Gi-Ryeong;Kim, Tae-Il;Choe, Go-Bong
- The Transactions of the Korea Information Processing Society
- /
- v.3 no.6
- /
- pp.1506-1514
- /
- 1996
This paper proposes enhanced features for the Universal Telecommunications (UPT), voice authentication and voice synthesis, using the specialized resources functions in the Intelligent peripheral(IP). The proposed voice authentication is able ti provide simple and user-friendly security mechanism and to prevent unauthorized users from fraudulently using the UPT number. Also, traditional UPT service deliveries only fixed message to the UPT user, but the proposed UPT service can support flexible message transfer by use of the voice synthesis.
PDF

Motion Study of Treatment Robot for Autistic Children Using Speech Data Classification Based on Artificial Neural Network (음성 분류 인공신경망을 활용한 자폐아 치료용 로봇의 지능화 동작 연구)

Lee, Jin-Gyu;Lee, Bo-Hee
- Journal of IKEEE
- /
- v.23 no.4
- /
- pp.1440-1447
- /
- 2019
Currently, the prevalence of autism spectrum disorders in children is reported to be higher and shows various types of disorders. In particular, they are having difficulty in communication due to communication impairment in the area of social communication and need to be improved through training. Thus, this study proposes a method of acquiring voice information through a microphone mounted on a robot designed through preliminary research and using this information to make intelligent motions. An ANN(Artificial Neural Network) was used to classify the speech data into robot motions, and we tried to improve the accuracy by combining the Recurrent Neural Network based on Convolutional Neural Network. The preprocessing of input speech data was analyzed using MFCC(Mel-Frequency Cepstral Coefficient), and the motion of the robot was estimated using various data normalization and neural network optimization techniques. In addition, the designed ANN showed a high accuracy by conducting an experiment comparing the accuracy with the existing architecture and the method of human intervention. In order to design robot motions with higher accuracy in the future and to apply them in the treatment and education environment of children with autism.
https://doi.org/10.7471/ikeee.2019.23.4.1440 인용 PDF KSCI

Search Result 877, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)