Search | Korea Science

Lee, Mun-hak;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.5
- /
- pp.601-606
- /
- 2019
This paper is a study on speech recognition in the Korean using grapheme unit (Cho-sumg [onset], Jung-sung [nucleus], Jong-sung [coda]). Here we make ASR (Automatic speech recognition) system without G2P (Grapheme to Phoneme) process and show that Deep learning based ASR systems can learn Korean pronunciation rules without G2P process. The proposed model is shown to reduce the word error rate in the presence of sufficient training data.
https://doi.org/10.7776/ASK.2019.38.5.601 인용 PDF KSCI

Yong-Seok Choi;Jeong-Uk Bang;Seung Hi Kim
- ETRI Journal
- /
- v.46 no.1
- /
- pp.118-126
- /
- 2024
In human conversations, listeners often utilize brief backchannels such as "uh-huh" or "yeah." Timely backchannels are crucial to understanding and increasing trust among conversational partners. In human-machine conversation systems, users can engage in natural conversations when a conversational agent generates backchannels like a human listener. We propose a method that simultaneously predicts backchannels and recognizes speech in real time. We use a streaming transformer and adopt multitask learning for concurrent backchannel prediction and speech recognition. The experimental results demonstrate the superior performance of our method compared with previous works while maintaining a similar single-task speech recognition performance. Owing to the extremely imbalanced training data distribution, the single-task backchannel prediction model fails to predict any of the backchannel categories, and the proposed multitask approach substantially enhances the backchannel prediction performance. Notably, in the streaming prediction scenario, the performance of backchannel prediction improves by up to 18.7% compared with existing methods.
https://doi.org/10.4218/etrij.2023-0358 인용 PDF

Hyo Hyun Choi;Min-Ho Cho
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2024.01a
- /
- pp.303-304
- /
- 2024
본 논문에서는 해당 웹 사이트에 얼굴 인식을 통한 자동로그인 시스템 서비스를 구현한다. 얼굴 인식 라이브러리로 face_recognition을 사용한다. 웹 사이트에 접근 시 원하는 채용공고에 지원할 수 있으며. 원하는 기업을 검색하고 저장하여 모아 볼 수 있는 서비스를 제공하는 웹 애플리케이션을 설계하고 구현한다. React를 사용하여 프론트엔드를 구성하고 SpringBoot와 Flask를 사용하여 벡엔드를 구현하였다. 자동로그인을 위한 얼굴 인식 라이브러리로 face_recognition을 사용한다.
PDF

Hong Soo Ki
- MALSORI
- /
- no.43
- /
- pp.45-55
- /
- 2002
High reliability of automatic speaker recognition regardless of voice conditions is necessary for forensic application. Audio recordings in real cases are not consistent in voice conditions, such as duration, time interval of recording, given text or conversational speech, transmission channel, etc. In this study the variation of verification error rate of ASR system with the voice conditions was investigated. As a result in order to decrease both false rejection rate and false acception rate, the various voices should be used for training and the duration of train voices should be longer than the test voices.
PDF

오종훈;이경순;최기선
- Proceedings of the Korean Society for Cognitive Science Conference
- /
- 2000.05a
- /
- pp.131-136
- /
- 2000
기존의 통계에 기반한 용어 자동 추출 기법(Automatic Term Recognition)은 비교적 좋은 성능의 결과를 보여왔다. 하지만 전문용어 사전 등의 정보를 이용하여 성능의 향상을 이룰 수 있는 여지는 여전히 남아있다. 본 논문에서는 이러한 근거에 기반하여 전문용어간의 계층 정보를 전문용어 사전을 통하여 구축하고 이를 이용하여 전문용어를 추출하는 방법을 제안하고자 한다. 본 논문이 제안하는 기법은 기존의 방법에 비해 좋은 성능을 나타내었다.
PDF

Chung Minhwa;Seo Jungyun;Lee Yong-Jo;Han Myungsoo
- Proceedings of the KSPS conference
- /
- 2003.05a
- /
- pp.32-35
- /
- 2003
This paper describes our recent work for developing a baseline platform for Korean spoken dialog recognition. In our work, We have collected about 65 hour speech corpus with auditory transcriptions. Linguistic information on various levels such as mophology, syntax, semantics, and discourse is attached to the speech database by using automatic or semi-automatic tools for tagging linguistic information.
PDF

Kim, Dae-Hak;Cheong, Hyoung-Chul
- Journal of the Korean Data and Information Science Society
- /
- v.14 no.3
- /
- pp.525-533
- /
- 2003
In this paper we introduce some existing preprocessing algorithm for character recognition and consider feature extraction method for the recognition of low resolution document. Image recognition of low resolution document including fax images can be frequently misclassified due to the blurring effect, slope effect, noise and so on. In order to overcome these difficulties in the character recognition we considered a mesh feature extraction and contour direction code feature. System for automatic character recognition were suggested.
PDF

Yun, Weon-Hee
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.3-16
- /
- 2003
ㆍAcoustic characteristics of stops in speech with contextual variability ㆍPosibility of stop recognition by post processing technique ㆍFurther work - Speech database - Modification of decoder - automatic segmentation of acoustic parameters
PDF

Sakai, Y.;Kitazawa, M.;Yokota, T.
- 제어로봇시스템학회:학술대회논문집
- /
- 1997.10a
- /
- pp.581-584
- /
- 1997
Discussed is stroke identification technique for automatic recognition of kanji characters without using the order of drawing strokes of a character.
PDF