• Title/Summary/Keyword: automatic recognition

Search Result 1,066, Processing Time 0.031 seconds

Korean speech recognition based on grapheme (문자소 기반의 한국어 음성인식)

  • Lee, Mun-hak;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.5
    • /
    • pp.601-606
    • /
    • 2019
  • This paper is a study on speech recognition in the Korean using grapheme unit (Cho-sumg [onset], Jung-sung [nucleus], Jong-sung [coda]). Here we make ASR (Automatic speech recognition) system without G2P (Grapheme to Phoneme) process and show that Deep learning based ASR systems can learn Korean pronunciation rules without G2P process. The proposed model is shown to reduce the word error rate in the presence of sufficient training data.

Joint streaming model for backchannel prediction and automatic speech recognition

  • Yong-Seok Choi;Jeong-Uk Bang;Seung Hi Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.118-126
    • /
    • 2024
  • In human conversations, listeners often utilize brief backchannels such as "uh-huh" or "yeah." Timely backchannels are crucial to understanding and increasing trust among conversational partners. In human-machine conversation systems, users can engage in natural conversations when a conversational agent generates backchannels like a human listener. We propose a method that simultaneously predicts backchannels and recognizes speech in real time. We use a streaming transformer and adopt multitask learning for concurrent backchannel prediction and speech recognition. The experimental results demonstrate the superior performance of our method compared with previous works while maintaining a similar single-task speech recognition performance. Owing to the extremely imbalanced training data distribution, the single-task backchannel prediction model fails to predict any of the backchannel categories, and the proposed multitask approach substantially enhances the backchannel prediction performance. Notably, in the streaming prediction scenario, the performance of backchannel prediction improves by up to 18.7% compared with existing methods.

Development of Facial Recognition-enabled Automatic Login for Recruitment Websites (얼굴인식을 통한 자동로그인이 가능한 채용 웹사이트 개발)

  • Hyo Hyun Choi;Min-Ho Cho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.303-304
    • /
    • 2024
  • 본 논문에서는 해당 웹 사이트에 얼굴 인식을 통한 자동로그인 시스템 서비스를 구현한다. 얼굴 인식 라이브러리로 face_recognition을 사용한다. 웹 사이트에 접근 시 원하는 채용공고에 지원할 수 있으며. 원하는 기업을 검색하고 저장하여 모아 볼 수 있는 서비스를 제공하는 웹 애플리케이션을 설계하고 구현한다. React를 사용하여 프론트엔드를 구성하고 SpringBoot와 Flask를 사용하여 벡엔드를 구현하였다. 자동로그인을 위한 얼굴 인식 라이브러리로 face_recognition을 사용한다.

  • PDF

Variation of the Verification Error Rate of Automatic Speaker Recognition System With Voice Conditions (다양한 음성을 이용한 자동화자식별 시스템 성능 확인에 관한 연구)

  • Hong Soo Ki
    • MALSORI
    • /
    • no.43
    • /
    • pp.45-55
    • /
    • 2002
  • High reliability of automatic speaker recognition regardless of voice conditions is necessary for forensic application. Audio recordings in real cases are not consistent in voice conditions, such as duration, time interval of recording, given text or conversational speech, transmission channel, etc. In this study the variation of verification error rate of ASR system with the voice conditions was investigated. As a result in order to decrease both false rejection rate and false acception rate, the various voices should be used for training and the duration of train voices should be longer than the test voices.

  • PDF

Automatic Terminology Recognition using the Dictionary Hierarchy (사전간 계층관계를 이용한 전문용어 자동 추출 기법)

  • 오종훈;이경순;최기선
    • Proceedings of the Korean Society for Cognitive Science Conference
    • /
    • 2000.05a
    • /
    • pp.131-136
    • /
    • 2000
  • 기존의 통계에 기반한 용어 자동 추출 기법(Automatic Term Recognition)은 비교적 좋은 성능의 결과를 보여왔다. 하지만 전문용어 사전 등의 정보를 이용하여 성능의 향상을 이룰 수 있는 여지는 여전히 남아있다. 본 논문에서는 이러한 근거에 기반하여 전문용어간의 계층 정보를 전문용어 사전을 통하여 구축하고 이를 이용하여 전문용어를 추출하는 방법을 제안하고자 한다. 본 논문이 제안하는 기법은 기존의 방법에 비해 좋은 성능을 나타내었다.

  • PDF

Development of a Baseline Platform for Spoken Dialog Recognition System (대화음성인식 시스템 구현을 위한 기본 플랫폼 개발)

  • Chung Minhwa;Seo Jungyun;Lee Yong-Jo;Han Myungsoo
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.32-35
    • /
    • 2003
  • This paper describes our recent work for developing a baseline platform for Korean spoken dialog recognition. In our work, We have collected about 65 hour speech corpus with auditory transcriptions. Linguistic information on various levels such as mophology, syntax, semantics, and discourse is attached to the speech database by using automatic or semi-automatic tools for tagging linguistic information.

  • PDF

Feature Extraction Method for the Character Recognition of the Low Resolution Document

  • Kim, Dae-Hak;Cheong, Hyoung-Chul
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.3
    • /
    • pp.525-533
    • /
    • 2003
  • In this paper we introduce some existing preprocessing algorithm for character recognition and consider feature extraction method for the recognition of low resolution document. Image recognition of low resolution document including fax images can be frequently misclassified due to the blurring effect, slope effect, noise and so on. In order to overcome these difficulties in the character recognition we considered a mesh feature extraction and contour direction code feature. System for automatic character recognition were suggested.

  • PDF

Multiple Acoustic Cues for Stop Recognition

  • Yun, Weon-Hee
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.3-16
    • /
    • 2003
  • ㆍAcoustic characteristics of stops in speech with contextual variability ㆍPosibility of stop recognition by post processing technique ㆍFurther work - Speech database - Modification of decoder - automatic segmentation of acoustic parameters

  • PDF