통합 검색 | Korea Science

효과적인 음성 인식 평가를 위한 심층 신경망 기반의 음성 인식 성능 지표 (Speech Recognition Accuracy Measure using Deep Neural Network for Effective Evaluation of Speech Recognition Performance)

지승은;김우일
- 한국정보통신학회논문지
- /
- 제21권12호
- /
- pp.2291-2297
- /
- 2017
본 논문에서는 음성 데이터베이스를 평가하기 위해 여러 가지의 음성 특성 지표 추출 알고리즘을 설명하고 심층 신경망 기반의 새로운 음성 성능 지표 생성 방법을 제안한다. 선행 연구에서는 효과적인 음성 인식 성능 지표를 생성하기 위해 대표적인 음성 인식 성능 지표인 단어 오인식률(Word Error Rate, WER)과 상관도가 높은 여러 가지 음성 특성 지표들을 조합하여 새로운 성능 지표를 생성하였다. 생성된 음성 성능 지표는 다양한 잡음 환경에서 각 음성 특성 지표를 단독으로 사용할 때보다 단어 오인식률과 높은 상관도를 나타내어 음성 인식 성능을 예측하는데 효과적임을 입증 하였다. 본 논문에서는 심층 신경망을 기반으로 한 음성 특성 지표 추출 방법에 대해 설명하며 선행 연구에서 조합에 사용한 GMM(Gaussian Mixture Model) 음향 모델 확률 값을 심층 신경망 학습을 통해 추출한 확률 값으로 대체해 조합함으로써 단어 오인식률과 보다 높은 상관도를 갖는 것을 확인한다.
https://doi.org/10.6109/jkiice.2017.21.12.2291 인용 PDF KSCI

의생명 분야의 개체명 인식에서 순환형 신경망과 조건적 임의 필드의 성능 비교 (Performance Comparison of Recurrent Neural Networks and Conditional Random Fields in Biomedical Named Entity Recognition)

조병철;김유섭
- 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
- /
- 한국정보과학회언어공학연구회 2016년도 제28회 한글 및 한국어 정보처리 학술대회
- /
- pp.321-323
- /
- 2016
최근 연구에서 기계학습 중 지도학습 방법으로 개체명 인식을 하고 있다. 그러나 지도 학습 방법은 데이터를 만드는 비용과 시간이 많이 필요로 한다. 본 연구에서는 주석 된 말뭉치를 사용하여 지도 학습 방법을 사용 한다. 의생명 개체명 인식은 Protein, RNA, DNA, Cell type, Cell line 등을 포함한 텍스트 처리에 중요한 기초 작업입니다. 그리고 의생명 지식 검색에서 가장 기본과 핵심 작업 중 하나이다. 본 연구에서는 순환형 신경망과 워드 임베딩을 자질로 사용한 조건적 임의 필드에 대한 성능을 비교한다. 조건적 임의 필드에 N_Gram만을 자질로 사용한 것을 기준점으로 설정 하였고, 기준점의 결과는 70.09% F1 Score이다. RNN의 jordan type은 60.75% F1 Score, elman type은 58.80% F1 Score의 성능을 보여준다. 조건적 임의 필드에 CCA, GLOVE, WORD2VEC을 사용 한 결과는 각각 72.73% F1 Score, 72.74% F1 Score, 72.82% F1 Score의 성능을 얻을 수 있다.
PDF

잔향시간이 양이를 사용한 한국어 단음절 인지에 미치는 영향 (Effects of reverberation time on binaural Korean monosyllabic word recognition in normal hearing subjects)

임덕환
- 한국음향학회지
- /
- 제40권6호
- /
- pp.678-682
- /
- 2021
실내에서 소음과 함께 존재하는 잔향시간은 어음인지에 영향을 미친다. 그 정도는 청취 조건이나 사용된 언어의 특성에 따라서 그 내용이 다를 수 있다. 본 연구에서는 양이 청취 조건에서 잔향시간이 정상 성인 10인의 표준화 된 한국어 단음절 변별력에 미치는 효과를 확인하고자 하였다. 양이 청취효과는 diotic(양이 소음간 동일 위상) 조건과 dichotic(양이 소음 간 위상차 존재, π) 조건에서 신호대잡음비를 0 dB로 고정하였다(55 dB HL). 잔향시간 효과를 관찰할 수 있는 3.4 s에서 대상의 단음절 변별력인 Word Recognition Score(WRS)점수를 분석하였다. 결과에서 dichotic인 경우에는 단측 청취 결과와 비교하여 유의한 변별력 개선이 보였고(p < 0.05), diotic 조건에서는 단측청취와 유의한 차이를 관찰할 수 없었다. 이러한 결과는 잔향시간을 고려한 여러 소음 음향 환경 분석에 참고가 될 수 있을 것으로 판단된다.
https://doi.org/10.7776/ASK.2021.40.6.678 인용 PDF KSCI

Weibo Disaster Rumor Recognition Method Based on Adversarial Training and Stacked Structure

Diao, Lei;Tang, Zhan;Guo, Xuchao;Bai, Zhao;Lu, Shuhan;Li, Lin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제16권10호
- /
- pp.3211-3229
- /
- 2022
To solve the problems existing in the process of Weibo disaster rumor recognition, such as lack of corpus, poor text standardization, difficult to learn semantic information, and simple semantic features of disaster rumor text, this paper takes Sina Weibo as the data source, constructs a dataset for Weibo disaster rumor recognition, and proposes a deep learning model BERT_AT_Stacked LSTM for Weibo disaster rumor recognition. First, add adversarial disturbance to the embedding vector of each word to generate adversarial samples to enhance the features of rumor text, and carry out adversarial training to solve the problem that the text features of disaster rumors are relatively single. Second, the BERT part obtains the word-level semantic information of each Weibo text and generates a hidden vector containing sentence-level feature information. Finally, the hidden complex semantic information of poorly-regulated Weibo texts is learned using a Stacked Long Short-Term Memory (Stacked LSTM) structure. The experimental results show that, compared with other comparative models, the model in this paper has more advantages in recognizing disaster rumors on Weibo, with an F1_Socre of 97.48%, and has been tested on an open general domain dataset, with an F1_Score of 94.59%, indicating that the model has better generalization.
https://doi.org/10.3837/tiis.2022.10.001 인용 PDF KSCI HTML

트랜스포머를 이용한 중국어 NER 관련 문자와 단어 통합 임배딩 (Integrated Char-Word Embedding on Chinese NER using Transformer)

김춘광;조인휘
- 한국정보처리학회:학술대회논문집
- /
- 한국정보처리학회 2021년도 춘계학술발표대회
- /
- pp.415-417
- /
- 2021
Since the words and words in Chinese sentences are continuous and the length of vocabulary is huge, Chinese NER(Named Entity Recognition) always based on character representation. In recent years, many Chinese research has been reconsidered how to integrate the word information into the Chinese NER model. However, the traditional sequence model has complex structure, the slow inference speed, and an additional dictionary information is needed, which is difficult to implement in the industry. The approach in this paper has the state of the art and parallelizable, which is integrated the char-word embeddings, so that the model learns word information. The proposed model is easy to implement, and outperforms traditional model in terms of speed and efficiency, which is improved f1-score on two dataset.
https://doi.org/10.3745/PKIPS.y2021m05a.415 인용 PDF

기계학습 기반 개체명 인식을 위한 사전 자질 생성 (Feature Generation of Dictionary for Named-Entity Recognition based on Machine Learning)

김재훈;김형철;최윤수
- 정보관리연구
- /
- 제41권2호
- /
- pp.31-46
- /
- 2010
오늘날 정보 추출의 한 단계로서 개체명 인식은 정보검색 분야 뿐 아니라 질의응답과 요약 분야에서 매우 유용하게 사용되고 있다. 개체명은 일반 단어와 달리 다양한 문서에서 꾸준히 생성되고 변화되고 있다. 이와 같은 개체명의 특성 때문에 여러 응용 시스템에서 미등록어 문제가 야기된다. 본 논문에서는 이런 미등록어 문제를 해결하기 위해 기계학습 기반 개체명 인식 시스템을 위한 새로운 자질 생성 방법을 제안한다. 일반적으로 기계학습 기반 개체명 인식 시스템은 단어 단위의 자질을 사용하므로 구절 단위의 개체명을 그대로 자질로 사용할 수 없다. 이 문제를 해결하기 위해 본 논문에서는 새로운 구절 단위의 정보를 단어 단위의 자질로 변환하는 자질 생성 방법을 제안하였다. 이 방법으로 개체명 사전과 WordNet을 개체명 인식의 자질로 사용할 수 있었다. 그 결과 영어 개체명 시스템은 F1 점수의 약 6%가 향상되었고 오류의 약 38%가 줄어들었다.
https://doi.org/10.1633/JIM.2010.41.2.031 인용 PDF KPUBS

HMM을 기반으로 한 자율이동로봇의 음성명령 인식시스템의 개발 (Development of Autonomous Mobile Robot with Speech Teaching Command Recognition System Based on Hidden Markov Model)

조현수;박민규;이현정;이민철
- 제어로봇시스템학회논문지
- /
- 제13권8호
- /
- pp.726-734
- /
- 2007
Generally, a mobile robot is moved by original input programs. However, it is very hard for a non-expert to change the program generating the moving path of a mobile robot, because he doesn't know almost the teaching command and operating method for driving the robot. Therefore, the teaching method with speech command for a handicapped person without hands or a non-expert without an expert knowledge to generate the path is required gradually. In this study, for easily teaching the moving path of the autonomous mobile robot, the autonomous mobile robot with the function of speech recognition is developed. The use of human voice as the teaching method provides more convenient user-interface for mobile robot. To implement the teaching function, the designed robot system is composed of three separated control modules, which are speech preprocessing module, DC servo motor control module, and main control module. In this study, we design and implement a speaker dependent isolated word recognition system for creating moving path of an autonomous mobile robot in the unknown environment. The system uses word-level Hidden Markov Models(HMM) for designated command vocabularies to control a mobile robot, and it has postprocessing by neural network according to the condition based on confidence score. As the spectral analysis method, we use a filter-bank analysis model to extract of features of the voice. The proposed word recognition system is tested using 33 Korean words for control of the mobile robot navigation, and we also evaluate the performance of navigation of a mobile robot using only voice command.
https://doi.org/10.5302/J.ICROS.2007.13.8.726 인용 PDF KSCI

문자 가분할과 Support Vector Machine을 이용한 필기 한글 단어 고속 검증기 (Hangul Segmentation and Word Verification System for Automatic Address Processing)

이충식;김인중;신종탁;김진형
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2000년도 추계종합학술대회 논문집(3)
- /
- pp.37-40
- /
- 2000
A fast method of Hangul address word verification is presented in this Paper. Pre-segmentation and recognition by DP matching is adopted in this paper. An address line image is over-segmented by analyzing the topology of connected components and the projection profile. A fast individual Hangul character verifier was developed by applying SVM (Support Vector Machine). The segmentation hypothesis was represented by lattice structure, and a best path search by dynamic programming generates the most probable segmentation path and the final verification score. The word verifier was tested on 310 address image DB, and it show the possibility of improvements of this method.
PDF

한국어 음성인식 플랫폼 개발현황 (Status Report on the Korean Speech Recognition Platform)

권오욱;권석봉;장규철;윤성락;김용래;장광동;김희린;유창동;김봉완;이용주
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2005년도 추계 학술대회 발표논문집
- /
- pp.215-218
- /
- 2005
This paper reports the current status of development of the Korean speech recognition platform (ECHOS). We implement new modules including ETSI feature extraction, backward search with trigram, and utterance verification. The ETSI feature extraction module is implemented by converting the public software to an object-oriented program. We show that trigram language modeling in the backward search pass reduces the word error rate from 23.5% to 22% on a large vocabulary continuous speech recognition task. We confirm the utterance verification module by examining word graphs with confidence score.
PDF

New Postprocessing Methods for Rejectin Out-of-Vocabulary Words

Song, Myung-Gyu
- The Journal of the Acoustical Society of Korea
- /
- 제16권3E호
- /
- pp.19-23
- /
- 1997
The goal of postprocessing in automatic speech recognition is to improve recognition performance by utterance verification at the output of recognition stage. It is focused on the effective rejection of out-of vocabulary words based on the confidence score of hypothesized candidate word. We present two methods for computing confidence scores. Both methods are based on the distance between each observation vector and the representative code vector, which is defined by the most likely code vector at each state. While the first method employs simple time normalization, the second one uses a normalization technique based on the concept of on-line garbage mode[1]. According to the speaker independent isolated words recognition experiment with discrete density HMM, the second method outperforms both the first one and conventional likelihood ratio scoring method[2].
PDF

검색결과 50건 처리시간 0.031초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)