통합 검색 | Korea Science

Automatic Speech Database Verification Method Based on Confidence Measure

Kang Jeomja;Jung Hoyoung;Kim Sanghun
- 대한음성학회지:말소리
- /
- 제51호
- /
- pp.71-84
- /
- 2004
In this paper, we propose the automatic speech database verification method(or called automatic verification) based on confidence measure for a large speech database. This method verifies the consistency between given transcription and speech using the confidence measure. The automatic verification process consists of two stages : the word-level likelihood computation stage and multi-level likelihood ratio computation stage. In the word-level likelihood computation stage, we calculate the word-level likelihood using the viterbi decoding algorithm and make the segment information. In the multi-level likelihood ratio computation stage, we calculate the word-level and the phone-level likelihood ratio based on confidence measure with anti-phone model. By automatic verification, we have achieved about 61% error reduction. And also we can reduce the verification time from 1 month in manual to 1-2 days in automatic.
PDF

대용량 운율 음성데이타를 이용한 자동합성방식 (Automatic Synthesis Method Using Prosody-Rich Database)

김상훈
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1998년도 제15회 음성통신 및 신호처리 워크샵(KSCSP 98 15권1호)
- /
- pp.87-92
- /
- 1998
In general, the synthesis unit database was constructed by recording isolated word. In that case, each boundary of word has typical prosodic pattern like a falling intonation or preboundary lengthening. To get natural synthetic speech using these kinds of database, we must artificially distort original speech. However, that artificial process rather resulted in unnatural, unintelligible synthetic speech due to the excessive prosodic modification on speech signal. To overcome these problems, we gathered thousands of sentences for synthesis database. To make a phone level synthesis unit, we trained speech recognizer with the recorded speech, and then segmented phone boundaries automatically. In addition, we used laryngo graph for the epoch detection. From the automatically generated synthesis database, we chose the best phone and directly concatenated it without any prosody processing. To select the best phone among multiple phone candidates, we used prosodic information such as break strength of word boundaries, phonetic contexts, cepstrum, pitch, energy, and phone duration. From the pilot test, we obtained some positive results.
PDF

Large scale word recognizer를 위한 음성 database - POW (The Speech Database for Large Scale Word Recognizer)

임연자
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1995년도 제12회 음성통신 및 신호처리 워크샵 논문집 (SCAS 12권 1호)
- /
- pp.291-294
- /
- 1995
본논문은 POW algorithm과 알고리즘을 통해 수행된 결과인 large scale word recognizer를 위한 POW set에 대하여 설명하겠다. Large scale word recognizer를 위한 speech database를 구축하기 위해서는 모든 가능한 phonological phenomenon이 POW set에 포함 되어얗 ks다. 또한 POW set의 음운 현상들의 분포는 추출하고자 하는 모집단의 음운현상들의 분포와 유사해야 한다. 위와 같은 목적으로 다음과 같이 3가지 성질을 갖는 POW set을 추출하기 위한 새로운 algorithm을 제안한다. 1. 모집단에서 발생하는 모든 음운현상을 포함해야 한다. 2, 최소한의 단어 집합으로 구성되어야 한다. 3. POW set과 모집단의 음운현상의 분포가 유사해야 한다. 우리는 약 300만 어절의 한국어 text corpus로부터 5천 단어의 고빈도 어절을 추출하고 이로부터 한국어 POW set을 추출하였다.
PDF

음성 데이터베이스로부터의 효율적인 색인데이터베이스 구축과 정보검색 (The Extraction of Effective Index Database from Voice Database and Information Retrieval)

박미성
- 한국도서관정보학회지
- /
- 제35권3호
- /
- pp.271-291
- /
- 2004
전자도서관과 같은 정보제공원은 이미지, 음성, 동영상 등과 같은 비정형 멀티미디어 데이터 서비스에 대한 요구를 받고 있다. 그리하여 본 연구에서는 음성 처리를 위해 어절생성기, 음절복원기, 형태소분석기, 교정기를 제안하였다. 제안한 음성처리 기술로 음성데이터베이스를 텍스트데이터베이스로 변환 한후 텍스트데이터베이스로부터 색인데이터베이스를 추출하였다. 그리고 추출한 색인데이터베이스로 텍스트와 음성의 내용기반정보검색에 활용할 수 있음을 보이기 위해 정보검색모델을 제안하였다.
PDF

훈련용 단어 음성DB 검증 (A Validation of the Isolated Word Speech Database)

이수종;김상훈;이영직
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2003년도 5월 학술대회지
- /
- pp.36-39
- /
- 2003
The purpose of this paper is to correct the errors in the isolated word speech database under the PC environment, and to analyze the various errors. The importance and procedures of the error detection are also described.
PDF

대용량 비정형 데이터 자료 입력 및 출력 (Data Input and Output of Unstructured Data of Large Capacity)

심규철;강병준;김경환;정회경
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2013년도 춘계학술대회
- /
- pp.613-615
- /
- 2013
최근 들어 워드 파일을 XML로 변환하여 서비스하기 위한 요구가 많아지고 있다. 이에 본 논문에서는 워드 파일(아래한글, MS-Office)로 입력된 데이터를 XML 파일로 변환하여 사용자가 XML 매핑 파일을 만들어 워드 프로세서에 입력된 데이터를 바로 추출하여 데이터베이스에 저장하는 시스템을 제안한다. 이는, 워드프로세스에 양식을 미리 작성하여 필요한 데이터를 데이터베이스에서 조회하여 워드프로세서 문서를 어플리케이션 프로그램에서 워드 파일을 생성 할 수 있다.
PDF

자동차 소음 환경에서 음성 인식 (Speech Recognition in the Car Noise Environment)

김완구;차일환;윤대희
- 전자공학회논문지B
- /
- 제30B권2호
- /
- pp.51-58
- /
- 1993
This paper describes the development of a speaker-dependent isolated word recognizer as applied to voice dialing in a car noise environment. for this purpose, several methods to improve performance under such condition are evaluated using database collected in a small car moving at 100km/h The main features of the recognizer are as follow: The endpoint detection error can be reduced by using the magnitude of the signal which is inverse filtered by the AR model of the background noise, and it can be compensated by using variants of the DTW algorithm. To remove the noise, an autocorrelation subtraction method is used with the constraint that residual energy obtainable by linear predictive analysis should be positive. By using the noise rubust distance measure, distortion of the feature vector is minimized. The speech recognizer is implemented using the Motorola DSP56001(24-bit general purpose digital signal processor). The recognition database is composed of 50 Korean names spoken by 3 male speakers. The recognition error rate of the system is reduced to 4.3% using a single reference pattern for each word and 1.5% using 2 reference patterns for each word.
PDF

학위논문 전문데이터베이스 구축 및 서비스환경 구현 (Construction of Full-Text Database and Implementation of Service Environment for Electronic Theses and Dissertations)

이기호;김진숙;윤화묵
- 한국정보처리학회논문지
- /
- 제7권1호
- /
- pp.41-49
- /
- 2000
1990년대 중반부터 다양하고 강력한 문서편집기의 보편화와 더불어 국내외의 대학에서는 책자형태의 논문제출과 동시에 전자형태 학위논문의 제출을 의무화하고 있다. 그러나 제출된 방대한 야의 전자형태의 논문들은 한글, MS-Word, Latexe 등 다양한 문서편집기로 작성되었고 문서형식의 표준화가 이루어지지 않아 효율적으로 활용되지 못하고 있는 실정이다. 본 논문에서는 다양한 형태로 존재하는 학위논문들을 하나의 통일된 중간포맷으로 변혼하고, 변환된 논물들을 전문데이터베이스(Full Text Datsbase)화하여 이를 인터넷을 통해 효육적으로 검색하고 서비스하기 위한 학위논문 전문검색시스템을 구현한다.
PDF

멀티미디어 데이터의 활용

황희정
- 디지털콘텐츠
- /
- 11호통권42호
- /
- pp.64-68
- /
- 1996
지난달까지 홈페이지 제작에 필요한 각종 HTML 태그들과 이들의 활용에 필요한 여러 가지 사항들을 살펴 보았다. 이번 달에는 딱딱한 HTML 태그에서 벗어나 좀더 재밌있고. 활용성이 강한 멀티미디어 데이터를 홈페이지에서 활용하는 방안에 대해 다루어 보겠다. 그리고 다음달에는 Microsoft Internet Assistant For Word를 이용 Word에서 만들어진 데이터를 HTML태그입력 없이 손쉽게 홈페이지를 만드는 방법을 소개하고자 한다.
PDF

Sub-word Based Offline Handwritten Farsi Word Recognition Using Recurrent Neural Network

Ghadikolaie, Mohammad Fazel Younessy;Kabir, Ehsanolah;Razzazi, Farbod
- ETRI Journal
- /
- 제38권4호
- /
- pp.703-713
- /
- 2016
In this paper, we present a segmentation-based method for offline Farsi handwritten word recognition. Although most segmentation-based systems suffer from segmentation errors within the first stages of recognition, using the inherent features of the Farsi writing script, we have segmented the words into sub-words. Instead of using a single complex classifier with many (N) output classes, we have created N simple recurrent neural network classifiers, each having only true/false outputs with the ability to recognize sub-words. Through the extraction of the number of sub-words in each word, and labeling the position of each sub-word (beginning/middle/end), many of the sub-word classifiers can be pruned, and a few remaining sub-word classifiers can be evaluated during the sub-word recognition stage. The candidate sub-words are then joined together and the closest word from the lexicon is chosen. The proposed method was evaluated using the Iranshahr database, which consists of 17,000 samples of Iranian handwritten city names. The results show the high recognition accuracy of the proposed method.
https://doi.org/10.4218/etrij.16.0115.0542 인용 PDF KSCI

검색결과 235건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)