Search | Korea Science

A study on the improvement of speech recognition for similar place names (유사지명 인식시의 성능 개선 연구)

백승권;양희식;한민수
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2000.11b
- /
- pp.49-53
- /
- 2000
본 연구에서는 DAB(Digital Audio Broadcasting) 시스템의 교통정보 검색 서비스를 위하여 경부선 및 호남선의 톨게이트가 위치한 49 개의 지명을 대상으로 이를 인식하고자 할 때 인식 율을 개선하였다. 지명 어휘의 특성을 분석한 결과 전체 지명의 81.6％가 2 음절이었으며 동일한 음절을 포함하는 지명이 전체의 구성된 어휘가 61％로 조사되었다. 시스템에서 인식율을 개선하기 위하여 인식 대상어휘를 3개의 set로 재분류하고 인식 대상 어휘로 판정된 후보 어휘에 대하여 인식 성공여부에 핵심이 되는 음절의 위치에 따라 가중치 윈도우를 적용하였다. 그 결과 화자 독립의 인식율 테스트에서 남성의 경우 7.2％, 여성의 경우 5.1％의 인식율 향상을 보였다.
PDF

Korean plain plosive produced by Chinese female speakers: Sentence vs. Paragraph (중국인 여성 화자의 한국어 평음 파열음 발음: 독립 문장과 문단의 비교)

Jiang, Pan;Kim, Ji-Eun;Lee, Choong-Woo
- Phonetics and Speech Sciences
- /
- v.7 no.2
- /
- pp.111-117
- /
- 2015
The purpose of this study is to investigate how Chinese learners of Korean produce Korean plain plosives differently in a reading passage and isolated sentences. There are several studies on Korean plosives produced by Chinese speakers, but the study comparing the production of reading passage and isolated sentences are rare. For these purposes, ten Chinese speakers' VOT values of Korean plain plosives were measured using Speech Analyzer. The results show that there is no significant difference between the plain plosive production of a reading passage and that of isolated sentences. In the further studies, the measurement of pitch with VOT is needed.
https://doi.org/10.13064/KSSS.2015.7.2.111 인용 PDF KSCI

Development of a Lipsync Algorithm Based on Audio-visual Corpus (시청각 코퍼스 기반의 립싱크 알고리듬 개발)

김진영;하영민;이화숙
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.3
- /
- pp.63-69
- /
- 2001
A corpus-based lip sync algorithm for synthesizing natural face animation is proposed in this paper. To get the lip parameters, some marks were attached some marks to the speaker's face, and the marks' positions were extracted with some Image processing methods. Also, the spoken utterances were labeled with HTK and prosodic information (duration, pitch and intensity) were analyzed. An audio-visual corpus was constructed by combining the speech and image information. The basic unit used in our approach is syllable unit. Based on this Audio-visual corpus, lip information represented by mark's positions was synthesized. That is. the best syllable units are selected from the audio-visual corpus and each visual information of selected syllable units are concatenated. There are two processes to obtain the best units. One is to select the N-best candidates for each syllable. The other is to select the best smooth unit sequences, which is done by Viterbi decoding algorithm. For these process, the two distance proposed between syllable units. They are a phonetic environment distance measure and a prosody distance measure. Computer simulation results showed that our proposed algorithm had good performances. Especially, it was shown that pitch and intensity information is also important as like duration information in lip sync.
PDF

Improvement of Speech Recognition System Using the Trained Model of Speech Feature (음성특성 학습 모델을 이용한 음성인식 시스템의 성능 향상)

송점동
- The Journal of Information Technology
- /
- v.3 no.4
- /
- pp.1-12
- /
- 2000
We can devide the speech into high frequency speech and low frequency speech according to the feature of the speech, However so far the construction of the recognizer without concerning this feature causes low recognition rate relatively and the needs of an amount of data in the research on the speech recognition. In this paper, we propose the method that can devide this feature of speaker's speech using the Formant frequency, and the method that can recognize the speech after constructing the recognizer model reflecting the feature of the high and low frequency of the speaker's speech, For the experiment we constructed the recognizer model using 47 mono-phone of Korean and trained the recognizer model using 20 women's and men's speech respectively. We divided the feature of speech using the Formant frequency Table, that had been consisted of the Formant frequency, and the value of pitch, and then We performed recognition using the trained model according to the feature of speech The proposed system outperformed the existing method in the recognition rate, as the result.
PDF

A Phenomenological Study on the Meaning of Economic Life of Marriage Immigrant Women (결혼이주여성의 경제생활 의미에 관한 현상학적 연구)

Lee, Hyoung-Ha
- Journal of the Korea Society of Computer and Information
- /
- v.18 no.12
- /
- pp.149-157
- /
- 2013
The purpose of this study is to listen to vivid story on economic life of marriage immigrant women using phenomenological study out of qualitative study methods, and to analyze the meaning of dynamicity of experiences through in-depth interviews. The research question is "What is the meaning of economic life that marriage immigrant women experience?" From the research, 67 meaningful statements were abstracted and 15 core meanings were organized. The 15 core meanings were categorized as 5 theme categories such as 'Tough Life', 'Unstable Income such as Children Education Expense and Insurance Premium', 'Search for Changes in Life Style for Adaptation', 'Pursuit of Economic Stability through Employment', 'Expectation of Supports and Return to Married Woman's Parents' Home.' The researcher made structural description through first person speaker for the application of hermeneutical writing. In other words, the meaning of economic life of marriage immigrant women in Korea is 'difficult coping process to family-oriented culture pursuing changes in life style to adapt themselves to difficult reality.' Various undertones of practice were proposed through those statements such as policy to expand opportunities to receive an old-age pension by applying 'Joint Scheme for Couples' (Virtual Name) to People's pension for stable economic life of marriage immigrant women in old age.
https://doi.org/10.9708/jksci.2013.18.12.149 인용 PDF KSCI

The Error Pattern Analysis of the HMM-Based Automatic Phoneme Segmentation (HMM기반 자동음소분할기의 음소분할 오류 유형 분석)

Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.5
- /
- pp.213-221
- /
- 2006
Phone segmentation of speech waveform is especially important for concatenative text to speech synthesis which uses segmented corpora for the construction of synthetic units. because the quality of synthesized speech depends critically on the accuracy of the segmentation. In the beginning. the phone segmentation was manually performed. but it brings the huge effort and the large time delay. HMM-based approaches adopted from automatic speech recognition are most widely used for automatic segmentation in speech synthesis, providing a consistent and accurate phone labeling scheme. Even the HMM-based approach has been successful, it may locate a phone boundary at a different position than expected. In this paper. we categorized adjacent phoneme pairs and analyzed the mismatches between hand-labeled transcriptions and HMM-based labels. Then we described the dominant error patterns that must be improved for the speech synthesis. For the experiment. hand labeled standard Korean speech DB from ETRI was used as a reference DB. Time difference larger than 20ms between hand-labeled phoneme boundary and auto-aligned boundary is treated as an automatic segmentation error. Our experimental results from female speaker revealed that plosive-vowel, affricate-vowel and vowel-liquid pairs showed high accuracies, 99%, 99.5% and 99% respectively. But stop-nasal, stop-liquid and nasal-liquid pairs showed very low accuracies, 45%, 50% and 55%. And these from male speaker revealed similar tendency.
https://doi.org/10.7776/ASK.2006.25.5.213 인용 PDF KSCI

Characteristics of Vowel Formants, Voice Intensity, and Fundamental Frequency of Female with Amyotrophic Lateral Sclerosis using Spectrograms (스펙트로그램을 이용한 근위축성측삭경화증 여성 화자의 모음 포먼트, 음성강도, 기본주파수의 변화)

Byeon, Haewon
- Journal of the Korea Convergence Society
- /
- v.10 no.9
- /
- pp.193-198
- /
- 2019
This study analyzed the changes of vowel formant, voice intensity, and fundamental frequency of vowels for 11 months using acoustochemical spectrogram analysis of women diagnosed with amyotrophic lateral sclerosis (ALS). The test word was a vowel /a, i, u/ and a diphthong /h + ja + da/, /h + wi + da/, and /h +ɰi+ da/. Speech data were collected through the word reading task presented on the monitor using 'Alvin' program, and the recording environment was set to 5,500 Hz for the nyquist frequency and 11,000 Hz for the sampling rate. The records were analyzed by using spectrograms to vowel formants, voice intensity, and fundamental frequency. As a result of analysis, the fundamental frequency and intensity of the ALS process were decreased and the formant slope of the diphthong was decreased rather than the formant change in the vowel. This result suggests that the vowel distortion of ALS due to disease progression is due to the decrease of tongue and jaw co morbidity.
https://doi.org/10.15207/JKCS.2019.10.9.193 인용 PDF KSCI

A Study on Giving Verbs 'kureru' and 'kudasaru': by Analyzing Dialogues of Female Speakers in Novels of the Edo Period, Meiji Period and the Taisho Period- (수수동사 'くれる·くださる'에 관한 고찰 - 에도기부터 다이쇼기의 작품속의 여성화자의 사용례를 중심으로-)

Yang, JungSoon
- Cross-Cultural Studies
- /
- v.31
- /
- pp.371-394
- /
- 2013
This study aims to know word forms and usages according to personal relationships of 'Kureru Kudasaru' by analyzing dialogues of female speakers. Novels of the Meiji period when there were attempts of a language revolution were mainly used for this study as well as novels of the Edo Period and the Taisho Period. Firstly, the number of examples according to gender differences in the novels was as follows. In case of 'Kureru', female speakers showed a high usage rate in the novels of the Edo period. 'Kureru' was mostly connected with female languages such as 'Naharu', 'Namasu', 'Nansu'. These expressions were not used in the novels of the Meiji Period and the Taisho Period. Although 'Okureru' and 'Okurenasaru' were used in the novels of the Meiji Period, the number of examples of 'Kureru' by female speakers was decreased in the novels of the Meiji Period and the Taisho Period. 'Kudasaru' was predominantly used by female speakers. Especially, female speakers used clearly to show vertical relationships in the novels of the Edo Period and"Doseishoseikatagi"of Meiji 10s. After"Ukigumo", the usage rate of female speakers was decreased but the usage rate of male speakers was increased. Gender differences became gradually smaller. Female speakers in the novels were increased from geisha and relatives such as wife, sister, mother and children to young women, teacher and student. Aspects of benefactive verbs' usages could be summarized as follows. Female speakers at licensed quarters used clearer and more typical expressions according to vertical relationships and gender differences in the novels of The Edo Period than the novels of The Meiji Period and the Taisho Period. In the novels of the Meiji Period, female speakers in a sophisticated social group used benefactive verbs to show strong respect and concern for the other person. In the novels of the Taisho Period, female speakers used benefactive verbs to show respect and concern for the other person according to their areas of outside activities. In the novels of the Meiji Period, female speakers used 'Okureru' when the other person was younger than them and was socially and psychologically close to them. Also, 'O~Nasaru' which was one of respect expressions was used by female speakers. Female speakers used it to older people in the Edo period but they also used it to younger people in the Meiji Period. Examples were not shown in the novels of the Taisho Period. Usages of 'Kureru' 'Kudasaru' according to vertical relationships were as follows. If 'a giver' was an older person, 'Kureru' with respect expressions 'Nasaru' 'Nansu' 'Namasu' was used more than 'Kudasaru' in the novels of the Edo Period. However, many examples of 'Kudasaru' were shown on the novels of the Meiji Period and the Taisho period. In the novels of the Meiji Period, 'Okureru' and 'Okurenasaru' which were expressions included in 'Kureru' were shown. Female speakers used them to older people who were socially and psychologically close to them like family. There were not many examples of 'a giver' and 'a receiver' around the same age. However, 'Kureru' and 'Okureru' were used in a younger group and 'Kudasaru' was used in an older group in the novels of the Meiji Period. If 'a giver' was an younger person, 'Kureru' was mainly used in the novels of the Edo period and "Doseishoseikatagi"in Meiji 10s. However, 'Kudasaru' was used many times in the novels of the latter Meiji Period and the Taisho Period.

A Study on the Aspects of the Relationships and Hardships on a 'Sijipsali' Narratives in Korean Women's Married Life (여성 화자의 시집살이담에 나타난 관계와 고난의 양상)

Kim, Kyung-Seop;Kim, Jeong-Lae
- The Journal of the Convergence on Culture Technology
- /
- v.6 no.2
- /
- pp.409-417
- /
- 2020
Oral-Performance in itself, which successfully narrates one's life, constitutes a kind of decent Verbal arts. The term 'Sijipsali-Narrative' refers to oral narratives portraying a series of events in the course of Women's Life-Story which arise from family life and socio-cultural issues through marriage. As a result, Sijipsali-Narrative belongs to a subcategory of Women's Life-Story. Sijipsali-Narrative can be divided into two categories as follow. One type of Sijipsali-Narrative is the 'Family-Connection sijipsali-narrative,' which results from the relationship between a daughter-in-law and the rest members of the family. Among the 'Family-Connection sijipsali-narratives,' including several forms of Sijipsali such as that of father-in-law and that of husband and that of children, Sijipsali of the mother-in-law is most distinctive. The other type of Sijipsali-Narrative is 'Sociocultural-Connection Sijipsali-narrative', which comes not from human relationship but from general issues a narrator is suffering from as a daughter-in-law in a family. The most universal narrative comes from Sijipsali connected with poverty and historical events, and family history, appearance, attitude of the daughter-in-law and so on can be materials for the narratives. Actually, the two types of Sijipsali narrative is not so much distinguished from each other as intermingled with each other. Sijipsali arising from family relationship can inevitably be related with poverty and some events, which result in conflicts among family members and so harass daughter-in-laws. This thesis has a clear-cut orientation to overview the aspects of the Relationships and Hardships on a 'Sijipsali' Narratives in Korean Women's Married Life.
https://doi.org/10.17703/JCCT.2020.6.2.409 인용 PDF KSCI

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

Byun, Hi-Gyung
- Phonetics and Speech Sciences
- /
- v.12 no.3
- /
- pp.1-14
- /
- 2020
Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.
https://doi.org/10.13064/KSSS.2020.12.3.001 인용 PDF KSCI

Search Result 63, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)