Search | Korea Science

Introduction of ETRI Broadcast News Speech Recognition System (ETRI 방송뉴스음성인식시스템 소개)

Park Jun
- Proceedings of the KSPS conference
- /
- 2006.05a
- /
- pp.89-93
- /
- 2006
This paper presents ETRI broadcast news speech recognition system. There are two major issues on the broadcast news speech recognition: 1) real-time processing and 2) out-of-vocabulary handling. For real-time processing, we devised the dual decoder architecture. The input speech signal is segmented based on the long-pause between utterances, and each decoder processes the speech segment alternatively. One decoder can start to recognize the current speech segment without waiting for the other decoder to recognize the previous speech segment completely. Thus, the processing delay is not accumulated. For out-of-vocabulary handling, we updated both the vocabulary and the language model, based on the recent news articles on the internet. By updating the language model as well as the vocabulary, we can improve the performance up to 17.2% ERR.
PDF

Effects of phonological awareness and phonological processing on language skills in 4- to 6-year old children with and without language delay (4~6세 일반아동 및 언어발달지연 아동의 음운인식 및 음운처리 능력이 언어 능력에 미치는 영향)

Kim, Shinyoung;Son, Jinkyeong;Yim, Dongsun
- Phonetics and Speech Sciences
- /
- v.12 no.1
- /
- pp.51-63
- /
- 2020
Phonological awareness is a metalinguistic awareness ability of phonology and is known to predict language skills, such as reading and vocabulary skills. The purpose of this study was to investigate the relationship between phonological awareness, phonological processing, and language skills in 4- to 6-years-old typically developing (TD) children and children with language delay (LD). A total of 32 children (TD=18, LD=15) participated in this study. They performed a phonological awareness task consisting of counting, deletion, and discrimination at syllable level. Nonword Repetition, Digit Backward, Receptive & Expressive Vocabulary Test, and Grammaticality Judgment Task were performed to analyze the correlation between phonological awareness, phonological processing, and language ability. A multiple stepwise regression analysis was performed to examine the phonological awareness subtasks that predict language ability. In the TD group, the syllable categorization task significantly predicted the receptive vocabulary and the performance of the Grammaticality Judgment Task. The LD group showed that the syllable counting task significantly predicted the receptive vocabulary, the expressive vocabulary, and the performance of the Grammaticality Judgment Task. The results showed that the phonological awareness performance was significantly different between the two groups. Further, correlation analysis and regression analysis showed different results for each group. The result of the phonological awareness performance predicted the language ability of each group significantly, suggesting the importance of the meta-linguistic awareness ability of phonology.
https://doi.org/10.13064/KSSS.2020.12.1.051 인용 PDF KSCI

An Energy-Efficient Matching Accelerator Using Matching Prediction for Mobile Object Recognition

Choi, Seongrim;Lee, Hwanyong;Nam, Byeong-Gyu
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.16 no.2
- /
- pp.251-254
- /
- 2016
An energy-efficient object matching accelerator is proposed for mobile object recognition based on matching prediction scheme. Conventionally, vocabulary tree has been used to save the external memory bandwidth in object matching process but involved massive internal memory transactions to examine each object in a database. In this paper, a novel object matching accelerator is proposed based on matching predictions to reduce unnecessary internal memory transactions by mitigating non-target object examinations, thereby improving the energy-efficiency. Experimental results show a 26% reduction in power-delay product compared to the prior art.
https://doi.org/10.5573/JSTS.2016.16.2.251 인용 PDF KSCI

A Study the effect of Cooking Activity as a Language Intervention on the Language Development of Language Delayed Infants. (요리활동을 통한 언어중재가 언어발달지연을 보이는 유아의 언어능력 향상에 대한 연구)

Seo, Eui-Jung;Kim, Yun-Hee
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.17 no.10
- /
- pp.109-118
- /
- 2016
Language intervention through cooking activity programs is designed to provide an efficient teaching method and improved educational environment in the field of teaching. This program addresses the effects of this program on the language development of three three-year-old infants (M;2, F;1) in the Center in Seoul. A cooking topic suitable for the age of this group was selected. The language Intervention was conducted for 50 minutes per week for a total of 25 times, and made use of vocabulary, verbs and nouns related to cooking which were evenly distributed. In this study, the Peabody Picture Vocabulary Test-Revised (PPVT-R), receptive language age (RLA) and expressive language age (ELA), and Preschool Receptive-Expressive Language Scale (PRES) were used to analyze the collected data. After the study, the cooking activity was accomplished with normal development outcomes appearing in the ability of vocabulary, receptive language, expressive language, and integrated language. There is now a solid evidence base supporting the efficacy of cooking activity in producing positive outcomes in the language development of language delayed infants. Consequently, cooking can induce their active participation and interest and extend their language abilities through various experiences.
https://doi.org/10.5762/KAIS.2016.17.10.109 인용 PDF KSCI

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

Bae Hyojoon;Jung Sungyun;Bae Keunsung
- MALSORI
- /
- no.52
- /
- pp.111-120
- /
- 2004
This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5 kbytes for program code. Maximum required time of 29.2 ms for processing a frame of 32 ms of speech validates real-time operation of the implemented system.
PDF

Development and effects of Nanta program using speech rhythm for children with limited speech sound production (말소리가 제한된 아동을 위한 말리듬을 이용한 난타 프로그램의 개발과 효과)

Park, Yeong Hye;Choi, Seong Hee
- Phonetics and Speech Sciences
- /
- v.13 no.2
- /
- pp.67-76
- /
- 2021
Nanta means "tapping" using percussion instruments such as drums, which is the rhythm of Samulnori, a tradtional Korean music. Nanta speech rhythm intervention program was developed and applied for the children with limited speech sound production and investigated its effect. Nanta program provided audible stimulation, various sound loudness and beats, and rhythms. Nanta program consists of three stages : Respiration, phonation and articulation with the rhythm. Six children with language development delay participated in this study. Children were encouraged to explore sounds and beats and freely express sounds and beats. Along with the rhythm, children also were encouraged to produce speech sounds by increasing the length of syllables in mimetic and imitating words. A total of 15 sessions were conducted twice a week for 40 minutes per session. For exploring the effectiveness, raw scores from preschool receptive-expressive scales (PRES) and receptive-expressive vocabulary test (REVT) were obtained and compared before and after therapy. The results demonstrated that significantly improved receptive (p=.027) and expressive language scores (p=.024) in PRES and receptive (p=.028) and expressive (p=.028) vocabulary scores following intervention using Wilcoxon signed-rank test.These findings suggest that the nanta rhythm program can be useful for improving language development and vocabulary in children with limited speech sound production.
https://doi.org/10.13064/KSSS.2021.13.2.067 인용 PDF KSCI

A Study on the Diphone Recognition of Korean Connected Words and Eojeol Reconstruction (한국어 연결단어의 이음소 인식과 어절 형성에 관한 연구)

;Jeong, Hong
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.4
- /
- pp.46-63
- /
- 1995
This thesis described an unlimited vocabulary connected speech recognition system using Time Delay Neural Network(TDNN). The recognition unit is the diphone unit which includes the transition section of two phonemes, and the number of diphone unit is 329. The recognition processing of korean connected speech is composed by three part; the feature extraction section of the input speech signal, the diphone recognition processing and post-processing. In the feature extraction section, the extraction of diphone interval in input speech signal is carried and then the feature vectors of 16th filter-bank coefficients are calculated for each frame in the diphone interval. The diphone recognition processing is comprised by the three stage hierachical structure and is carried using 30 Time Delay Neural Networks. particularly, the structure of TDNN is changed so as to increase the recognition rate. The post-processing section, mis-recognized diphone strings are corrected using the probability of phoneme transition and the probability o phoneme confusion and then the eojeols (Korean word or phrase) are formed by combining the recognized diphones.
PDF

Korean Speech Recognition Based on Syllable (음절을 기반으로한 한국어 음성인식)

Lee, Young-Ho;Jeong, Hong
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.31B no.1
- /
- pp.11-22
- /
- 1994
For the conventional systme based on word, it is very difficult to enlarge the number of vocabulary. To cope with this problem, we must use more fundamental units of speech. For example, syllables and phonemes are such units, Korean speech consists of initial consonants, middle vowels and final consonants and has characteristic that we can obtain syllables from speech easily. In this paper, we show a speech recognition system with the advantage of the syllable characteristics peculiar to the Korean speech. The algorithm of recognition system is the Time Delay Neural Network. To recognize many recognition units, system consists of initial consonants, middle vowels, and final consonants recognition neural network. At first, our system recognizes initial consonants, middle vowels and final consonants. Then using this results, system recognizes isolated words. Through experiments, we got 85.12% recognition rate for 2735 data of initial consonants, 86.95% recognition rate for 3110 data of middle vowels, and 90.58% recognition rate for 1615 data of final consonants. And we got 71.2% recognition rate for 250 data of isolated words.
PDF

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.391-394
- /
- 2004
This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.
PDF

Search Result 9, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)