Search | Korea Science

Automatic segmentation for continuous spoken Korean language recognition based on phonemic TDNN (음소단위 TDNN에 기반한 한국어 연속 음성 인식을 위한 데이타 자동분할)

Baac, Coo-Phong;Lee, Geun-Bae;Lee, Jong-Hyeok
- Annual Conference on Human and Language Technology
- /
- 1995.10a
- /
- pp.30-34
- /
- 1995
신경망을 이용하는 연속 음성 인식에서 학습이라 함은 인위적으로 분할된 음성 데이타를 토대로 진행되는 것이 지배적이었다. 그러나 분할된 음성데이타를 마련하기 위해서는 많은 시간과 노력, 숙련 등을 요구할 뿐만아니라 그 자체가 인식도메인의 변화나 확장을 어렵게 하는 하나의 요인 되기도 한다. 그래서 분할된 음성데이타의 사용을 가급적 피하고 그러면서도 성능을 떨어뜨리지 않는 신경망 학습법들이 나타나고 있다. 본 논문에서는 학습된 인식기를 이용하여 자동으로 한국어 음성데이타를 분할한 후 그 분할된 데이타를 이용하여 다시 인식기를 재학습시켜나가는 반복 과정을 소개하고자 한다. 여기에는 TDNN이 인식기로 사용되며 인식단위는 음소이다. 학습은 cross-validation 기법을 이용하여 제어된다.
PDF

A Study of Voice Service Architecture Using MPLS Technology Based on ATM (ATM기반 MPLS 기술을 이용한 음성서비스 제공 구조 연구)

Yoon, Hyeon-Sik;Yang, Sun-Hee
- Proceedings of the Korea Information Processing Society Conference
- /
- 2002.11b
- /
- pp.1301-1304
- /
- 2002
통신 환경이 변하면서, 기존의 서비스에 따라 크게 음성망과 패킷망으로 구분되던 망 구조가 하나의 통합된 망에서 모든 서비스를 제공하는 구조로 진화하고 있다. 그리고, 이러한 서비스를 가능하게 하는 기술로서 VoIP(Voice over IP)가 최근까지도 계속 각광받고 있다. 그러나, 많은 노력에도 불구하고, 음성서비스와 같은 실시간 서비스의 엄격한 품질 요구조건을 보장하는 문제 때문에 VoIP 기술의 실제 적용이 지연되고 있다. 이에 본 논문에서는 통합망의 패킷 전달망을 ACE2000 MPLS 시스템 기반의 MPLS 망으로 구축함으로써 음성서비스의 품질을 보장하는 망 구조를 제시하고자 한다. 아울러 TE Server를 이용해서, 음성호를 전달하는 ER-LSP(Explicit Routed Label Switched Path)를 설정하는 호 설정 절차를 제시하였다.
PDF

An Implementation of Speech DB Gathering System Using VoiceXML (VoiceXML을 이용한 음성 DB 수집 시스템 구현)

Kim Dong-Hyun;Roh Yong-Wan;Hong Kwang-Seok
- Journal of Internet Computing and Services
- /
- v.6 no.1
- /
- pp.39-50
- /
- 2005
Speech DB is basically required factor when we are study for phonetics, speech recognition and speech synthesis and so on. The quantity and quality of speech DB decide the efficiency of system that we develop. therefore. speech DB has an extremely important factor, Recently, development of the various telephone service technique such as voice portal. it is actual condition where the necessity of collection of telephone speech DB. The existing IVR application telephone speech DB collection system used C/C++ language or the exclusive development tool. Thus it is the actual condition where the recycle of each application service for resources is difficult and have a problem of many labors and time necessity. But. VoiceXML is a language having tag form ipredicated in XML. which has easy and simple grammar system. Therefore, if we make a few efforts we could draw up easily. it has a merit reducing labors and time, Also, VoiceXML has many advantages of various telephone speech DB gathering because of changing contents of DB. In this paper, we introduce telephone speech DB gathering system which is the mast important factor for development of speech information processing technique.
PDF

Design and Implementation of Korean Voice Web Browser (한국어 음성 웹브라우저 설계 및 구현)

Jang, Young-Gun;Jo, Kyoung-Hwan
- Journal of KIISE:Computing Practices and Letters
- /
- v.7 no.5
- /
- pp.458-466
- /
- 2001
This paper is addressed to a design and implementation of Korean voice web browser using voice technologies for controling web browser and selecting contents in the web document, and converting them to voice after HTML analysis. Main feature of this web browser is universal design which considers both of normal person and visual disabled, allows multi-modal interface. As voice interface for visual disabled, it supports tree structure which allows to recognize web document structure easily by only voice guidance regardless of frame usage, can handle all elements described as tag in the web document, identify them as predefined different voice property according to element property. This method gets rid of additional guidance voice for element property without audio style sheet or additional programming effort.
PDF

Comparison of Male/Female Speech Features and Improvement of Recognition Performance by Gender-Specific Speech Recognition (남성과 여성의 음성 특징 비교 및 성별 음성인식에 의한 인식 성능의 향상)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.5 no.6
- /
- pp.568-574
- /
- 2010
In an effort to improve the speech recognition rate, we investigated performance comparison between speaker-independent and gender-specific speech recognitions. For this purpose, 20 male and 20 female speakers each pronounced 300 isolated Korean words and the speeches were divided into 4 groups: female, male, and two mixed genders. To examine the validity for the gender-specific speech recognition, Fourier spectrum and MFCC feature vectors averaged over male and female speakers separately were examined. The result showed distinction between the two genders, which supports the motivation for the gender-specific speech recognition. In experiments of speech recognition rate, the error rate for the gender-specific case was shown to be less than50% compared to that of the speaker-independent case. From the obtained results, it might be suggested that hierarchical recognition of gender and speech recognition might yield better performance over the current method of speech recognition.
PDF KSCI

메디칼 영상처리 보드 및 응용 Software

지영선
- Proceedings of the KSLP Conference
- /
- 1995.11a
- /
- pp.181-184
- /
- 1995
일반적으로 의료상에서 방사선 사진 상으로부터 병소 부위를 찾아내어 질병의 유무 및 진단을 해왔으나 명확하지 못한 방사선 사진 자체의 문제점들과 사진 현상시의 문제점들로 인하여 진단의 혼란을 초월할 수 있다. 또한 오래 전부터 컴퓨터의 발달로 인하여 방사선 사진을 입력, 진단하려는 움직임도 있었으나 많은 노력에도 불구하고 입력시키려는 사진이 잡음이 많고 대비가 상당히 안 좋은 상태이므로, 이로 인한 해상도의 문제점으로 이를 기피하고 현상되어 나온 사진자체로 진단을 하려고 하였다. (중략)
PDF

Analysis of IETF IP Telephony Protocols (LETF IP 텔레포니 프로토콜 분석)

최선완;하은용;전경재;최경수;김환철
- Proceedings of the Korea Multimedia Society Conference
- /
- 2000.04a
- /
- pp.397-400
- /
- 2000
인터넷에서 음성 서비스를 제공하는 IP 텔레포니 또는 VOIP(Voice over IP)기술은 대부분 ITU-T H.323을 기반으로 제공되고 있다. 그러나 H.323은 그 구조가 복잡하기 때문에 이해하는데 상당한 노력과 오랜 개발 기간이 요구된다. 특히 표준을 따라 개발한 제품간에 상호연동을 위해서 상당한 노력이 필요하다. LFTF는 이러한 문제를 극복하고 인터넷 환경에서 잘 동작할 수 있는 IP 텔레포니용 프로토콜을 표준화하고 있으며 , 본 논문에서는 이들 프로토콜을 분석한다.
PDF

고령친화 AI음성 O2O 서비스의 서비스가치가 태도와 이용의도에 미치는 영향에 관한 연구

Lee, Myeong-Suk;Go, In-Gon
- 한국벤처창업학회:학술대회논문집
- /
- 2021.11a
- /
- pp.125-128
- /
- 2021
한국은 2025년 전체 인구 중 65세 이상 인구가 20%을 상회하는 초고령 사회 진입이 전망되면서 노화수준에 맞는 고령친화적인 제품서비스 공급이 요구된다. 특히 시니어 소비자가 사용하기 편리한 인터페이스를 갖춘 서비스가 필요하다. 이에 시니어들은 노화(aging)에 대한 문제의식에 비용을 지불하며 젊은 소비자들과 유사한 소비행태를 보이고, 노화 수준별 건강 유지 및 건강 불안, 돌봄 공백, 사회적 고립 증가 등 사회문제가 복합적으로 심화되면서 고령친화적인 스마트한 Aging Service 공급이 요구된다. 이러한 시기와 맞물려 with코로나시대 시니어 소비자가 사용하기 편리한 인터페이스를 갖는 제품·서비스로 4차 산업혁명의 중심인 AI(Artificial Intelligence)와 정보통신 기술의 노력이 가시화되고 있다. 따라서 IT 기술에 덧입혀 시니어들의 욕구에 부합하는 AI 음성인식 기능을 탑재한 제품 및 서비스가 향후 고령친화산업 성장을 주도할 것으로 전망된다. 이에 본 연구는 '고령친화 AI 음성 O2O 서비스'의 서비스 가치가 태도와 이용의도에 영향을 미치는가를 분석하기 위해 선행이론을 토대로 전문가 델파이 방법을 통해 고령친화 AI 음성 O2O 서비스의 정의를 도출하고 실증분석으로 '고령친화 AI 음성 O2O 서비스'의 서비스가치(상황기반 제공성, 즉시연결성, 위치정확성)와 태도 및 이용의도간의 인과관계를 조사하기 위해 본 연구를 진행하였다.
PDF

A Study of Fundamental Frequency for Focused Word Spotting in Spoken Korean (한국어 발화음성에서 중점단어 탐색을 위한 기본주파수에 대한 연구)

Kwon, Soon-Il;Park, Ji-Hyung;Park, Neung-Soo
- The KIPS Transactions:PartB
- /
- v.15B no.6
- /
- pp.595-602
- /
- 2008
The focused word of each sentence is a help in recognizing and understanding spoken Korean. To find the method of focused word spotting at spoken speech signal, we made an analysis of the average and variance of Fundamental Frequency and the average energy extracted from a focused word and the other words in a sentence by experiments with the speech data from 100 spoken sentences. The result showed that focused words have either higher relative average F0 or higher relative variances of F0 than other words. Our findings are to make a contribution to getting prosodic characteristics of spoken Korean and keyword extraction based on natural language processing.
https://doi.org/10.3745/KIPSTB.2008.15-B.6.595 인용 PDF KSCI

An Agent Based IP Transcript System in VoIP Network (VoIP망에서 Agent 기반 IP 녹취 시스템)

Lim Jae-Jin;Kim Soo-Hee;Jung In-Sang;Jung In-Hwan
- Proceedings of the Korea Information Processing Society Conference
- /
- 2006.05a
- /
- pp.1243-1246
- /
- 2006
초고속 통신망의 확대 적용으로 인터넷의 빠른 성장과 함께 음성과 비디오 그리고 데이터를 통합하고자 하는 노력이 시도되고 있다. VoIP(Voice over IP)는 IP를 이용하여 음성과 데이터를 패킷 형태로 통합하여 실시간으로 전송하는 기술이다[1]. 패킷 네트워크에서 VoIP 시그널링 기술을 이용하면 망 자원으 효율적 이용 및 PSTN에 가까운 음질 그리고 인터넷과 연계한 다양한 음성서비스 지원이 가능하다. 콜센터에서도 VoIP를 사용하게 됨에 따라 VoIP망에서의 녹취 시스템이 필요하다. VoIP 녹취 시스템은 상담원과 고객 간의 통화 내용을 자동으로 녹음하여 보관함으로써 고객의 요구사항을 명확하게 파악할 수 있으며 녹취 데이터의 통계 자료 제공으로 효율적인 관리가 가능하고, 선택 녹취, 스케쥴링 녹취, 상담원의 평가 자료를 제공하여 고객 관리의 질적인 향상을 지원한다. 본 논문에서는 성능에 큰 영향을 주지 않고 기존의 VoIP 녹취 시스템의 문제점을 해결한 에이전트를 포함한 VoIP 녹취 시스템을 제안한다.
PDF

Search Result 148, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)