Search | Korea Science

Speech Animation by Visualizing the Organs of Articulation (조음 기관의 시각화를 이용한 음성 동기화 애니메이션)

Lee, Sung-Jin;Kim, Ig-Jae;Ko, Hyeong-Seok
- 한국HCI학회:학술대회논문집
- /
- 2006.02a
- /
- pp.843-851
- /
- 2006
본 논문에서는 음성에 따른 얼굴 애니메이션을 사실적으로 표현하기 위한 조음기관(혀, 성대 등)의 움직임을 시각화하는 방법을 제시한다. 이를 위해서, 음성에 따른 얼굴 애니메이션을 위한 말뭉치(Corpus)를 생성하고, 생성된 말뭉치에 대해서 음소 단위의 분석(Phoneme alignment) 처리를 한 후, 각 음소에 따른 조음기관의 움직임을 생성한다. 본 논문에서는 조음기관의 움직임 생성을 위해서 얼굴 애니메이션 처리에서 널리 사용되고 있는 기저 모델 기반 형태 혼합 보간 기법(Blend shape Interpolation)을 사용하였다. 그리고 이를 통하여 프레임/키프레임 기반 움직임 생성 사용자 인터페이스를 구축하였다. 구축된 인터페이스를 통해 언어치료사가 직접 각 음소 별 조음기관의 정확한 모션 데이터를 생성토록 한다. 획득된 모션 데이터를 기반으로 각 음소 별 조음기관의 3차원 기본 기저를 모델링하고, 새롭게 입력된 음소 시퀀스(phoneme sequence)에 대해서 동기화된 3차원 조음기관의 움직임을 생성한다. 이를 통해 자연스러운 3차원 얼굴 애니메이션에 적용하여 얼굴과 동기화된 조음 기관의 움직임을 만들어 낼 수 있다.
PDF

User Adjustment Post-Process Using Neural Network In Isolated Word Speech Recognition (고립단어 음성인식에서 신경망을 이용한 사용자 적응형 후처리)

Kim, Young-Jin;Kim, Eun-Ju;Kim, Myoung-Won
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.11b
- /
- pp.736-738
- /
- 2005
최근 PDA나 PMP와 같은 개인용 모바일 기기의 인터페이스 개발로써 잡음환경에 강인한 음성인식 기술들이 연구되고 있으며 이러한 방법으로 오류패턴, 순차패턴, 의미정보, 문맥정보와 같이 인식기에 독립적인 정보를 이용하거나 영상 정보와 같이 언어와 성격이 다른 이질적인 정보를 이용하여 후처리를 하는 연구들이 진행되어 왔다. 그러나 인식기와 독립적인 정보로 후처리를 하는 방법들의 인식률은 인식기의 사전 인식률이 주변 잡음에 의해 떨어질 경우 후처리 인식률도 같이 떨어지는 현상이 벌어진다. 따라서 본 논문에서는 주변 잡음으로 인한 인식기의 사전 인식률에 저하를 줄이는 방법으로 사용자 적응형 후처리를 제안한다. 사용자 적응형 후처리에 사용되는 데이터는 사용자의 발화에 대한 인식기의 출력 값들이며, 출력 값들은 화자독립모델에 의해 계산되는 각 단어들의 유사도 들이다. 따라서 화자독립모델의 결과를 사용자 적응형 후처리에 적용한 결과 인식기의 오류를 $58.7\%$ 줄일 수 있었다.
PDF

A Design and Development of Augmented Reality Based Video Guestbook System (증강현실 기반의 비디오 방명록 시스템 설계 및 개발)

Kim, Namkil;Park, Heechan;Park, Kyoung Shin
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.11a
- /
- pp.415-416
- /
- 2009
최근 증강현실 기술이 발전하면서 다양한 환경에서의 상호작용을 위한 증강현실 기반 사용자 인터페이스가 활발히 연구되고 있다. 본 논문에서는 증강현실 기반의 사용자 인터페이스를 이용하여 전시회에서 사용자의 참여를 유도하고 다양한 상호작용을 제공하는 비디오 방명록 시스템을 제안한다. 이 시스템은 기존의 텍스트나 음성 서비스 중심적인 전시장 안내 시스템과는 달리 증강현실 기술과 사용자의 참여와 기록을 남길 수 있도록 유도하는 상호작용 방법을 제시한다. 본 논문에서는 증강현실 기반의 동영상 방명록 시스템의 설계와 구현을 설명하고 추후 연구방향을 논한다.
https://doi.org/10.3745/PKIPS.y2009m11a.415 인용 PDF

AI Kiosk with User Interface Application (사용자 인터페이스를 적용한 AI 키오스크)

Yun-Jin Park;Da-Yeon Choi;Su-Yeong Kim;Ji-Won Jang
- Proceedings of the Korea Information Processing Society Conference
- /
- 2023.11a
- /
- pp.842-843
- /
- 2023
Covid-19으로 인한 언택트(Untact) 문화의 확산으로 키오스크 주문과 같은 비대면 서비스가 증가하였다. 본 연구에서는 비대면 서비스로 인해 발생하는 정보격차 및 접근성 문제를 해결하기 위해 AI 기술과 사용자 인터페이스를 결합하여 개인 맞춤화된 키오스크를 소개한다. 본 연구에서 개발된 AI 키오스크는 자연어 처리기술을 활용한 음성 주문을 통해 편리성을 증진하고, 딥러닝 기술을 이용한 연령대 인식, 사용자의 알레르기 정보를 고려한 메뉴 추천을 통해 사용자에게 맞춤화된 서비스를 제공한다. 개발된 키오스크를 통해 개인화된 서비스를 개선할 수 있으며 더불어 정보 취약계층 간의 정보격차를 해소할 수 있다.
https://doi.org/10.3745/PKIPS.y2023m11a.842 인용 PDF

Post-Processing of Voice Recognition Using Phonologic Rules and Morphologic analysis (음절 복원 규칙과 형태소 분석을 이용한 음성인식 후처리)

Seo, Sang-Hyun;Kim, Jae-Hong;Kim, Hae-Jin;Kim, Mi-Jin;Lee, Sang-Jo
- Annual Conference on Human and Language Technology
- /
- 1997.10a
- /
- pp.495-499
- /
- 1997
컴퓨터의 사용이 보편화됨에 따라 컴퓨터와 사용자 사이의 쉽고 자연스러운 의사 소통을 위한 자연어 인터페이스에 대한 연구가 활발히 진행되고 있다. 이 중에서 특히, 음성인식 분야는 음성명령, 받아쓰기 시스템 등 일반적인 컴퓨터 사용자의 요구를 충족시켜 줄 수 있는 분야로 주목을 받고 있다. 그러나 음성인식은 인식 자체만으로는 인식률에 한계가 있으며, 인식 결과를 향상시키기 위해서는 후처리 단계가 필요하다. 본 논문에서는 음성 인식의 성능을 향상시키기 위해 음성 인식의 결과로 들어온 연속된 한국어 음성을 올바른 음절로 복원시켜 주는 시스템을 구현하였다. 이 시스템에서는 어절단위의 연속된 한국어 음성을 입력으로 받아 한국어 발음 규칙을 역으로 적용하여 원래의 음절로 복원시키고, 형태소 분석기를 이용하여 복원된 음절이 올바른지를 확인하고 수정한다. 초등학교 교과서에 나오는 문장을 대상으로 본 시스템의 성능을 실험한 결과, 90.42%의 복원율을 나타내었다. 현재 정확하게 복원이 되지 않는 것 중에는 동음이의어가 차지하는 비중이 크며, 이 문제는 구문분석이나 의미분석을 이용하여 어느 정도 개선할 수 있을 것으로 보인다.
PDF

A Study on the In-Vehicle Voice Interaction Structure Considering Implicit context with Persistence of Conversation (대화 지속성 암묵적 단서를 고려한 차량 내 음성 인터랙션 구조 연구)

Namkung, Kiechan
- Journal of the Korea Convergence Society
- /
- v.12 no.2
- /
- pp.179-184
- /
- 2021
In this study, the conversation behavior of users is investigated by using in-vehicle voice interaction system. The purpose of this study is to identify the elements of conversations that the users expect in voice interactions with systems and present the structural improvements to enable the voice interactions similar to those between people. To observe the users' behavior of voice interaction in the vehicle, the data through contextual inquiry are collected and the interview contents are analyzed by using the open coding. We have been able to explore the usefulness of voice interaction features, which are of great importance in that they increase the user's satisfaction with the features and their usage persistence. This study is meaningful in analyzing the user's empirical needs for the technology of interpersonal model from the perspective of conversation.
https://doi.org/10.15207/JKCS.2021.12.2.179 인용 PDF KSCI

Trends of Hardware Accelerator for the Embedded Speech Recognition (내장형 음성인식기를 위한 전용 하드웨어가속기 기술개발 동향)

Kim, J.Y.;Kim, T.J.;Lee, J.H.;Eum, N.W.
- Electronics and Telecommunications Trends
- /
- v.29 no.4
- /
- pp.91-100
- /
- 2014
사람의 말소리를 문자로 변환하여 기기의 제어명령으로 활용하는 것이 음성인식 기술이다. 음성인식에 대한 기술개발 요구는 수십 년 전부터 있어 왔고, 꾸준히 제품화되고 있는 분야라 하겠다. 제품으로의 상용화가 가능한 알고리즘 및 데이터 처리체계는 HMM(Hidden Markov Model)이라는 수학적 모델링으로 정형화되어 있으며, 대규모의 반복적 데이터 수집과 정교한 학습 데이터베이스의 구축이 음성인식기술의 핵심요소라는 것이 일반적인 시각이다. 이러한 이유로 인해, 대용량 음성인식 데이터베이스의 수집, 가공 등이 가능한 인프라를 갖춘 기관 및 업체들이 음성인식기술 시장을 점유할 수 있는 것이다. 그러나, 이러한 음성인식의 서비스 제공 체계는 사물인터넷 또는 웨어러블 디바이스 등으로 음성인식 사용자 인터페이스가 확대되고 통신 및 네트워크가 연결이 불가한 경우 그 한계를 보일 수 있다. 본고에서는 이러한 문제를 해결하기 위한 내장형 음성인식기의 하드웨어가속기 기술개발에 대한 내용과 국내외 현황을 살펴보기로 한다.
PDF

An Experimental Study on Hindrance Factors of Usability of Menu Structure in ARS (ARS 메뉴체계 사용성 저해요소에 대한 실험연구)

Kim, Ho-Won;Kim, Hee-Cheol
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.15 no.2
- /
- pp.462-470
- /
- 2011
ARS (Automatic Response Systems) based on VUI (Voice User Interface) and TTI (Touch Tone Interface) are one of the most widely used communication systems. Despite common usages, however, inconvenience of ARS is continually pointed out. This may stem from lack of human-centered studies aside from technological development. In this paper, we provide guidelines for designing ARS by analyzing hindrance factors of usability of ARS menu structure. We had selected two call-centers using ARS, and carried out an experimental study where subjects performed the task of "returning books." After that, they completed questionnaires and interviews. We identified four problems: the complex menu structure, lack of representativeness on the menu name, users' awareness of location, and a difficulty to move among menus. And we partially discussed the ways of avoiding the problems.
https://doi.org/10.6109/jkiice.2011.15.2.462 인용 PDF KSCI

Interactive Game Designed for Early Child using Multimedia Interface : Physical Activities (멀티미디어 인터페이스 기술을 이용한 유아 대상의 체감형 게임 설계 : 신체 놀이 활동 중심)

Won, Hye-Min;Lee, Kyoung-Mi
- The Journal of the Korea Contents Association
- /
- v.11 no.3
- /
- pp.116-127
- /
- 2011
This paper proposes interactive game elements for children : contents, design, sound, gesture recognition, and speech recognition. Interactive games for early children must use the contents which reflect the educational needs and the design elements which are all bright, friendly, and simple to use. Also the games should consider the background music which is familiar with children and the narration which make easy to play the games. In gesture recognition and speech recognition, the interactive games must use gesture and voice data which hits to the age of the game user. Also, this paper introduces the development process for the interactive skipping game and applies the child-oriented contents, gestures, and voices to the game.
https://doi.org/10.5392/JKCA.2011.11.3.116 인용 PDF KSCI

A Voice Annotation Browsing Technique in Digital Talking Book for Reading-disabled People (독서장애인을 위한 음성 도서 어노테이션 검색 기법)

Park, Joo Hyun;Lim, Soon-Bum;Lee, Jongwoo
- Journal of Korea Multimedia Society
- /
- v.16 no.4
- /
- pp.510-519
- /
- 2013
In this paper, we propose a voice-annotation browsing system that make the reading-disabled people to be able to find and play the existing voice-annotations. The proposed system consists of 4 steps: input, ranking & recommendation, search, and output. For the reading-disabled people depending only on the auditory sense, all steps can accept voice commands. To evaluate the effectiveness of our system, we design and implement an android-based mobile e-book application supporting the voice-annotation browsing ability. The implemented system is tested by a number of blind-folded users. As a result, we can see almost all the reading-disabled people can successfully and easily reach the existing voice-annotations they want to find.
https://doi.org/10.9717/kmms.2013.16.4.510 인용 PDF KSCI

Search Result 198, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)