통합 검색 | Korea Science

A Novel Integration Scheme for Audio Visual Speech Recognition

Pham, Than Trung;Kim, Jin-Young;Na, Seung-You
- 한국음향학회지
- /
- 제28권8호
- /
- pp.832-842
- /
- 2009
Automatic speech recognition (ASR) has been successfully applied to many real human computer interaction (HCI) applications; however, its performance tends to be significantly decreased under noisy environments. The invention of audio visual speech recognition (AVSR) using an acoustic signal and lip motion has recently attracted more attention due to its noise-robustness characteristic. In this paper, we describe our novel integration scheme for AVSR based on a late integration approach. Firstly, we introduce the robust reliability measurement for audio and visual modalities using model based information and signal based information. The model based sources measure the confusability of vocabulary while the signal is used to estimate the noise level. Secondly, the output probabilities of audio and visual speech recognizers are normalized respectively before applying the final integration step using normalized output space and estimated weights. We evaluate the performance of our proposed method via Korean isolated word recognition system. The experimental results demonstrate the effectiveness and feasibility of our proposed system compared to the conventional systems.
https://doi.org/10.7776/ASK.2009.28.8.832 인용 PDF KSCI

영상표식 기반의 로봇 매니퓰레이터 끝점 위치 제어 (Tip Position Control of a Robot Manipulator using Visual Markers)

임세준;임현;이영삼
- 제어로봇시스템학회논문지
- /
- 제16권9호
- /
- pp.883-890
- /
- 2010
This paper proposes tip position control system which uses a visual marker to determine the tip position of a robot manipulator. The main idea of this paper is to introduce visual marker for the tracking control of a robot manipulator. Existing researches utilize stationary markers to get pattern information from them. Unlike existing researches, we introduce visual markers to get the coordinates of them in addition to their pattern information. Markers need not be stationary and the extracted coordinate of markers are used as a reference trajectory for the tracking control of a robot manipulator. To build the proposed control scheme, we first obtain intrinsic parameters through camera calibration and evaluate their validity. Secondly, we present a procedure to obtain the relative coordinate of a visual marker with respect to a camera. Thirdly, we derive the equation for the kinematics of the SCORBOTER 4pc manipulator which we use for control of manipulator. Also, we provide a flow diagram of entire visual marker tracking system. The feasibility of the proposed scheme will be demonstrated through real experiments.
https://doi.org/10.5302/J.ICROS.2010.16.9.883 인용 PDF KSCI

C＋＋를 위한 대화식 다중 뷰 시각 프로그래밍 환경 (An Interactive Multi-View Visual Programming Environment for C＋＋)

류천열;정근호;유재우;송후봉
- 한국정보처리학회논문지
- /
- 제2권5호
- /
- pp.746-756
- /
- 1995
본 논문은 다중 뷰를 이용한 대화식 시각 프로그래밍 환경에 관한 연구로서, C＋＋언어 프로그래밍을 위한 클래스의 시각화와 호출되는 멤버 함수의 흐름을 시각 화하는 뷰들을 제공한다. 본 연구는 클래스에 대한 새로운 시각 기호를 정의하고, 시 각 기호를 이용한 다양한 뷰의 대화식 시각 프로그래밍 환경을 구성 하였다. 대화식 다중 뷰 시각 프로그래밍 환경은 객체지향 언어에서 클래스의 표현과 객체간의 실행 관계를 시각적으로 표현하므로써 객체지향 프로그램의 전체 구조에 대한 파악이 용이 하여 프로그램의 개발이 손쉬워지고, 초보자를 위한 교육과 훈련에도 유용하게 사용 될 수 있다.
PDF

A 3D Audio-Visual Animated Agent for Expressive Conversational Question Answering

Martin, J.C.;Jacquemin, C.;Pointal, L.;Katz, B.
- 한국정보컨버전스학회:학술대회논문집
- /
- 한국정보컨버전스학회 2008년도 International conference on information convergence
- /
- pp.53-56
- /
- 2008
This paper reports on the ACQA(Animated agent for Conversational Question Answering) project conducted at LIMSI. The aim is to design an expressive animated conversational agent(ACA) for conducting research along two main lines: 1/ perceptual experiments(eg perception of expressivity and 3D movements in both audio and visual channels): 2/ design of human-computer interfaces requiring head models at different resolutions and the integration of the talking head in virtual scenes. The target application of this expressive ACA is a real-time question and answer speech based system developed at LIMSI(RITEL). The architecture of the system is based on distributed modules exchanging messages through a network protocol. The main components of the system are: RITEL a question and answer system searching raw text, which is able to produce a text(the answer) and attitudinal information; this attitudinal information is then processed for delivering expressive tags; the text is converted into phoneme, viseme, and prosodic descriptions. Audio speech is generated by the LIMSI selection-concatenation text-to-speech engine. Visual speech is using MPEG4 keypoint-based animation, and is rendered in real-time by Virtual Choreographer (VirChor), a GPU-based 3D engine. Finally, visual and audio speech is played in a 3D audio and visual scene. The project also puts a lot of effort for realistic visual and audio 3D rendering. A new model of phoneme-dependant human radiation patterns is included in the speech synthesis system, so that the ACA can move in the virtual scene with realistic 3D visual and audio rendering.
PDF

일정도표 정보의 지도기반 가시화 기법 (Visual Mapping from Time-Table Information to Map)

이석준;정기숙;정승대;정순기
- 한국HCI학회:학술대회논문집
- /
- 한국HCI학회 2006년도 학술대회 1부
- /
- pp.1155-1160
- /
- 2006
다양한 과학 분야와 공학 분야에서는 그들이 다루고 있는 특정한 주제의 정보를 좀 더 신속하고, 명확하게 사용자에게 전달하기 위해서 여러 가지 정보 가시화(information visualization) 기법을 사용한다. 정보를 가시화 할 때는 기본적으로 세 가지 과정을 거치는데, 원천 데이터(raw data)로부터 데이터 모델(data model)로 변환하고, 변환된 데이터 모델을 가시화 구조상(visual structure)에 매핑(mapping)시킨 후 정보화 모델(information model)로 변환하게 된다. 본 논문에서는 특정 행사가 진행되고 있는 건물내부에서 발생하는 시간, 공간적인 정보를 정리한 도표 메타포(table metaphor)를 토대로, 해당 데이터 모델로부터 추출한 다양한 정보를 3 차원 지도로 구성된 정보화 모델 상에 반영하기 위한 방법을 제안하였다. 또한, 정보를 단순히 공간상에 반영하기 보다는 사용자의 관심영역(interest area)에 따른 정보의 공간적 의미에 중점을 두어 3차원 공간상에 표현하였다.
PDF

STUDY ON THE VISUAL COGNITIVE CHARACTERISTICS BY THE FIXATION POINT ANALYSIS USING THE EYE MARK RECORDER

Yamanoto, Satoshi;Yamaoka, Toshiki;Matsunobe, Takuo
- 한국감성과학회:학술대회논문집
- /
- 한국감성과학회 2001년도 춘계학술대회 논문집
- /
- pp.20-25
- /
- 2001
In recent years, the concern about a user center design in increasing, and it's needed to task a user's visual cognitive characteristics for information presentation. Then this study aims to grasp user's cognitive characteristics about the information presentation by analyzing the fixation points. In the experiment, actually subject operated a copy machine. Recorded the fixation point movement of the operation panel by the eye mark recorder. Analysis examined the screen interface of the operation panel from the field of a fixation point trace. The top down type fixation oder by experience or the context became clear as a result. Furthermore, the difference of the fixation order by skill level was also examined. In this study, it was assumed that to grasp the visual cognitive characteristics becomes the key of efficient information.
PDF

쾌적성 평가지표로서 시각 및 청각정보의 영향에 관한 연구 (A study on the Visual and Aural Information Effect as the Amenity Evaluation Index)

신훈;송민정;김선우;장길수
- 한국소음진동공학회:학술대회논문집
- /
- 한국소음진동공학회 2007년도 춘계학술대회논문집
- /
- pp.511-514
- /
- 2007
This study aims to derive the effect of road traffic noise perception by the visual and aural information through a laboratory experiment. To verify the result more precisely, ME(Magnitude Estimation) and SD(Semantic Differential Method) evaluation on the effect of visual and aural effect were carried out by 43 university students. As the result, up to 10% of psychological reduction effect was shown under the 65dB(A). As the noise level, it was analyzed that the vision affected about 7dB(A) and sound affected 5dB(A). However, if these two are given simultaneously, mainly sound affects to reduce the annoyance of noise and the vision next. Compared with the urban central circumstances, this effect(2dB(A) under 65dB(A) noise) was shown smaller than field test.
PDF

A Design and Implementation of Tangible Educational Contents

Kim, So-Young;Kim, Heesun
- International Journal of Advanced Culture Technology
- /
- 제4권4호
- /
- pp.64-69
- /
- 2016
Currently on the school education site, various multimedia contents are used to effectively deliver knowledge to the students and increase interest in class. The majority of the multimedia contents currently used in classes are composed of visual and auditory information. This paper intends to maximize actuality and immersion in the content by adding olfactory information to the existing visual and auditory data. Tangible contents were developed based on the details of aromatic plants learned in the fifth grade of elementary school. The shape and explanation of the aromatic plants are displayed with visual and auditory information, and an aroma spraying application is used to allow the students to smell the aromatic plants. After conducting the class using the developed contents, the students' satisfaction of the class, as well as their overall academic understanding, were investigated. Upon doing so, it was discovered that the students' academic understanding and satisfaction increased in comparison to classes comprised of only visual and auditory contents.
https://doi.org/10.17703/IJACT.2016.4.4.64 인용 PDF

Client-Server 모델에 의한 시각처리시스템 (Visual Processing System based on Client-Server Model)

문용선;허형팔;임승우;박경숙
- 전자공학회논문지T
- /
- 제36T권2호
- /
- pp.42-47
- /
- 1999
본 논문은 시각정보를 공장자동화에 적용하기 위한 모델을 제안하였다. 제안된 모델에서 클라이언트-서버 모델과 RPC(Remote Procedure Calling)를 이용하여 시각정보를 획득하는 시스템, 공장자동화 전체를 총괄하는 메인서버, 그리고 시각정보만을 처리하는 처리서버로 분산처리 한다. 그 유효성을 지폐인식시스템으로 구현하였다.
PDF

시각 및 청각 정보가 환경음의 쾌적성 평가에 미치는 영향에 관한 연구 (An study on the Effects of Visual and Aural Information on Environmental Sound Amenity Evaluation)

신훈;백건종;송민정;장길수
- 한국소음진동공학회논문집
- /
- 제17권9호
- /
- pp.813-818
- /
- 2007
This study aims to know the effect of road traffic noise perception when the visual and aural information is added in a laboratory experiment. ME (magnitude estimation) and SD (semantic differential method) evaluation on the effect of visual and aural effect were carried out by 43 university students. As the result, up to 10 % of psychological reduction effect was shown under the 65 dB(A). As the noise level, it was analyzed that the vision affected about 7 dB(A) and sound affected 5 dB(A). However, if these two are given simultaneously, mainly sound affects to reduce the annoyance of noise and the vision next. Compared with the urban central circumstances, this effect (2 dB(A) under 65 dB(A) noise) was shown smaller than field test.
https://doi.org/10.5050/KSNVN.2007.17.9.813 인용 PDF KSCI

검색결과 5,281건 처리시간 0.039초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)