• Title/Summary/Keyword: visual-audio

Search Result 424, Processing Time 0.022 seconds

A Study of the spatial perception by audio-visual information (시각과 청각에 의한 공간적 지각에 관한 연구)

  • Lee, Chai-Bong;Kang, Dae-Gee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.2
    • /
    • pp.132-136
    • /
    • 2010
  • Psychophysical experiment was performed to investigate how audio-visual spatial disparity affects on perceptual space in peripheral vision. In the experiment, participants were exposed to two stimuli of vision and sound which comes simultaneously from different directions, respectively. The visual stimulus was implemented by 7 white LEDs which were located at an equal distance with 7 different angles of $-70^{\circ}$, $-40^{\circ}$, $-20^{\circ}$, $0^{\circ}$, $20^{\circ}$, $40^{\circ}$, and $70^{\circ}$ from the right front. Those audial stimuli were also implemented by loudspeakers which were placed at 9 different directions equally spaced by $5^{\circ}$ ranged from $-20^{\circ}$ to $20^{\circ}$. Each participant then evaluated spatial disparity between visual and audial stimuli with 5 levels of response, in which the higher level indicates the larger gap. When the visual stimulus is applied from the right, the results show that the response level gets higher for a larger angle between visual and auditory stimuli. A similar tendency for the visual stimulus with $0^{\circ}$ orientation was also be observed. On the other hand, when the visual stimulus is applied from the left, the response level gets lower for the larger angle.

음성인식 기반 인터렉티브 미디어아트의 연구 - 소리-시각 인터렉티브 설치미술 "Water Music" 을 중심으로-

  • Lee, Myung-Hak;Jiang, Cheng-Ri;Kim, Bong-Hwa;Kim, Kyu-Jung
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.354-359
    • /
    • 2008
  • This Audio-Visual Interactive Installation is composed of a video projection of a video Projection and digital Interface technology combining with the viewer's voice recognition. The Viewer can interact with the computer generated moving images growing on the screen by blowing his/her breathing or making sound. This symbiotic audio and visual installation environment allows the viewers to experience an illusionistic spacephysically as well as psychologically. The main programming technologies used to generate moving water waves which can interact with the viewer in this installation are visual C++ and DirectX SDK For making water waves, full-3D rendering technology and particle system were used.

  • PDF

Design of Music Learning Assistant Based on Audio Music and Music Score Recognition

  • Mulyadi, Ahmad Wisnu;Machbub, Carmadi;Prihatmanto, Ary S.;Sin, Bong-Kee
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.5
    • /
    • pp.826-836
    • /
    • 2016
  • Mastering a musical instrument for an unskilled beginning learner is not an easy task. It requires playing every note correctly and maintaining the tempo accurately. Any music comes in two forms, a music score and it rendition into an audio music. The proposed method of assisting beginning music players in both aspects employs two popular pattern recognition methods for audio-visual analysis; they are support vector machine (SVM) for music score recognition and hidden Markov model (HMM) for audio music performance tracking. With proper synchronization of the two results, the proposed music learning assistant system can give useful feedback to self-training beginners.

The Implementation of Real-Time Speaker Localization Using Multi-Modality (멀티모달러티를 이용한 실시간 음원추적 시스템 구현)

  • Park, Jeong-Ok;Na, Seung-You;Kim, Jin-Young
    • Proceedings of the KIEE Conference
    • /
    • 2004.11c
    • /
    • pp.459-461
    • /
    • 2004
  • This paper presents an implementation of real-time speaker localization using audio-visual information. Four channels of microphone signals are processed to detect vertical as well as horizontal speaker positions. At first short-time average magnitude difference function(AMDF) signals are used to determine whether the microphone signals are human voices or not. And then the orientation and distance information of the sound sources can be obtained through interaural time difference and interaual level differences. Finally visual information by a camera helps get finer tuning of the speaker orientation. Experimental results of the real-time localization system show that the performance improves to 99.6% compared to the rate of 88.8% when only the audio information is used.

  • PDF

The Effective Education of the Standard Pronunciations (효과적인 표준 발음 교육)

  • Lee Dong-Seok
    • MALSORI
    • /
    • no.51
    • /
    • pp.17-37
    • /
    • 2004
  • The purpose of this dissertation is to make the general korean speakers to learn the standard pronunciations. But it is in existence that the obstructions of the command of the standard pronunciations. They are the mistake in the education course on the korean pronunciations, the teacher's capability and the mass communications's duplicity. To overcome this obstructions, we must concentrate our efforts on the propagation of the standard pronunciations. To propagate of the standard pronunciations we can take a several method. These are the presentation of the pronunciation mistakes, audio-visual teaching, the presentation of the pronunciation principles and the use of the korean dictionary. The standard pronunciations are different from the pronunciations of the general korean speakers in many respects. So we can't make an accurate estimate of the pronunciation's changes. No one knows what will happen in the future about the korean pronunciations. But we must teach the standard pronunciations to the general korean speakers. The standard pronunciations are offically valid in the present time.

  • PDF

Estimation of speech feature vectors and enhancement of speech recognition performance using lip information (입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상)

  • Min So-Hee;Kim Jin-Young;Choi Seung-Ho
    • MALSORI
    • /
    • no.44
    • /
    • pp.83-92
    • /
    • 2002
  • Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.

  • PDF

The use of audio-visual aids and hyper-pronunciation method in teaching English consonants to Japanese college students

  • Todaka, Yuichi
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.149-154
    • /
    • 1996
  • Since the 1980s, a number of professionals in the ESL/EFL field have investigated the role of pronunciation in the ESL/EFL curriculum. Applying the insights gained from the second language acquisition research, these efforts have focused on the integration of pronunciation teaching and learning into the communicative curriculum, with a shift towards overall intelligibility as the primary goal of pronunciation teaching and learning. The present study reports on the efficacy of audio-visual aids and hyper-pronunciation training method in teaching the productions of English consonants to Japanese college students. The talk will focus on the implications of the present study, and the presenter makes suggestions to teaching pronunciation to Japanese learners.

  • PDF

Hierarchical Treatment of Aphasic Perserveration Program: A Case Study (위계적 고착현상 치료 프로그램의 적용: 사례 연구)

  • Jeong, Ok-Ran;Shim, Hong-Im;Ko, Do-Heung
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.75-86
    • /
    • 2001
  • This study explored the effectiveness of a hierarchical treatment of aphasic perseveration (TAP) program in a Korean client with transcortical sensory aphasia. The subject with 52% perserveration score (Korean version of Boston Naming Test : K-BNT) was 44 year-old female with MCA (Middle cerebral artery) infarction. The experimental design used was an alternating treatment design with the hierarchical TAP and conventional audio-visual stimulation. The frequency of occurrence of perseverative behaviors and correct response in naming performance were analyzed and compared. It was claimed that the hierarchical TAP was more effective in naming performance than conventional audio-visual stimulation in terms of correct naming response. The frequency of occurrence of perseverative behaviors was lower in hierarchical TAP but the difference was relatively small. Unlike in English, sentence completion task was no longer stimulable while unison speech was very stimulable among the specific strategies of TAP program in Korean. Therefore, it could be said that TAP is language-dependent.

  • PDF

A study on Metadata Modeling using Structure Information of Video Document (비디오 문서의 구조 정보를 이용한 메타데이터 모델링에 관한 연구)

  • 권재길
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.4
    • /
    • pp.10-18
    • /
    • 1998
  • Video information is an important component of multimedia system such as Digital Library. World-Wide Web(WWW) and Video-On-Demand(VOD) service system. It can support various types of information because of including audio-visual, spatial-temporal and semantics information. In addition, it requires the ability of retrieving the specific scene of video instead of entire retrieval of video document. Therefore, so as to support a variety of retrieval, this paper models metadata using video document structure information that consists of hierarchical structure, and designs database schema that can manipulate video document.

  • PDF

The Plan for the Effective Method of Dental Laboratory Technology (치과기공과 교수방법의 효율화를 위한 방안)

  • Lee, Do-Kyeng
    • Journal of Technologic Dentistry
    • /
    • v.8 no.1
    • /
    • pp.31-36
    • /
    • 1986
  • This treatise suggests the effective method for the dental laboratory technology teaching plan. It will present concrete practical steps for and audio-visual dental laboratory technology education approach. It will also help students to understand the dental laboratory theory and practice learned in the class and make use of it greatly in the field work. As follows: 1. Instructor should teach interestingly basic dental laboratory technology theory with illustrations and figures on the teaching method. 2. In practical traing class, instructor should teach every step, using audio-visual materials such as slides and video tapes/Instructor and his assist and should show an example to the students. 3. Instructor should make a standard and train the studtnes repeatedly until they come up to it. 4. Students should be skilled in every case through field work during their spare time and vacation. 5. Instructor should also teach job moral and manner to the students so that they can be adapted themselves to the social activities and be successful dental laboratory technician after graduation.

  • PDF