• Title/Summary/Keyword: Visual Information Processing

Search Result 1,076, Processing Time 0.028 seconds

신체 장애우를 위한 얼굴 특징 추적을 이용한 실감형 게임 시스템 구현

  • Ju, Jin-Sun;Shin, Yun-Hee;Kim, Eun-Yi
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10a
    • /
    • pp.475-478
    • /
    • 2006
  • 실감형 게임은 사람의 신체 움직임 및 오감을 최대한 반영한 리얼리티를 추구하는 전문적인 게임이다. 현재 개발된 실감형 게임들은 비 장애우를 대상으로 만들어 졌기 때문에 많은 움직임을 필요로 한다. 하지만 신체적 불편함을 가진 장애우들은 이러한 게임들을 이용하는데 어려움이 있다. 따라서 본 논문에서는 PC상에서 최소의 얼굴 움직임을 사용하여 수행할 수 있는 실감형 게임 시스템을 제안한다. 제안된 실감형 게임 시스템은 웹 카메라로부터 얻어진 영상에서 신경망 기반의 텍스쳐 분류기를 이용하여 눈 영역을 추출한다. 추출된 눈 영역은 Mean-shift 알고리즘을 이용하여 실시간으로 추적되어지고, 그 결과로 마우스의 움직임이 제어된다. 구현된 flash게임과 연동하여 게임을 눈의 움직임으로 제어 할 수 있다. 제안된 시스템의 효율성을 검증하기 위하여 장애우와 비 장애우로 분류하여 성능을 평가 하였다. 그 결과 제안된 시스템이 보다 편리하고 친숙하게 신체 장애우 에게 활용 될 수 있으며 복잡한 환경에서도 확실한 얼굴 추적을 통하여 실감형 게임 시스템을 실행 할 수 있음이 증명되었다.

  • PDF

Constructing a Noise-Robust Speech Recognition System using Acoustic and Visual Information (청각 및 시가 정보를 이용한 강인한 음성 인식 시스템의 구현)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.8
    • /
    • pp.719-725
    • /
    • 2007
  • In this paper, we present an audio-visual speech recognition system for noise-robust human-computer interaction. Unlike usual speech recognition systems, our system utilizes the visual signal containing speakers' lip movements along with the acoustic signal to obtain robust speech recognition performance against environmental noise. The procedures of acoustic speech processing, visual speech processing, and audio-visual integration are described in detail. Experimental results demonstrate the constructed system significantly enhances the recognition performance in noisy circumstances compared to acoustic-only recognition by using the complementary nature of the two signals.

Image Denoising via Fast and Fuzzy Non-local Means Algorithm

  • Lv, Junrui;Luo, Xuegang
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1108-1118
    • /
    • 2019
  • Non-local means (NLM) algorithm is an effective and successful denoising method, but it is computationally heavy. To deal with this obstacle, we propose a novel NLM algorithm with fuzzy metric (FM-NLM) for image denoising in this paper. A new feature metric of visual features with fuzzy metric is utilized to measure the similarity between image pixels in the presence of Gaussian noise. Similarity measures of luminance and structure information are calculated using a fuzzy metric. A smooth kernel is constructed with the proposed fuzzy metric instead of the Gaussian weighted L2 norm kernel. The fuzzy metric and smooth kernel computationally simplify the NLM algorithm and avoid the filter parameters. Meanwhile, the proposed FM-NLM using visual structure preferably preserves the original undistorted image structures. The performance of the improved method is visually and quantitatively comparable with or better than that of the current state-of-the-art NLM-based denoising algorithms.

A Study on Visual Web Service Framework with Web UI Integration Base (웹 화면통합 기반의 Visual Web Service 프레임워크에 대한 연구)

  • Kim, Tae-Hoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.05a
    • /
    • pp.413-416
    • /
    • 2008
  • 본 연구에서는 일반적인 시스템 통합 프로젝트에서 발생하는 문제점들을 살펴보면서 새로운 UI 기반의 시스템 통합에 대한 방법을 제시하고자 한다. 시스템통합 프로젝트 진행 시 많은 비용과 시간이 소요되고 있으며 업무시스템의 통합 및 인터페이스 산출물에 대한 검증으로 다양한 이슈들이 발생하고있다. 특히 통합 요구사항의 출현 및 EAI(Enterprise Application Integration), B2Bi(Business-to-Business Integration)에 대한 다양한 요구사항으로 일반적인 어플리케이션 통합프로젝트의 구성에 대한 논의와 새로운 시스템통합 방법론에 대한 정의가 필요하다. 이러한 문제를 해결하기 위하여 기존 방식에서 좀더 진화된 비주얼한 웹 화면통합 기반의 SOA(Service Oriented Architecture) 구축결과 물을 기반으로, VISUAL SOA 기반 프레임워크를 제안하고자 한다.

A Study on the Relation of Visual Information Character and Design Alternatives (시각적 정보의 특성이 디자인대안에 미치는 영향에 관한 연구)

  • 오해춘
    • Archives of design research
    • /
    • v.15 no.2
    • /
    • pp.81-90
    • /
    • 2002
  • Designer creates new design alternatives using acquisition of visual information in design process. Which is more effectiveness to acquire in directive visual information or in-directive visual information\ulcorner In this research we would like to find out that the relation of character in visual information and design alternatives. So to A subjects, we give them to see directive visual information to make visual mental imagery, to B subjects, we give them to see in-directive visual information to make it. In this experiment they must crate telephone design. C subjects must evaluate this design alternatives by questions composing scale in distinction and elegant. After a experimentation, It is true that we make hypothesis that distinct two subjects in distinction and elegant. Though elegant is opposite with hypothesis. So to make elegant design. it is import to concentrate cognitive ability. Accordingly it proves that in-directive visual information is effective for new type design stage in design process and directive visual information is effective for new style design stage in design process.

  • PDF

Visual Programming Environment for Effective Teaching and Research in Image Processing (영상처리에서 효율적인 교육과 연구를 위한 비주얼 프로그래밍 환경 개발)

  • Lee Jeong Heon;Heo Hoon;Chae Oksam
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.1
    • /
    • pp.50-61
    • /
    • 2005
  • With the wide spread use of multimedia device, the demand for the image processing engineers are increasing in various fields. However there are few engineers who can develop practical applications in the image processing area. To teach practical image processing techniques, we need a visual programming environment which can efficiently present the image processing theories and, at the same time, provide interactive experiments for the theory presented. In this paper, we propose a visual programming environment of the integrated environment for image processing. It consists of the theory presentation systems and experiment systems based on the visual programming environment. The theory presentation systems support multimedia data, web documents and powerpoint files. The proposed system provides an integrated environment for application development as well as education. The proposed system accumulates the teaching materials and exercise data and it manages, an ideal image processing education and research environment to students and instructors.

Investigating the Effects of Hearing Loss and Hearing Aid Digital Delay on Sound-Induced Flash Illusion

  • Moradi, Vahid;Kheirkhah, Kiana;Farahani, Saeid;Kavianpour, Iman
    • Journal of Audiology & Otology
    • /
    • v.24 no.4
    • /
    • pp.174-179
    • /
    • 2020
  • Background and Objectives: The integration of auditory-visual speech information improves speech perception; however, if the auditory system input is disrupted due to hearing loss, auditory and visual inputs cannot be fully integrated. Additionally, temporal coincidence of auditory and visual input is a significantly important factor in integrating the input of these two senses. Time delayed acoustic pathway caused by the signal passing through digital signal processing. Therefore, this study aimed to investigate the effects of hearing loss and hearing aid digital delay circuit on sound-induced flash illusion. Subjects and Methods: A total of 13 adults with normal hearing, 13 with mild to moderate hearing loss, and 13 with moderate to severe hearing loss were enrolled in this study. Subsequently, the sound-induced flash illusion test was conducted, and the results were analyzed. Results: The results showed that hearing aid digital delay and hearing loss had no detrimental effect on sound-induced flash illusion. Conclusions: Transmission velocity and neural transduction rate of the auditory inputs decreased in patients with hearing loss. Hence, the integrating auditory and visual sensory cannot be combined completely. Although the transmission rate of the auditory sense input was approximately normal when the hearing aid was prescribed. Thus, it can be concluded that the processing delay in the hearing aid circuit is insufficient to disrupt the integration of auditory and visual information.

Lip and Voice Synchronization Using Visual Attention (시각적 어텐션을 활용한 입술과 목소리의 동기화 연구)

  • Dongryun Yoon;Hyeonjoong Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.166-173
    • /
    • 2024
  • This study explores lip-sync detection, focusing on the synchronization between lip movements and voices in videos. Typically, lip-sync detection techniques involve cropping the facial area of a given video, utilizing the lower half of the cropped box as input for the visual encoder to extract visual features. To enhance the emphasis on the articulatory region of lips for more accurate lip-sync detection, we propose utilizing a pre-trained visual attention-based encoder. The Visual Transformer Pooling (VTP) module is employed as the visual encoder, originally designed for the lip-reading task, predicting the script based solely on visual information without audio. Our experimental results demonstrate that, despite having fewer learning parameters, our proposed method outperforms the latest model, VocaList, on the LRS2 dataset, achieving a lip-sync detection accuracy of 94.5% based on five context frames. Moreover, our approach exhibits an approximately 8% superiority over VocaList in lip-sync detection accuracy, even on an untrained dataset, Acappella.

Aural-visual two-stream based infant cry recognition (Aural-visual two-stream 기반의 아기 울음소리 식별)

  • Bo, Zhao;Lee, Jonguk;Atif, Othmane;Park, Daihee;Chung, Yongwha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.354-357
    • /
    • 2021
  • Infants communicate their feelings and needs to the outside world through non-verbal methods such as crying and displaying diverse facial expressions. However, inexperienced parents tend to decode these non-verbal messages incorrectly and take inappropriate actions, which might affect the bonding they build with their babies and the cognitive development of the newborns. In this paper, we propose an aural-visual two-stream based infant cry recognition system to help parents comprehend the feelings and needs of crying babies. The proposed system first extracts the features from the pre-processed audio and video data by using the VGGish model and 3D-CNN model respectively, fuses the extracted features using a fully connected layer, and finally applies a SoftMax function to classify the fused features and recognize the corresponding type of cry. The experimental results show that the proposed system classification exceeds 0.92 in F1-score, which is 0.08 and 0.10 higher than the single-stream aural model and single-stream visual model.

A Study on Image Recognition based on the Characteristics of Retinal Cells (망막 세포 특성에 의한 영상인식에 관한 연구)

  • Cho, Jae-Hyun;Kim, Do-Hyeon;Kim, Kwang-Baek
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.11
    • /
    • pp.2143-2149
    • /
    • 2007
  • Visual Cortex Stimulator is among artificial retina prosthesis for blind man, is the method that stimulate the brain cell directly without processing the information from retina to visual cortex. In this paper, we propose image construction and recognition model that is similar to human visual processing by recognizing the feature data with orientation information, that is, the characteristics of visual cortex. Back propagation algorithm based on Delta-bar delta is used to recognize after extracting image feature by Kirsh edge detector. Various numerical patterns are used to analyze the performance of proposed method. In experiment, the proposed recognition model to extract image characteristics with the orientation of information from retinal cells to visual cortex makes a little difference in a recognition rate but shows that it is not sensitive in a variety of learning rates similar to human vision system.