• Title/Summary/Keyword: visual/audio system

Search Result 150, Processing Time 0.024 seconds

Speech Emotion Recognition with SVM, KNN and DSVM

  • Hadhami Aouani ;Yassine Ben Ayed
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.40-48
    • /
    • 2023
  • Speech Emotions recognition has become the active research theme in speech processing and in applications based on human-machine interaction. In this work, our system is a two-stage approach, namely feature extraction and classification engine. Firstly, two sets of feature are investigated which are: the first one is extracting only 13 Mel-frequency Cepstral Coefficient (MFCC) from emotional speech samples and the second one is applying features fusions between the three features: Zero Crossing Rate (ZCR), Teager Energy Operator (TEO), and Harmonic to Noise Rate (HNR) and MFCC features. Secondly, we use two types of classification techniques which are: the Support Vector Machines (SVM) and the k-Nearest Neighbor (k-NN) to show the performance between them. Besides that, we investigate the importance of the recent advances in machine learning including the deep kernel learning. A large set of experiments are conducted on Surrey Audio-Visual Expressed Emotion (SAVEE) dataset for seven emotions. The results of our experiments showed given good accuracy compared with the previous studies.

LED Board Optimization Design for User-Friendly System Configuration (사용자 친화적 시스템 구성을 위한 LED 보드 최적화 설계)

  • Ju-An Park;Chang-Woo Han;Hui-Sang Yoo;Boong-Joo Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.859-866
    • /
    • 2023
  • This paper focuses on configuring a user-friendly system of LED systems by applying improvement measures such as gamma correction, non-flicker, and driving noise removal using MCUs and LED drivers. As a result of the experiment, the 22kHz PWM mode of the LED driver generated noise outside the audible frequency range, making it practically imperceptible to users. The appropriate pull-up resistor values within the normal operating delay ratio of 5% were found to be 1kΩ to 10kΩ for the 3kHz PWM mode and 1kΩ to 2kΩ for the 22kHz PWM mode. In addition, gamma correction can be optimized for nonlinear human visual systems to express accurate contrast and as a result, it is expected to develop an LED system that can be expressed more naturally and accurately than conventional LED systems and improve users' visual experience.

Analysis and performance evaluation of the parallel typed for a vehicle driving simulator (병렬구조형 차량운전 모사장치의 성능평가 및 분석)

  • 박일경;박경균;김정하;이운성
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.1481-1484
    • /
    • 1997
  • The vehicle driving simulator expects vehicle motion with real-time simulation arise from driver's steering, accelerating, stopping and simulates motion of vehicl with visula, audio and washout algorithm. And it gives a vivid feeling to driver in reality. Vehicle driving simulator with vehicle integration control system is used for analysis of analysis of vehicle controllaility, steering capacity and safety in various pseudo environment alike. basides, it analyzeds vehicle safety factor dirver's reaction and promotes traffic safety without driver's own risks. The main proceduress of development of the vehicle driving simulator are classified by 3 parts. first the motion base system which can be generated by the motion queues, should be developed. Secondly, real-time vehicle software which can afford the vehicle dynamics, might be constructed. The third procedure is the integration of vehicle driing simulator which can be interconnected between visual systems with motion base. In this study, we are to study of the motion base for a vehicle driving simulator design and that of its real time control and using an extra gyro sensor and accelerometers to find a position and an orientatiion of the moving platform except for calculating forward kinematics. To drive the motion base, we use National Instruments corp's Labview software. Furthemore, we use analysis module for the vehicle motionand the washout algorithm module to consummate driving simulator, which can be driven by human in reality, so we are doing experimentally process about various vehicle motion conditon.

  • PDF

Development and Assessment of Multi-sensory Effector System to Improve the Realistic of Virtual Underwater Simulation (가상 해저 시뮬레이션의 현실감 향상을 위한 다감각 효과 재현 시스템 개발 및 평가)

  • Kim, Cheol-Min;Youn, Jae-Hong;Kang, Im-Chul;Kim, Byung-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.1
    • /
    • pp.104-112
    • /
    • 2014
  • With recent development of virtual reality technology, coupled with the growth of the marine industry, virtual underwater simulation systems are under development in various studies, for educational purposes and to simulate virtual reality experiences. Current literature indicates many underwater simulation systems to date have focused on the quality of visual stimulus delivered through three-dimensional graphics user interface, limiting the reality of the experience. In order to improve the quality of the reality delivered by such virtual simulations, it is crucial to develop multi-sensory technology rather than focus on the conventional audio-visual interaction, which limits experiencer from the sense of underwater immersion and existence within the simulation. This work proposes the immersive multi-sensory effector system, delivering the users with a more realistic underwater experience. The sense of reality perceived was evaluated, as the main factor of the virtual reality system.

Implementation and Design of Objective Quality Assurance System for Multimedia Service Video (멀티미디어 서비스 영상의 객관적 품질측정 시스템 설계 및 구현)

  • Joo, Hae-Jong;Hong, Bong-Hwa;On, Jin-Oh;Hong, Suk-Ju
    • 전자공학회논문지 IE
    • /
    • v.45 no.1
    • /
    • pp.58-64
    • /
    • 2008
  • This Paper provides perceptual metrics for video quality based on properties of human visual system, and audio quality based on human audition. All metrics work without reference signals, allowing non-intrusive, in-service measurements. A simple and easy-to-learn user interface displays the metrics and saves them in popular file formats like CSV. In this paper, proposed method was able to various and corrective measurement for the multimedia service video quality. As that it was able to application to set up service guide line and the methode of measurement and system for the set up standardization of the high quality video service.

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

  • Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.181-192
    • /
    • 2004
  • This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.

  • PDF

Story-based Information Retrieval (스토리 기반의 정보 검색 연구)

  • You, Eun-Soon;Park, Seung-Bo
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.81-96
    • /
    • 2013
  • Video information retrieval has become a very important issue because of the explosive increase in video data from Web content development. Meanwhile, content-based video analysis using visual features has been the main source for video information retrieval and browsing. Content in video can be represented with content-based analysis techniques, which can extract various features from audio-visual data such as frames, shots, colors, texture, or shape. Moreover, similarity between videos can be measured through content-based analysis. However, a movie that is one of typical types of video data is organized by story as well as audio-visual data. This causes a semantic gap between significant information recognized by people and information resulting from content-based analysis, when content-based video analysis using only audio-visual data of low level is applied to information retrieval of movie. The reason for this semantic gap is that the story line for a movie is high level information, with relationships in the content that changes as the movie progresses. Information retrieval related to the story line of a movie cannot be executed by only content-based analysis techniques. A formal model is needed, which can determine relationships among movie contents, or track meaning changes, in order to accurately retrieve the story information. Recently, story-based video analysis techniques have emerged using a social network concept for story information retrieval. These approaches represent a story by using the relationships between characters in a movie, but these approaches have problems. First, they do not express dynamic changes in relationships between characters according to story development. Second, they miss profound information, such as emotions indicating the identities and psychological states of the characters. Emotion is essential to understanding a character's motivation, conflict, and resolution. Third, they do not take account of events and background that contribute to the story. As a result, this paper reviews the importance and weaknesses of previous video analysis methods ranging from content-based approaches to story analysis based on social network. Also, we suggest necessary elements, such as character, background, and events, based on narrative structures introduced in the literature. We extract characters' emotional words from the script of the movie Pretty Woman by using the hierarchical attribute of WordNet, which is an extensive English thesaurus. WordNet offers relationships between words (e.g., synonyms, hypernyms, hyponyms, antonyms). We present a method to visualize the emotional pattern of a character over time. Second, a character's inner nature must be predetermined in order to model a character arc that can depict the character's growth and development. To this end, we analyze the amount of the character's dialogue in the script and track the character's inner nature using social network concepts, such as in-degree (incoming links) and out-degree (outgoing links). Additionally, we propose a method that can track a character's inner nature by tracing indices such as degree, in-degree, and out-degree of the character network in a movie through its progression. Finally, the spatial background where characters meet and where events take place is an important element in the story. We take advantage of the movie script to extracting significant spatial background and suggest a scene map describing spatial arrangements and distances in the movie. Important places where main characters first meet or where they stay during long periods of time can be extracted through this scene map. In view of the aforementioned three elements (character, event, background), we extract a variety of information related to the story and evaluate the performance of the proposed method. We can track story information extracted over time and detect a change in the character's emotion or inner nature, spatial movement, and conflicts and resolutions in the story.

Design and Implementation of Video Documents Management System (비디오 문서 관리시스템의 설계 및 구현)

  • Kweon, Jae-Gil;Bae, Jong-Min
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2287-2297
    • /
    • 2000
  • Video documents which have audio-visual and other semantics information have complex relationship among media. While user requests for topic retrieval or specific region retrieval increase, it is difficult to meet these requests with the existing design methodology, In order to support the systematic management and the various retrieval capabilities of video document, we must formulate structural and systematic model on metadata using semantics and structural informations which are abstracted automaticallv or manuallv. This paper suggests generic metadata model with which we analyze the characteristics of video document, supports various query types and serves as a generic framework for video applications, we propose the generic integrated management model(GIMM)for generic metadata,, design video documents management system(VDMS) and implement it using GIMM.

  • PDF

A Study on Modeling of Bibliographic Framework Based on FRBR for Television Program Materials (방송영상자료의 FRBR기반 서지구조모형에 관한 연구)

  • Chung, Jin-Gyoo
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.41 no.1
    • /
    • pp.185-214
    • /
    • 2007
  • This study intends to design the bibliographic framework based on IFLA-FRBR model for television program materials and to evaluate this in terms of effectiveness of retrieval and usability of the system. The FRBR model supplies mote suitable bibliographic framework of audio-visual material which has a sufficient hierarchical relations and relative bibliographical records. The followings are research methods designed by this study; (1) The experimental metadata system named it FbCS based on FRBR was developed by using the entity-related database and composed of multi-layed and hierarchy. FbCS is developed through benchmarking of a case study for iMMix model in Netherlands based on FRBR. (2) To evaluate effectiveness of retrieval and usability of FbCS, this study made a experiment and survey by user groups of professionals.

Design and Implement of Terrestrial & Satellite integrated DMB receiver for Personalized Broadcasting Services (개인 휴대형 방송 서비스를 위한 지상파/위성 통합 DMB 수신기 설계 및 구현)

  • Cho, Yong-Hoon;Kim, Won-Yong;Choi, Soon-Pil;Oh, Se-In;Choi, Jeong-Hoon
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.289-291
    • /
    • 2007
  • The Digital Multimedia Broadcasting(DMB) system is developed to offer high quality audio-visual multimedia contents to the uses by the various portable terminals in the mobile environment. Integrated complex reception platform is required to receive multimedia broadcasting services transmitted from various transmission media. In this paper, we present the design and implementation technic for providing the both of terrestrial and satellite DMB services simultaneously using the same hardware platform. The implemented complex receiving terminal to accommodate these DMB services simultaneously need composed of it RF module. it baseband module, it complex control module and the complex de-multiplexer module. The complex control module is designed using uClinux operating system. The complex de-multiplexer, which perform the functions of the address decoder and each DMB stream de-multiplexer, is implemented. with FPGA device. The implemented platform is tested in a real environment and its performance is satisfied with required performance criteria.

  • PDF