• Title/Summary/Keyword: visual-audio

Search Result 424, Processing Time 0.026 seconds

A Fundamental Study on the Marine Leisure - focus on the Psychology of Emotion for Seashore Relaxation - (해양레저에 관한 기초적인 연구 - 해변휴양의 정서심리를 중심으로 -)

  • Yoon, Soon-Dong
    • Proceedings of KOSOMES biannual meeting
    • /
    • 2008.05a
    • /
    • pp.75-80
    • /
    • 2008
  • There are a lot of interest and research on practical area of marine leisure but few research on fundamental area. We need to suggest the theoretical basis on the merit of marine leisure. The author analyzed in visual and audio informations of seashore environment based on psychology of emotion aesthetically and musically. As a results, Peoples could get affirmative emotion through participating in seashore relaxation and changed their negative emotion into affirmative.

  • PDF

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.3
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

Analysis on the Possibility of Electronic Surveillance Society in the Intelligence Information age

  • Chung, Choong-Sik
    • Journal of Platform Technology
    • /
    • v.6 no.4
    • /
    • pp.11-17
    • /
    • 2018
  • In the smart intelligence information society, there is a possibility that the social dysfunction such as the personal information protection issue and the risk to the electronic surveillance society may be highlighted. In this paper, we refer to various categories and classify electronic surveillance into audio surveillance, visual surveillance, location surveillance, biometric information surveillance, and data surveillance. In order to respond to new electronic surveillance in the intelligent information society, it requires a change of perception that is different from that of the past. This starts with the importance of digital privacy and results in the right to self-determination of personal information. Therefore, in order to preemptively respond to the dysfunctions that may arise in the intelligent information society, it is necessary to further raise the awareness of the civil society to protect information human rights.

Screen Performance of the Korean Actress Kim Hye-Soo (영화배우 김혜수의 스크린 퍼포먼스)

  • Kim, Jong-Guk
    • Journal of Information Technology Applications and Management
    • /
    • v.28 no.1
    • /
    • pp.43-51
    • /
    • 2021
  • This article explores Kim Hye-soo's film acting from the perspective of performance, which means a socio-cultural action planned and intended for a certain purpose. Through the aspect of screen performance which the identity of the era that the performance study aims for is expressed through acting and reappeared in a system of verbal and non-verbal symbols, it was intended to enhance the academic value of Korean film acting. First, Kim Hye-soo's acting performance transforms by repeating genre acting. The sensuality and sexual attractiveness that evaluates Kim Hye-soo are repeated by the typical vision required by genre films, but the acting performance is not consumed or subordinated as a tool for visual pleasure. Second, Kim Hye-soo's body, face, emotion and audio are engraved with memories of the times, and the sociocultural identity of the performance is expressed through dynamic interaction between actions and reactions. Third, Kim Hye-soo's restored and recreated performance is sensitive to the changes of the times and is still in the process.

Speech Emotion Recognition with SVM, KNN and DSVM

  • Hadhami Aouani ;Yassine Ben Ayed
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.40-48
    • /
    • 2023
  • Speech Emotions recognition has become the active research theme in speech processing and in applications based on human-machine interaction. In this work, our system is a two-stage approach, namely feature extraction and classification engine. Firstly, two sets of feature are investigated which are: the first one is extracting only 13 Mel-frequency Cepstral Coefficient (MFCC) from emotional speech samples and the second one is applying features fusions between the three features: Zero Crossing Rate (ZCR), Teager Energy Operator (TEO), and Harmonic to Noise Rate (HNR) and MFCC features. Secondly, we use two types of classification techniques which are: the Support Vector Machines (SVM) and the k-Nearest Neighbor (k-NN) to show the performance between them. Besides that, we investigate the importance of the recent advances in machine learning including the deep kernel learning. A large set of experiments are conducted on Surrey Audio-Visual Expressed Emotion (SAVEE) dataset for seven emotions. The results of our experiments showed given good accuracy compared with the previous studies.

A Case Study on the Healing Forest Development Plan of Kangwon Province (강원도 치유의 숲 조성 기본계획 수립에 관한 연구)

  • Kim, Myeong-Jun;Lee, Joon-Woo;Cha, Du-Song
    • Journal of Forest and Environmental Science
    • /
    • v.26 no.1
    • /
    • pp.53-63
    • /
    • 2010
  • This study carried out to establish a master plan about healing forest in Gangwon-do focusing on healing road and visitor center. The site of this study was approximately 721 ha of mountain in Imgye-myeon, Gangwon-do, and the master plan was established through analysis of humanities-social and natural environments. The healing forest was developed 6 healing trails(10.5 km), devided by 3 steps, and each healing trail was designed to make rest area, wooden bridge, and open space. Also, visitor center, the core place of healing forest, was devided to several spaces as health measurement room, AV room, etc. and was planed for audio-visual education room for visitors.

Design of Emergency Evacuation Guiding System with Serially Connected Multi-channel Speakers (직렬 스피커 연결을 이용한 비상 대피 유도 시스템의 설계)

  • Chung, Han-Vit;Kim, Tea-Wan;Chung, Yun-Mo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.4
    • /
    • pp.142-152
    • /
    • 2011
  • In general, existing emergency evacuation guiding systems depend on visual techniques like emergency lights or LEDs. Actually people in the case of fire emergency condition may not obtain a range of view because of smoke from the fire. This paper introduces a technique to design an emergency guiding system using directivity sound to cope with this problem. In this case all speakers are serially connected for audio signal transmission in a serial fashion to achieve convenient speaker installation. Floyd algorithm is used to find shortest evacuation paths. Because serially connected multi-channel speakers are weak in case of disconnection, this paper uses a technique to solve the diagnostic problem. In the proposed system, a PC based on the USB protocol is used for control and observation. The system has achievements, such as increasing evacuation rate under emergency conditions, and serial transmission of audio signal for easy maintenance and low installation cost.

Design of a Three Dimensional Audio System for Multicast Conferencing (멀티캐스트 화상회의를 위한 3-D 음향시스템 설계)

  • 김영오;고대식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.1B
    • /
    • pp.71-76
    • /
    • 2000
  • On multimedia teleconferencing system existing a number of participants, face of the participants can beperceived by visual image. However, differentiation of each participant's voice and spaciousness sense are very hard since voice of all participants is processed with one dimensional data. In this paper, we implemented three dimensional audio rendering system using the HRTF(Head Related Transfer Function) and distance sense reproduction method and determined the optimal location of the participants for teleconferencing system. In the results of the listening test using elevation and azimuth angle, we showed that directional perception of the azimuth angles were better than that of the elevation angles. Specially, we showed that participant location using the HRTFS of the azimuth angle 10" , 90" , 270" and350" was efficient in teleconferencing system existing four participants. We also proposed that distance cue was used for enhancement of the reality and location of many participants more than five.ipants more than five.

  • PDF

Timeline Synchronization of Multiple Videos Based on Waveform (소리 파형을 이용한 다수 동영상간 시간축 동기화 기법)

  • Kim, Shin;Yoon, Kyoungro
    • Journal of Broadcast Engineering
    • /
    • v.23 no.2
    • /
    • pp.197-205
    • /
    • 2018
  • Panoramic image is one of the technologies that are commonly used today. However, technical difficulties still exist in panoramic video production. Without a special camera such as a 360-degree camera, making panoramic video becomes more difficult. In order to make a panoramic video, it is necessary to synchronize the timeline of multiple videos shot at multiple locations. However, the timeline synchronization method using the internal clock of the camera may cause an error due to the difference of the internal hardware. In order to solve this problem, timeline synchronization between multiple videos using visual information or auditory information has been studied. However, there is a problem in accuracy and processing time when using video information, and there is a problem in that, when using audio information, there is no synchronization when there is sensitivity to noise or there is no melody. Therefore, in this paper, we propose a timeline synchronization method between multiple video using audio waveform. It shows higher synchronization accuracy and temporal efficiency than the video information based time synchronization method.

Implementation of SMIL Editor for Multimedia Broadcasting (멀티미디어 방송을 위한 SMIL 편집 시스템 구현)

  • 장대영;김창수;정회경
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.3
    • /
    • pp.622-629
    • /
    • 2004
  • Recently, as digital broadcasting and internet are spreaded out of the world, we can easily use informations with less restrictions of time and space. According to the current trends, concerns for the ways of representing multimedia data has been rapidly increased, and users demand the services with integrated document that takes not only simple text and image but also time varying audio-visual data. Therefore, in 1998, W3C presented an international standard, SMIL in order to solve multimedia object representation and synchronization problems. By using SMIL, various multimedia elements can be integrated as a multimedia document with proper view in a space and time. Using this SMIL document, we can create new internet radio broadcasting service that delivers not only audio data but also various text, image and video. In this paper, we describe on a SMIL document editor for the common users to be able to represent time varying multimedia data with special layout and synchronization of time and space.