• Title/Summary/Keyword: Sound recognition

Search Result 311, Processing Time 0.026 seconds

Effect of Digital Noise Reduction of Hearing Aids on Music and Speech Perception

  • Kim, Hyo Jeong;Lee, Jae Hee;Shim, Hyun Joon
    • Korean Journal of Audiology
    • /
    • v.24 no.4
    • /
    • pp.180-190
    • /
    • 2020
  • Background and Objectives: Although many studies have evaluated the effect of the digital noise reduction (DNR) algorithm of hearing aids (HAs) on speech recognition, there are few studies on the effect of DNR on music perception. Therefore, we aimed to evaluate the effect of DNR on music, in addition to speech perception, using objective and subjective measurements. Subjects and Methods: Sixteen HA users participated in this study (58.00±10.44 years; 3 males and 13 females). The objective assessment of speech and music perception was based on the Korean version of the Clinical Assessment of Music Perception test and word and sentence recognition scores. Meanwhile, for the subjective assessment, the quality rating of speech and music as well as self-reported HA benefits were evaluated. Results: There was no improvement conferred with DNR of HAs on the objective assessment tests of speech and music perception. The pitch discrimination at 262 Hz in the DNR-off condition was better than that in the unaided condition (p=0.024); however, the unaided condition and the DNR-on conditions did not differ. In the Korean music background questionnaire, responses regarding ease of communication were better in the DNR-on condition than in the DNR-off condition (p=0.029). Conclusions: Speech and music perception or sound quality did not improve with the activation of DNR. However, DNR positively influenced the listener's subjective listening comfort. The DNR-off condition in HAs may be beneficial for pitch discrimination at some frequencies.

Multi-Emotion Regression Model for Recognizing Inherent Emotions in Speech Data (음성 데이터의 내재된 감정인식을 위한 다중 감정 회귀 모델)

  • Moung Ho Yi;Myung Jin Lim;Ju Hyun Shin
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.81-88
    • /
    • 2023
  • Recently, communication through online is increasing due to the spread of non-face-to-face services due to COVID-19. In non-face-to-face situations, the other person's opinions and emotions are recognized through modalities such as text, speech, and images. Currently, research on multimodal emotion recognition that combines various modalities is actively underway. Among them, emotion recognition using speech data is attracting attention as a means of understanding emotions through sound and language information, but most of the time, emotions are recognized using a single speech feature value. However, because a variety of emotions exist in a complex manner in a conversation, a method for recognizing multiple emotions is needed. Therefore, in this paper, we propose a multi-emotion regression model that extracts feature vectors after preprocessing speech data to recognize complex, inherent emotions and takes into account the passage of time.

Volume Control using Gesture Recognition System

  • Shreyansh Gupta;Samyak Barnwal
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.6
    • /
    • pp.161-170
    • /
    • 2024
  • With the technological advances, the humans have made so much progress in the ease of living and now incorporating the use of sight, motion, sound, speech etc. for various application and software controls. In this paper, we have explored the project in which gestures plays a very significant role in the project. The topic of gesture control which has been researched a lot and is just getting evolved every day. We see the usage of computer vision in this project. The main objective that we achieved in this project is controlling the computer settings with hand gestures using computer vision. In this project we are creating a module which acts a volume controlling program in which we use hand gestures to control the computer system volume. We have included the use of OpenCV. This module is used in the implementation of hand gestures in computer controls. The module in execution uses the web camera of the computer to record the images or videos and then processes them to find the needed information and then based on the input, performs the action on the volume settings if that computer. The program has the functionality of increasing and decreasing the volume of the computer. The setup needed for the program execution is a web camera to record the input images and videos which will be given by the user. The program will perform gesture recognition with the help of OpenCV and python and its libraries and them it will recognize or identify the specified human gestures and use them to perform or carry out the changes in the device setting. The objective is to adjust the volume of a computer device without the need for physical interaction using a mouse or keyboard. OpenCV, a widely utilized tool for image processing and computer vision applications in this domain, enjoys extensive popularity. The OpenCV community consists of over 47,000 individuals, and as of a survey conducted in 2020, the estimated number of downloads exceeds 18 million.

A Research on Object Detection Technology for the Visually Impaired (시각장애인을 위한 사물 감지 기술 연구)

  • Jeong, Yeon-Kyu;Kim, Byung-Gyu;Lee, Jeong-Bae
    • The KIPS Transactions:PartB
    • /
    • v.19B no.4
    • /
    • pp.225-230
    • /
    • 2012
  • In this paper, a blind person using a white cane as an adjunct of the things available sensing technology has been implemented. Sensing technology to implement things ultrasonic sensors and a webcam was used to process the data from the server computer. Ultrasonic sensors detect objects within 4meter people distinguish between those things that if the results based on the results will sound off. In this study, ultrasonic sensors, object recognition and human perception with the introduction of techniques and technologies developed for detecting objects in the lives of the visually impaired is expected to be greater usability.

The Influence of Comedic Elements of the Game on the Gaming Choice by the Game Users (게임속의 코미디 요소가 사용자들의 게임 선택에 미치는 영향)

  • Maeng Jae-Hee;Hwang Ji-Yeon;Park Jin-Wan;Park Jin-Wan
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.9
    • /
    • pp.108-115
    • /
    • 2006
  • This paper will study the influence of comedic elements of the game on the gaming choice by the game users, and focus its goal on establishing the foundation for the various game production environments. Within the game, the comedic elements are categorized as one of actively expressive elements and enhance the game's entertainment together with its sound qualities of graphics, scenario, sounds and level designs. Although the comedic elements are generally acknowledged as necessities, the research on how the users actually perceive those elements has been insufficient. Therefore this paper will investigate characteristics, compositions and techniques of the comedy used in the game and will analyze the influence that those comedic elements have on the users' recognition, satisfaction and royalty to the game.

  • PDF

A Study on the Development of the Interactive Emotional Contents Player Platform (인터랙티브 감성 콘텐츠 플레이어 플랫폼 개발에 관한 연구)

  • Kim, Min-Young;Kim, Dong-Keun;Cho, Yong-Joo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.7
    • /
    • pp.1572-1580
    • /
    • 2010
  • This thesis presents an emotion-based contents player platform that can change its visual and aural components as user's emotions. It analyzes the emotion as pleasant, unpleasant, aroused, and relaxed based on the physiological signals and the user's active response. Accordingly. the system reorganizes graphical and aural stimuli, such as, light, color, sound, in real-time. It can be used to develop and show the emotional contents and also be applied for the systematic analysis to find out how the components would affect the emotion. This paper describes overall the system architecture and the implementations of the sub-systems, as well as the actual contents built on top of the platform.

자음의 단어내 음운환경별로 본 음가변화

  • 김종미
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.5
    • /
    • pp.69-76
    • /
    • 1994
  • Acoustic cues of some consonantal phonology were tested in Korean words. All Korean consonants were recorded and acoustically analyzed in controlled phonological environments :ⅰ) word-initial, ⅱ) inter-vocalic, and ⅲ) word-final positions. The observed acoustic regulations are : ⅰ) The lengths of obstruents are longer word-initially than word-finally, ⅱ) The lengths of sonorants are longer word-finally than in word-initial or inter-vocalic positions, ⅲ) The formants of the lateral sound /l/ are higher word-finally than intervocalically. The phonological explanations of these acoustic regulations can be found in the rules of ⅰ) inter-vocalic voicing of plain stops, ⅱ) syllable-final unreleasing of obstruents, ⅲ) word-initial aspiration of stops, and ⅳ) liquid alternation between [r] and [l]. Numerical data of all these acoustic regulations are reported in order to facilitate their application toward improving naturalness for speech synthesis and accurateness for speech recognition.

  • PDF

A Study on the Natural Language Generation by Machine Translation (영한 기계번역의 자연어 생성 연구)

  • Hong Sung-Ryong
    • Journal of Digital Contents Society
    • /
    • v.6 no.1
    • /
    • pp.89-94
    • /
    • 2005
  • In machine translation the goal of natural language generation is to produce an target sentence transmitting the meaning of source sentence by using an parsing tree of source sentence and target expressions. It provides generator with linguistic structures, word mapping, part-of-speech, lexical information. The purpose of this study is to research the Korean Characteristics which could be used for the establishment of an algorism in speech recognition and composite sound. This is a part of realization for the plan of automatic machine translation. The stage of MT is divided into the level of morphemic, semantic analysis and syntactic construction.

  • PDF

A study on the subway-station improvement of metropolitan subway (도시철도 역사개선에 관한연구)

  • Kim, Dong-Won;Park, Soo-Choong;Lee, Hi-Sung;Moon, Dae-Sup
    • Proceedings of the KSR Conference
    • /
    • 2007.11a
    • /
    • pp.1641-1646
    • /
    • 2007
  • Subway station is installed with equipment of deficient ventilation and air purification in need of routine activity for users because projectors didn't have recognition of the importance of prevention equipment on disasters in subway stations. So if the accident in Deagu is occurred again, we can't prevent it. And it is urgently necessary to improve antique subway station for the safety and service to citizens because users were increased far more. Improvement of subway station has to be modernized the backward and antique subway station and maintained and developed by the center of traffic to give the social contribution with management of social facilities. And subway station has to be made conveniently so that everyone may use the station with one-stop system including sales, business, service equipment. This study will show the improvement plan of problem in underground space and situation of metropolitan-train history to improve metropolitan-train station .we will study a traffic line and equipment relocation of underground space considering safety by choosing the improved station and problems of the underground station in seoul. And then we will study the improvement plan of air purification and sound isolation to upgrade service in citizens.

  • PDF

An Analysis of Acoustic Features Caused by Articulatory Changes for Korean Distant-Talking Speech

  • Kim Sunhee;Park Soyoung;Yoo Chang D.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2E
    • /
    • pp.71-76
    • /
    • 2005
  • Compared to normal speech, distant-talking speech is characterized by the acoustic effect due to interfering sound and echoes as well as articulatory changes resulting from the speaker's effort to be more intelligible. In this paper, the acoustic features for distant-talking speech due to the articulatory changes will be analyzed and compared with those of the Lombard effect. In order to examine the effect of different distances and articulatory changes, speech recognition experiments were conducted for normal speech as well as distant-talking speech at different distances using HTK. The speech data used in this study consist of 4500 distant-talking utterances and 4500 normal utterances of 90 speakers (56 males and 34 females). Acoustic features selected for the analysis were duration, formants (F1 and F2), fundamental frequency, total energy and energy distribution. The results show that the acoustic-phonetic features for distant-talking speech correspond mostly to those of Lombard speech, in that the main resulting acoustic changes between normal and distant-talking speech are the increase in vowel duration, the shift in first and second formant, the increase in fundamental frequency, the increase in total energy and the shift in energy from low frequency band to middle or high bands.