• Title/Summary/Keyword: Audio Event Detection

Search Result 25, Processing Time 0.027 seconds

Intelligent Abnormal Event Detection Algorithm for Single Households at Home via Daily Audio and Vision Patterns (지능형 오디오 및 비전 패턴 기반 1인 가구 이상 징후 탐지 알고리즘)

  • Jung, Juho;Ahn, Junho
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.77-86
    • /
    • 2019
  • As the number of single-person households increases, it is not easy to ask for help alone if a single-person household is severely injured in the home. This paper detects abnormal event when members of a single household in the home are seriously injured. It proposes an vision detection algorithm that analyzes and recognizes patterns through videos that are collected based on home CCTV. And proposes audio detection algorithms that analyze and recognize patterns of sound that occur in households based on Smartphones. If only each algorithm is used, shortcomings exist and it is difficult to detect situations such as serious injuries in a wide area. So I propose a fusion method that effectively combines the two algorithms. The performance of the detection algorithm and the precise detection performance of the proposed fusion method were evaluated, respectively.

A study on the waveform-based end-to-end deep convolutional neural network for weakly supervised sound event detection (약지도 음향 이벤트 검출을 위한 파형 기반의 종단간 심층 콘볼루션 신경망에 대한 연구)

  • Lee, Seokjin;Kim, Minhan;Jeong, Youngho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.1
    • /
    • pp.24-31
    • /
    • 2020
  • In this paper, the deep convolutional neural network for sound event detection is studied. Especially, the end-to-end neural network, which generates the detection results from the input audio waveform, is studied for weakly supervised problem that includes weakly-labeled and unlabeled dataset. The proposed system is based on the network structure that consists of deeply-stacked 1-dimensional convolutional neural networks, and enhanced by the skip connection and gating mechanism. Additionally, the proposed system is enhanced by the sound event detection and post processings, and the training step using the mean-teacher model is added to deal with the weakly supervised data. The proposed system was evaluated by the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 Task 4 dataset, and the result shows that the proposed system has F1-scores of 54 % (segment-based) and 32 % (event-based).

Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing (청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출)

  • Lee, Gi Yong;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.600-605
    • /
    • 2020
  • In this paper, we propose a sound event detection method using a multi-channel multi-scale neural networks for sound sensing home monitoring for the hearing impaired. In the proposed system, two channels with high signal quality are selected from several wireless microphone sensors in home. The three features (time difference of arrival, pitch range, and outputs obtained by applying multi-scale convolutional neural network to log mel spectrogram) extracted from the sensor signals are applied to a classifier based on a bidirectional gated recurrent neural network to further improve the performance of sound event detection. The detected sound event result is converted into text along with the sensor position of the selected channel and provided to the hearing impaired. The experimental results show that the sound event detection method of the proposed system is superior to the existing method and can effectively deliver sound information to the hearing impaired.

Comparison of Audio Event Detection Performance using DNN (DNN을 이용한 오디오 이벤트 검출 성능 비교)

  • Chung, Suk-Hwan;Chung, Yong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.3
    • /
    • pp.571-578
    • /
    • 2018
  • Recently, deep learning techniques have shown superior performance in various kinds of pattern recognition. However, there have been some arguments whether the DNN performs better than the conventional machine learning techniques when classification experiments are done using a small amount of training data. In this study, we compared the performance of the conventional GMM and SVM with DNN, a kind of deep learning techniques, in audio event detection. When tested on the same data, DNN has shown superior overall performance but SVM was better than DNN in segment-based F-score.

A study on training DenseNet-Recurrent Neural Network for sound event detection (음향 이벤트 검출을 위한 DenseNet-Recurrent Neural Network 학습 방법에 관한 연구)

  • Hyeonjin Cha;Sangwook Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.5
    • /
    • pp.395-401
    • /
    • 2023
  • Sound Event Detection (SED) aims to identify not only sound category but also time interval for target sounds in an audio waveform. It is a critical technique in field of acoustic surveillance system and monitoring system. Recently, various models have introduced through Detection and Classification of Acoustic Scenes and Events (DCASE) Task 4. This paper explored how to design optimal parameters of DenseNet based model, which has led to outstanding performance in other recognition system. In experiment, DenseRNN as an SED model consists of DensNet-BC and bi-directional Gated Recurrent Units (GRU). This model is trained with Mean teacher model. With an event-based f-score, evaluation is performed depending on parameters, related to model architecture as well as model training, under the assessment protocol of DCASE task4. Experimental result shows that the performance goes up and has been saturated to near the best. Also, DenseRNN would be trained more effectively without dropout technique.

A Personal Video Event Classification Method based on Multi-Modalities by DNN-Learning (DNN 학습을 이용한 퍼스널 비디오 시퀀스의 멀티 모달 기반 이벤트 분류 방법)

  • Lee, Yu Jin;Nang, Jongho
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1281-1297
    • /
    • 2016
  • In recent years, personal videos have seen a tremendous growth due to the substantial increase in the use of smart devices and networking services in which users create and share video content easily without many restrictions. However, taking both into account would significantly improve event detection performance because videos generally have multiple modalities and the frame data in video varies at different time points. This paper proposes an event detection method. In this method, high-level features are first extracted from multiple modalities in the videos, and the features are rearranged according to time sequence. Then the association of the modalities is learned by means of DNN to produce a personal video event detector. In our proposed method, audio and image data are first synchronized and then extracted. Then, the result is input into GoogLeNet as well as Multi-Layer Perceptron (MLP) to extract high-level features. The results are then re-arranged in time sequence, and every video is processed to extract one feature each for training by means of DNN.

Abnormal Behavior Pattern Identifications of One-person Households using Audio, Vision, and Dust Sensors (음성, 영상, 먼지 센서를 활용한 1인 가구 이상 행동 패턴 탐지)

  • Kim, Si-won;Ahn, Jun-ho
    • Journal of Internet Computing and Services
    • /
    • v.20 no.6
    • /
    • pp.95-103
    • /
    • 2019
  • The number of one person households has grown steadily over the recent past and the population of lonely and unnoticed death are also observed. The phenomenon of one person households has been occurred. In the dark side of society, the remarkable number of lonely and unnoticed death are reported among different age-groups. We propose an unusual event detection method which may give a remarkable solution to reduce the number of the death rete for people dying alone and remaining undiscovered for a long period of time. The unusual event detection method we suggested to identify abnormal user behavior in their lives using vision pattern, audio pattern, and dust pattern algorithms. Individually proposed pattern algorithms have disadvantages of not being able to detect when they leave the coverage area. We utilized a fusion method to improve the accuracy performance of each pattern algorithm and evaluated the technique with multiple user behavior patterns in indoor areas.

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.

Acoustic Monitoring and Localization for Social Care

  • Goetze, Stefan;Schroder, Jens;Gerlach, Stephan;Hollosi, Danilo;Appell, Jens-E.;Wallhoff, Frank
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.40-50
    • /
    • 2012
  • Increase in the number of older people due to demographic changes poses great challenges to the social healthcare systems both in the Western and as well as in the Eastern countries. Support for older people by formal care givers leads to enormous temporal and personal efforts. Therefore, one of the most important goals is to increase the efficiency and effectiveness of today's care. This can be achieved by the use of assistive technologies. These technologies are able to increase the safety of patients or to reduce the time needed for tasks that do not relate to direct interaction between the care giver and the patient. Motivated by this goal, this contribution focuses on applications of acoustic technologies to support users and care givers in ambient assisted living (AAL) scenarios. Acoustic sensors are small, unobtrusive and can be added to already existing care or living environments easily. The information gathered by the acoustic sensors can be analyzed to calculate the position of the user by localization and the context by detection and classification of acoustic events in the captured acoustic signal. By doing this, possibly dangerous situations like falls, screams or an increased amount of coughs can be detected and appropriate actions can be initialized by an intelligent autonomous system for the acoustic monitoring of older persons. The proposed system is able to reduce the false alarm rate compared to other existing and commercially available approaches that basically rely only on the acoustic level. This is due to the fact that it explicitly distinguishes between the various acoustic events and provides information on the type of emergency that has taken place. Furthermore, the position of the acoustic event can be determined as contextual information by the system that uses only the acoustic signal. By this, the position of the user is known even if she or he does not wear a localization device such as a radio-frequency identification (RFID) tag.

A Real-Time Sound Recognition System with a Decision Logic of Random Forest for Robots (Random Forest를 결정로직으로 활용한 로봇의 실시간 음향인식 시스템 개발)

  • Song, Ju-man;Kim, Changmin;Kim, Minook;Park, Yongjin;Lee, Seoyoung;Son, Jungkwan
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.273-281
    • /
    • 2022
  • In this paper, we propose a robot sound recognition system that detects various sound events. The proposed system is designed to detect various sound events in real-time by using a microphone on a robot. To get real-time performance, we use a VGG11 model which includes several convolutional neural networks with real-time normalization scheme. The VGG11 model is trained on augmented DB through 24 kinds of various environments (12 reverberation times and 2 signal to noise ratios). Additionally, based on random forest algorithm, a decision logic is also designed to generate event signals for robot applications. This logic can be used for specific classes of acoustic events with better performance than just using outputs of network model. With some experimental results, the performance of proposed sound recognition system is shown on real-time device for robots.