Search | Korea Science

Hyeonjin Cha;Sangwook Park
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.5
- /
- pp.395-401
- /
- 2023
Sound Event Detection (SED) aims to identify not only sound category but also time interval for target sounds in an audio waveform. It is a critical technique in field of acoustic surveillance system and monitoring system. Recently, various models have introduced through Detection and Classification of Acoustic Scenes and Events (DCASE) Task 4. This paper explored how to design optimal parameters of DenseNet based model, which has led to outstanding performance in other recognition system. In experiment, DenseRNN as an SED model consists of DensNet-BC and bi-directional Gated Recurrent Units (GRU). This model is trained with Mean teacher model. With an event-based f-score, evaluation is performed depending on parameters, related to model architecture as well as model training, under the assessment protocol of DCASE task4. Experimental result shows that the performance goes up and has been saturated to near the best. Also, DenseRNN would be trained more effectively without dropout technique.
https://doi.org/10.7776/ASK.2023.42.5.395 인용 PDF

Kim, Tae-Ho;Chang, Joon-Hyuk
- Proceedings of the Korea Information Processing Society Conference
- /
- 2018.05a
- /
- pp.344-345
- /
- 2018
본 논문에서는 오디오와 레이더 기반의 딥러닝을 활용한 환경 분류 기술을 제안한다. 제안된 환경 분류 기술은 오디오를 이용한 환경 분류 딥러닝 모델과 레이더를 이용한 딥러닝 모델을 앙상블로 결합하여 환경을 분류한다. 특히, 오디오와 레이더 각 성능을 높이기 위해 별도의 모델이 제안된 딥러닝 환경분류 기법은 실내 환경 5 가지를 분류 하였으며, 오디오 또는 레이더 단일 데이터를 활용한 환경 분류에 비해 우수한 성능을 보였다.
https://doi.org/10.3745/PKIPS.y2018m05a.344 인용 PDF

Kim, Jungmin;Lee, Younglo;Kim, Donghyeon;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.5
- /
- pp.406-413
- /
- 2020
In this paper, to improve the classification accuracy of bird and amphibian acoustic sound, we utilize GLU (Gated Linear Unit) and Self-attention that encourages the network to extract important features from data and discriminate relevant important frames from all the input sequences for further performance improvement. To utilize acoustic data, we convert 1-D acoustic data to a log-Mel spectrogram. Subsequently, undesirable component such as background noise in the log-Mel spectrogram is reduced by GLU. Then, we employ the proposed temporal self-attention to improve classification accuracy. The data consist of 6-species of birds, 8-species of amphibians including endangered species in the natural environment. As a result, our proposed method is shown to achieve an accuracy of 91 % with bird data and 93 % with amphibian data. Overall, an improvement of about 6 % ~ 7 % accuracy in performance is achieved compared to the existing algorithms.
https://doi.org/10.7776/ASK.2020.39.5.406 인용 PDF KSCI

Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
- Journal of Broadcast Engineering
- /
- v.23 no.6
- /
- pp.855-865
- /
- 2018
Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.
https://doi.org/10.5909/JBE.2018.23.6.855 인용 PDF KSCI KPUBS HTML