• Title/Summary/Keyword: 합성 순환 신경망

Search Result 46, Processing Time 0.028 seconds

Residual Convolutional Recurrent Neural Network-Based Sound Event Classification Applicable to Broadcast Captioning Services (자막방송을 위한 잔차 합성곱 순환 신경망 기반 음향 사건 분류)

  • Kim, Nam Kyun;Kim, Hong Kook;Ahn, Chung Hyun
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.26-27
    • /
    • 2021
  • 본 논문에서는 자막방송 제공을 위해 방송콘텐츠를 이해하는 방법으로 잔차 합성곱 순환신경망 기반 음향 사건 분류 기법을 제안한다. 제안된 기법은 잔차 합성곱 신경망과 순환 신경망을 연결한 구조를 갖는다. 신경망의 입력 특징으로는 멜-필터벵크 특징을 활용하고, 잔차 합성곱 신경망은 하나의 스템 블록과 5개의 잔차 합성곱 신경망으로 구성된다. 잔차 합성곱 신경망은 잔차 학습으로 구성된 합성곱 신경망과 기존의 합성곱 신경망 대비 특징맵의 표현 능력 향상을 위해 합성곱 블록 주의 모듈로 구성한다. 추출된 특징맵은 순환 신경망에 연결되고, 최종적으로 음향 사건 종류와 시간정보를 추출하는 완전연결층으로 연결되는 구조를 활용한다. 제안된 모델 훈련을 위해 라벨링되지 않는 데이터 활용이 가능한 평균 교사 모델을 기반으로 훈련하였다. 제안된 모델의 성능평가를 위해 DCASE 2020 챌린지 Task 4 데이터 셋을 활용하였으며, 성능 평가 결과 46.8%의 이벤트 단위의 F1-score를 얻을 수 있었다.

  • PDF

Graph Convolutional - Network Architecture Search : Network architecture search Using Graph Convolution Neural Networks (그래프 합성곱-신경망 구조 탐색 : 그래프 합성곱 신경망을 이용한 신경망 구조 탐색)

  • Su-Youn Choi;Jong-Youel Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.649-654
    • /
    • 2023
  • This paper proposes the design of a neural network structure search model using graph convolutional neural networks. Deep learning has a problem of not being able to verify whether the designed model has a structure with optimized performance due to the nature of learning as a black box. The neural network structure search model is composed of a recurrent neural network that creates a model and a convolutional neural network that is the generated network. Conventional neural network structure search models use recurrent neural networks, but in this paper, we propose GC-NAS, which uses graph convolutional neural networks instead of recurrent neural networks to create convolutional neural network models. The proposed GC-NAS uses the Layer Extraction Block to explore depth, and the Hyper Parameter Prediction Block to explore spatial and temporal information (hyper parameters) based on depth information in parallel. Therefore, since the depth information is reflected, the search area is wider, and the purpose of the search area of the model is clear by conducting a parallel search with depth information, so it is judged to be superior in theoretical structure compared to GC-NAS. GC-NAS is expected to solve the problem of the high-dimensional time axis and the range of spatial search of recurrent neural networks in the existing neural network structure search model through the graph convolutional neural network block and graph generation algorithm. In addition, we hope that the GC-NAS proposed in this paper will serve as an opportunity for active research on the application of graph convolutional neural networks to neural network structure search.

Artificial neural network for classifying with epilepsy MEG data (뇌전증 환자의 MEG 데이터에 대한 분류를 위한 인공신경망 적용 연구)

  • Yujin Han;Junsik Kim;Jaehee Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.139-155
    • /
    • 2024
  • This study performed a multi-classification task to classify mesial temporal lobe epilepsy with left hippocampal sclerosis patients (left mTLE), mesial temporal lobe epilepsy with right hippocampal sclerosis (right mTLE), and healthy controls (HC) using magnetoencephalography (MEG) data. We applied various artificial neural networks and compared the results. As a result of modeling with convolutional neural networks (CNN), recurrent neural networks (RNN), and graph neural networks (GNN), the average k-fold accuracy was excellent in the order of CNN-based model, GNN-based model, and RNN-based model. The wall time was excellent in the order of RNN-based model, GNN-based model, and CNN-based model. The graph neural network, which shows good figures in accuracy, performance, and time, and has excellent scalability of network data, is the most suitable model for brain research in the future.

Earthquake events classification using convolutional recurrent neural network (합성곱 순환 신경망 구조를 이용한 지진 이벤트 분류 기법)

  • Ku, Bonhwa;Kim, Gwantae;Jang, Su;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.592-599
    • /
    • 2020
  • This paper proposes a Convolutional Recurrent Neural Net (CRNN) structure that can simultaneously reflect both static and dynamic characteristics of seismic waveforms for various earthquake events classification. Addressing various earthquake events, including not only micro-earthquakes and artificial-earthquakes but also macro-earthquakes, requires both effective feature extraction and a classifier that can discriminate seismic waveform under noisy environment. First, we extract the static characteristics of seismic waveform through an attention-based convolution layer. Then, the extracted feature-map is sequentially injected as input to a multi-input single-output Long Short-Term Memory (LSTM) network structure to extract the dynamic characteristic for various seismic event classifications. Subsequently, we perform earthquake events classification through two fully connected layers and softmax function. Representative experimental results using domestic and foreign earthquake database show that the proposed model provides an effective structure for various earthquake events classification.

Multi-channel EEG classification method according to music tempo stimuli using 3D convolutional bidirectional gated recurrent neural network (3차원 합성곱 양방향 게이트 순환 신경망을 이용한 음악 템포 자극에 따른 다채널 뇌파 분류 방식)

  • Kim, Min-Soo;Lee, Gi Yong;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.228-233
    • /
    • 2021
  • In this paper, we propose a method to extract and classify features of multi-channel ElectroEncephalo Graphy (EEG) that change according to various musical tempo stimuli. In the proposed method, a 3D convolutional bidirectional gated recurrent neural network extracts spatio-temporal and long time-dependent features from the 3D EEG input representation transformed through the preprocessing. The experimental results show that the proposed tempo stimuli classification method is superior to the existing method and the possibility of constructing a music-based brain-computer interface.

Performance comparison of various deep neural network architectures using Merlin toolkit for a Korean TTS system (Merlin 툴킷을 이용한 한국어 TTS 시스템의 심층 신경망 구조 성능 비교)

  • Hong, Junyoung;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.57-64
    • /
    • 2019
  • In this paper, we construct a Korean text-to-speech system using the Merlin toolkit which is an open source system for speech synthesis. In the text-to-speech system, the HMM-based statistical parametric speech synthesis method is widely used, but it is known that the quality of synthesized speech is degraded due to limitations of the acoustic modeling scheme that includes context factors. In this paper, we propose an acoustic modeling architecture that uses deep neural network technique, which shows excellent performance in various fields. Fully connected deep feedforward neural network (DNN), recurrent neural network (RNN), gated recurrent unit (GRU), long short-term memory (LSTM), bidirectional LSTM (BLSTM) are included in the architecture. Experimental results have shown that the performance is improved by including sequence modeling in the architecture, and the architecture with LSTM or BLSTM shows the best performance. It has been also found that inclusion of delta and delta-delta components in the acoustic feature parameters is advantageous for performance improvement.

Recurrent Neural Network Based Spectrum Sensing Technique for Cognitive Radio Communications (인지 무선 통신을 위한 순환 신경망 기반 스펙트럼 센싱 기법)

  • Jung, Tae-Yun;Jeong, Eui-Rim
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.759-767
    • /
    • 2020
  • This paper proposes a new Recurrent neural network (RNN) based spectrum sensing technique for cognitive radio communications. The proposed technique determines the existence of primary user's signal without any prior information of the primary users. The method performs high-speed sampling by considering the whole sensing bandwidth and then converts the signal into frequency spectrum via fast Fourier transform (FFT). This spectrum signal is cut in sensing channel bandwidth and entered into the RNN to determine the channel vacancy. The performance of the proposed technique is verified through computer simulations. According to the results, the proposed one is superior to more than 2 [dB] than the existing threshold-based technique and has similar performance to that of the existing Convolutional neural network (CNN) based method. In addition, experiments are carried out in indoor environments and the results show that the proposed technique performs more than 4 [dB] better than both the conventional threshold-based and the CNN based methods.

Learning Recurrent Neural Networks for Activity Detection from Untrimmed Videos (비분할 비디오로부터 행동 탐지를 위한 순환 신경망 학습)

  • Song, YeongTaek;Suh, Junbae;Kim, Incheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.892-895
    • /
    • 2017
  • 본 논문에서는 비분할 비디오로부터 이 비디오에 담긴 사람의 행동을 효과적으로 탐지해내기 위한 심층 신경망 모델을 제안한다. 일반적으로 비디오에서 사람의 행동을 탐지해내는 작업은 크게 비디오에서 행동 탐지에 효과적인 특징들을 추출해내는 과정과 이 특징들을 토대로 비디오에 담긴 행동을 탐지해내는 과정을 포함한다. 본 논문에서는 특징 추출 과정과 행동 탐지 과정에 이용할 심층 신경망 모델을 제시한다. 특히 비디오로부터 각 행동별 시간적, 공간적 패턴을 잘 표현할 수 있는 특징들을 추출해내기 위해서는 C3D 및 I-ResNet 합성곱 신경망 모델을 이용하고, 시계열 특징 벡터들로부터 행동을 자동 판별해내기 위해서는 양방향 BI-LSTM 순환 신경망 모델을 이용한다. 대용량의 공개 벤치 마크 데이터 집합인 ActivityNet 비디오 데이터를 이용한 실험을 통해, 본 논문에서 제안하는 심층 신경망 모델의 성능과 효과를 확인할 수 있었다.

A Study on Learning Performance Improvement by Using Hidden States in Deep Reinforcement Learning (심층강화학습에 은닉 상태 정보 활용을 통한 학습 성능 개선에 대한 고찰)

  • Choi, Yohan;Seok, Yeong-Jun;Kim, Ju-Bong;Han, Youn-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.528-530
    • /
    • 2022
  • 심층강화학습에 완전 연결 신경망과 합성곱 신경망은 잘 활용되는 것에 반해 순환 신경망은 잘 활용되지 않는다. 이는 강화학습이 마르코프 속성을 전제로 하기 때문이다. 지금까지의 강화학습은 환경이 마르코프 속성을 만족하도록 사전 작업이 필요했다, 본 논문에서는 마르코프 속성을 따르지 않는 환경에서 이러한 사전 작업 없이도 순환 신경망의 은닉 상태를 통해 마르코프 속성을 학습함으로써 학습 성능을 개선할 수 있다는 것을 소개한다.

Shooting sound analysis using convolutional neural networks and long short-term memory (합성곱 신경망과 장단기 메모리를 이용한 사격음 분석 기법)

  • Kang, Se Hyeok;Cho, Ji Woong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.312-318
    • /
    • 2022
  • This paper proposes a model which classifies the type of guns and information about sound source location using deep neural network. The proposed classification model is composed of convolutional neural networks (CNN) and long short-term memory (LSTM). For training and test the model, we use the Gunshot Audio Forensic Dataset generated by the project supported by the National Institute of Justice (NIJ). The acoustic signals are transformed to Mel-Spectrogram and they are provided as learning and test data for the proposed model. The model is compared with the control model consisting of convolutional neural networks only. The proposed model shows high accuracy more than 90 %.