• Title/Summary/Keyword: 합성 순환 신경망

Search Result 46, Processing Time 0.019 seconds

New Hybrid Approach of CNN and RNN based on Encoder and Decoder (인코더와 디코더에 기반한 합성곱 신경망과 순환 신경망의 새로운 하이브리드 접근법)

  • Jongwoo Woo;Gunwoo Kim;Keunho Choi
    • Information Systems Review
    • /
    • v.25 no.1
    • /
    • pp.129-143
    • /
    • 2023
  • In the era of big data, the field of artificial intelligence is showing remarkable growth, and in particular, the image classification learning methods by deep learning are becoming an important area. Various studies have been actively conducted to further improve the performance of CNNs, which have been widely used in image classification, among which a representative method is the Convolutional Recurrent Neural Network (CRNN) algorithm. The CRNN algorithm consists of a combination of CNN for image classification and RNNs for recognizing time series elements. However, since the inputs used in the RNN area of CRNN are the flatten values extracted by applying the convolution and pooling technique to the image, pixel values in the same phase in the image appear in different order. And this makes it difficult to properly learn the sequence of arrangements in the image intended by the RNN. Therefore, this study aims to improve image classification performance by proposing a novel hybrid method of CNN and RNN applying the concepts of encoder and decoder. In this study, the effectiveness of the new hybrid method was verified through various experiments. This study has academic implications in that it broadens the applicability of encoder and decoder concepts, and the proposed method has advantages in terms of model learning time and infrastructure construction costs as it does not significantly increase complexity compared to conventional hybrid methods. In addition, this study has practical implications in that it presents the possibility of improving the quality of services provided in various fields that require accurate image classification.

Korean speech recognition using deep learning (딥러닝 모형을 사용한 한국어 음성인식)

  • Lee, Suji;Han, Seokjin;Park, Sewon;Lee, Kyeongwon;Lee, Jaeyong
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.213-227
    • /
    • 2019
  • In this paper, we propose an end-to-end deep learning model combining Bayesian neural network with Korean speech recognition. In the past, Korean speech recognition was a complicated task due to the excessive parameters of many intermediate steps and needs for Korean expertise knowledge. Fortunately, Korean speech recognition becomes manageable with the aid of recent breakthroughs in "End-to-end" model. The end-to-end model decodes mel-frequency cepstral coefficients directly as text without any intermediate processes. Especially, Connectionist Temporal Classification loss and Attention based model are a kind of the end-to-end. In addition, we combine Bayesian neural network to implement the end-to-end model and obtain Monte Carlo estimates. Finally, we carry out our experiments on the "WorimalSam" online dictionary dataset. We obtain 4.58% Word Error Rate showing improved results compared to Google and Naver API.

Analyzing the Impact of Multivariate Inputs on Deep Learning-Based Reservoir Level Prediction and Approaches for Mid to Long-Term Forecasting (다변량 입력이 딥러닝 기반 저수율 예측에 미치는 영향 분석과 중장기 예측 방안)

  • Hyeseung Park;Jongwook Yoon;Hojun Lee;Hyunho Yang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.199-207
    • /
    • 2024
  • Local reservoirs are crucial sources for agricultural water supply, necessitating stable water level management to prepare for extreme climate conditions such as droughts. Water level prediction is significantly influenced by local climate characteristics, such as localized rainfall, as well as seasonal factors including cropping times, making it essential to understand the correlation between input and output data as much as selecting an appropriate prediction model. In this study, extensive multivariate data from over 400 reservoirs in Jeollabuk-do from 1991 to 2022 was utilized to train and validate a water level prediction model that comprehensively reflects the complex hydrological and climatological environmental factors of each reservoir, and to analyze the impact of each input feature on the prediction performance of water levels. Instead of focusing on improvements in water level performance through neural network structures, the study adopts a basic Feedforward Neural Network composed of fully connected layers, batch normalization, dropout, and activation functions, focusing on the correlation between multivariate input data and prediction performance. Additionally, most existing studies only present short-term prediction performance on a daily basis, which is not suitable for practical environments that require medium to long-term predictions, such as 10 days or a month. Therefore, this study measured the water level prediction performance up to one month ahead through a recursive method that uses daily prediction values as the next input. The experiment identified performance changes according to the prediction period and analyzed the impact of each input feature on the overall performance based on an Ablation study.

Comparison of CNN and GAN-based Deep Learning Models for Ground Roll Suppression (그라운드-롤 제거를 위한 CNN과 GAN 기반 딥러닝 모델 비교 분석)

  • Sangin Cho;Sukjoon Pyun
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.2
    • /
    • pp.37-51
    • /
    • 2023
  • The ground roll is the most common coherent noise in land seismic data and has an amplitude much larger than the reflection event we usually want to obtain. Therefore, ground roll suppression is a crucial step in seismic data processing. Several techniques, such as f-k filtering and curvelet transform, have been developed to suppress the ground roll. However, the existing methods still require improvements in suppression performance and efficiency. Various studies on the suppression of ground roll in seismic data have recently been conducted using deep learning methods developed for image processing. In this paper, we introduce three models (DnCNN (De-noiseCNN), pix2pix, and CycleGAN), based on convolutional neural network (CNN) or conditional generative adversarial network (cGAN), for ground roll suppression and explain them in detail through numerical examples. Common shot gathers from the same field were divided into training and test datasets to compare the algorithms. We trained the models using the training data and evaluated their performances using the test data. When training these models with field data, ground roll removed data are required; therefore, the ground roll is suppressed by f-k filtering and used as the ground-truth data. To evaluate the performance of the deep learning models and compare the training results, we utilized quantitative indicators such as the correlation coefficient and structural similarity index measure (SSIM) based on the similarity to the ground-truth data. The DnCNN model exhibited the best performance, and we confirmed that other models could also be applied to suppress the ground roll.

CNN Model-based Arrhythmia Classification using Image-typed ECG Data (이미지 타입의 ECG 데이터를 사용한 CNN 모델 기반 부정맥 분류)

  • Yeon-Suk Bang;Myung-Soo Jang;Yousik Hong;Sang-Suk Lee;Jun-Sang Yu;Woo-Beom Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.4
    • /
    • pp.205-212
    • /
    • 2023
  • Among cardiac diseases, arrhythmias can lead to serious complications such as stroke, heart attack, and heart failure if left untreated, so continuous and accurate ECG monitoring is crucial for clinical care. However, the accurate interpretation of electrocardiogram (ECG) data is entirely dependent on medical doctors, which requires additional time and cost. Therefore, this paper proposes an arrhythmia recognition module for the purpose of developing a medical platform through the analysis of abnormal pulse waveforms based on Lifelogs. The proposed method is to convert ECG data into image format instead of time series data, apply visual pattern recognition technology, and then detect arrhythmia using CNN model. In order to validate the arrhythmia classification of the CNN model by image type conversion of ECG data proposed in this paper, the MIT-BIH arrhythmia dataset was used, and the result showed an accuracy of 97%.

Sound event detection model using self-training based on noisy student model (잡음 학생 모델 기반의 자가 학습을 활용한 음향 사건 검지)

  • Kim, Nam Kyun;Park, Chang-Soo;Kim, Hong Kook;Hur, Jin Ook;Lim, Jeong Eun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.479-487
    • /
    • 2021
  • In this paper, we propose an Sound Event Detection (SED) model using self-training based on a noisy student model. The proposed SED model consists of two stages. In the first stage, a mean-teacher model based on an Residual Convolutional Recurrent Neural Network (RCRNN) is constructed to provide target labels regarding weakly labeled or unlabeled data. In the second stage, a self-training-based noisy student model is constructed by applying different noise types. That is, feature noises, such as time-frequency shift, mixup, SpecAugment, and dropout-based model noise are used here. In addition, a semi-supervised loss function is applied to train the noisy student model, which acts as label noise injection. The performance of the proposed SED model is evaluated on the validation set of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 Challenge Task 4. The experiments show that the single model and ensemble model of the proposed SED based on the noisy student model improve F1-score by 4.6 % and 3.4 % compared to the top-ranked model in DCASE 2020 challenge Task 4, respectively.