• Title/Summary/Keyword: CNN-LSTM Neural Network

Search Result 107, Processing Time 0.028 seconds

Intelligent Activity Recognition based on Improved Convolutional Neural Network

  • Park, Jin-Ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.807-818
    • /
    • 2022
  • In order to further improve the accuracy and time efficiency of behavior recognition in intelligent monitoring scenarios, a human behavior recognition algorithm based on YOLO combined with LSTM and CNN is proposed. Using the real-time nature of YOLO target detection, firstly, the specific behavior in the surveillance video is detected in real time, and the depth feature extraction is performed after obtaining the target size, location and other information; Then, remove noise data from irrelevant areas in the image; Finally, combined with LSTM modeling and processing time series, the final behavior discrimination is made for the behavior action sequence in the surveillance video. Experiments in the MSR and KTH datasets show that the average recognition rate of each behavior reaches 98.42% and 96.6%, and the average recognition speed reaches 210ms and 220ms. The method in this paper has a good effect on the intelligence behavior recognition.

A Study on the Epileptic Seizure Prediction using CNN (CNN을 이용한 뇌전증 발작예측에 관한 연구)

  • Ryu, Sanguk;Lee, Namhwa;Lee, Yeonsu;Joe, Inwhee;Min, Kyeongyuk;Kim, Taeksoo
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.2
    • /
    • pp.92-95
    • /
    • 2020
  • In this paper, the new architecture of seizure prediction using CNN and LSTM and DWT was presented. In the proposed architecture, EEG data was labeled into a preictal and interictal section, and DWT was adopted to the preprocessing process to apply the characteristics of the time and frequency domain of the processed EEG signal. Also, CNN was applied to extract the spatial characteristics of each electrode used for EEG measurement, and LSTM neural network was applied to verify the logical order of the preictal section. The learning of the proposed architecture utilizes the CHB-MIT Scalp EEG dataset, and the sliding window technique is applied to balance the dataset between the number of interictal sections and the number of preictal sections. As a result of the simulation of the proposed architecture, a sensitivity of 81.22% and an FPR of 0.174 were obtained.

State of Health Estimation for Lithium-Ion Batteries Using Long-term Recurrent Convolutional Network (LRCN을 이용한 리튬 이온 배터리의 건강 상태 추정)

  • Hong, Seon-Ri;Kang, Moses;Jeong, Hak-Geun;Baek, Jong-Bok;Kim, Jong-Hoon
    • The Transactions of the Korean Institute of Power Electronics
    • /
    • v.26 no.3
    • /
    • pp.183-191
    • /
    • 2021
  • A battery management system (BMS) provides some functions for ensuring safety and reliability that includes algorithms estimating battery states. Given the changes caused by various operating conditions, the state-of-health (SOH), which represents a figure of merit of the battery's ability to store and deliver energy, becomes challenging to estimate. Machine learning methods can be applied to perform accurate SOH estimation. In this study, we propose a Long-Term Recurrent Convolutional Network (LRCN) that combines the Convolutional Neural Network (CNN) and Long Short-term Memory (LSTM) to extract aging characteristics and learn temporal mechanisms. The dataset collected by the battery aging experiments of NASA PCoE is used to train models. The input dataset used part of the charging profile. The accuracy of the proposed model is compared with the CNN and LSTM models using the k-fold cross-validation technique. The proposed model achieves a low RMSE of 2.21%, which shows higher accuracy than others in SOH estimation.

Development of radar-based nowcasting method using Generative Adversarial Network (적대적 생성 신경망을 이용한 레이더 기반 초단시간 강우예측 기법 개발)

  • Yoon, Seong Sim;Shin, Hongjoon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.64-64
    • /
    • 2022
  • 이상기후로 인해 돌발적이고 국지적인 호우 발생의 빈도가 증가하게 되면서 짧은 선행시간(~3 시간) 범위에서 수치예보보다 높은 정확도를 갖는 초단시간 강우예측자료가 돌발홍수 및 도시홍수의 조기경보를 위해 유용하게 사용되고 있다. 일반적으로 초단시간 강우예측 정보는 레이더를 활용하여 외삽 및 이동벡터 기반의 예측기법으로 산정한다. 최근에는 장기간 레이더 관측자료의 확보와 충분한 컴퓨터 연산자원으로 인해 레이더 자료를 활용한 인공지능 심층학습 기반(RNN(Recurrent Neural Network), CNN(Convolutional Neural Network), Conv-LSTM 등)의 강우예측이 국외에서 확대되고 있고, 국내에서도 ConvLSTM 등을 활용한 연구들이 진행되었다. CNN 심층신경망 기반의 초단기 예측 모델의 경우 대체적으로 외삽기반의 예측성능보다 우수한 경향이 있었으나, 예측시간이 길어질수록 공간 평활화되는 경향이 크게 나타나므로 고강도의 뚜렷한 강수 특징을 예측하기 힘들어 예측정확도를 향상시키는데 중요한 소규모 기상현상을 왜곡하게 된다. 본 연구에서는 이러한 한계를 보완하기 위해 적대적 생성 신경망(Generative Adversarial Network, GAN)을 적용한 초단시간 예측기법을 활용하고자 한다. GAN은 생성모형과 판별모형이라는 두 신경망이 서로간의 적대적인 경쟁을 통해 학습하는 신경망으로, 데이터의 확률분포를 학습하고 학습된 분포에서 샘플을 쉽게 생성할 수 있는 기법이다. 본 연구에서는 2017년부터 2021년까지의 환경부 대형 강우레이더 합성장을 수집하고, 강우발생 사례를 대상으로 학습을 수행하여 신경망을 최적화하고자 한다. 학습된 신경망으로 강우예측을 수행하여, 국내 기상청과 환경부에서 생산한 레이더 초단시간 예측강우와 정량적인 정확도를 비교평가 하고자 한다.

  • PDF

Electroencephalography-based imagined speech recognition using deep long short-term memory network

  • Agarwal, Prabhakar;Kumar, Sandeep
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.672-685
    • /
    • 2022
  • This article proposes a subject-independent application of brain-computer interfacing (BCI). A 32-channel Electroencephalography (EEG) device is used to measure imagined speech (SI) of four words (sos, stop, medicine, washroom) and one phrase (come-here) across 13 subjects. A deep long short-term memory (LSTM) network has been adopted to recognize the above signals in seven EEG frequency bands individually in nine major regions of the brain. The results show a maximum accuracy of 73.56% and a network prediction time (NPT) of 0.14 s which are superior to other state-of-the-art techniques in the literature. Our analysis reveals that the alpha band can recognize SI better than other EEG frequencies. To reinforce our findings, the above work has been compared by models based on the gated recurrent unit (GRU), convolutional neural network (CNN), and six conventional classifiers. The results show that the LSTM model has 46.86% more average accuracy in the alpha band and 74.54% less average NPT than CNN. The maximum accuracy of GRU was 8.34% less than the LSTM network. Deep networks performed better than traditional classifiers.

A SE Approach for Real-Time NPP Response Prediction under CEA Withdrawal Accident Conditions

  • Felix Isuwa, Wapachi;Aya, Diab
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.18 no.2
    • /
    • pp.75-93
    • /
    • 2022
  • Machine learning (ML) data-driven meta-model is proposed as a surrogate model to reduce the excessive computational cost of the physics-based model and facilitate the real-time prediction of a nuclear power plant's transient response. To forecast the transient response three machine learning (ML) meta-models based on recurrent neural networks (RNNs); specifically, Long Short Term Memory (LSTM), Gated Recurrent Unit (GRU), and a sequence combination of Convolutional Neural Network (CNN) and LSTM are developed. The chosen accident scenario is a control element assembly withdrawal at power concurrent with the Loss Of Offsite Power (LOOP). The transient response was obtained using the best estimate thermal hydraulics code, MARS-KS, and cross-validated against the Design and control document (DCD). DAKOTA software is loosely coupled with MARS-KS code via a python interface to perform the Best Estimate Plus Uncertainty Quantification (BEPU) analysis and generate a time series database of the system response to train, test and validate the ML meta-models. Key uncertain parameters identified as required by the CASU methodology were propagated using the non-parametric Monte-Carlo (MC) random propagation and Latin Hypercube Sampling technique until a statistically significant database (181 samples) as required by Wilk's fifth order is achieved with 95% probability and 95% confidence level. The three ML RNN models were built and optimized with the help of the Talos tool and demonstrated excellent performance in forecasting the most probable NPP transient response. This research was guided by the Systems Engineering (SE) approach for the systematic and efficient planning and execution of the research.

Restoring Motion Capture Data for Pose Estimation (자세 추정을 위한 모션 캡처 데이터 복원)

  • Youn, Yeo-su;Park, Hyun-jun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.5-7
    • /
    • 2021
  • Motion capture data files for pose estimation may have inaccurate data depending on the surrounding environment and the degree of movement, so it is necessary to correct it. In the past, inaccurate data was restored with post-processing by people, but recently various kind of neural networks such as LSTM and R-CNN are used as automated method. However, since neural network-based data restoration methods require a lot of computing resource, this paper proposes a method that reduces computing resource and maintains data restoration rate compared to neural network-based method. The proposed method automatically restores inaccurate motion capture data by using posture measurement data (c3d). As a result of the experiment, data restoration rates ranged from 89% to 99% depending on the degree of inaccuracy of the data.

  • PDF

CRNN-Based Korean Phoneme Recognition Model with CTC Algorithm (CTC를 적용한 CRNN 기반 한국어 음소인식 모델 연구)

  • Hong, Yoonseok;Ki, Kyungseo;Gweon, Gahgene
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.3
    • /
    • pp.115-122
    • /
    • 2019
  • For Korean phoneme recognition, Hidden Markov-Gaussian Mixture model(HMM-GMM) or hybrid models which combine artificial neural network with HMM have been mainly used. However, current approach has limitations in that such models require force-aligned corpus training data that is manually annotated by experts. Recently, researchers used neural network based phoneme recognition model which combines recurrent neural network(RNN)-based structure with connectionist temporal classification(CTC) algorithm to overcome the problem of obtaining manually annotated training data. Yet, in terms of implementation, these RNN-based models have another difficulty in that the amount of data gets larger as the structure gets more sophisticated. This problem of large data size is particularly problematic in the Korean language, which lacks refined corpora. In this study, we introduce CTC algorithm that does not require force-alignment to create a Korean phoneme recognition model. Specifically, the phoneme recognition model is based on convolutional neural network(CNN) which requires relatively small amount of data and can be trained faster when compared to RNN based models. We present the results from two different experiments and a resulting best performing phoneme recognition model which distinguishes 49 Korean phonemes. The best performing phoneme recognition model combines CNN with 3hop Bidirectional LSTM with the final Phoneme Error Rate(PER) at 3.26. The PER is a considerable improvement compared to existing Korean phoneme recognition models that report PER ranging from 10 to 12.

A Study on Image Generation from Sentence Embedding Applying Self-Attention (Self-Attention을 적용한 문장 임베딩으로부터 이미지 생성 연구)

  • Yu, Kyungho;No, Juhyeon;Hong, Taekeun;Kim, Hyeong-Ju;Kim, Pankoo
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.63-69
    • /
    • 2021
  • When a person sees a sentence and understands the sentence, the person understands the sentence by reminiscent of the main word in the sentence as an image. Text-to-image is what allows computers to do this associative process. The previous deep learning-based text-to-image model extracts text features using Convolutional Neural Network (CNN)-Long Short Term Memory (LSTM) and bi-directional LSTM, and generates an image by inputting it to the GAN. The previous text-to-image model uses basic embedding in text feature extraction, and it takes a long time to train because images are generated using several modules. Therefore, in this research, we propose a method of extracting features by using the attention mechanism, which has improved performance in the natural language processing field, for sentence embedding, and generating an image by inputting the extracted features into the GAN. As a result of the experiment, the inception score was higher than that of the model used in the previous study, and when judged with the naked eye, an image that expresses the features well in the input sentence was created. In addition, even when a long sentence is input, an image that expresses the sentence well was created.

A Deep Neural Network for Activity Recognition of Multi-object (다중 객체의 행동 인식을 위한 심층신경망)

  • Kim, Seunghyun;Kim, Do-Yeon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.597-598
    • /
    • 2016
  • 행동 인식을 위한 기존의 심층신경망은 행동 패턴 모델링과 행동 인식 성능 향상에 큰 기여를 하였다. 그러나 이 신경망은 영상 전체를 하나의 행동 인식 대상으로 보기 때문에 다중 객체의 개별적인 행동 인식에는 한계가 있다. 이에 본 논문에서는 R-CNN과 LSTM을 융합한 RC-LSTM 심층신경망을 통해 다중 객체의 행동 인식을 위한 방법을 제안한다.