• Title/Summary/Keyword: Encoder Model

Search Result 364, Processing Time 0.024 seconds

Abnormal Flight Detection Technique of UAV based on U-Net (U-Net을 이용한 무인항공기 비정상 비행 탐지 기법 연구)

  • Myeong Jae Song;Eun Ju Choi;Byoung Soo Kim;Yong Ho Moon
    • Journal of Aerospace System Engineering
    • /
    • v.18 no.3
    • /
    • pp.41-47
    • /
    • 2024
  • Recently, as the practical application and commercialization of unmanned aerial vehicles (UAVs) is pursued, interest in ensuring the safety of the UAV is increasing. Because UAV accidents can result in property damage and loss of life, it is important to develop technology to prevent accidents. For this reason, a technique to detect the abnormal flight state of UAVs has been developed based on the AutoEncoder model. However, the existing detection technique is limited in terms of performance and real-time processing. In this paper, we propose a U-Net based abnormal flight detection technique. In the proposed technique, abnormal flight is detected based on the increasing rate of Mahalanobis distance for the reconstruction error obtained from the U-Net model. Through simulation experiments, it can be shown that the proposed detection technique has superior detection performance compared to the existing detection technique, and can operate in real-time in an on-board environment.

Abnormal State Detection using Memory-augmented Autoencoder technique in Frequency-Time Domain

  • Haoyi Zhong;Yongjiang Zhao;Chang Gyoon Lim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.348-369
    • /
    • 2024
  • With the advancement of Industry 4.0 and Industrial Internet of Things (IIoT), manufacturing increasingly seeks automation and intelligence. Temperature and vibration monitoring are essential for machinery health. Traditional abnormal state detection methodologies often overlook the intricate frequency characteristics inherent in vibration time series and are susceptible to erroneously reconstructing temperature abnormalities due to the highly similar waveforms. To address these limitations, we introduce synergistic, end-to-end, unsupervised Frequency-Time Domain Memory-Enhanced Autoencoders (FTD-MAE) capable of identifying abnormalities in both temperature and vibration datasets. This model is adept at accommodating time series with variable frequency complexities and mitigates the risk of overgeneralization. Initially, the frequency domain encoder processes the spectrogram generated through Short-Time Fourier Transform (STFT), while the time domain encoder interprets the raw time series. This results in two disparate sets of latent representations. Subsequently, these are subjected to a memory mechanism and a limiting function, which numerically constrain each memory term. These processed terms are then amalgamated to create two unified, novel representations that the decoder leverages to produce reconstructed samples. Furthermore, the model employs Spectral Entropy to dynamically assess the frequency complexity of the time series, which, in turn, calibrates the weightage attributed to the loss functions of the individual branches, thereby generating definitive abnormal scores. Through extensive experiments, FTD-MAE achieved an average ACC and F1 of 0.9826 and 0.9808 on the CMHS and CWRU datasets, respectively. Compared to the best representative model, the ACC increased by 0.2114 and the F1 by 0.1876.

Seismic Data Processing Using BERT-Based Pretraining: Comparison of Shotgather Arrays (BERT 기반 사전학습을 이용한 탄성파 자료처리: 송신원 모음 배열 비교)

  • Youngjae Shin
    • Geophysics and Geophysical Exploration
    • /
    • v.27 no.3
    • /
    • pp.171-180
    • /
    • 2024
  • The processing of seismic data involves analyzing earthquake wave data to understand the internal structure and characteristics of the Earth, which requires high computational power. Recently, machine learning (ML) techniques have been introduced to address these challenges and have been utilized in various tasks such as noise reduction and velocity model construction. However, most studies have focused on specific seismic data processing tasks, limiting the full utilization of similar features and structures inherent in the datasets. In this study, we compared the efficacy of using receiver-wise time-series data ("receiver array") and synchronized receiver signals ("time array") from shotgathers for pretraining a Bidirectional Encoder Representations from Transformers (BERT) model. To this end, shotgather data generated from a synthetic model containing faults was used to perform noise reduction, velocity prediction, and fault detection tasks. In the task of random noise reduction, both the receiver and time arrays showed good performance. However, for tasks requiring the identification of spatial distributions, such as velocity estimation and fault detection, the results from the time array were superior.

Visual analysis of attention-based end-to-end speech recognition (어텐션 기반 엔드투엔드 음성인식 시각화 분석)

  • Lim, Seongmin;Goo, Jahyun;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.41-49
    • /
    • 2019
  • An end-to-end speech recognition model consisting of a single integrated neural network model was recently proposed. The end-to-end model does not need several training steps, and its structure is easy to understand. However, it is difficult to understand how the model recognizes speech internally. In this paper, we visualized and analyzed the attention-based end-to-end model to elucidate its internal mechanisms. We compared the acoustic model of the BLSTM-HMM hybrid model with the encoder of the end-to-end model, and visualized them using t-SNE to examine the difference between neural network layers. As a result, we were able to delineate the difference between the acoustic model and the end-to-end model encoder. Additionally, we analyzed the decoder of the end-to-end model from a language model perspective. Finally, we found that improving end-to-end model decoder is necessary to yield higher performance.

Neural Networks Based Identification and Control of a Large Flexible Antenna

  • Sasaki, Minoru;Murase, Takuya;Ukita, Nobuharu
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1711-1716
    • /
    • 2004
  • This paper presents identification and control of a 10-m antenna via accelerometers and angle encoder data. Artificial Neural Networks can be used effectively for the identification and control of nonlinear dynamical system such as a large flexible antenna. Some identification results are shown and compared with the results of conventional prediction error method. And we use a neural network inverse model for control the large flexible antenna. In the neural network inverse model, a neural network is trained, using supervised learning, to develop an inverse model of the antenna. The network input is the process output, and the network output is the corresponding process input. The control results show the validation of the ANN approach for identification and control of the 10-m flexible antenna.

  • PDF

Sensorless Speed Control of Direct Current Motor using Current Error Compensation (전류오차보상에 의한 직류전동기의 센서리스 속도제어)

  • 함형철;오세진;김종수
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.27 no.7
    • /
    • pp.930-936
    • /
    • 2003
  • A new method of direct current motor drive, which requires neither shaft encoder nor speed estimator, is presented. The proposed scheme is based on decreasing current gap between a numerical model and an actual motor. By supplying the identical instantaneous voltage to both model and motor in the direction of reducing the current difference, the rotor approaches to the model speed, that is, reference value. The performance of direct current motor drives without speed sensor is generally poor at very low speed. However, in this system, it is possible to obtain good speed performance in the low speed range.

GRAYSCALE IMAGE COLORIZATION USING A CONVOLUTIONAL NEURAL NETWORK

  • JWA, MINJE;KANG, MYUNGJOO
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.25 no.2
    • /
    • pp.26-38
    • /
    • 2021
  • Image coloration refers to adding plausible colors to a grayscale image or video. Image coloration has been used in many modern fields, including restoring old photographs, as well as reducing the time spent painting cartoons. In this paper, a method is proposed for colorizing grayscale images using a convolutional neural network. We propose an encoder-decoder model, adapting FusionNet to our purpose. A proper loss function is defined instead of the MSE loss function to suit the purpose of coloring. The proposed model was verified using the ImageNet dataset. We quantitatively compared several colorization models with ours, using the peak signal-to-noise ratio (PSNR) metric. In addition, to qualitatively evaluate the results, our model was applied to images in the test dataset and compared to images applied to various other models. Finally, we applied our model to a selection of old black and white photographs.

Single Image-based Depth Estimation Network using Attention Model (Attention Model 을 이용한 단안 영상 기반 깊이 추정 네트워크)

  • Jung, Geunho;Yoon, Sang Min
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.14-17
    • /
    • 2020
  • 단안 영상에서의 깊이 추정은 주어진 시점에서 촬영된 2 차원 영상으로부터 객체까지의 3 차원 거리 정보를 추정하는 것이다. 최근 딥러닝 기반으로 단안 RGB 영상에서 깊이 정보 추정에 유용한 특징 맵을 추출하고 이를 이용해서 깊이를 추정하는 모델들이 기존 방법들의 성능을 넘어서면서 관련된 연구가 활발히 진행되고 있다. 또한 Attention Model 과 같이 특정 특징 맵의 채널 혹은 공간을 강조하여 전체적인 네트워크의 성능을 개선하는 연구가 소개되었다. 본 논문에서는 깊이 정보 추정을 위해 사용되는 특징 맵을 강조하기 위해서 Attention Model 을 추가한 AutoEncoder 기반의 깊이 추정 네트워크를 제안하고 적용 부분에 따른 네트워크의 깊이 정보 추정 성능을 평가 및 분석한다.

  • PDF

BERT-Based Logits Ensemble Model for Gender Bias and Hate Speech Detection

  • Sanggeon Yun;Seungshik Kang;Hyeokman Kim
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.641-651
    • /
    • 2023
  • Malicious hate speech and gender bias comments are common in online communities, causing social problems in our society. Gender bias and hate speech detection has been investigated. However, it is difficult because there are diverse ways to express them in words. To solve this problem, we attempted to detect malicious comments in a Korean hate speech dataset constructed in 2020. We explored bidirectional encoder representations from transformers (BERT)-based deep learning models utilizing hyperparameter tuning, data sampling, and logits ensembles with a label distribution. We evaluated our model in Kaggle competitions for gender bias, general bias, and hate speech detection. For gender bias detection, an F1-score of 0.7711 was achieved using an ensemble of the Soongsil-BERT and KcELECTRA models. The general bias task included the gender bias task, and the ensemble model achieved the best F1-score of 0.7166.

A Study on Fine-Tuning and Transfer Learning to Construct Binary Sentiment Classification Model in Korean Text (한글 텍스트 감정 이진 분류 모델 생성을 위한 미세 조정과 전이학습에 관한 연구)

  • JongSoo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, generative models based on the Transformer architecture, such as ChatGPT, have been gaining significant attention. The Transformer architecture has been applied to various neural network models, including Google's BERT(Bidirectional Encoder Representations from Transformers) sentence generation model. In this paper, a method is proposed to create a text binary classification model for determining whether a comment on Korean movie review is positive or negative. To accomplish this, a pre-trained multilingual BERT sentence generation model is fine-tuned and transfer learned using a new Korean training dataset. To achieve this, a pre-trained BERT-Base model for multilingual sentence generation with 104 languages, 12 layers, 768 hidden, 12 attention heads, and 110M parameters is used. To change the pre-trained BERT-Base model into a text classification model, the input and output layers were fine-tuned, resulting in the creation of a new model with 178 million parameters. Using the fine-tuned model, with a maximum word count of 128, a batch size of 16, and 5 epochs, transfer learning is conducted with 10,000 training data and 5,000 testing data. A text sentiment binary classification model for Korean movie review with an accuracy of 0.9582, a loss of 0.1177, and an F1 score of 0.81 has been created. As a result of performing transfer learning with a dataset five times larger, a model with an accuracy of 0.9562, a loss of 0.1202, and an F1 score of 0.86 has been generated.