• Title/Summary/Keyword: Encoder Model

Search Result 354, Processing Time 0.022 seconds

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation (Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단)

  • Hong, Su-Woong;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2022
  • This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

Unsupervised Learning-Based Pipe Leak Detection using Deep Auto-Encoder

  • Yeo, Doyeob;Bae, Ji-Hoon;Lee, Jae-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.9
    • /
    • pp.21-27
    • /
    • 2019
  • In this paper, we propose a deep auto-encoder-based pipe leak detection (PLD) technique from time-series acoustic data collected by microphone sensor nodes. The key idea of the proposed technique is to learn representative features of the leak-free state using leak-free time-series acoustic data and the deep auto-encoder. The proposed technique can be used to create a PLD model that detects leaks in the pipeline in an unsupervised learning manner. This means that we only use leak-free data without labeling while training the deep auto-encoder. In addition, when compared to the previous supervised learning-based PLD method that uses image features, this technique does not require complex preprocessing of time-series acoustic data owing to the unsupervised feature extraction scheme. The experimental results show that the proposed PLD method using the deep auto-encoder can provide reliable PLD accuracy even considering unsupervised learning-based feature extraction.

dynamic localization of a mobile robot using a rotating sonar and a map (회전 초음파 센서와 지도를 이용한 이동 로보트의 동적 절대 위치 추정)

  • 양해용;정학영;이장규
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.544-547
    • /
    • 1997
  • In this paper, we propose a dynamic localization method using a rotating sonar and a map. The proposed method is implemented by using extended Kalman filter. The state equation is based on the encoder propagation model and the encoder error model, and the measurement equation is a map-based measurement equation using a rotating sonar sensor. By utilizing sonar beam characteristics, map-based measurements are updated while AMR is moving continuously. By modeling and estimating systematic errors of a differential encoder, the position is successfully estimated even the interval of the map-based measurement. Monte-Carlo simulation shows that the proposed global position estimator has the performance of a few millimeter order in position error and of a few tenth degrees in heading error and of compensating systematic errors of the differential encoder well.

  • PDF

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding (MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구)

  • Song, Jeongook;Lee, Joonil;Kang, Hong-Goo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.86-96
    • /
    • 2013
  • Unified Speech and Audio Coding (USAC) is the speech/audio codec with the best quality, approved on Final Draft International Standard (FDIS) at MPEG meeting in 2011. Since MPEG conventionally standardizes only the decoder, it is not easy to study on the encoder technologies. Furthermore, Reference Model(RM) shows extremely poor performance. To solve these problems, the open source project(JAME) proposes the methods to make the improved performance of main encoder technologies in USAC. Especially, this paper introduces the encoder modules: the signal classifier for selective operation between two coders, the psychoacoustic model in frequency domain, and window transition technology. Finally, the results of verification test for FDIS and the performance of Common Encoder are appended.

A Korean Multi-speaker Text-to-Speech System Using d-vector (d-vector를 이용한 한국어 다화자 TTS 시스템)

  • Kim, Kwang Hyeon;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.469-475
    • /
    • 2022
  • To train the model of the deep learning-based single-speaker TTS system, a speech DB of tens of hours and a lot of training time are required. This is an inefficient method in terms of time and cost to train multi-speaker or personalized TTS models. The voice cloning method uses a speaker encoder model to make the TTS model of a new speaker. Through the trained speaker encoder model, a speaker embedding vector representing the timbre of the new speaker is created from the small speech data of the new speaker that is not used for training. In this paper, we propose a multi-speaker TTS system to which voice cloning is applied. The proposed TTS system consists of a speaker encoder, synthesizer and vocoder. The speaker encoder applies the d-vector technique used in the speaker recognition field. The timbre of the new speaker is expressed by adding the d-vector derived from the trained speaker encoder as an input to the synthesizer. It can be seen that the performance of the proposed TTS system is excellent from the experimental results derived by the MOS and timbre similarity listening tests.

Anomaly-based Alzheimer's disease detection using entropy-based probability Positron Emission Tomography images

  • Husnu Baris Baydargil;Jangsik Park;Ibrahim Furkan Ince
    • ETRI Journal
    • /
    • v.46 no.3
    • /
    • pp.513-525
    • /
    • 2024
  • Deep neural networks trained on labeled medical data face major challenges owing to the economic costs of data acquisition through expensive medical imaging devices, expert labor for data annotation, and large datasets to achieve optimal model performance. The heterogeneity of diseases, such as Alzheimer's disease, further complicates deep learning because the test cases may substantially differ from the training data, possibly increasing the rate of false positives. We propose a reconstruction-based self-supervised anomaly detection model to overcome these challenges. It has a dual-subnetwork encoder that enhances feature encoding augmented by skip connections to the decoder for improving the gradient flow. The novel encoder captures local and global features to improve image reconstruction. In addition, we introduce an entropy-based image conversion method. Extensive evaluations show that the proposed model outperforms benchmark models in anomaly detection and classification using an encoder. The supervised and unsupervised models show improved performances when trained with data preprocessed using the proposed image conversion method.

Fixed-point Processing Optimization of MPEG Psychoacoustic Model-II Algorithm for ASIC Implementation (MPEG 심리음향 모델-ll 알고리듬의 ASIC 구현을 위한 고정 소수점 연산 최적화)

  • Lee Keun-Sup;Park Young-Cheol;Youn Dae Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.11C
    • /
    • pp.1491-1497
    • /
    • 2004
  • The psychoacoustic model in MPEG audio layer-III (MP3) encoder is optimized for the fixed-point processing. The optimization process consists of determining the data word length of arithmetic unit and the algorithm for transcendental functions that are often used in the psychoacoustic model. In order to determine the data word length, we defined a statistical model expressing the relation between the fixed-point operation errors of the psychoacoustic model and the probability of alteration of the allocated bits doe to these errors. Based on the simulations using this model, we chose a 24-bit data path and constructed a 24-bit fixed-point MP3 encoder. Sound quality tests using the constructed fixed-point encoder showed a mean degradation of -0.2 on ITU-R 5-point audio impairment scale.

Natural Language Generation Using SC-GRU Encoder-Decoder Model (SC-GRU encoder-decoder 모델을 이용한 자연어생성)

  • Kim, Geonyeong;Lee, Changki
    • Annual Conference on Human and Language Technology
    • /
    • 2017.10a
    • /
    • pp.167-171
    • /
    • 2017
  • 자연어 생성은 특정한 조건들을 만족하는 문장을 생성하는 연구로, 이러한 조건들은 주로 표와 같은 축약되고 구조화된 의미 표현으로 주어지며 사용자가 자연어로 생성된 문장을 받아야 하는 어떤 분야에서든 응용이 가능하다. 본 논문에서는 SC(Semantically Conditioned)-GRU기반 encoder-decoder모델을 이용한 자연어 생성 모델을 제안한다. 본 논문에서 제안한 모델이 SF Hotel 데이터에서는 0.8645 BLEU의 성능을, SF Restaurant 데이터에서는 0.7570 BLEU의 성능을 보였다.

  • PDF

Natural Language Generation Using SC-GRU Encoder-Decoder Model (SC-GRU encoder-decoder 모델을 이용한 자연어생성)

  • Kim, Geonyeong;Lee, Changki
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.167-171
    • /
    • 2017
  • 자연어 생성은 특정한 조건들을 만족하는 문장을 생성하는 연구로, 이러한 조건들은 주로 표와 같은 축약되고 구조화된 의미 표현으로 주어지며 사용자가 자연어로 생성된 문장을 받아야 하는 어떤 분야에서든 응용이 가능하다. 본 논문에서는 SC(Semantically Conditioned)-GRU기반 encoder-decoder모델을 이용한 자연어 생성 모델을 제안한다. 본 논문에서 제안한 모델이 SF Hotel 데이터에서는 0.8645 BLEU의 성능을, SF Restaurant 데이터에서는 0.7570 BLEU의 성능을 보였다.

  • PDF

Forecasting Crop Yield Using Encoder-Decoder Model with Attention (Attention 기반 Encoder-Decoder 모델을 활용한작물의 생산량 예측)

  • Kang, Sooram;Cho, Kyungchul;Na, MyungHwan
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.4
    • /
    • pp.569-579
    • /
    • 2021
  • Purpose: The purpose of this study is the time series analysis for predicting the yield of crops applicable to each farm using environmental variables measured by smart farms cultivating tomato. In addition, it is intended to confirm the influence of environmental variables using a deep learning model that can be explained to some extent. Methods: A time series analysis was performed to predict production using environmental variables measured at 75 smart farms cultivating tomato in two periods. An LSTM-based encoder-decoder model was used for cases of several farms with similar length. In particular, Dual Attention Mechanism was applied to use environmental variables as exogenous variables and to confirm their influence. Results: As a result of the analysis, Dual Attention LSTM with a window size of 12 weeks showed the best predictive power. It was verified that the environmental variables has a similar effect on prediction through wieghtss extracted from the prediction model, and it was also verified that the previous time point has a greater effect than the time point close to the prediction point. Conclusion: It is expected that it will be possible to attempt various crops as a model that can be explained by supplementing the shortcomings of general deep learning model.