• Title/Summary/Keyword: Auto-encoder model

Search Result 62, Processing Time 0.024 seconds

Side Information Extrapolation Using Motion-aligned Auto Regressive Model for Compressed Sensing based Wyner-Ziv Codec

  • Li, Ran;Gan, Zongliang;Cui, Ziguan;Wu, Minghu;Zhu, Xiuchang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.2
    • /
    • pp.366-385
    • /
    • 2013
  • In this paper, we propose a compressed sensing (CS) based Wyner-Ziv (WZ) codec using motion-aligned auto regressive model (MAAR) based side information (SI) extrapolation to improve the compression performance of low-delay distributed video coding (DVC). In the CS based WZ codec, the WZ frame is divided into small blocks and CS measurements of each block are acquired at the encoder, and a specific CS reconstruction algorithm is proposed to correct errors in the SI using CS measurements at the decoder. In order to generate high quality SI, a MAAR model is introduced to improve the inaccurate motion field in auto regressive (AR) model, and the Tikhonov regularization on MAAR coefficients and overlapped block based interpolation are performed to reduce block effects and errors from over-fitting. Simulation experiments show that our proposed CS based WZ codec associated with MAAR based SI generation achieves better results compared to other SI extrapolation methods.

Development of de-noised image reconstruction technique using Convolutional AutoEncoder for fast monitoring of fuel assemblies

  • Choi, Se Hwan;Choi, Hyun Joon;Min, Chul Hee;Chung, Young Hyun;Ahn, Jae Joon
    • Nuclear Engineering and Technology
    • /
    • v.53 no.3
    • /
    • pp.888-893
    • /
    • 2021
  • The International Atomic Energy Agency has developed a tomographic imaging system for accomplishing the total fuel rod-by-rod verification time of fuel assemblies within the order of 1-2 h, however, there are still limitations for some fuel types. The aim of this study is to develop a deep learning-based denoising process resulting in increasing the tomographic image acquisition speed of fuel assembly compared to the conventional techniques. Convolutional AutoEncoder (CAE) was employed for denoising the low-quality images reconstructed by filtered back-projection (FBP) algorithm. The image data set was constructed by the Monte Carlo method with the FBP and ground truth (GT) images for 511 patterns of missing fuel rods. The de-noising performance of the CAE model was evaluated by comparing the pixel-by-pixel subtracted images between the GT and FBP images and the GT and CAE images; the average differences of the pixel values for the sample image 1, 2, and 3 were 7.7%, 28.0% and 44.7% for the FBP images, and 0.5%, 1.4% and 1.9% for the predicted image, respectively. Even for the FBP images not discriminable the source patterns, the CAE model could successfully estimate the patterns similarly with the GT image.

A study on the application of residual vector quantization for vector quantized-variational autoencoder-based foley sound generation model (벡터 양자화 변분 오토인코더 기반의 폴리 음향 생성 모델을 위한 잔여 벡터 양자화 적용 연구)

  • Seokjin Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.243-252
    • /
    • 2024
  • Among the Foley sound generation models that have recently begun to be studied, a sound generation technique using the Vector Quantized-Variational AutoEncoder (VQ-VAE) structure and generation model such as Pixelsnail are one of the important research subjects. On the other hand, in the field of deep learning-based acoustic signal compression, residual vector quantization technology is reported to be more suitable than the conventional VQ-VAE structure. Therefore, in this paper, we aim to study whether residual vector quantization technology can be effectively applied to the Foley sound generation. In order to tackle the problem, this paper applies the residual vector quantization technique to the conventional VQ-VAE-based Foley sound generation model, and in particular, derives a model that is compatible with the existing models such as Pixelsnail and does not increase computational resource consumption. In order to evaluate the model, an experiment was conducted using DCASE2023 Task7 data. The results show that the proposed model enhances about 0.3 of the Fréchet audio distance. Unfortunately, the performance enhancement was limited, which is believed to be due to the decrease in the resolution of time-frequency domains in order to do not increase consumption of the computational resources.

Artificial intelligence application UX/UI study for language learning of children with articulation disorder (조음장애 아동의 언어학습을 위한 인공지능 애플리케이션 UX/UI 연구)

  • Yang, Eun-mi;Park, Dea-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.174-176
    • /
    • 2022
  • In this paper, we present a mobile application for 'personalized customized learning' for children with articulation disorders using an artificial intelligence (AI) algorithm. A dataset (Data Set) to analyze, judge, and predict the learner's articulation situation and degree. In particular, we designed a prototype model by looking at how AI can be improved and advanced compared to existing applications from the UX/UI (GUI) aspect. So far, the focus has been on visual experience, but now it is an important time to process data and provide a UX/UI (GUI) experience to users. The UX/UI (GUI) of the proposed mobile application was to be provided according to the learner's articulation level and situation by using CRNN (Convolution Recurrent Neural Network) of DeepLearning and Auto Encoder GPT-3 (Generative Pretrained Transformer). The use of artificial intelligence algorithms will provide a learning environment with a high degree of perfection to children with articulation disorders, thereby enhancing the learning effect. I hope that you do not have any fear or discomfort in conversation by improving the perfection of articulation with 'personalized and customized learning'.

  • PDF

A study on Korean multi-turn response generation using generative and retrieval model (생성 모델과 검색 모델을 이용한 한국어 멀티턴 응답 생성 연구)

  • Lee, Hodong;Lee, Jongmin;Seo, Jaehyung;Jang, Yoonna;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.1
    • /
    • pp.13-21
    • /
    • 2022
  • Recent deep learning-based research shows excellent performance in most natural language processing (NLP) fields with pre-trained language models. In particular, the auto-encoder-based language model proves its excellent performance and usefulness in various fields of Korean language understanding. However, the decoder-based Korean generative model even suffers from generating simple sentences. Also, there is few detailed research and data for the field of conversation where generative models are most commonly utilized. Therefore, this paper constructs multi-turn dialogue data for a Korean generative model. In addition, we compare and analyze the performance by improving the dialogue ability of the generative model through transfer learning. In addition, we propose a method of supplementing the insufficient dialogue generation ability of the model by extracting recommended response candidates from external knowledge information through a retrival model.

Semantic Segmentation Intended Satellite Image Enhancement Method Using Deep Auto Encoders (심층 자동 인코더를 이용한 시맨틱 세그멘테이션용 위성 이미지 향상 방법)

  • K. Dilusha Malintha De Silva;Hyo Jong Lee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.8
    • /
    • pp.243-252
    • /
    • 2023
  • Satellite imageries are at a greatest importance for land cover examining. Numerous studies have been conducted with satellite images and uses semantic segmentation techniques to extract information which has higher altitude viewpoint. The device which is taking these images must employee wireless communication links to send them to receiving ground stations. Wireless communications from a satellite are inevitably affected due to transmission errors. Evidently images which are being transmitted are distorted because of the information loss. Current semantic segmentation techniques are not made for segmenting distorted images. Traditional image enhancement methods have their own limitations when they are used for satellite images enhancement. This paper proposes an auto-encoder based image pre-enhancing method for satellite images. As a distorted satellite images dataset, images received from a real radio transmitter were used. Training process of the proposed auto-encoder was done by letting it learn to produce a proper approximation of the source image which was sent by the image transmitter. Unlike traditional image enhancing methods, the proposed method was able to provide more applicable image to a segmentation model. Results showed that by using the proposed pre-enhancing technique, segmentation results have been greatly improved. Enhancements made to the aerial images are contributed the correct assessment of land resources.

Network Anomaly Detection Technologies Using Unsupervised Learning AutoEncoders (비지도학습 오토 엔코더를 활용한 네트워크 이상 검출 기술)

  • Kang, Koohong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.617-629
    • /
    • 2020
  • In order to overcome the limitations of the rule-based intrusion detection system due to changes in Internet computing environments, the emergence of new services, and creativity of attackers, network anomaly detection (NAD) using machine learning and deep learning technologies has received much attention. Most of these existing machine learning and deep learning technologies for NAD use supervised learning methods to learn a set of training data set labeled 'normal' and 'attack'. This paper presents the feasibility of the unsupervised learning AutoEncoder(AE) to NAD from data sets collecting of secured network traffic without labeled responses. To verify the performance of the proposed AE mode, we present the experimental results in terms of accuracy, precision, recall, f1-score, and ROC AUC value on the NSL-KDD training and test data sets. In particular, we model a reference AE through the deep analysis of diverse AEs varying hyper-parameters such as the number of layers as well as considering the regularization and denoising effects. The reference model shows the f1-scores 90.4% and 89% of binary classification on the KDDTest+ and KDDTest-21 test data sets based on the threshold of the 82-th percentile of the AE reconstruction error of the training data set.

Semi-supervised based Unknown Attack Detection in EDR Environment

  • Hwang, Chanwoong;Kim, Doyeon;Lee, Taejin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4909-4926
    • /
    • 2020
  • Cyberattacks penetrate the server and perform various malicious acts such as stealing confidential information, destroying systems, and exposing personal information. To achieve this, attackers perform various malicious actions by infecting endpoints and accessing the internal network. However, the current countermeasures are only anti-viruses that operate in a signature or pattern manner, allowing initial unknown attacks. Endpoint Detection and Response (EDR) technology is focused on providing visibility, and strong countermeasures are lacking. If you fail to respond to the initial attack, it is difficult to respond additionally because malicious behavior like Advanced Persistent Threat (APT) attack does not occur immediately, but occurs over a long period of time. In this paper, we propose a technique that detects an unknown attack using an event log without prior knowledge, although the initial response failed with anti-virus. The proposed technology uses a combination of AutoEncoder and 1D CNN (1-Dimention Convolutional Neural Network) based on semi-supervised learning. The experiment trained a dataset collected over a month in a real-world commercial endpoint environment, and tested the data collected over the next month. As a result of the experiment, 37 unknown attacks were detected in the event log collected for one month in the actual commercial endpoint environment, and 26 of them were verified as malicious through VirusTotal (VT). In the future, it is expected that the proposed model will be applied to EDR technology to form a secure endpoint environment and reduce time and labor costs to effectively detect unknown attacks.

An AutoEncoder Model based on Attention and Inverse Document Frequency for Classification of Creativity in Essay (에세이의 창의성 분류를 위한 어텐션과 역문서 빈도 기반의 자기부호화기 모델)

  • Se-Jin Jeong;Deok-gi Kim;Byung-Won On
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.624-629
    • /
    • 2022
  • 에세이의 창의성을 자동으로 분류하는 기존의 주요 연구는 말뭉치에서 빈번하게 등장하지 않는 단어에 초점을 맞추어 기계학습을 수행한다. 그러나 이러한 연구는 에세이의 주제와 상관없이 단순히 참신한 단어가 많아 창의적으로 분류되는 문제점이 발생한다. 본 논문에서는 어텐션(Attention)과 역문서 빈도(Inverse Document Frequency; IDF)를 이용하여 에세이 내용 전달에 있어 중요하면서 참신한 단어에 높은 가중치를 두는 문맥 벡터를 구하고, 자기부호화기(AutoEncoder) 모델을 사용하여 문맥 벡터들로부터 창의적인 에세이와 창의적이지 않은 에세이의 특징 벡터를 추출한다. 그리고 시험 단계에서 새로운 에세이의 특징 벡터와 비교하여 그 에세이가 창의적인지 아닌지 분류하는 딥러닝 모델을 제안한다. 실험 결과에 따르면 제안 방안은 기존 방안에 비해 높은 정확도를 보인다. 구체적으로 제안 방안의 평균 정확도는 92%였고 기존의 주요 방안보다 9%의 정확도 향상을 보였다.

  • PDF

Development of a driver's emotion detection model using auto-encoder on driving behavior and psychological data

  • Eun-Seo, Jung;Seo-Hee, Kim;Yun-Jung, Hong;In-Beom, Yang;Jiyoung, Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.3
    • /
    • pp.35-43
    • /
    • 2023
  • Emotion recognition while driving is an essential task to prevent accidents. Furthermore, in the era of autonomous driving, automobiles are the subject of mobility, requiring more emotional communication with drivers, and the emotion recognition market is gradually spreading. Accordingly, in this research plan, the driver's emotions are classified into seven categories using psychological and behavioral data, which are relatively easy to collect. The latent vectors extracted through the auto-encoder model were also used as features in this classification model, confirming that this affected performance improvement. Furthermore, it also confirmed that the performance was improved when using the framework presented in this paper compared to when the existing EEG data were included. Finally, 81% of the driver's emotion classification accuracy and 80% of F1-Score were achieved only through psychological, personal information, and behavioral data.