• Title/Summary/Keyword: Deep Representation Learning

Search Result 113, Processing Time 0.026 seconds

A Study on Deep Learning model for classifying programs by functionalities (기능성에 따른 프로그래밍 소스코드 분류를 위한 Deep Learning Model 연구)

  • Yoon, Joo-Sung;Lee, Eun-Hun;An, Jin-Hyeon;Kim, Hyun-Cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.615-616
    • /
    • 2016
  • 최근 4차 산업으로 패러다임이 변화함에 따라 SW산업이 더욱 중요하게 되었다. 이에 따라 전 세계적으로 코딩 교육에 대한 수요도 증가하게 되었고 기업에서도 SW를 잘 만들기 위한 코드 관리 중요성도 증가하게 되었다. 많은 양의 프로그래밍 소스코드를 사람이 일일이 채점하고 관리하는 것은 사실상 불가능하기 때문에 이러한 문제를 해결할 수 있는 코드 평가 시스템이 요구되고 있다. 하지만 어떤 코드가 좋은 코드인지 코드를 어떻게 평가해야하는지에 대한 명확한 기준은 없으며 이에 대한 연구도 부족한 상황이다. 최근에 주목 받고 있는 Deep Learning 기술은 이미지 처리, 자연어 처리등 기존의 Machine Learning 알고리즘이 냈던 성과보다 훨씬 뛰어난 성과를 내고 있다. 하지만 Programming language 영역에서는 아직 깊이 연구된 바가 없다. 따라서 본 연구에서는 Deep Learning 기술로 알려진 Convolutional Neural Network의 변형된 형태엔 Tree-based Convolutional Neural Network를 사용하여 프로그래밍 소스코드를 분석, 분류하는 알고리즘 및 코드의 Representation Learning에 대한 연구를 진행함으로써 이러한 문제를 해결하고자 한다.

Facial Expression Recognition through Self-supervised Learning for Predicting Face Image Sequence

  • Yoon, Yeo-Chan;Kim, Soo Kyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.41-47
    • /
    • 2022
  • In this paper, we propose a new and simple self-supervised learning method that predicts the middle image of a face image sequence for automatic expression recognition. Automatic facial expression recognition can achieve high performance through deep learning methods, however, generally requires a expensive large data set. The size of the data set and the performance of the algorithm are tend to be proportional. The proposed method learns latent deep representation of a face through self-supervised learning using an existing dataset without constructing an additional dataset. Then it transfers the learned parameter to new facial expression reorganization model for improving the performance of automatic expression recognition. The proposed method showed high performance improvement for two datasets, CK+ and AFEW 8.0, and showed that the proposed method can achieve a great effect.

Comparing State Representation Techniques for Reinforcement Learning in Autonomous Driving (자율주행 차량 시뮬레이션에서의 강화학습을 위한 상태표현 성능 비교)

  • Jihwan Ahn;Taesoo Kwon
    • Journal of the Korea Computer Graphics Society
    • /
    • v.30 no.3
    • /
    • pp.109-123
    • /
    • 2024
  • Research into vision-based end-to-end autonomous driving systems utilizing deep learning and reinforcement learning has been steadily increasing. These systems typically encode continuous and high-dimensional vehicle states, such as location, velocity, orientation, and sensor data, into latent features, which are then decoded into a vehicular control policy. The complexity of urban driving environments necessitates the use of state representation learning through networks like Variational Autoencoders (VAEs) or Convolutional Neural Networks (CNNs). This paper analyzes the impact of different image state encoding methods on reinforcement learning performance in autonomous driving. Experiments were conducted in the CARLA simulator using RGB images and semantically segmented images captured by the vehicle's front camera. These images were encoded using VAE and Vision Transformer (ViT) networks. The study examines how these networks influence the agents' learning outcomes and experimentally demonstrates the role of each state representation technique in enhancing the learning efficiency and decision- making capabilities of autonomous driving systems.

Subsurface anomaly detection utilizing synthetic GPR images and deep learning model

  • Ahmad Abdelmawla;Shihan Ma;Jidong J. Yang;S. Sonny Kim
    • Geomechanics and Engineering
    • /
    • v.33 no.2
    • /
    • pp.203-209
    • /
    • 2023
  • One major advantage of ground penetrating radar (GPR) over other field test methods is its ability to obtain subsurface images of roads in an efficient and non-intrusive manner. Not only can the strata of pavement structure be retrieved from the GPR scan images, but also various irregularities, such as cracks and internal cavities. This article introduces a deep learning-based approach, focusing on detecting subsurface cracks by recognizing their distinctive hyperbolic signatures in the GPR scan images. Given the limited road sections that contain target features, two data augmentation methods, i.e., feature insertion and generation, are implemented, resulting in 9,174 GPR scan images. One of the most popular real-time object detection models, You Only Learn One Representation (YOLOR), is trained for detecting the target features for two types of subsurface cracks: bottom cracks and full cracks from the GPR scan images. The former represents partial cracks initiated from the bottom of the asphalt layer or base layers, while the latter includes extended cracks that penetrate these layers. Our experiments show the test average precisions of 0.769, 0.803 and 0.735 for all cracks, bottom cracks, and full cracks, respectively. This demonstrates the practicality of deep learning-based methods in detecting subsurface cracks from GPR scan images.

3D Object Generation and Renderer System based on VAE ResNet-GAN

  • Min-Su Yu;Tae-Won Jung;GyoungHyun Kim;Soonchul Kwon;Kye-Dong Jung
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.142-146
    • /
    • 2023
  • We present a method for generating 3D structures and rendering objects by combining VAE (Variational Autoencoder) and GAN (Generative Adversarial Network). This approach focuses on generating and rendering 3D models with improved quality using residual learning as the learning method for the encoder. We deep stack the encoder layers to accurately reflect the features of the image and apply residual blocks to solve the problems of deep layers to improve the encoder performance. This solves the problems of gradient vanishing and exploding, which are problems when constructing a deep neural network, and creates a 3D model of improved quality. To accurately extract image features, we construct deep layers of the encoder model and apply the residual function to learning to model with more detailed information. The generated model has more detailed voxels for more accurate representation, is rendered by adding materials and lighting, and is finally converted into a mesh model. 3D models have excellent visual quality and accuracy, making them useful in various fields such as virtual reality, game development, and metaverse.

Crop Leaf Disease Identification Using Deep Transfer Learning

  • Changjian Zhou;Yutong Zhang;Wenzhong Zhao
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.149-158
    • /
    • 2024
  • Traditional manual identification of crop leaf diseases is challenging. Owing to the limitations in manpower and resources, it is challenging to explore crop diseases on a large scale. The emergence of artificial intelligence technologies, particularly the extensive application of deep learning technologies, is expected to overcome these challenges and greatly improve the accuracy and efficiency of crop disease identification. Crop leaf disease identification models have been designed and trained using large-scale training data, enabling them to predict different categories of diseases from unlabeled crop leaves. However, these models, which possess strong feature representation capabilities, require substantial training data, and there is often a shortage of such datasets in practical farming scenarios. To address this issue and improve the feature learning abilities of models, this study proposes a deep transfer learning adaptation strategy. The novel proposed method aims to transfer the weights and parameters from pre-trained models in similar large-scale training datasets, such as ImageNet. ImageNet pre-trained weights are adopted and fine-tuned with the features of crop leaf diseases to improve prediction ability. In this study, we collected 16,060 crop leaf disease images, spanning 12 categories, for training. The experimental results demonstrate that an impressive accuracy of 98% is achieved using the proposed method on the transferred ResNet-50 model, thereby confirming the effectiveness of our transfer learning approach.

Video Quality Representation Classification of Encrypted HTTP Adaptive Video Streaming

  • Dubin, Ran;Hadar, Ofer;Dvir, Amit;Pele, Ofir
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.3804-3819
    • /
    • 2018
  • The increasing popularity of HTTP adaptive video streaming services has dramatically increased bandwidth requirements on operator networks, which attempt to shape their traffic through Deep Packet inspection (DPI). However, Google and certain content providers have started to encrypt their video services. As a result, operators often encounter difficulties in shaping their encrypted video traffic via DPI. This highlights the need for new traffic classification methods for encrypted HTTP adaptive video streaming to enable smart traffic shaping. These new methods will have to effectively estimate the quality representation layer and playout buffer. We present a new machine learning method and show for the first time that video quality representation classification for (YouTube) encrypted HTTP adaptive streaming is possible. The crawler codes and the datasets are provided in [43,44,51]. An extensive empirical evaluation shows that our method is able to independently classify every video segment into one of the quality representation layers with 97% accuracy if the browser is Safari with a Flash Player and 77% accuracy if the browser is Chrome, Explorer, Firefox or Safari with an HTML5 player.

An Attention Method-based Deep Learning Encoder for the Sentiment Classification of Documents (문서의 감정 분류를 위한 주목 방법 기반의 딥러닝 인코더)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.268-273
    • /
    • 2017
  • Recently, deep learning encoder-based approach has been actively applied in the field of sentiment classification. However, Long Short-Term Memory network deep learning encoder, the commonly used architecture, lacks the quality of vector representation when the length of the documents is prolonged. In this study, for effective classification of the sentiment documents, we suggest the use of attention method-based deep learning encoder that generates document vector representation by weighted sum of the outputs of Long Short-Term Memory network based on importance. In addition, we propose methods to modify the attention method-based deep learning encoder to suit the sentiment classification field, which consist of a part that is to applied to window attention method and an attention weight adjustment part. In the window attention method part, the weights are obtained in the window units to effectively recognize feeling features that consist of more than one word. In the attention weight adjustment part, the learned weights are smoothened. Experimental results revealed that the performance of the proposed method outperformed Long Short-Term Memory network encoder, showing 89.67% in accuracy criteria.

Multi-view learning review: understanding methods and their application (멀티 뷰 기법 리뷰: 이해와 응용)

  • Bae, Kang Il;Lee, Yung Seop;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.41-68
    • /
    • 2019
  • Multi-view learning considers data from various viewpoints as well as attempts to integrate various information from data. Multi-view learning has been studied recently and has showed superior performance to a model learned from only a single view. With the introduction of deep learning techniques to a multi-view learning approach, it has showed good results in various fields such as image, text, voice, and video. In this study, we introduce how multi-view learning methods solve various problems faced in human behavior recognition, medical areas, information retrieval and facial expression recognition. In addition, we review data integration principles of multi-view learning methods by classifying traditional multi-view learning methods into data integration, classifiers integration, and representation integration. Finally, we examine how CNN, RNN, RBM, Autoencoder, and GAN, which are commonly used among various deep learning methods, are applied to multi-view learning algorithms. We categorize CNN and RNN-based learning methods as supervised learning, and RBM, Autoencoder, and GAN-based learning methods as unsupervised learning.

Siamese Network for Learning Robust Feature of Hippocampi

  • Ahmed, Samsuddin;Jung, Ho Yub
    • Smart Media Journal
    • /
    • v.9 no.3
    • /
    • pp.9-17
    • /
    • 2020
  • Hippocampus is a complex brain structure embedded deep into the temporal lobe. Studies have shown that this structure gets affected by neurological and psychiatric disorders and it is a significant landmark for diagnosing neurodegenerative diseases. Hippocampus features play very significant roles in region-of-interest based analysis for disease diagnosis and prognosis. In this study, we have attempted to learn the embeddings of this important biomarker. As conventional metric learning methods for feature embedding is known to lacking in capturing semantic similarity among the data under study, we have trained deep Siamese convolutional neural network for learning metric of the hippocampus. We have exploited Gwangju Alzheimer's and Related Dementia cohort data set in our study. The input to the network was pairs of three-view patches (TVPs) of size 32 × 32 × 3. The positive samples were taken from the vicinity of a specified landmark for the hippocampus and negative samples were taken from random locations of the brain excluding hippocampi regions. We have achieved 98.72% accuracy in verifying hippocampus TVPs.