• Title/Summary/Keyword: neural network learning

Search Result 4,177, Processing Time 0.033 seconds

Super High-Resolution Image Style Transfer (초-고해상도 영상 스타일 전이)

  • Kim, Yong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.104-123
    • /
    • 2022
  • Style transfer based on neural network provides very high quality results by reflecting the high level structural characteristics of images, and thereby has recently attracted great attention. This paper deals with the problem of resolution limitation due to GPU memory in performing such neural style transfer. We can expect that the gradient operation for style transfer based on partial image, with the aid of the fixed size of receptive field, can produce the same result as the gradient operation using the entire image. Based on this idea, each component of the style transfer loss function is analyzed in this paper to obtain the necessary conditions for partitioning and padding, and to identify, among the information required for gradient calculation, the one that depends on the entire input. By structuring such information for using it as auxiliary constant input for partition-based gradient calculation, this paper develops a recursive algorithm for super high-resolution image style transfer. Since the proposed method performs style transfer by partitioning input image into the size that a GPU can handle, it can perform style transfer without the limit of the input image resolution accompanied by the GPU memory size. With the aid of such super high-resolution support, the proposed method can provide a unique style characteristics of detailed area which can only be appreciated in super high-resolution style transfer.

Target-Aspect-Sentiment Joint Detection with CNN Auxiliary Loss for Aspect-Based Sentiment Analysis (CNN 보조 손실을 이용한 차원 기반 감성 분석)

  • Jeon, Min Jin;Hwang, Ji Won;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.1-22
    • /
    • 2021
  • Aspect Based Sentiment Analysis (ABSA), which analyzes sentiment based on aspects that appear in the text, is drawing attention because it can be used in various business industries. ABSA is a study that analyzes sentiment by aspects for multiple aspects that a text has. It is being studied in various forms depending on the purpose, such as analyzing all targets or just aspects and sentiments. Here, the aspect refers to the property of a target, and the target refers to the text that causes the sentiment. For example, for restaurant reviews, you could set the aspect into food taste, food price, quality of service, mood of the restaurant, etc. Also, if there is a review that says, "The pasta was delicious, but the salad was not," the words "steak" and "salad," which are directly mentioned in the sentence, become the "target." So far, in ABSA, most studies have analyzed sentiment only based on aspects or targets. However, even with the same aspects or targets, sentiment analysis may be inaccurate. Instances would be when aspects or sentiment are divided or when sentiment exists without a target. For example, sentences like, "Pizza and the salad were good, but the steak was disappointing." Although the aspect of this sentence is limited to "food," conflicting sentiments coexist. In addition, in the case of sentences such as "Shrimp was delicious, but the price was extravagant," although the target here is "shrimp," there are opposite sentiments coexisting that are dependent on the aspect. Finally, in sentences like "The food arrived too late and is cold now." there is no target (NULL), but it transmits a negative sentiment toward the aspect "service." Like this, failure to consider both aspects and targets - when sentiment or aspect is divided or when sentiment exists without a target - creates a dual dependency problem. To address this problem, this research analyzes sentiment by considering both aspects and targets (Target-Aspect-Sentiment Detection, hereby TASD). This study detected the limitations of existing research in the field of TASD: local contexts are not fully captured, and the number of epochs and batch size dramatically lowers the F1-score. The current model excels in spotting overall context and relations between each word. However, it struggles with phrases in the local context and is relatively slow when learning. Therefore, this study tries to improve the model's performance. To achieve the objective of this research, we additionally used auxiliary loss in aspect-sentiment classification by constructing CNN(Convolutional Neural Network) layers parallel to existing models. If existing models have analyzed aspect-sentiment through BERT encoding, Pooler, and Linear layers, this research added CNN layer-adaptive average pooling to existing models, and learning was progressed by adding additional loss values for aspect-sentiment to existing loss. In other words, when learning, the auxiliary loss, computed through CNN layers, allowed the local context to be captured more fitted. After learning, the model is designed to do aspect-sentiment analysis through the existing method. To evaluate the performance of this model, two datasets, SemEval-2015 task 12 and SemEval-2016 task 5, were used and the f1-score increased compared to the existing models. When the batch was 8 and epoch was 5, the difference was largest between the F1-score of existing models and this study with 29 and 45, respectively. Even when batch and epoch were adjusted, the F1-scores were higher than the existing models. It can be said that even when the batch and epoch numbers were small, they can be learned effectively compared to the existing models. Therefore, it can be useful in situations where resources are limited. Through this study, aspect-based sentiments can be more accurately analyzed. Through various uses in business, such as development or establishing marketing strategies, both consumers and sellers will be able to make efficient decisions. In addition, it is believed that the model can be fully learned and utilized by small businesses, those that do not have much data, given that they use a pre-training model and recorded a relatively high F1-score even with limited resources.

A preliminary study for development of an automatic incident detection system on CCTV in tunnels based on a machine learning algorithm (기계학습(machine learning) 기반 터널 영상유고 자동 감지 시스템 개발을 위한 사전검토 연구)

  • Shin, Hyu-Soung;Kim, Dong-Gyou;Yim, Min-Jin;Lee, Kyu-Beom;Oh, Young-Sup
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.19 no.1
    • /
    • pp.95-107
    • /
    • 2017
  • In this study, a preliminary study was undertaken for development of a tunnel incident automatic detection system based on a machine learning algorithm which is to detect a number of incidents taking place in tunnel in real time and also to be able to identify the type of incident. Two road sites where CCTVs are operating have been selected and a part of CCTV images are treated to produce sets of training data. The data sets are composed of position and time information of moving objects on CCTV screen which are extracted by initially detecting and tracking of incoming objects into CCTV screen by using a conventional image processing technique available in this study. And the data sets are matched with 6 categories of events such as lane change, stoping, etc which are also involved in the training data sets. The training data are learnt by a resilience neural network where two hidden layers are applied and 9 architectural models are set up for parametric studies, from which the architectural model, 300(first hidden layer)-150(second hidden layer) is found to be optimum in highest accuracy with respect to training data as well as testing data not used for training. From this study, it was shown that the highly variable and complex traffic and incident features could be well identified without any definition of feature regulation by using a concept of machine learning. In addition, detection capability and accuracy of the machine learning based system will be automatically enhanced as much as big data of CCTV images in tunnel becomes rich.

Estimation of the Lodging Area in Rice Using Deep Learning (딥러닝을 이용한 벼 도복 면적 추정)

  • Ban, Ho-Young;Baek, Jae-Kyeong;Sang, Wan-Gyu;Kim, Jun-Hwan;Seo, Myung-Chul
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.66 no.2
    • /
    • pp.105-111
    • /
    • 2021
  • Rice lodging is an annual occurrence caused by typhoons accompanied by strong winds and strong rainfall, resulting in damage relating to pre-harvest sprouting during the ripening period. Thus, rapid estimations of the area of lodged rice are necessary to enable timely responses to damage. To this end, we obtained images related to rice lodging using a drone in Gimje, Buan, and Gunsan, which were converted to 128 × 128 pixels images. A convolutional neural network (CNN) model, a deep learning model based on these images, was used to predict rice lodging, which was classified into two types (lodging and non-lodging), and the images were divided in a 8:2 ratio into a training set and a validation set. The CNN model was layered and trained using three optimizers (Adam, Rmsprop, and SGD). The area of rice lodging was evaluated for the three fields using the obtained data, with the exception of the training set and validation set. The images were combined to give composites images of the entire fields using Metashape, and these images were divided into 128 × 128 pixels. Lodging in the divided images was predicted using the trained CNN model, and the extent of lodging was calculated by multiplying the ratio of the total number of field images by the number of lodging images by the area of the entire field. The results for the training and validation sets showed that accuracy increased with a progression in learning and eventually reached a level greater than 0.919. The results obtained for each of the three fields showed high accuracy with respect to all optimizers, among which, Adam showed the highest accuracy (normalized root mean square error: 2.73%). On the basis of the findings of this study, it is anticipated that the area of lodged rice can be rapidly predicted using deep learning.

Prediction and analysis of acute fish toxicity of pesticides to the rainbow trout using 2D-QSAR (2D-QSAR방법을 이용한 농약류의 무지개 송어 급성 어독성 분석 및 예측)

  • Song, In-Sik;Cha, Ji-Young;Lee, Sung-Kwang
    • Analytical Science and Technology
    • /
    • v.24 no.6
    • /
    • pp.544-555
    • /
    • 2011
  • The acute toxicity in the rainbow trout (Oncorhynchus mykiss) was analyzed and predicted using quantitative structure-activity relationships (QSAR). The aquatic toxicity, 96h $LC_{50}$ (median lethal concentration) of 275 organic pesticides, was obtained from EU-funded project DEMETRA. Prediction models were derived from 558 2D molecular descriptors, calculated in PreADMET. The linear (multiple linear regression) and nonlinear (support vector machine and artificial neural network) learning methods were optimized by taking into account the statistical parameters between the experimental and predicted p$LC_{50}$. After preprocessing, population based forward selection were used to select the best subsets of descriptors in the learning methods including 5-fold cross-validation procedure. The support vector machine model was used as the best model ($R^2_{CV}$=0.677, RMSECV=0.887, MSECV=0.674) and also correctly classified 87% for the training set according to EU regulation criteria. The MLR model could describe the structural characteristics of toxic chemicals and interaction with lipid membrane of fish. All the developed models were validated by 5 fold cross-validation and Y-scrambling test.

Removal of Seabed Multiples in Seismic Reflection Data using Machine Learning (머신러닝을 이용한 탄성파 반사법 자료의 해저면 겹반사 제거)

  • Nam, Ho-Soo;Lim, Bo-Sung;Kweon, Il-Ryong;Kim, Ji-Soo
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.3
    • /
    • pp.168-177
    • /
    • 2020
  • Seabed multiple reflections (seabed multiples) are the main cause of misinterpretations of primary reflections in both shot gathers and stack sections. Accordingly, seabed multiples need to be suppressed throughout data processing. Conventional model-driven methods, such as prediction-error deconvolution, Radon filtering, and data-driven methods, such as the surface-related multiple elimination technique, have been used to attenuate multiple reflections. However, the vast majority of processing workflows require time-consuming steps when testing and selecting the processing parameters in addition to computational power and skilled data-processing techniques. To attenuate seabed multiples in seismic reflection data, input gathers with seabed multiples and label gathers without seabed multiples were generated via numerical modeling using the Marmousi2 velocity structure. The training data consisted of normal-moveout-corrected common midpoint gathers fed into a U-Net neural network. The well-trained model was found to effectively attenuate the seabed multiples according to the image similarity between the prediction result and the target data, and demonstrated good applicability to field data.

Sensor Fault Detection Scheme based on Deep Learning and Support Vector Machine (딥 러닝 및 서포트 벡터 머신기반 센서 고장 검출 기법)

  • Yang, Jae-Wan;Lee, Young-Doo;Koo, In-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.2
    • /
    • pp.185-195
    • /
    • 2018
  • As machines have been automated in the field of industries in recent years, it is a paramount importance to manage and maintain the automation machines. When a fault occurs in sensors attached to the machine, the machine may malfunction and further, a huge damage will be caused in the process line. To prevent the situation, the fault of sensors should be monitored, diagnosed and classified in a proper way. In the paper, we propose a sensor fault detection scheme based on SVM and CNN to detect and classify typical sensor errors such as erratic, drift, hard-over, spike, and stuck faults. Time-domain statistical features are utilized for the learning and testing in the proposed scheme, and the genetic algorithm is utilized to select the subset of optimal features. To classify multiple sensor faults, a multi-layer SVM is utilized, and ensemble technique is used for CNN. As a result, the SVM that utilizes a subset of features selected by the genetic algorithm provides better performance than the SVM that utilizes all the features. However, the performance of CNN is superior to that of the SVM.

Semi-supervised domain adaptation using unlabeled data for end-to-end speech recognition (라벨이 없는 데이터를 사용한 종단간 음성인식기의 준교사 방식 도메인 적응)

  • Jeong, Hyeonjae;Goo, Jahyun;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.29-37
    • /
    • 2020
  • Recently, the neural network-based deep learning algorithm has dramatically improved performance compared to the classical Gaussian mixture model based hidden Markov model (GMM-HMM) automatic speech recognition (ASR) system. In addition, researches on end-to-end (E2E) speech recognition systems integrating language modeling and decoding processes have been actively conducted to better utilize the advantages of deep learning techniques. In general, E2E ASR systems consist of multiple layers of encoder-decoder structure with attention. Therefore, E2E ASR systems require data with a large amount of speech-text paired data in order to achieve good performance. Obtaining speech-text paired data requires a lot of human labor and time, and is a high barrier to building E2E ASR system. Therefore, there are previous studies that improve the performance of E2E ASR system using relatively small amount of speech-text paired data, but most studies have been conducted by using only speech-only data or text-only data. In this study, we proposed a semi-supervised training method that enables E2E ASR system to perform well in corpus in different domains by using both speech or text only data. The proposed method works effectively by adapting to different domains, showing good performance in the target domain and not degrading much in the source domain.

Research on Text Classification of Research Reports using Korea National Science and Technology Standards Classification Codes (국가 과학기술 표준분류 체계 기반 연구보고서 문서의 자동 분류 연구)

  • Choi, Jong-Yun;Hahn, Hyuk;Jung, Yuchul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.1
    • /
    • pp.169-177
    • /
    • 2020
  • In South Korea, the results of R&D in science and technology are submitted to the National Science and Technology Information Service (NTIS) in reports that have Korea national science and technology standard classification codes (K-NSCC). However, considering there are more than 2000 sub-categories, it is non-trivial to choose correct classification codes without a clear understanding of the K-NSCC. In addition, there are few cases of automatic document classification research based on the K-NSCC, and there are no training data in the public domain. To the best of our knowledge, this study is the first attempt to build a highly performing K-NSCC classification system based on NTIS report meta-information from the last five years (2013-2017). To this end, about 210 mid-level categories were selected, and we conducted preprocessing considering the characteristics of research report metadata. More specifically, we propose a convolutional neural network (CNN) technique using only task names and keywords, which are the most influential fields. The proposed model is compared with several machine learning methods (e.g., the linear support vector classifier, CNN, gated recurrent unit, etc.) that show good performance in text classification, and that have a performance advantage of 1% to 7% based on a top-three F1 score.

A Study on Deep Learning-based Pedestrian Detection and Alarm System (딥러닝 기반의 보행자 탐지 및 경보 시스템 연구)

  • Kim, Jeong-Hwan;Shin, Yong-Hyeon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.18 no.4
    • /
    • pp.58-70
    • /
    • 2019
  • In the case of a pedestrian traffic accident, it has a large-scale danger directly connected by a fatal accident at the time of the accident. The domestic ITS is not used for intelligent risk classification because it is used only for collecting traffic information despite of the construction of good quality traffic infrastructure. The CNN based pedestrian detection classification model, which is a major component of the proposed system, is implemented on an embedded system assuming that it is installed and operated in a restricted environment. A new model was created by improving YOLO's artificial neural network, and the real-time detection speed result of average accuracy 86.29% and 21.1 fps was shown with 20,000 iterative learning. And we constructed a protocol interworking scenario and implementation of a system that can connect with the ITS. If a pedestrian accident prevention system connected with ITS will be implemented through this study, it will help to reduce the cost of constructing a new infrastructure and reduce the incidence of traffic accidents for pedestrians, and we can also reduce the cost for system monitoring.