• Title/Summary/Keyword: AI Training Data

Search Result 276, Processing Time 0.022 seconds

Image Super-Resolution for Improving Object Recognition Accuracy (객체 인식 정확도 개선을 위한 이미지 초해상도 기술)

  • Lee, Sung-Jin;Kim, Tae-Jun;Lee, Chung-Heon;Yoo, Seok Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.6
    • /
    • pp.774-784
    • /
    • 2021
  • The object detection and recognition process is a very important task in the field of computer vision, and related research is actively being conducted. However, in the actual object recognition process, the recognition accuracy is often degraded due to the resolution mismatch between the training image data and the test image data. To solve this problem, in this paper, we designed and developed an integrated object recognition and super-resolution framework by proposing an image super-resolution technique to improve object recognition accuracy. In detail, 11,231 license plate training images were built by ourselves through web-crawling and artificial-data-generation, and the image super-resolution artificial neural network was trained by defining an objective function to be robust to the image flip. To verify the performance of the proposed algorithm, we experimented with the trained image super-resolution and recognition on 1,999 test images, and it was confirmed that the proposed super-resolution technique has the effect of improving the accuracy of character recognition.

Damage Detection and Damage Quantification of Temporary works Equipment based on Explainable Artificial Intelligence (XAI)

  • Cheolhee Lee;Taehoe Koo;Namwook Park;Nakhoon Lim
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.11-19
    • /
    • 2024
  • This paper was studied abouta technology for detecting damage to temporary works equipment used in construction sites with explainable artificial intelligence (XAI). Temporary works equipment is mostly composed of steel or aluminum, and it is reused several times due to the characters of the materials in temporary works equipment. However, it sometimes causes accidents at construction sites by using low or decreased quality of temporary works equipment because the regulation and restriction of reuse in them is not strict. Currently, safety rules such as related government laws, standards, and regulations for quality control of temporary works equipment have not been established. Additionally, the inspection results were often different according to the inspector's level of training. To overcome these limitations, a method based with AI and image processing technology was developed. In addition, it was devised by applying explainableartificial intelligence (XAI) technology so that the inspector makes more exact decision with resultsin damage detect with image analysis by the XAI which is a developed AI model for analysis of temporary works equipment. In the experiments, temporary works equipment was photographed with a 4k-quality camera, and the learned artificial intelligence model was trained with 610 labelingdata, and the accuracy was tested by analyzing the image recording data of temporary works equipment. As a result, the accuracy of damage detect by the XAI was 95.0% for the training dataset, 92.0% for the validation dataset, and 90.0% for the test dataset. This was shown aboutthe reliability of the performance of the developed artificial intelligence. It was verified for usability of explainable artificial intelligence to detect damage in temporary works equipment by the experiments. However, to improve the level of commercial software, the XAI need to be trained more by real data set and the ability to detect damage has to be kept or increased when the real data set is applied.

An analysis of public perception on Artificial Intelligence(AI) education using Big Data: Based on News articles and Twitter (빅데이터 분석을 통해 본 AI교육에 대한 사회적 인식: 뉴스기사와 트위터를 중심으로)

  • Lee, Sang-Soog;Yoo, Inhyeok;Kim, Jinhee
    • Journal of Digital Convergence
    • /
    • v.18 no.6
    • /
    • pp.9-16
    • /
    • 2020
  • The purpose of this study is to understand the public needs for AI education actively promoted and supported by the current government. In doing so, 11 metropolitan news articles and Twitter posts regarding AI education that have been posted from January 1, 2018 to December 31, 2019 were collected. Then, word frequency analysis using TF(Term Frequency) method and LDA(Latent Dirichlet Allocation) method of topic modeling analysis were conducted. The topics of the news articles turn out to be a macroscopic policy support such as 'training female manpower in the AI field' and 'curriculum reform of university and K-12', whereas the topics of twitter delineate more detailed social perception on future society, such as future competencies and pedagogical methods, including 'coexistence with intelligent robots', 'coding education', and 'humane education competence development'. The findings are expected to be used to suggest the implications for the composition and management of AI curriculum as well as the basic framework of human resources development in the future industry.

Prediction Model of Real Estate Transaction Price with the LSTM Model based on AI and Bigdata

  • Lee, Jeong-hyun;Kim, Hoo-bin;Shim, Gyo-eon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.274-283
    • /
    • 2022
  • Korea is facing a number difficulties arising from rising housing prices. As 'housing' takes the lion's share in personal assets, many difficulties are expected to arise from fluctuating housing prices. The purpose of this study is creating housing price prediction model to prevent such risks and induce reasonable real estate purchases. This study made many attempts for understanding real estate instability and creating appropriate housing price prediction model. This study predicted and validated housing prices by using the LSTM technique - a type of Artificial Intelligence deep learning technology. LSTM is a network in which cell state and hidden state are recursively calculated in a structure which added cell state, which is conveyor belt role, to the existing RNN's hidden state. The real sale prices of apartments in autonomous districts ranging from January 2006 to December 2019 were collected through the Ministry of Land, Infrastructure, and Transport's real sale price open system and basic apartment and commercial district information were collected through the Public Data Portal and the Seoul Metropolitan City Data. The collected real sale price data were scaled based on monthly average sale price and a total of 168 data were organized by preprocessing respective data based on address. In order to predict prices, the LSTM implementation process was conducted by setting training period as 29 months (April 2015 to August 2017), validation period as 13 months (September 2017 to September 2018), and test period as 13 months (December 2018 to December 2019) according to time series data set. As a result of this study for predicting 'prices', there have been the following results. Firstly, this study obtained 76 percent of prediction similarity. We tried to design a prediction model of real estate transaction price with the LSTM Model based on AI and Bigdata. The final prediction model was created by collecting time series data, which identified the fact that 76 percent model can be made. This validated that predicting rate of return through the LSTM method can gain reliability.

Korean Machine Reading Comprehension for Patent Consultation Using BERT (BERT를 이용한 한국어 특허상담 기계독해)

  • Min, Jae-Ok;Park, Jin-Woo;Jo, Yu-Jeong;Lee, Bong-Gun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.4
    • /
    • pp.145-152
    • /
    • 2020
  • MRC (Machine reading comprehension) is the AI NLP task that predict the answer for user's query by understanding of the relevant document and which can be used in automated consult services such as chatbots. Recently, the BERT (Pre-training of Deep Bidirectional Transformers for Language Understanding) model, which shows high performance in various fields of natural language processing, have two phases. First phase is Pre-training the big data of each domain. And second phase is fine-tuning the model for solving each NLP tasks as a prediction. In this paper, we have made the Patent MRC dataset and shown that how to build the patent consultation training data for MRC task. And we propose the method to improve the performance of the MRC task using the Pre-trained Patent-BERT model by the patent consultation corpus and the language processing algorithm suitable for the machine learning of the patent counseling data. As a result of experiment, we show that the performance of the method proposed in this paper is improved to answer the patent counseling query.

A Network Packet Analysis Method to Discover Malicious Activities

  • Kwon, Taewoong;Myung, Joonwoo;Lee, Jun;Kim, Kyu-il;Song, Jungsuk
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.143-153
    • /
    • 2022
  • With the development of networks and the increase in the number of network devices, the number of cyber attacks targeting them is also increasing. Since these cyber-attacks aim to steal important information and destroy systems, it is necessary to minimize social and economic damage through early detection and rapid response. Many studies using machine learning (ML) and artificial intelligence (AI) have been conducted, among which payload learning is one of the most intuitive and effective methods to detect malicious behavior. In this study, we propose a preprocessing method to maximize the performance of the model when learning the payload in term units. The proposed method constructs a high-quality learning data set by eliminating unnecessary noise (stopwords) and preserving important features in consideration of the machine language and natural language characteristics of the packet payload. Our method consists of three steps: Preserving significant special characters, Generating a stopword list, and Class label refinement. By processing packets of various and complex structures based on these three processes, it is possible to make high-quality training data that can be helpful to build high-performance ML/AI models for security monitoring. We prove the effectiveness of the proposed method by comparing the performance of the AI model to which the proposed method is applied and not. Forthermore, by evaluating the performance of the AI model applied proposed method in the real-world Security Operating Center (SOC) environment with live network traffic, we demonstrate the applicability of the our method to the real environment.

Development of Prediction Model for Nitrogen Oxides Emission Using Artificial Intelligence (인공지능 기반 질소산화물 배출량 예측을 위한 연구모형 개발)

  • Jo, Ha-Nui;Park, Jisu;Yun, Yongju
    • Korean Chemical Engineering Research
    • /
    • v.58 no.4
    • /
    • pp.588-595
    • /
    • 2020
  • Prediction and control of nitrogen oxides (NOx) emission is of great interest in industry due to stricter environmental regulations. Herein, we propose an artificial intelligence (AI)-based framework for prediction of NOx emission. The framework includes pre-processing of data for training of neural networks and evaluation of the AI-based models. In this work, Long-Short-Term Memory (LSTM), one of the recurrent neural networks, was adopted to reflect the time series characteristics of NOx emissions. A decision tree was used to determine a time window of LSTM prior to training of the network. The neural network was trained with operational data from a heating furnace. The optimal model was obtained by optimizing hyper-parameters. The LSTM model provided a reliable prediction of NOx emission for both training and test data, showing an accuracy of 93% or more. The application of the proposed AI-based framework will provide new opportunities for predicting the emission of various air pollutants with time series characteristics.

Anomaly Detection via Pattern Dictionary Method and Atypicality in Application (패턴사전과 비정형성을 통한 이상치 탐지방법 적용)

  • Sehong Oh;Jongsung Park;Youngsam Yoon
    • Journal of Sensor Science and Technology
    • /
    • v.32 no.6
    • /
    • pp.481-486
    • /
    • 2023
  • Anomaly detection holds paramount significance across diverse fields, encompassing fraud detection, risk mitigation, and sensor evaluation tests. Its pertinence extends notably to the military, particularly within the Warrior Platform, a comprehensive combat equipment system with wearable sensors. Hence, we propose a data-compression-based anomaly detection approach tailored to unlabeled time series and sequence data. This method entailed the construction of two distinctive features, typicality and atypicality, to discern anomalies effectively. The typicality of a test sequence was determined by evaluating the compression efficacy achieved through the pattern dictionary. This dictionary was established based on the frequency of all patterns identified in a training sequence generated for each sensor within Warrior Platform. The resulting typicality served as an anomaly score, facilitating the identification of anomalous data using a predetermined threshold. To improve the performance of the pattern dictionary method, we leveraged atypicality to discern sequences that could undergo compression independently without relying on the pattern dictionary. Consequently, our refined approach integrated both typicality and atypicality, augmenting the effectiveness of the pattern dictionary method. Our proposed method exhibited heightened capability in detecting a spectrum of unpredictable anomalies, fortifying the stability of wearable sensors prevalent in military equipment, including the Army TIGER 4.0 system.

A Vision Transformer Based Recommender System Using Side Information (부가 정보를 활용한 비전 트랜스포머 기반의 추천시스템)

  • Kwon, Yujin;Choi, Minseok;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.119-137
    • /
    • 2022
  • Recent recommendation system studies apply various deep learning models to represent user and item interactions better. One of the noteworthy studies is ONCF(Outer product-based Neural Collaborative Filtering) which builds a two-dimensional interaction map via outer product and employs CNN (Convolutional Neural Networks) to learn high-order correlations from the map. However, ONCF has limitations in recommendation performance due to the problems with CNN and the absence of side information. ONCF using CNN has an inductive bias problem that causes poor performances for data with a distribution that does not appear in the training data. This paper proposes to employ a Vision Transformer (ViT) instead of the vanilla CNN used in ONCF. The reason is that ViT showed better results than state-of-the-art CNN in many image classification cases. In addition, we propose a new architecture to reflect side information that ONCF did not consider. Unlike previous studies that reflect side information in a neural network using simple input combination methods, this study uses an independent auxiliary classifier to reflect side information more effectively in the recommender system. ONCF used a single latent vector for user and item, but in this study, a channel is constructed using multiple vectors to enable the model to learn more diverse expressions and to obtain an ensemble effect. The experiments showed our deep learning model improved performance in recommendation compared to ONCF.

A Study on Impacts of De-identification on Machine Learning's Biased Knowledge (머신러닝 편향성 관점에서 비식별화의 영향분석에 대한 연구)

  • Soohyeon Ha;Jinsong Kim;Yeeun Son;Gaeun Won;Yujin Choi;Soyeon Park;Hyung-Jong Kim;Eunsung Kang
    • Journal of the Korea Society for Simulation
    • /
    • v.33 no.2
    • /
    • pp.27-35
    • /
    • 2024
  • We aimed to shed light on the issue of perpetuating societal disparities by analyzing the impact of inherent biases present in datasets used for training artificial intelligence models on the predictions generated by Artificial Intelligence(AI). Therefore, to examine the influence of data bias on AI models, we constructed an original dataset containing biases related to gender wage gaps and subsequently created a de-identified dataset. Additionally, by utilizing the decision tree algorithm, we compared the outputs of AI models trained on both the original and de-identified datasets, aiming to analyze how data de-identification affects the biases in the results produced by artificial intelligence models. Through this, our goal was to highlight the significant role of data de-identification not only in safeguarding individual privacy but also in addressing biases within the data.