• Title/Summary/Keyword: 인공지능 학습용 데이터

Search Result 97, Processing Time 0.025 seconds

A Study on the Complementary Method of Aerial Image Learning Dataset Using Cycle Generative Adversarial Network (CycleGAN을 활용한 항공영상 학습 데이터 셋 보완 기법에 관한 연구)

  • Choi, Hyeoung Wook;Lee, Seung Hyeon;Kim, Hyeong Hun;Suh, Yong Cheol
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.6
    • /
    • pp.499-509
    • /
    • 2020
  • This study explores how to build object classification learning data based on artificial intelligence. The data has been investigated recently in image classification fields and, in turn, has a great potential to use. In order to recognize and extract relatively accurate objects using artificial intelligence, a large amount of learning data is required to be used in artificial intelligence algorithms. However, currently, there are not enough datasets for object recognition learning to share and utilize. In addition, generating data requires long hours of work, high expenses and labor. Therefore, in the present study, a small amount of initial aerial image learning data was used in the GAN (Generative Adversarial Network)-based generator network in order to establish image learning data. Moreover, the experiment also evaluated its quality in order to utilize additional learning datasets. The method of oversampling learning data using GAN can complement the amount of learning data, which have a crucial influence on deep learning data. As a result, this method is expected to be effective particularly with insufficient initial datasets.

A development of App to gather data for machine learning on Korean language writing recognition (한글 필기 인식을 위한 기계학습 용 데이터 수집 앱 개발)

  • Bae, Junwoo;Shim, Hyundo;Kim, Sungsuk;Sung, Mi-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.753-754
    • /
    • 2018
  • 최근 인공지능에 대한 관심이 증가하고 관련 연구가 활발히 진행됨에 따라, 기존 연구분야에도 이를 적용하고자 하는 시도가 증가하고 있다. 본 연구진도 한글 글씨를 인식하기 위해 기계학습을 적용하고자 하며, 그에 따라 본 연구에서는 초기 연구로서 사용자 필기 데이터를 수집하기 위한 안드로이드용 앱을 개발하였다. 최종 대상이 한글 공부를 시작하는 유아로 선정하였으므로, 그에 적절하게 학습 앱의 Activity를 구성하였다. 입력한 한글 데이터 분만 아니라 하나의 글자에 대한 초성, 중성, 종성별로 데이터를 별도로 수집하여 추후 활용할 수 있게 구성하였다. 즉, 학습과정에서 발생한 데이터는 이미지와 이벤트 두 가지 모두 저장하여 추후 최종 연구에 활용하고자 하였다.

Efficient Data Design Approaches for Object Detection in CCTV (CCTV 환경에서의 Object Detection 을 위한 효율적인 데이터 설계 방안 연구)

  • Hwa-Yong Jeong;Jeong-Hyun Choi;Sang-Min Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.615-618
    • /
    • 2023
  • 최근 computer vision 기술 발달이 가속화되고 있으나, 특정 산업의 경우 산업 적용의 어려움과 데이터적 특성으로 인하여 기술 발전의 속도를 따라가지 못하고 있다. 특히, CCTV 는 대부분 실외 환경에 운영되어 다양한 환경의 변화 및 데이터 고유 특성상 노이즈가 많기 때문에 데이터 산포가 커서 기술의 현장 적용에 어려움이 있다. 본 논문에서는 CCTV 데이터의 특성을 고려하여 CCTV 운용 환경에 강건한 객체탐지(object detector) 학습을 위한 데이터 설계 방안을 제안한다. 제안 기법은 대용량의 CCTV 영상에서 객체탐지에 효과적인 샘플링을 유도하는 방안과 소수의 CCTV 레이블 데이터 외 MS COCO 등 다수 오픈 레이블 데이터를 혼합학습 하여 일반화 성능을 높이는 방안을 제안한다. 다수의 실험을 통해 제안 기법의 우수성을 입증하였으며, 특히 mAP 기준 13.39%의 성능 향상을 꾀할 수 있음을 선보였다.

Study on the development of automatic translation service system for Korean astronomical classics by artificial intelligence - Focused on system analysis and design step (천문 고문헌 특화 인공지능 자동번역 서비스 시스템 개발 연구 - 시스템 요구사항 분석 및 설계 위주)

  • Seo, Yoon Kyung;Kim, Sang Hyuk;Ahn, Young Sook;Choi, Go-Eun;Choi, Young Sil;Baik, Hangi;Sun, Bo Min;Kim, Hyun Jin;Lee, Sahng Woon
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.2
    • /
    • pp.62.2-62.2
    • /
    • 2019
  • 한국의 고천문 자료는 삼국시대 이후 근대 조선까지 다수가 존재하여 세계적으로 드문 기록 문화를 보유하고 있으나, 한문 번역이 많이 이루어지지 않아 학술적 활용이 활발하지 못한 상태이다. 고문헌의 한문 문장 번역은 전문인력의 수작업에 의존하는 만큼 소요 시간이 길기에 투자대비 효율성이 떨어지는 편이다. 이에 최근 여러 분야에서 응용되는 인공지능의 적용을 대안으로 삼을 수 있으며, 초벌 번역 수준일지라도 자동번역기의 개발은 유용한 학술도구가 될 수 있다. 한국천문연구원은 한국정보화진흥원이 주관하는 2019년도 Information and Communication Technology 기반 공공서비스 촉진사업에 한국고전번역원과 공동 참여하여 인공신경망 기계학습이 적용된 고문헌 자동번역모델을 개발하고자 한다. 이 연구는 고천문 도메인에 특화된 인공지능 기계학습 기법으로 자동번역모델을 개발하여 이를 서비스하는 것을 목적으로 한다. 연구 방법은 크게 4가지 개발을 진행하는 것으로 나누어 볼 수 있다. 첫째, 인공지능의 학습 데이터에 해당되는 '코퍼스'를 구축하는 것이다. 이는 고문헌의 한자 원문과 한글 번역문이 쌍을 이루도록 만들어 줌으로써 학습에 최적화한 데이터를 최소 6만 개 이상 추출하는 것이다. 둘째, 추출된 학습 데이터 코퍼스를 다양한 인공지능 기계학습 기법에 적용하여 천문 분야 특수고전 도메인에 특화된 자동번역 모델을 생성하는 것이다. 셋째, 클라우드 기반에서 참여 기관별로 소장한 고문헌을 자동 번역 모델에 기반하여 도메인 특화된 모델로 도출 및 활용할 수 있는 대기관 서비스 플랫폼 구축이다. 넷째, 개발된 자동 번역기의 대국민 개방을 위해 웹과 모바일 메신저를 통해 자동 번역 서비스를 클라우드 기반으로 구축하는 것이다. 이 연구는 시스템 요구사항 분석과 정의를 바탕으로 설계가 진행 또는 일부 완료되어 구현 중에 있다. 추후 이 연구의 성능 평가는 자동번역모델 평가와 응용시스템 시험으로 나누어 진행된다. 자동번역모델은 평가용 테스트셋에 의한 자동 평가와 전문가에 의한 휴먼 평가에 따라 모델의 품질을 수치로 측정할 수 있다. 또한 응용시스템 시험은 소프트웨어 방법론의 개발 단계별 테스트를 적용한다. 이 연구를 통해 고천문 분야가 인공지능 자동번역 확산 플랫폼 시범의 첫 케이스라는 점에서 의의가 있다. 즉, 클라우드 기반으로 시스템을 구축함으로써 상대적으로 적은 초기 비용을 투자하여 활용성이 높은 한문 문장 자동 번역기라는 연구 인프라를 확보하는 첫 적용 학문 분야이다. 향후 이를 활용한 고천문 분야 학술 활동이 더욱 활발해질 것을 기대해 볼 수 있다.

  • PDF

Object Recognition Using Convolutional Neural Network in military CCTV (합성곱 신경망을 활용한 군사용 CCTV 객체 인식)

  • Ahn, Jin Woo;Kim, Dohyung;Kim, Jaeoh
    • Journal of the Korea Society for Simulation
    • /
    • v.31 no.2
    • /
    • pp.11-20
    • /
    • 2022
  • There is a critical need for AI assistance in guard operations of Army base perimeters, which is exacerbated by changes in the national defense and security environment such as force reduction. In addition, the possibility for human error inherent to perimeter guard operations attests to the need for an innovative revamp of current systems. The purpose of this study is to propose a real-time object detection AI tailored to military CCTV surveillance with three unique characteristics. First, training data suitable for situations in which relatively small objects must be recognized is used due to the characteristics of military CCTV. Second, we utilize a data augmentation algorithm suited for military context applied in the data preparation step. Third, a noise reduction algorithm is applied to account for military-specific situations, such as camouflaged targets and unfavorable weather conditions. The proposed system has been field-tested in a real-world setting, and its performance has been verified.

AI Model-Based Automated Data Cleaning for Reliable Autonomous Driving Image Datasets (자율주행 영상데이터의 신뢰도 향상을 위한 AI모델 기반 데이터 자동 정제)

  • Kana Kim;Hakil Kim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.302-313
    • /
    • 2023
  • This paper aims to develop a framework that can fully automate the quality management of training data used in large-scale Artificial Intelligence (AI) models built by the Ministry of Science and ICT (MSIT) in the 'AI Hub Data Dam' project, which has invested more than 1 trillion won since 2017. Autonomous driving technology using AI has achieved excellent performance through many studies, but it requires a large amount of high-quality data to train the model. Moreover, it is still difficult for humans to directly inspect the processed data and prove it is valid, and a model trained with erroneous data can cause fatal problems in real life. This paper presents a dataset reconstruction framework that removes abnormal data from the constructed dataset and introduces strategies to improve the performance of AI models by reconstructing them into a reliable dataset to increase the efficiency of model training. The framework's validity was verified through an experiment on the autonomous driving dataset published through the AI Hub of the National Information Society Agency (NIA). As a result, it was confirmed that it could be rebuilt as a reliable dataset from which abnormal data has been removed.

Building-up and Feasibility Study of Image Dataset of Field Construction Equipments for AI Training (인공지능 학습용 토공 건설장비 영상 데이터셋 구축 및 타당성 검토)

  • Na, Jong Ho;Shin, Hyu Soun;Lee, Jae Kang;Yun, Il Dong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.1
    • /
    • pp.99-107
    • /
    • 2023
  • Recently, the rate of death and safety accidents at construction sites is the highest among all kinds of industries. In order to apply artificial intelligence technology to construction sites, it is essential to secure a dataset which can be used as a basic training data. In this paper, a number of image data were collected through actual construction site, for which major construction equipment objects mainly operated in civil engineering sites were defined. The optimal training dataset construction was completed by annotation process of about 90,000 image dataset. Reliability of the dataset was verified with the mAP of over 90 % in use of YOLO, a representative model in the field of object detection. The construction equipment training dataset built in this study has been released which is currently available on the public data portal of the Ministry of Public Administration and Security. This dataset is expected to be freely used for any application of object detection technology on construction sites especially in the field of construction safety in the future.

Error Analysis of Recent Conversational Agent-based Commercialization Education Platform (최신 대화형 에이전트 기반 상용화 교육 플랫폼 오류 분석)

  • Lee, Seungjun;Park, Chanjun;Seo, Jaehyung;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.3
    • /
    • pp.11-22
    • /
    • 2022
  • Recently, research and development using various Artificial Intelligence (AI) technologies are being conducted in the field of education. Among the AI in Education (AIEd), conversational agents are not limited by time and space, and can learn more effectively by combining them with various AI technologies such as voice recognition and translation. This paper conducted a trend analysis on platforms that have a large number of users and used conversational agents for English learning among commercialized application. Currently commercialized educational platforms using conversational agent through trend analysis has several limitations and problems. To analyze specific problems and limitations, a comparative experiment was conducted with the latest pre-trained large-capacity dialogue model. Sensibleness and Specificity Average (SSA) human evaluation was conducted to evaluate conversational human-likeness. Based on the experiment, this paper propose the need for trained with large-capacity parameters dialogue models, educational data, and information retrieval functions for effective English conversation learning.

Fruit price prediction study using artificial intelligence (인공지능을 이용한 과일 가격 예측 모델 연구)

  • Im, Jin-mo;Kim, Weol-Youg;Byoun, Woo-Jin;Shin, Seung-Jung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.4 no.2
    • /
    • pp.197-204
    • /
    • 2018
  • One of the hottest issues in our 21st century is AI. Just as the automation of manual labor has been achieved through the Industrial Revolution in the agricultural society, the intelligence information society has come through the SW Revolution in the information society. With the advent of Google 'Alpha Go', the computer has learned and predicted its own machine learning, and now the time has come for the computer to surpass the human, even to the world of Baduk, in other words, the computer. Machine learning ML (machine learning) is a field of artificial intelligence. Machine learning ML (machine learning) is a field of artificial intelligence, which means that AI technology is developed to allow the computer to learn by itself. The time has come when computers are beyond human beings. Many companies use machine learning, for example, to keep learning images on Facebook, and then telling them who they are. We also used a neural network to build an efficient energy usage model for Google's data center optimization. As another example, Microsoft's real-time interpretation model is a more sophisticated translation model as the language-related input data increases through translation learning. As machine learning has been increasingly used in many fields, we have to jump into the AI industry to move forward in our 21st century society.

A Study on Recognition of Artificial Intelligence Utilizing Big Data Analysis (빅데이터 분석을 활용한 인공지능 인식에 관한 연구)

  • Nam, Soo-Tai;Kim, Do-Goan;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.129-130
    • /
    • 2018
  • Big data analysis is a technique for effectively analyzing unstructured data such as the Internet, social network services, web documents generated in the mobile environment, e-mail, and social data, as well as well formed structured data in a database. The most big data analysis techniques are data mining, machine learning, natural language processing, and pattern recognition, which were used in existing statistics and computer science. Global research institutes have identified analysis of big data as the most noteworthy new technology since 2011. Therefore, companies in most industries are making efforts to create new value through the application of big data. In this study, we analyzed using the Social Matrics which a big data analysis tool of Daum communications. We analyzed public perceptions of "Artificial Intelligence" keyword, one month as of May 19, 2018. The results of the big data analysis are as follows. First, the 1st related search keyword of the keyword of the "Artificial Intelligence" has been found to be technology (4,122). This study suggests theoretical implications based on the results.

  • PDF