• 제목/요약/키워드: AI training data

검색결과 261건 처리시간 0.025초

A Survey of Applications of Artificial Intelligence Algorithms in Eco-environmental Modelling

  • Kim, Kang-Suk;Park, Joon-Hong
    • Environmental Engineering Research
    • /
    • 제14권2호
    • /
    • pp.102-110
    • /
    • 2009
  • Application of artificial intelligence (AI) approaches in eco-environmental modeling has gradually increased for the last decade. Comprehensive understanding and evaluation on the applicability of this approach to eco-environmental modeling are needed. In this study, we reviewed the previous studies that used AI-techniques in eco-environmental modeling. Decision Tree (DT) and Artificial Neural Network (ANN) were found to be major AI algorithms preferred by researchers in ecological and environmental modeling areas. When the effect of the size of training data on model prediction accuracy was explored using the data from the previous studies, the prediction accuracy and the size of training data showed nonlinear correlation, which was best-described by hyperbolic saturation function among the tested nonlinear functions including power and logarithmic functions. The hyperbolic saturation equations were proposed to be used as a guideline for optimizing the size of training data set, which is critically important in designing the field experiments required for training AI-based eco-environmental modeling.

랜드마크 이미지 AI 학습용 데이터 구축을 위한 메타데이터 표준 설계 방안 연구 (A Study on Designing Metadata Standard for Building AI Training Dataset of Landmark Images)

  • 김진묵
    • 한국문헌정보학회지
    • /
    • 제54권2호
    • /
    • pp.419-434
    • /
    • 2020
  • 본 연구의 목적은 랜드마크 이미지의 AI 학습용 데이터 구축을 위한 메타데이터 표준 설계 방안을 제시하기 위함이다. 이를 위해, 이미지 검색시스템의 종류와 각각의 색인 방식에 관한 최신 기술 현황을 포괄적으로 조사하여 분석하고, AI 머신러닝을 적용한 랜드마크 인식에 필수적인 학습용 공개 데이터셋과 이미지 객체 인식에 관한 기계학습 도구를 조사하였다. 이를 통해, 랜드마크 이미지 AI 학습용 데이터에 최적화된 메타데이터 요소를 선정하고 각각의 요소에 대한 입력 데이터를 정의하였다. 결론 및 제언에서는 랜드마크 인식을 활용한 추천시스템을 포함한 응용서비스 개발 방안을 논의하였다.

자율주행 영상데이터의 신뢰도 향상을 위한 AI모델 기반 데이터 자동 정제 (AI Model-Based Automated Data Cleaning for Reliable Autonomous Driving Image Datasets)

  • 김가나;김학일
    • 방송공학회논문지
    • /
    • 제28권3호
    • /
    • pp.302-313
    • /
    • 2023
  • 본 연구는 과학기술정보통신부가 2017년부터 1조원 이상을 투자한 'AI Hub 댐' 사업에서 구축된 인공지능 모델 학습데이터의 품질관리를 자동화할 수 있는 프레임워크의 개발을 목표로 한다. 자율주행 개발에 사용되는 AI 모델 학습에는 다량의 고품질의 데이터가 필요하며, 가공된 데이터를 검수자가 데이터 자체의 이상을 검수하고 유효함을 증명하는 데는 여전히 어려움이 있으며 오류가 있는 데이터로 학습된 모델은 실제 상황에서 큰 문제를 야기할 수 있다. 본 논문에서는 이상 데이터를 제거하는 신뢰할 수 있는 데이터셋 정제 프레임워크를 통해 모델의 인식 성능을 향상시키는 전략을 소개한다. 제안하는 방법은 인공지능 학습용 데이터 품질관리 가이드라인의 지표를 기반으로 설계되었다. 한국정보화진흥원의 AI Hub을 통해 공개된 자율주행 데이터셋에 대한 실험을 통해 프레임워크의 유효성을 증명하였고, 이상 데이터가 제거된 신뢰할 수 있는 데이터셋으로 재구축될 수 있음을 확인하였다.

빅데이터 양성 교육에서 리커트 척도에 따른 만족도 분석에 관한 연구 (A Study on Student Satisfaction according to Likert Scale in Big Data Training)

  • 최현
    • 한국산업융합학회 논문집
    • /
    • 제22권6호
    • /
    • pp.775-783
    • /
    • 2019
  • The big data industry market continues to grow and is expected to grow further. In this paper, based on the five-point Likert scale of college students in the process of developing big data young people, the satisfaction of instructors in big data training and improvement of job (education) ability based on AI convergence The survey was conducted on the expectations of the participants and their intention to participate in the training process for the young talents. Male students were more satisfied than students. In terms of students, students who are less than 6th semester have the highest satisfaction, but students who are less than 7th and 8th semesters are less satisfied. By department, the satisfaction level of science and statistics students was the highest, while the satisfaction level of other students was low. According to the average of college credits, the satisfaction of students under 3.5~4.0 was the highest, and the satisfaction of students below 3.0 was the lowest.

유사물체 치환증강을 통한 기동장비 물체 인식 성능 향상 (Object Detection Accuracy Improvements of Mobility Equipments through Substitution Augmentation of Similar Objects)

  • 허지성;박지훈
    • 한국군사과학기술학회지
    • /
    • 제25권3호
    • /
    • pp.300-310
    • /
    • 2022
  • A vast amount of labeled data is required for deep neural network training. A typical strategy to improve the performance of a neural network given a training data set is to use data augmentation technique. The goal of this work is to offer a novel image augmentation method for improving object detection accuracy. An object in an image is removed, and a similar object from the training data set is placed in its area. An in-painting algorithm fills the space that is eliminated but not filled by a similar object. Our technique shows at most 2.32 percent improvements on mAP in our testing on a military vehicle dataset using the YOLOv4 object detector.

A Study on the Land Cover Classification and Cross Validation of AI-based Aerial Photograph

  • Lee, Seong-Hyeok;Myeong, Soojeong;Yoon, Donghyeon;Lee, Moung-Jin
    • 대한원격탐사학회지
    • /
    • 제38권4호
    • /
    • pp.395-409
    • /
    • 2022
  • The purpose of this study is to evaluate the classification performance and applicability when land cover datasets constructed for AI training are cross validation to other areas. For study areas, Gyeongsang-do and Jeolla-do in South Korea were selected as cross validation areas, and training datasets were obtained from AI-Hub. The obtained datasets were applied to the U-Net algorithm, a semantic segmentation algorithm, for each region, and the accuracy was evaluated by applying them to the same and other test areas. There was a difference of about 13-15% in overall classification accuracy between the same and other areas. For rice field, fields and buildings, higher accuracy was shown in the Jeolla-do test areas. For roads, higher accuracy was shown in the Gyeongsang-do test areas. In terms of the difference in accuracy by weight, the result of applying the weights of Gyeongsang-do showed high accuracy for forests, while that of applying the weights of Jeolla-do showed high accuracy for dry fields. The result of land cover classification, it was found that there is a difference in classification performance of existing datasets depending on area. When constructing land cover map for AI training, it is expected that higher quality datasets can be constructed by reflecting the characteristics of various areas. This study is highly scalable from two perspectives. First, it is to apply satellite images to AI study and to the field of land cover. Second, it is expanded based on satellite images and it is possible to use a large scale area and difficult to access.

Development of Dataset Items for Commercial Space Design Applying AI

  • Jung Hwa SEO;Segeun CHUN;Ki-Pyeong, KIM
    • 한국인공지능학회지
    • /
    • 제11권1호
    • /
    • pp.25-29
    • /
    • 2023
  • In this paper, the purpose is to create a standard of AI training dataset type for commercial space design. As the market size of the field of space design continues to increase and the time spent increases indoors after COVID-19, interest in space is expanding throughout society. In addition, more and more consumers are getting used to the digital environment. Therefore, If you identify trends and preemptively propose the atmosphere and specifications that customers require quickly and easily, you can increase customer trust and conduct effective sales. As for the data set type, commercial districts were divided into a total of 8 categories, and images that could be processed were derived by refining 4,009,30MB JPG format images collected through web crawling. Then, by performing bounding and labeling operations, we developed a 'Dataset for AI Training' of 3,356 commercial space image data in CSV format with a size of 2.08MB. Through this study, elements of spatial images such as place type, space classification, and furniture can be extracted and used when developing AI algorithms, and it is expected that images requested by clients can be easily and quickly collected through spatial image input information.

AI교육 효과성 제고를 위한 AI리터러시 교육의 필요성 (Necessity of AI Literacy Education to Enhance for the Effectiveness of AI Education)

  • 양석재;신승기
    • 한국정보교육학회:학술대회논문집
    • /
    • 한국정보교육학회 2021년도 학술논문집
    • /
    • pp.295-301
    • /
    • 2021
  • 본 연구에서는 차기 개정교육과정의 개정을 앞두고 인공지능교육의 효과성을 높이기 위한 AI리터러시 교육의 필요성을 살펴보고자 하였다. 이를 위해 고등학생을 대상으로 인공지능 모델링 수업을 실시하고 인공지능교육에서 학생들이 인식하는 AI리터러시에 대한 필요성과 내용 및 교육시기 등을 설문을 통해 살펴보았다. 인공지능수업에서 데이터 활용 및 데이터 전처리의 필요성에 대해서는 대체로 동의하는 결과가 나타났으며, 인공지능 수업을 진행하는 과정에서 데이터베이스 활용에 대한 기초역량이 부족하여 어려움을 겪는 경우가 많았다. 특히, 데이터 분석을 위한 파일의 구조에 대한 이해가 부족하였으며 데이터분석을 위한 데이터저장의 형태에 대한 이해도가 낮은 것으로 관찰되었다. 이러한 부분을 극복하기 위하여 데이터처리를 위한 사전교육의 필요성을 인식하였고, 그 시기로는 대체적으로 고등학교 진학 이전이 적절하다는 의견이 많았다. AI리터러시의 내용요소에 대해서는 데이터 생성 및 삭제를 비롯하여 데이터 변형과 함께 데이터 시각화의 내용에 대한 요구가 높았음을 알 수 있었다.

  • PDF

Criteria for implementing artificial intelligence systems in reproductive medicine

  • Enric Guell
    • Clinical and Experimental Reproductive Medicine
    • /
    • 제51권1호
    • /
    • pp.1-12
    • /
    • 2024
  • This review article discusses the integration of artificial intelligence (AI) in assisted reproductive technology and provides key concepts to consider when introducing AI systems into reproductive medicine practices. The article highlights the various applications of AI in reproductive medicine and discusses whether to use commercial or in-house AI systems. This review also provides criteria for implementing new AI systems in the laboratory and discusses the factors that should be considered when introducing AI in the laboratory, including the user interface, scalability, training, support, follow-up, cost, ethics, and data quality. The article emphasises the importance of ethical considerations, data quality, and continuous algorithm updates to ensure the accuracy and safety of AI systems.

Generating and Validating Synthetic Training Data for Predicting Bankruptcy of Individual Businesses

  • Hong, Dong-Suk;Baik, Cheol
    • Journal of information and communication convergence engineering
    • /
    • 제19권4호
    • /
    • pp.228-233
    • /
    • 2021
  • In this study, we analyze the credit information (loan, delinquency information, etc.) of individual business owners to generate voluminous training data to establish a bankruptcy prediction model through a partial synthetic training technique. Furthermore, we evaluate the prediction performance of the newly generated data compared to the actual data. When using conditional tabular generative adversarial networks (CTGAN)-based training data generated by the experimental results (a logistic regression task), the recall is improved by 1.75 times compared to that obtained using the actual data. The probability that both the actual and generated data are sampled over an identical distribution is verified to be much higher than 80%. Providing artificial intelligence training data through data synthesis in the fields of credit rating and default risk prediction of individual businesses, which have not been relatively active in research, promotes further in-depth research efforts focused on utilizing such methods.