• Title/Summary/Keyword: AI Training Data

Search Result 276, Processing Time 0.02 seconds

Criteria for implementing artificial intelligence systems in reproductive medicine

  • Enric Guell
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.51 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • This review article discusses the integration of artificial intelligence (AI) in assisted reproductive technology and provides key concepts to consider when introducing AI systems into reproductive medicine practices. The article highlights the various applications of AI in reproductive medicine and discusses whether to use commercial or in-house AI systems. This review also provides criteria for implementing new AI systems in the laboratory and discusses the factors that should be considered when introducing AI in the laboratory, including the user interface, scalability, training, support, follow-up, cost, ethics, and data quality. The article emphasises the importance of ethical considerations, data quality, and continuous algorithm updates to ensure the accuracy and safety of AI systems.

Generating and Validating Synthetic Training Data for Predicting Bankruptcy of Individual Businesses

  • Hong, Dong-Suk;Baik, Cheol
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.4
    • /
    • pp.228-233
    • /
    • 2021
  • In this study, we analyze the credit information (loan, delinquency information, etc.) of individual business owners to generate voluminous training data to establish a bankruptcy prediction model through a partial synthetic training technique. Furthermore, we evaluate the prediction performance of the newly generated data compared to the actual data. When using conditional tabular generative adversarial networks (CTGAN)-based training data generated by the experimental results (a logistic regression task), the recall is improved by 1.75 times compared to that obtained using the actual data. The probability that both the actual and generated data are sampled over an identical distribution is verified to be much higher than 80%. Providing artificial intelligence training data through data synthesis in the fields of credit rating and default risk prediction of individual businesses, which have not been relatively active in research, promotes further in-depth research efforts focused on utilizing such methods.

Method for improving video/image data quality for AI learning of unstructured data (비정형데이터의 AI학습을 위한 영상/이미지 데이터 품질 향상 방법)

  • Kim Seung Hee;Dongju Ryu
    • Convergence Security Journal
    • /
    • v.23 no.2
    • /
    • pp.55-66
    • /
    • 2023
  • Recently, there is an increasing movement to increase the value of AI learning data and to secure high-quality data based on previous research on AI learning data in all areas of society. Therefore, quality management is very important in construction projects to secure high-quality data. In this paper, quality management to secure high-quality data when building AI learning data and improvement plans for each construction process are presented. In particular, more than 80% of the data quality of unstructured data built for AI learning is determined during the construction process. In this paper, we performed quality inspection of image/video data. In addition, we identified inspection procedures and problem elements that occurred in the construction phases of acquisition, data cleaning, labeling, and models, and suggested ways to secure high-quality data by solving them. Through this, it is expected that it will be an alternative to overcome the quality deviation of data for research groups and operators participating in the construction of AI learning data.

Technical Trends in Hyperscale Artificial Intelligence Processors (초거대 인공지능 프로세서 반도체 기술 개발 동향)

  • W. Jeon;C.G. Lyuh
    • Electronics and Telecommunications Trends
    • /
    • v.38 no.5
    • /
    • pp.1-11
    • /
    • 2023
  • The emergence of generative hyperscale artificial intelligence (AI) has enabled new services, such as image-generating AI and conversational AI based on large language models. Such services likely lead to the influx of numerous users, who cannot be handled using conventional AI models. Furthermore, the exponential increase in training data, computations, and high user demand of AI models has led to intensive hardware resource consumption, highlighting the need to develop domain-specific semiconductors for hyperscale AI. In this technical report, we describe development trends in technologies for hyperscale AI processors pursued by domestic and foreign semiconductor companies, such as NVIDIA, Graphcore, Tesla, Google, Meta, SAPEON, FuriosaAI, and Rebellions.

An Analysis Study of SW·AI elements of Primary Textbooks based on the 2015 Revised National Curriculum (2015 개정교육과정에 따른 초등학교 교과서의 SW·AI 요소 분석 연구)

  • Park, SunJu
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.2
    • /
    • pp.317-325
    • /
    • 2021
  • In this paper, the degree of reflection of SW·AI elements and CT elements was investigated and analyzed for a total of 44 textbooks of Korean, social, moral, mathematics and science textbooks based on the 2015 revised curriculum. As a result of the analysis, most of the activities of data collection, data analysis, and data presentation, which are ICT elements, were not reflected, and algorithm and programming elements were not reflected among SW·AI content elements, and there were no abstraction, automation, and generalization elements among CT elements. Therefore, in order to effectively implement SW·AI convergence education in elementary school subjects, we will expand ICT utilization activities to SW·AI utilization activities. Training on the understanding of SW·AI convergence education and improvement of teaching and learning methods using SW·AI is needed for teachers. In addition, it is necessary to establish an information curriculum and secure separate class hours for substantial SW·AI education.

Efficient Hangul Word Processor (HWP) Malware Detection Using Semi-Supervised Learning with Augmented Data Utility Valuation (효율적인 HWP 악성코드 탐지를 위한 데이터 유용성 검증 및 확보 기반 준지도학습 기법)

  • JinHyuk Son;Gihyuk Ko;Ho-Mook Cho;Young-Kuk Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.71-82
    • /
    • 2024
  • With the advancement of information and communication technology (ICT), the use of electronic document types such as PDF, MS Office, and HWP files has increased. Such trend has led the cyber attackers increasingly try to spread malicious documents through e-mails and messengers. To counter such attacks, AI-based methodologies have been actively employed in order to detect malicious document files. The main challenge in detecting malicious HWP(Hangul Word Processor) files is the lack of quality dataset due to its usage is limited in Korea, compared to PDF and MS-Office files that are highly being utilized worldwide. To address this limitation, data augmentation have been proposed to diversify training data by transforming existing dataset, but as the usefulness of the augmented data is not evaluated, augmented data could end up harming model's performance. In this paper, we propose an effective semi-supervised learning technique in detecting malicious HWP document files, which improves overall AI model performance via quantifying the utility of augmented data and filtering out useless training data.

Enhanced Machine Learning Preprocessing Techniques for Optimization of Semiconductor Process Data in Smart Factories (스마트 팩토리 반도체 공정 데이터 최적화를 위한 향상된 머신러닝 전처리 방법 연구)

  • Seung-Gyu Choi;Seung-Jae Lee;Choon-Sung Nam
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.4
    • /
    • pp.57-64
    • /
    • 2024
  • The introduction of Smart Factories has transformed manufacturing towards more objective and efficient line management. However, most companies are not effectively utilizing the vast amount of sensor data collected every second. This study aims to use this data to predict product quality and manage production processes efficiently. Due to security issues, specific sensor data could not be verified, so semiconductor process-related training data from the "SAMSUNG SDS Brightics AI" site was used. Data preprocessing, including removing missing values, outliers, scaling, and feature elimination, was crucial for optimal sensor data. Oversampling was used to balance the imbalanced training dataset. The SVM (rbf) model achieved high performance (Accuracy: 97.07%, GM: 96.61%), surpassing the MLP model implemented by "SAMSUNG SDS Brightics AI". This research can be applied to various topics, such as predicting component lifecycles and process conditions.

Class Classification and Type of Learning Data by Object for Smart Autonomous Delivery (스마트 자율배송을 위한 클래스 분류와 객체별 학습데이터 유형)

  • Young-Jin Kang;;Jeong, Seok Chan
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.37-47
    • /
    • 2022
  • Autonomous delivery operation data is the key to driving a paradigm shift for last-mile delivery in the Corona era. To bridge the technological gap between domestic autonomous delivery robots and overseas technology-leading countries, large-scale data collection and verification that can be used for artificial intelligence training is required as the top priority. Therefore, overseas technology-leading countries are contributing to verification and technological development by opening AI training data in public data that anyone can use. In this paper, 326 objects were collected to trainn autonomous delivery robots, and artificial intelligence models such as Mask r-CNN and Yolo v3 were trained and verified. In addition, the two models were compared based on comparison and the elements required for future autonomous delivery robot research were considered.

Color & Texture Attribute Classification System of Fashion Item Image for Standardizing Learning Data in Fashion AI (패션 AI의 학습 데이터 표준화를 위한 패션 아이템 이미지의 색채와 소재 속성 분류 체계)

  • Park, Nanghee;Choi, Yoonmi
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.44 no.2
    • /
    • pp.354-368
    • /
    • 2020
  • Accurate and versatile image data-sets are essential for fashion AI research and AI-based fashion businesses based on a systematic attribute classification system. This study constructs a color and texture attribute hierarchical classification system by collecting fashion item images and analyzing the metadata of fashion items described by consumers. Essential dimensions to explain color and texture attributes were extracted; in addition, attribute values for each dimension were constructed based on metadata and previous studies. This hierarchical classification system satisfies consistency, exclusiveness, inclusiveness, and flexibility. The image tagging to confirm the usefulness of the proposed classification system indicated that the contents of attributes of the same image differ depending on the annotator that require a clear standard for distinguishing differences between the properties. This classification system will improve the reliability of the training data for machine learning, by providing standardized criteria for tasks such as tagging and annotating of fashion items.

A Study on Tower Modeling for Artificial Intelligence Training in Artifact Restoration

  • Byong-Kwon Lee;Young-Chae Park
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.27-34
    • /
    • 2023
  • This paper studied the 3D modeling process for the restoration of the 'Three-story Stone Pagoda of Bulguksa Temple in Gyeongju', a stone pagoda from the Unified Silla Period, using artificial intelligence (AI). Existing 3D modeling methods generate numerous verts and faces, which takes a considerable amount of time for AI learning. Accordingly, a method of performing more efficient 3D modeling by lowering the number of verts and faces is required. To this end, in this study, the structure of the stone pagoda was deeply analyzed and a modeling method optimized for AI learning was studied. In addition, it is meaningful to propose a new 3D modeling methodology for the restoration of stone pagodas in Korea and to secure a data set necessary for artificial intelligence learning.