• Title/Summary/Keyword: AI Training Data

Search Result 276, Processing Time 0.027 seconds

Implementation of a data collection system for big data analysis and learning based on infant body temperature data (영유아 체온 데이터 기반 빅데이터 분석 및 학습을 위한 데이터 수집 시스템 구현)

  • Lee, Hyoun-Sup;Heo, Gyeongyong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.577-578
    • /
    • 2021
  • Recently, artificial intelligence systems are being used in various fields. The accuracy of the decision algorithm of artificial intelligence is greatly affected by the amount of learning and the accuracy of the learning data. In the case of the amount of learning, a large amount of data is required because it has a decisive effect on the performance of AI. In this paper, we propose a data collection system for constructing a system that analyzes future conditions and changes in infants' conditions based on the body temperature data of infants and toddlers. The proposed system is a system that collects and transmits data, and it is believed that it can minimize the resource consumption of the server system in existing big data analysis and training data construction.

  • PDF

Generating Training Dataset of Machine Learning Model for Context-Awareness in a Health Status Notification Service (사용자 건강 상태알림 서비스의 상황인지를 위한 기계학습 모델의 학습 데이터 생성 방법)

  • Mun, Jong Hyeok;Choi, Jong Sun;Choi, Jae Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.1
    • /
    • pp.25-32
    • /
    • 2020
  • In the context-aware system, rule-based AI technology has been used in the abstraction process for getting context information. However, the rules are complicated by the diversification of user requirements for the service and also data usage is increased. Therefore, there are some technical limitations to maintain rule-based models and to process unstructured data. To overcome these limitations, many studies have applied machine learning techniques to Context-aware systems. In order to utilize this machine learning-based model in the context-aware system, a management process of periodically injecting training data is required. In the previous study on the machine learning based context awareness system, a series of management processes such as the generation and provision of learning data for operating several machine learning models were considered, but the method was limited to the applied system. In this paper, we propose a training data generating method of a machine learning model to extend the machine learning based context-aware system. The proposed method define the training data generating model that can reflect the requirements of the machine learning models and generate the training data for each machine learning model. In the experiment, the training data generating model is defined based on the training data generating schema of the cardiac status analysis model for older in health status notification service, and the training data is generated by applying the model defined in the real environment of the software. In addition, it shows the process of comparing the accuracy by learning the training data generated in the machine learning model, and applied to verify the validity of the generated learning data.

A Methodology of AI Learning Model Construction for Intelligent Coastal Surveillance (해안 경계 지능화를 위한 AI학습 모델 구축 방안)

  • Han, Changhee;Kim, Jong-Hwan;Cha, Jinho;Lee, Jongkwan;Jung, Yunyoung;Park, Jinseon;Kim, Youngtaek;Kim, Youngchan;Ha, Jeeseung;Lee, Kanguk;Kim, Yoonsung;Bang, Sungwan
    • Journal of Internet Computing and Services
    • /
    • v.23 no.1
    • /
    • pp.77-86
    • /
    • 2022
  • The Republic of Korea is a country in which coastal surveillance is an imperative national task as it is surrounded by seas on three sides under the confrontation between South and North Korea. However, due to Defense Reform 2.0, the number of R/D (Radar) operating personnel has decreased, and the period of service has also been shortened. Moreover, there is always a possibility that a human error will occur. This paper presents specific guidelines for developing an AI learning model for the intelligent coastal surveillance system. We present a three-step strategy to realize the guidelines. The first stage is a typical stage of building an AI learning model, including data collection, storage, filtering, purification, and data transformation. In the second stage, R/D signal analysis is first performed. Subsequently, AI learning model development for classifying real and false images, coastal area analysis, and vulnerable area/time analysis are performed. In the final stage, validation, visualization, and demonstration of the AI learning model are performed. Through this research, the first achievement of making the existing weapon system intelligent by applying the application of AI technology was achieved.

Gradient Descent Training Method for Optimizing Data Prediction Models (데이터 예측 모델 최적화를 위한 경사하강법 교육 방법)

  • Hur, Kyeong
    • Journal of Practical Engineering Education
    • /
    • v.14 no.2
    • /
    • pp.305-312
    • /
    • 2022
  • In this paper, we focused on training to create and optimize a basic data prediction model. And we proposed a gradient descent training method of machine learning that is widely used to optimize data prediction models. It visually shows the entire operation process of gradient descent used in the process of optimizing parameter values required for data prediction models by applying the differential method and teaches the effective use of mathematical differentiation in machine learning. In order to visually explain the entire operation process of gradient descent, we implement gradient descent SW in a spreadsheet. In this paper, first, a two-variable gradient descent training method is presented, and the accuracy of the two-variable data prediction model is verified by comparison with the error least squares method. Second, a three-variable gradient descent training method is presented and the accuracy of a three-variable data prediction model is verified. Afterwards, the direction of the optimization practice for gradient descent was presented, and the educational effect of the proposed gradient descent method was analyzed through the results of satisfaction with education for non-majors.

Construction of Artificial Intelligence Training Platform for Multi-Center Clinical Research (다기관 임상연구를 위한 인공지능 학습 플랫폼 구축)

  • Lee, Chung-Sub;Kim, Ji-Eon;No, Si-Hyeong;Kim, Tae-Hoon;Yoon, Kwon-Ha;Jeong, Chang-Won
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.239-246
    • /
    • 2020
  • In the medical field where artificial intelligence technology is introduced, research related to clinical decision support system(CDSS) in relation to diagnosis and prediction is actively being conducted. In particular, medical imaging-based disease diagnosis area applied AI technologies at various products. However, medical imaging data consists of inconsistent data, and it is a reality that it takes considerable time to prepare and use it for research. This paper describes a one-stop AI learning platform for converting to medical image standard R_CDM(Radiology Common Data Model) and supporting AI algorithm development research based on the dataset. To this, the focus is on linking with the existing CDM(common data model) and model the system, including the schema of the medical imaging standard model and report information for multi-center research based on DICOM(Digital Imaging and Communications in Medicine) tag information. And also, we show the execution results based on generated datasets through the AI learning platform. As a proposed platform, it is expected to be used for various image-based artificial intelligence researches.

A Study on the Medical Application and Personal Information Protection of Generative AI (생성형 AI의 의료적 활용과 개인정보보호)

  • Lee, Sookyoung
    • The Korean Society of Law and Medicine
    • /
    • v.24 no.4
    • /
    • pp.67-101
    • /
    • 2023
  • The utilization of generative AI in the medical field is also being rapidly researched. Access to vast data sets reduces the time and energy spent in selecting information. However, as the effort put into content creation decreases, there is a greater likelihood of associated issues arising. For example, with generative AI, users must discern the accuracy of results themselves, as these AIs learn from data within a set period and generate outcomes. While the answers may appear plausible, their sources are often unclear, making it challenging to determine their veracity. Additionally, the possibility of presenting results from a biased or distorted perspective cannot be discounted at present on ethical grounds. Despite these concerns, the field of generative AI is continually advancing, with an increasing number of users leveraging it in various sectors, including biomedical and life sciences. This raises important legal considerations regarding who bears responsibility and to what extent for any damages caused by these high-performance AI algorithms. A general overview of issues with generative AI includes those discussed above, but another perspective arises from its fundamental nature as a large-scale language model ('LLM') AI. There is a civil law concern regarding "the memorization of training data within artificial neural networks and its subsequent reproduction". Medical data, by nature, often reflects personal characteristics of patients, potentially leading to issues such as the regeneration of personal information. The extensive application of generative AI in scenarios beyond traditional AI brings forth the possibility of legal challenges that cannot be ignored. Upon examining the technical characteristics of generative AI and focusing on legal issues, especially concerning the protection of personal information, it's evident that current laws regarding personal information protection, particularly in the context of health and medical data utilization, are inadequate. These laws provide processes for anonymizing and de-identification, specific personal information but fall short when generative AI is applied as software in medical devices. To address the functionalities of generative AI in clinical software, a reevaluation and adjustment of existing laws for the protection of personal information are imperative.

Vulnerability Threat Classification Based on XLNET AND ST5-XXL model

  • Chae-Rim Hong;Jin-Keun Hong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.262-273
    • /
    • 2024
  • We provide a detailed analysis of the data processing and model training process for vulnerability classification using Transformer-based language models, especially sentence text-to-text transformers (ST5)-XXL and XLNet. The main purpose of this study is to compare the performance of the two models, identify the strengths and weaknesses of each, and determine the optimal learning rate to increase the efficiency and stability of model training. We performed data preprocessing, constructed and trained models, and evaluated performance based on data sets with various characteristics. We confirmed that the XLNet model showed excellent performance at learning rates of 1e-05 and 1e-04 and had a significantly lower loss value than the ST5-XXL model. This indicates that XLNet is more efficient for learning. Additionally, we confirmed in our study that learning rate has a significant impact on model performance. The results of the study highlight the usefulness of ST5-XXL and XLNet models in the task of classifying security vulnerabilities and highlight the importance of setting an appropriate learning rate. Future research should include more comprehensive analyzes using diverse data sets and additional models.

A New Study on Vibration Data Acquisition and Intelligent Fault Diagnostic System for Aero-engine

  • Ding, Yongshan;Jiang, Dongxiang
    • Proceedings of the Korean Society of Propulsion Engineers Conference
    • /
    • 2008.03a
    • /
    • pp.16-21
    • /
    • 2008
  • Aero-engine, as one kind of rotating machinery with complex structure and high rotating speed, has complicated vibration faults. Therefore, condition monitoring and fault diagnosis system is very important for airplane security. In this paper, a vibration data acquisition and intelligent fault diagnosis system is introduced. First, the vibration data acquisition part is described in detail. This part consists of hardware acquisition modules and software analysis modules which can realize real-time data acquisition and analysis, off-line data analysis, trend analysis, fault simulation and graphical result display. The acquisition vibration data are prepared for the following intelligent fault diagnosis. Secondly, two advanced artificial intelligent(AI) methods, mapping-based and rule-based, are discussed. One is artificial neural network(ANN) which is an ideal tool for aero-engine fault diagnosis and has strong ability to learn complex nonlinear functions. The other is data mining, another AI method, has advantages of discovering knowledge from massive data and automatically extracting diagnostic rules. Thirdly, lots of historical data are used for training the ANN and extracting rules by data mining. Then, real-time data are input into the trained ANN for mapping-based fault diagnosis. At the same time, extracted rules are revised by expert experience and used for rule-based fault diagnosis. From the results of the experiments, the conclusion is obvious that both the two AI methods are effective on aero-engine vibration fault diagnosis, while each of them has its individual quality. The whole system can be developed in local vibration monitoring and real-time fault diagnosis for aero-engine.

  • PDF

Class 1·3 Vehicle Classification Using Deep Learning and Thermal Image (열화상 카메라를 활용한 딥러닝 기반의 1·3종 차량 분류)

  • Jung, Yoo Seok;Jung, Do Young
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.6
    • /
    • pp.96-106
    • /
    • 2020
  • To solve the limitation of traffic monitoring that occur from embedded sensor such as loop and piezo sensors, the thermal imaging camera was installed on the roadside. As the length of Class 1(passenger car) is getting longer, it is becoming difficult to classify from Class 3(2-axle truck) by using an embedded sensor. The collected images were labeled to generate training data. A total of 17,536 vehicle images (640x480 pixels) training data were produced. CNN (Convolutional Neural Network) was used to achieve vehicle classification based on thermal image. Based on the limited data volume and quality, a classification accuracy of 97.7% was achieved. It shows the possibility of traffic monitoring system based on AI. If more learning data is collected in the future, 12-class classification will be possible. Also, AI-based traffic monitoring will be able to classify not only 12-class, but also new various class such as eco-friendly vehicles, vehicle in violation, motorcycles, etc. Which can be used as statistical data for national policy, research, and industry.

Study on OCR Enhancement of Homomorphic Filtering with Adaptive Gamma Value

  • Heeyeon Jo;Jeongwoo Lee;Hongrae Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.101-108
    • /
    • 2024
  • AI-OCR (Artificial Intelligence Optical Character Recognition) combines OCR technology with Artificial Intelligence to overcome limitations that required human intervention. To enhance the performance of AI-OCR, training on diverse data sets is essential. However, the recognition rate declines when image colors have similar brightness levels. To solve this issue, this study employs Homomorphic filtering as a preprocessing step to clearly differentiate color levels, thereby increasing text recognition rates. While Homomorphic filtering is ideal for text extraction because of its ability to adjust the high and low frequency components of an image separately using a gamma value, it has the downside of requiring manual adjustments to the gamma value. This research proposes a range for gamma threshold values based on tests involving image contrast, brightness, and entropy. Experimental results using the proposed range of gamma values in Homomorphic filtering suggest a high likelihood for effective AI-OCR performance.