• Title/Summary/Keyword: Automated Machine Learning (AutoML)

Search Result 23, Processing Time 0.033 seconds

Prediction of medication-related osteonecrosis of the jaw (MRONJ) using automated machine learning in patients with osteoporosis associated with dental extraction and implantation: a retrospective study

  • Da Woon Kwack;Sung Min Park
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.49 no.3
    • /
    • pp.135-141
    • /
    • 2023
  • Objectives: This study aimed to develop and validate machine learning (ML) models using H2O-AutoML, an automated ML program, for predicting medication-related osteonecrosis of the jaw (MRONJ) in patients with osteoporosis undergoing tooth extraction or implantation. Patients and Methods: We conducted a retrospective chart review of 340 patients who visited Dankook University Dental Hospital between January 2019 and June 2022 who met the following inclusion criteria: female, age ≥55 years, osteoporosis treated with antiresorptive therapy, and recent dental extraction or implantation. We considered medication administration and duration, demographics, and systemic factors (age and medical history). Local factors, such as surgical method, number of operated teeth, and operation area, were also included. Six algorithms were used to generate the MRONJ prediction model. Results: Gradient boosting demonstrated the best diagnostic accuracy, with an area under the receiver operating characteristic curve (AUC) of 0.8283. Validation with the test dataset yielded a stable AUC of 0.7526. Variable importance analysis identified duration of medication as the most important variable, followed by age, number of teeth operated, and operation site. Conclusion: ML models can help predict MRONJ occurrence in patients with osteoporosis undergoing tooth extraction or implantation based on questionnaire data acquired at the first visit.

A study on automated soil moisture monitoring methods for the Korean peninsula based on Google Earth Engine (Google Earth Engine 기반의 한반도 토양수분 모니터링 자동화 기법 연구)

  • Jang, Wonjin;Chung, Jeehun;Lee, Yonggwan;Kim, Jinuk;Kim, Seongjoon
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.9
    • /
    • pp.615-626
    • /
    • 2024
  • To accurately and efficiently monitor soil moisture (SM) across South Korea, this study developed a SM estimation model that integrates the cloud computing platform Google Earth Engine (GEE) and Automated Machine Learning (AutoML). Various spatial information was utilized based on Terra MODIS (Moderate Resolution Imaging Spectroradiometer) and the global precipitation observation satellite GPM (Global Precipitation Measurement) to test optimal input data combinations. The results indicated that GPM-based accumulated dry-days, 5-day antecedent average precipitation, NDVI (Normalized Difference Vegetation Index), the sum of LST (Land Surface Temperature) acquired during nighttime and daytime, soil properties (sand and clay content, bulk density), terrain data (elevation and slope), and seasonal classification had high feature importance. After setting the objective function (Determination of coefficient, R2 ; Root Mean Square Error, RMSE; Mean Absolute Percent Error, MAPE) using AutoML for the combination of the aforementioned data, a comparative evaluation of machine learning techniques was conducted. The results revealed that tree-based models exhibited high performance, with Random Forest demonstrating the best performance (R2 : 0.72, RMSE: 2.70 vol%, MAPE: 0.14).

Development of Big Data and AutoML Platforms for Smart Plants (스마트 플랜트를 위한 빅데이터 및 AutoML 플랫폼 개발)

  • Jin-Young Kang;Byeong-Seok Jeong
    • The Journal of Bigdata
    • /
    • v.8 no.2
    • /
    • pp.83-95
    • /
    • 2023
  • Big data analytics and AI play a critical role in the development of smart plants. This study presents a big data platform for plant data and an 'AutoML platform' for AI-based plant O&M(Operation and Maintenance). The big data platform collects, processes and stores large volumes of data generated in plants using Hadoop, Spark, and Kafka. The AutoML platform is a machine learning automation system aimed at constructing predictive models for equipment prognostics and process optimization in plants. The developed platforms configures a data pipeline considering compatibility with existing plant OISs(Operation Information Systems) and employs a web-based GUI to enhance both accessibility and convenience for users. Also, it has functions to load user-customizable modules into data processing and learning algorithms, which increases process flexibility. This paper demonstrates the operation of the platforms for a specific process of an oil company in Korea and presents an example of an effective data utilization platform for smart plants.

Dam Inflow Prediction and Evaluation Using Hybrid Auto-sklearn Ensemble Model (하이브리드 Auto-sklearn 앙상블 모델을 이용한 댐 유입량 예측 및 평가)

  • Lee, Seoro;Bae, Joo Hyun;Lee, Gwanjae;Yang, Dongseok;Hong, Jiyeong;Kim, Jonggun;Lim, Kyoung Jae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.307-307
    • /
    • 2022
  • 최근 기후변화와 댐 상류 토지이용 변화 등과 같은 다양한 원인에 의해 댐 유입량의 변동성이 증가하면서 댐 관리 및 운영조작 의사 결정에 어려움이 발생하고 있다. 따라서 이러한 댐 유입량의 변동 특성을 반영하여 댐 유입량을 정확하고 효율적으로 예측할 수 있는 방안이 필요한 실정이다. 머신러닝 기술이 발전하면서 Auto-ML(Automated Machine Learning)이 다양한 분야에서 활용되고 있다. Auto-ML은 데이터 전처리, 최적 알고리즘 선택, 하이퍼파라미터 튜닝, 모델 학습 및 평가 등의 모든 과정을 자동화하는 기술이다. 그러나 아직까지 수문 분야에서 댐 유입량을 예측하기 위한 모델을 개발하는데 있어서 Auto-ML을 활용한 사례는 부족하고, 특히 댐 유입량의 예측 정확성을 확보하기 위해 High-inflow and low-inflow 의 변동 특성을 고려한 하이브리드 결합 방식을 통해 Auto-ML 기반 앙상블 모델을 개발하고 평가한 연구는 없다. 본 연구에서는 Auto-ML의 패키지 중 Auto-sklearn을 통해 홍수기, 비홍수기 유입량 변동 특성을 반영한 하이브리드 앙상블 댐 유입량 예측 모델을 개발하였다. 소양강댐을 대상으로 적용한 결과, 하이브리드 Auto-sklearn 앙상블 모델의 댐 유입량 예측 성능은 R2 0.868, RMSE 66.23 m3/s, MAE 16.45 m3/s로 단일 Auto-sklearn을 통해 구축 된 앙상블 모델보다 전반적으로 우수한 것으로 나타났다. 특히 FDC (Flow Duration Curve)의 저수기, 갈수기 구간에서 두 모델의 유입량 예측 경향은 큰 차이를 보였으며, 하이브리드 Auto-sklearn 모델의 예측 값이 관측 값과 더욱 유사한 것으로 나타났다. 이는 홍수기, 비홍수기 구간에 대한 앙상블 모델이 독립적으로 구축되는 과정에서 각 모델에 대한 하이퍼파라미터가 최적화되었기 때문이라 판단된다. 향후 본 연구의 방법론은 보다 정확한 댐 유입량 예측 자료를 생성하기 위한 방안 수립뿐만 아니라 다양한 분야의 불균형한 데이터셋을 이용한 앙상블 모델을 구축하는데도 유용하게 활용될 수 있을 것으로 사료된다.

  • PDF

AutoFe-Sel: A Meta-learning based methodology for Recommending Feature Subset Selection Algorithms

  • Irfan Khan;Xianchao Zhang;Ramesh Kumar Ayyasam;Rahman Ali
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1773-1793
    • /
    • 2023
  • Automated machine learning, often referred to as "AutoML," is the process of automating the time-consuming and iterative procedures that are associated with the building of machine learning models. There have been significant contributions in this area across a number of different stages of accomplishing a data-mining task, including model selection, hyper-parameter optimization, and preprocessing method selection. Among them, preprocessing method selection is a relatively new and fast growing research area. The current work is focused on the recommendation of preprocessing methods, i.e., feature subset selection (FSS) algorithms. One limitation in the existing studies regarding FSS algorithm recommendation is the use of a single learner for meta-modeling, which restricts its capabilities in the metamodeling. Moreover, the meta-modeling in the existing studies is typically based on a single group of data characterization measures (DCMs). Nonetheless, there are a number of complementary DCM groups, and their combination will allow them to leverage their diversity, resulting in improved meta-modeling. This study aims to address these limitations by proposing an architecture for preprocess method selection that uses ensemble learning for meta-modeling, namely AutoFE-Sel. To evaluate the proposed method, we performed an extensive experimental evaluation involving 8 FSS algorithms, 3 groups of DCMs, and 125 datasets. Results show that the proposed method achieves better performance compared to three baseline methods. The proposed architecture can also be easily extended to other preprocessing method selections, e.g., noise-filter selection and imbalance handling method selection.

Preliminary Test of Google Vertex Artificial Intelligence in Root Dental X-ray Imaging Diagnosis (구글 버텍스 AI을 이용한 치과 X선 영상진단 유용성 평가)

  • Hyun-Ja Jeong
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.3
    • /
    • pp.267-273
    • /
    • 2024
  • Using a cloud-based vertex AI platform that can develop an artificial intelligence learning model without coding, this study easily developed an artificial intelligence learning model by the non-professional general public and confirmed its clinical applicability. Nine dental diseases and 2,999 root disease X-ray images released on the Kaggle site were used for the learning data, and learning, verification, and test data images were randomly classified. Image classification and multi-label learning were performed through hyper-parameter tuning work using a learning pipeline in vertex AI's basic learning model workflow. As a result of performing AutoML(Automated Machine Learning), AUC(Area Under Curve) was found to be 0.967, precision was 95.6%, and reproduction rate was 95.2%. It was confirmed that the learned artificial intelligence model was sufficient for clinical diagnosis.

Prediction of intensive care unit admission using machine learning in patients with odontogenic infection

  • Joo-Ha Yoon;Sung Min Park
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.50 no.4
    • /
    • pp.216-221
    • /
    • 2024
  • Objectives: This study aimed to develop and validate a model to predict the need for intensive care unit (ICU) admission in patients with dental infections using an automated machine learning (ML) program called H2O-AutoML. Materials and Methods: Two models were created using only the information available at the initial examination. Model 1 was parameterized with only clinical symptoms and blood tests, excluding contrast-enhanced multi-detector computed tomography (MDCT) images available at the initial visit, whereas model 2 was created with the addition of the MDCT information to the model 1 parameters. Although model 2 was expected to be superior to model 1, we wanted to independently determine this conclusion. A total of 210 patients who visited the Department of Oral and Maxillofacial Surgery at the Dankook University Dental Hospital from March 2013 to August 2023 was included in this study. The patients' demographic characteristics (sex, age, and place of residence), systemic factors (hypertension, diabetes mellitus [DM], kidney disease, liver disease, heart disease, anticoagulation therapy, and osteoporosis), local factors (smoking status, site of infection, postoperative wound infection, dysphagia, odynophagia, and trismus), and factors known from initial blood tests were obtained from their medical charts and retrospectively reviewed. Results: The generalized linear model algorithm provided the best diagnostic accuracy, with an area under the receiver operating characteristic values of 0.8289 in model 1 and 0.8415 in model 2. In both models, the C-reactive protein level was the most important variable, followed by DM. Conclusion: This study provides unprecedented data on the use of ML for successful prediction of ICU admission based on initial examination results. These findings will considerably contribute to the development of the field of dentistry, especially oral and maxillofacial surgery.

Automated Machine Learning-Based Solar PV Forecasting Considering Solar Position Information (태양 위치 정보를 고려한 AutoML 기반의 태양광 발전량 예측)

  • Jinyeong Oh;Dayeong So;Byeongcheon Lee;Jihoon Moon
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.322-323
    • /
    • 2023
  • 지속 가능한 에너지인 태양광 발전은 전 세계에서 널리 활용하는 재생 에너지 원천 중 하나로 최근 효율적인 태양광 발전 시스템 운영을 위해 태양광 발전량을 정확하게 예측하기 위한 연구가 활발히 진행되고 있다. 태양광 발전량 예측 모델을 구성하기 위해서는 기상 및 대기 환경을 넘어 태양의 위치에 따른 일사량의 정보가 필수적이나 태양의 실시간 위치 정보를 입력 변수로 활용한 연구가 부족한 실정이다. 그리하여 본 논문에서는 시간과 태양광 발전소 위치를 기반으로 태양의 고도와 방위각을 실시간으로 계산하여 입력 변수로 사용하는 방식을 제안한다. 이를 위해 AutoML 기반의 다양한 기계학습 모델을 구성하여 태양광 발전율을 예측하고 그 성능을 비교 분석하였다. 실험 결과, 태양 위치 정보를 포함한 경우에 환경 변수만을 고려하였을 때보다 예측 성능이 크게 향상되었음을 확인할 수 있었으며, Extra Trees 모델의 경우 태양 위치 정보를 추가하였을 때 MAE(Mean Absolute Error)가 33.90 에서 22.38 까지 낮아지는 결과를 확인하였다.

Gap-Filling of Sentinel-2 NDVI Using Sentinel-1 Radar Vegetation Indices and AutoML (Sentinel-1 레이더 식생지수와 AutoML을 이용한 Sentinel-2 NDVI 결측화소 복원)

  • Youjeong Youn;Jonggu Kang;Seoyeon Kim;Yemin Jeong;Soyeon Choi;Yungyo Im;Youngmin Seo;Myoungsoo Won;Junghwa Chun;Kyungmin Kim;Keunchang Jang;Joongbin Lim;Yangwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1341-1352
    • /
    • 2023
  • The normalized difference vegetation index (NDVI) derived from satellite images is a crucial tool to monitor forests and agriculture for broad areas because the periodic acquisition of the data is ensured. However, optical sensor-based vegetation indices(VI) are not accessible in some areas covered by clouds. This paper presented a synthetic aperture radar (SAR) based approach to retrieval of the optical sensor-based NDVI using machine learning. SAR system can observe the land surface day and night in all weather conditions. Radar vegetation indices (RVI) from the Sentinel-1 vertical-vertical (VV) and vertical-horizontal (VH) polarizations, surface elevation, and air temperature are used as the input features for an automated machine learning (AutoML) model to conduct the gap-filling of the Sentinel-2 NDVI. The mean bias error (MAE) was 7.214E-05, and the correlation coefficient (CC) was 0.878, demonstrating the feasibility of the proposed method. This approach can be applied to gap-free nationwide NDVI construction using Sentinel-1 and Sentinel-2 images for environmental monitoring and resource management.

An Automated Production System Design for Natural Language Processing Models Using Korean Pre-trained Model (한국어 사전학습 모델을 활용한 자연어 처리 모델 자동 산출 시스템 설계)

  • Jihyoung Jang;Hoyoon Choi;Gun-woo Lee;Myung-seok Choi;Charmgil Hong
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.613-618
    • /
    • 2022
  • 효과적인 자연어 처리를 위해 제안된 Transformer 구조의 등장 이후, 이를 활용한 대규모 언어 모델이자 사전학습 모델인 BERT, GPT, OPT 등이 공개되었고, 이들을 한국어에 보다 특화한 KoBERT, KoGPT 등의 사전학습 모델이 공개되었다. 자연어 처리 모델의 확보를 위한 학습 자원이 늘어나고 있지만, 사전학습 모델을 각종 응용작업에 적용하기 위해서는 데이터 준비, 코드 작성, 파인 튜닝 및 저장과 같은 복잡한 절차를 수행해야 하며, 이는 다수의 응용 사용자에게 여전히 도전적인 과정으로, 올바른 결과를 도출하는 것은 쉽지 않다. 이러한 어려움을 완화시키고, 다양한 기계 학습 모델을 사용자 데이터에 보다 쉽게 적용할 수 있도록 AutoML으로 통칭되는 자동 하이퍼파라미터 탐색, 모델 구조 탐색 등의 기법이 고안되고 있다. 본 연구에서는 한국어 사전학습 모델과 한국어 텍스트 데이터를 사용한 자연어 처리 모델 산출 과정을 정형화 및 절차화하여, 궁극적으로 목표로 하는 예측 모델을 자동으로 산출하는 시스템의 설계를 소개한다.

  • PDF