• Title/Summary/Keyword: MAE(Mean Absolute Error)

Search Result 204, Processing Time 0.03 seconds

Movie recommendation system using community detection based on label propagation (레이블 전파에 기반한 커뮤니티 탐지를 이용한 영화추천시스템)

  • Xinchang, Khamphaphone;Vilakone, Phonexay;Lee, Han-Hyung;Song, Min-Hyuk;Park, Doo-Soon
    • Annual Conference of KIPS
    • /
    • 2019.05a
    • /
    • pp.273-276
    • /
    • 2019
  • There is a lot of information in our world, quick access to the most accurate information or finding the information we need is more difficult and complicated. The recommendation system has become important for users to quickly find the product according to user's preference. A social recommendation system using community detection based on label propagation is proposed. In this paper, we applied community detection based on label propagation and collaborative filtering in the movie recommendation system. We implement with MovieLens dataset, the users will be clustering to the community by using label propagation algorithm, Our proposed algorithm will be recommended movie with finding the most similar community to the new user according to the personal propensity of users. Mean Absolute Error (MAE) is used to shown efficient of our proposed method.

Automated Machine Learning-Based Solar PV Forecasting Considering Solar Position Information (태양 위치 정보를 고려한 AutoML 기반의 태양광 발전량 예측)

  • Jinyeong Oh;Dayeong So;Byeongcheon Lee;Jihoon Moon
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.322-323
    • /
    • 2023
  • 지속 가능한 에너지인 태양광 발전은 전 세계에서 널리 활용하는 재생 에너지 원천 중 하나로 최근 효율적인 태양광 발전 시스템 운영을 위해 태양광 발전량을 정확하게 예측하기 위한 연구가 활발히 진행되고 있다. 태양광 발전량 예측 모델을 구성하기 위해서는 기상 및 대기 환경을 넘어 태양의 위치에 따른 일사량의 정보가 필수적이나 태양의 실시간 위치 정보를 입력 변수로 활용한 연구가 부족한 실정이다. 그리하여 본 논문에서는 시간과 태양광 발전소 위치를 기반으로 태양의 고도와 방위각을 실시간으로 계산하여 입력 변수로 사용하는 방식을 제안한다. 이를 위해 AutoML 기반의 다양한 기계학습 모델을 구성하여 태양광 발전율을 예측하고 그 성능을 비교 분석하였다. 실험 결과, 태양 위치 정보를 포함한 경우에 환경 변수만을 고려하였을 때보다 예측 성능이 크게 향상되었음을 확인할 수 있었으며, Extra Trees 모델의 경우 태양 위치 정보를 추가하였을 때 MAE(Mean Absolute Error)가 33.90 에서 22.38 까지 낮아지는 결과를 확인하였다.

An interpretable machine learning approach for forecasting personal heat strain considering the cumulative effect of heat exposure

  • Seo, Seungwon;Choi, Yujin;Koo, Choongwan
    • Korean Journal of Construction Engineering and Management
    • /
    • v.24 no.6
    • /
    • pp.81-90
    • /
    • 2023
  • Climate change has resulted in increased frequency and intensity of heat waves, which poses a significant threat to the health and safety of construction workers, particularly those engaged in labor-intensive and heat-stress vulnerable working environments. To address this challenge, this study aimed to propose an interpretable machine learning approach for forecasting personal heat strain by considering the cumulative effect of heat exposure as a situational variable, which has not been taken into account in the existing approach. As a result, the proposed model, which incorporated the cumulative working time along with environmental and personal variables, was found to have superior forecast performance and explanatory power. Specifically, the proposed Multi-Layer Perceptron (MLP) model achieved a Mean Absolute Error (MAE) of 0.034 (℃) and an R-squared of 99.3% (0.933). Feature importance analysis revealed that the cumulative working time, as a situational variable, had the most significant impact on personal heat strain. These findings highlight the importance of systematic management of personal heat strain at construction sites by comprehensively considering the cumulative working time as a situational variable as well as environmental and personal variables. This study provided a valuable contribution to the construction industry by offering a reliable and accurate heat strain forecasting model, enhancing the health and safety of construction workers.

A Empirical Study on Recommendation Schemes Based on User-based and Item-based Collaborative Filtering (사용자 기반과 아이템 기반 협업여과 추천기법에 관한 실증적 연구)

  • Ye-Na Kim;In-Bok Choi;Taekeun Park;Jae-Dong Lee
    • Annual Conference of KIPS
    • /
    • 2008.11a
    • /
    • pp.714-717
    • /
    • 2008
  • 협업여과 추천기법에는 사용자 기반 협업여과와 아이템 기반 협업여과가 있으며, 절차는 유사도 측정, 이웃 선정, 예측값 생성 단계로 이루어진다. 유사도 측정 단계에는 유클리드 거리(Euclidean Distance), 코사인 유사도(Cosine Similarity), 피어슨 상관계수(Pearson Correlation Coefficient) 방법 등이 있고, 이웃 선정 단계에는 상관 한계치(Correlation-Threshold), 근접 N 이웃(Best-N-Neighbors) 방법 등이 있다. 마지막으로 예측값 생성 단계에는 단순평균(Simple Average), 가중합(Weighted Sum), 조정 가중합(Adjusted Weighted Sum) 등이 있다. 이처럼 협업여과 추천기법에는 다양한 기법들이 사용되고 있다. 따라서 본 논문에서는 사용자 기반 협업여과와 아이템 기반 협업여과 추천기법에 사용되는 유사도 측정 기법과 예측값 생성 기법의 최적화된 조합을 알아보기 위해 성능 실험 및 비교 분석을 하였다. 실험은 GroupLens의 MovieLens 데이터 셋을 활용하였고 MAE(Mean Absolute Error)값을 이용하여 추천기법을 비교 하였다. 실험을 통해 유사도 측정 기법과 예측값 생성 기법의 최적화된 조합을 찾을 수 있었고, 사용자 기반 협업여과와 아이템 기반 협업여과의 성능비교를 통해 아이템 기반 협업여과의 성능이 보다 우수했음을 확인 하였다.

Perceived Age Prediction from Face Image Based on Super-resolution and Tanh-polar Transform (얼굴영상의 초해상도화 및 Tanh-polar 변환 기반의 인지나이 예측)

  • Ilkoo Ahn ;Siwoo Lee
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.5
    • /
    • pp.329-335
    • /
    • 2023
  • Perceived age is defined as age estimated based on physical appearance. Perceived age is an important indicator of the overall health status of the elderly. This is because people who appear older tend to have higher rates of morbidity and mortality than people of the same chronological age. Although perceived age is an important indicator, there is a lack of objective methods to quantify perceived age. In this paper, we construct a quantified perceived age model from face images using a convolutional neural network. The face images are enlarged to super-resolution and the skin, an important feature in perceived age, is made clear. Moreover, through Tanh-polar transformation, the central area of the face occupies a relatively larger area than the boundary area, helping the neural network better recognize facial skin features. The experimental results show mean absolute error (MAE) of 6.59, showing that the proposed model is superior to existing method.

Gray Matter and White Matter-Based Brain Age Prediction Model Using Multi-MRI Datasets (다중 MRI 데이터셋을 활용한 회백질 및 백질 기반 뇌 연령 예측 모델)

  • Seung-Jun Lee;Myungeun Lee;Hyung-Jeong Yang
    • Annual Conference of KIPS
    • /
    • 2024.10a
    • /
    • pp.719-722
    • /
    • 2024
  • 뇌 연령은 신경퇴행성 질환과 인지 저하를 예측하는 중요한 바이오마커로 주목받고 있으며, 이를 통해 개인의 뇌 건강 상태를 보다 정밀하게 확인할 수 있다. 특히, 회백질과 백질은 뇌 구조와 기능을 평가하는 데 핵심적인 역할을 하며, 뇌 구조적 변화를 분석함으로써 뇌 연령 예측의 정확도를 높일 수 있다. 또한, 특정 데이터셋만 활용될 경우 일반화된 성능을 기대하기 어려워 뇌 연령 예측에 다양한 데이터셋을 활용한 연구가 필요하다. 따라서, 본 연구에서는 다중 모달 MRI 데이터를 결합한 3D CNN 기반 뇌 연령 예측 모델을 제안한다. 제안된 모델은 회백질과 백질의 특징을 전처리된 T1 이미지에 결합하여 더욱 풍부한 뇌 구조 정보를 학습할 수 있도록 설계하여, 뇌 연령 예측의 정확성을 향상시켰다. 실험 결과 회백질과 백질 정보를 추가로 활용한 모델이 T1 이미지만을 사용한 기존 CNN 및 ResNet 모델보다 MAE(Mean Absolute Error) 평가지표에서 더 우수한 성능을 보였으며, 이를 통해 회백질과 백질 정보가 뇌 연령 예측에 중요한 기여를 한다는 사실을 확인하였다.

Development of a Biophysical Rice Yield Model Using All-weather Climate Data (MODIS 전천후 기상자료 기반의 생물리학적 벼 수량 모형 개발)

  • Lee, Jihye;Seo, Bumsuk;Kang, Sinkyu
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.5_2
    • /
    • pp.721-732
    • /
    • 2017
  • With the increasing socio-economic importance of rice as a global staple food, several models have been developed for rice yield estimation by combining remote sensing data with carbon cycle modelling. In this study, we aimed to estimate rice yield in Korea using such an integrative model using satellite remote sensing data in combination with a biophysical crop growth model. Specifically, daily meteorological inputs derived from MODIS (Moderate Resolution imaging Spectroradiometer) and radar satellite products were used to run a light use efficiency based crop growth model, which is based on the MODIS gross primary production (GPP) algorithm. The modelled biomass was converted to rice yield using a harvest index model. We estimated rice yield from 2003 to 2014 at the county level and evaluated the modelled yield using the official rice yield and rice straw biomass statistics of Statistics Korea (KOSTAT). The estimated rice biomass, yield, and harvest index and their spatial distributions were investigated. Annual mean rice yield at the national level showed a good agreement with the yield statistics with the yield statistics, a mean error (ME) of +0.56% and a mean absolute error (MAE) of 5.73%. The estimated county level yield resulted in small ME (+0.10~+2.00%) and MAE (2.10~11.62%),respectively. Compared to the county-level yield statistics, the rice yield was over estimated in the counties in Gangwon province and under estimated in the urban and coastal counties in the south of Chungcheong province. Compared to the rice straw statistics, the estimated rice biomass showed similar error patterns with the yield estimates. The subpixel heterogeneity of the 1 km MODIS FPAR(Fraction of absorbed Photosynthetically Active Radiation) may have attributed to these errors. In addition, the growth and harvest index models can be further developed to take account of annually varying growth conditions and growth timings.

Early Prediction of Fine Dust Concentration in Seoul using Weather and Fine Dust Information (기상 및 미세먼지 정보를 활용한 서울시의 미세먼지 농도 조기 예측)

  • HanJoo Lee;Minkyu Jee;Hakdong Kim;Taeheul Jun;Cheongwon Kim
    • Journal of Broadcast Engineering
    • /
    • v.28 no.3
    • /
    • pp.285-292
    • /
    • 2023
  • Recently, the impact of fine dust on health has become a major topic. Fine dust is dangerous because it can penetrate the body and affect the respiratory system, without being filtered out by the mucous membrane in the nose. Since fine dust is directly related to the industry, it is practically impossible to completely remove it. Therefore, if the concentration of fine dust can be predicted in advance, pre-emptive measures can be taken to minimize its impact on the human body. Fine dust can travel over 600km in a day, so it not only affects neighboring areas, but also distant regions. In this paper, wind direction and speed data and a time series prediction model were used to predict the concentration of fine dust in Seoul, and the correlation between the concentration of fine dust in Seoul and the concentration in each region was confirmed. In addition, predictions were made using the concentration of fine dust in each region and in Seoul. The lowest MAE (mean absolute error) in the prediction results was 12.13, which was about 15.17% better than the MAE of 14.3 presented in previous studies.

Restoration of Missing Data in Satellite-Observed Sea Surface Temperature using Deep Learning Techniques (딥러닝 기법을 활용한 위성 관측 해수면 온도 자료의 결측부 복원에 관한 연구)

  • Won-Been Park;Heung-Bae Choi;Myeong-Soo Han;Ho-Sik Um;Yong-Sik Song
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.6
    • /
    • pp.536-542
    • /
    • 2023
  • Satellites represent cutting-edge technology, of ering significant advantages in spatial and temporal observations. National agencies worldwide harness satellite data to respond to marine accidents and analyze ocean fluctuations effectively. However, challenges arise with high-resolution satellite-based sea surface temperature data (Operational Sea Surface Temperature and Sea Ice Analysis, OSTIA), where gaps or empty areas may occur due to satellite instrumentation, geographical errors, and cloud cover. These issues can take several hours to rectify. This study addressed the issue of missing OSTIA data by employing LaMa, the latest deep learning-based algorithm. We evaluated its performance by comparing it to three existing image processing techniques. The results of this evaluation, using the coefficient of determination (R2) and mean absolute error (MAE) values, demonstrated the superior performance of the LaMa algorithm. It consistently achieved R2 values of 0.9 or higher and kept MAE values under 0.5 ℃ or less. This outperformed the traditional methods, including bilinear interpolation, bicubic interpolation, and DeepFill v1 techniques. We plan to evaluate the feasibility of integrating the LaMa technique into an operational satellite data provision system.

Effect of input variable characteristics on the performance of an ensemble machine learning model for algal bloom prediction (앙상블 머신러닝 모형을 이용한 하천 녹조발생 예측모형의 입력변수 특성에 따른 성능 영향)

  • Kang, Byeong-Koo;Park, Jungsu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.35 no.6
    • /
    • pp.417-424
    • /
    • 2021
  • Algal bloom is an ongoing issue in the management of freshwater systems for drinking water supply, and the chlorophyll-a concentration is commonly used to represent the status of algal bloom. Thus, the prediction of chlorophyll-a concentration is essential for the proper management of water quality. However, the chlorophyll-a concentration is affected by various water quality and environmental factors, so the prediction of its concentration is not an easy task. In recent years, many advanced machine learning algorithms have increasingly been used for the development of surrogate models to prediction the chlorophyll-a concentration in freshwater systems such as rivers or reservoirs. This study used a light gradient boosting machine(LightGBM), a gradient boosting decision tree algorithm, to develop an ensemble machine learning model to predict chlorophyll-a concentration. The field water quality data observed at Daecheong Lake, obtained from the real-time water information system in Korea, were used for the development of the model. The data include temperature, pH, electric conductivity, dissolved oxygen, total organic carbon, total nitrogen, total phosphorus, and chlorophyll-a. First, a LightGBM model was developed to predict the chlorophyll-a concentration by using the other seven items as independent input variables. Second, the time-lagged values of all the input variables were added as input variables to understand the effect of time lag of input variables on model performance. The time lag (i) ranges from 1 to 50 days. The model performance was evaluated using three indices, root mean squared error-observation standard deviation ration (RSR), Nash-Sutcliffe coefficient of efficiency (NSE) and mean absolute error (MAE). The model showed the best performance by adding a dataset with a one-day time lag (i=1) where RSR, NSE, and MAE were 0.359, 0.871 and 1.510, respectively. The improvement of model performance was observed when a dataset with a time lag up of about 15 days (i=15) was added.