• Title/Summary/Keyword: 결측치

Search Result 98, Processing Time 0.027 seconds

Store Sales Prediction Using Gradient Boosting Model (그래디언트 부스팅 모델을 활용한 상점 매출 예측)

  • Choi, Jaeyoung;Yang, Heeyoon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.171-177
    • /
    • 2021
  • Through the rapid developments in machine learning, there have been diverse utilization approaches not only in industrial fields but also in daily life. Implementations of machine learning on financial data, also have been of interest. Herein, we employ machine learning algorithms to store sales data and present future applications for fintech enterprises. We utilize diverse missing data processing methods to handle missing data and apply gradient boosting machine learning algorithms; XGBoost, LightGBM, CatBoost to predict the future revenue of individual stores. As a result, we found that using median imputation onto missing data with the appliance of the xgboost algorithm has the best accuracy. By employing the proposed method, fintech enterprises and customers can attain benefits. Stores can benefit by receiving financial assistance beforehand from fintech companies, while these corporations can benefit by offering financial support to these stores with low risk.

A Study on the Index Estimation of Missing Real Estate Transaction Cases Using Machine Learning (머신러닝을 활용한 결측 부동산 매매 지수의 추정에 대한 연구)

  • Kim, Kyung-Min;Kim, Kyuseok;Nam, Daisik
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.25 no.1
    • /
    • pp.171-181
    • /
    • 2022
  • The real estate price index plays key roles as quantitative data in real estate market analysis. International organizations including OECD publish the real estate price indexes by country, and the Korea Real Estate Board announces metropolitan-level and municipal-level indexes. However, when the index is set on the smaller spatial unit level than metropolitan and municipal-level, problems occur: missing values. As the spatial scope is narrowed down, there are cases where there are few or no transactions depending on the unit period, which lead index calculation difficult or even impossible. This study suggests a supervised learning-based machine learning model to compensate for missing values that may occur due to no transaction in a specific range and period. The models proposed in our research verify the accuracy of predicting the existing values and missing values.

The Effect of Decision-making Attitudes within the Family on the Human Rights Awareness of Adolescents: Mediating Effect of Self-Esteem (가족 내 의사결정 태도가 청소년의 인권의식에 미치는 영향: 자아존중감의 매개효과)

  • Kim, Jung-Hui;Choi, Yeon-Sun
    • Journal of Industrial Convergence
    • /
    • v.20 no.10
    • /
    • pp.131-136
    • /
    • 2022
  • This study examines the mediating effect of self-esteem in the influence of family decision-making attitudes on adolescents' human rights awareness. In order to achieve the purpose of this study, data from the Korea Youth Policy Research Institute surveyed in 2018 were used and analyzed. After extracting 693 adolescents with part-time work experience among all respondents in this data, missing values, outliers, and weights were removed, and a total of 511 people were selected as final research subjects. The SPSS WIN 25.0 program was used to verify the influence and mediating effect between measurement variables. As a result of the analysis, the partial mediating effect of self-esteem was confirmed in the influence of decision-making attitudes within the family on the human rights consciousness of adolescents. In addition, the Sobel Test was conducted to confirm the significance of the mediating effect of self-esteem. Based on the results of this study, the necessity of social welfare intervention was suggested for desirable communication between parents and children, raising awareness of human rights and enhancing self-esteem suggested.

Analysis of Factors Affecting Academic Ability of Preschool-age Children

  • Moon, Kyung-Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.205-213
    • /
    • 2022
  • This study is to analyze the relationship among potential variables of self-development, social development, learning readiness, and academic ability using data from the Panel Study on Korean Children, which was surveyed in 2014, and to find factors affecting the academic ability of preschool children will be. The subjects of this study were 6-year-old children of 1113 households among 2150 households in the 7th Panel Study on Korean Children(2014) data, excluding non-responders and system-missing 1037 households. As a result of analyzing the path effect of the research model, it was found that, between self-development and academic skills, self-development had a direct effect on academic skills and also had a significant indirect effect through social development and learning readiness as a medium. In addition, it was found that learning readiness had the greatest influence among self-development, social development, and learning readiness on academic skills. As a result, the academic skills of preschool-age children should be treated with great importance in order to develop them into talents with creativity and problem-solving ability.

Data Quality Assessment and Improvement for Water Level Prediction of the Han River (한강 수위 예측을 위한 데이터 품질 진단 및 개선)

  • Ji-Hyun Choi;Jin-Yeop Kang;Hyun Ahn
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.1
    • /
    • pp.133-138
    • /
    • 2023
  • As a side effect of recent rapid climate change and global warming, the frequency and scale of flood disasters are increasing worldwide. In Korea, the water level of the Han River is a major management target for preventing flood disasters in Seoul, the capital of Korea. In this paper, to improve the water level prediction of the Han River based on machine learning, we perform a comprehensive assessment of the quality of related dataset and propose data preprocessing methods to improve it. Specifically, we improve the dataset in terms of completeness, validity, and accuracy through missing value processing and cross-correlation analysis. In addition, we conduct a performance evaluation using random forest and LightGBM to analyze the effect of the proposed data improvement method on the water level prediction performance of the Han River.

Smoothed RSSI-Based Distance Estimation Using Deep Neural Network (심층 인공신경망을 활용한 Smoothed RSSI 기반 거리 추정)

  • Hyeok-Don Kwon;Sol-Bee Lee;Jung-Hyok Kwon;Eui-Jik Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.2
    • /
    • pp.71-76
    • /
    • 2023
  • In this paper, we propose a smoothed received signal strength indicator (RSSI)-based distance estimation using deep neural network (DNN) for accurate distance estimation in an environment where a single receiver is used. The proposed scheme performs a data preprocessing consisting of data splitting, missing value imputation, and smoothing steps to improve distance estimation accuracy, thereby deriving the smoothed RSSI values. The derived smoothed RSSI values are used as input data of the Multi-Input Single-Output (MISO) DNN model, and are finally returned as an estimated distance in the output layer through input layer and hidden layer. To verify the superiority of the proposed scheme, we compared the performance of the proposed scheme with that of the linear regression-based distance estimation scheme. As a result, the proposed scheme showed 29.09% higher distance estimation accuracy than the linear regression-based distance estimation scheme.

Comparison of Machine Learning Techniques in Urban Weather Prediction using Air Quality Sensor Data (실외공기측정기 자료를 이용한 도심 기상 예측 기계학습 모형 비교)

  • Jong-Chan Park;Heon Jin Park
    • The Journal of Bigdata
    • /
    • v.6 no.2
    • /
    • pp.39-49
    • /
    • 2021
  • Recently, large and diverse weather data are being collected by sensors from various sources. Efforts to predict the concentration of fine dust through machine learning are being made everywhere, and this study intends to compare PM10 and PM2.5 prediction models using data from 840 outdoor air meters installed throughout the city. Information can be provided in real time by predicting the concentration of fine dust after 5 minutes, and can be the basis for model development after 10 minutes, 30 minutes, and 1 hour. Data preprocessing was performed, such as noise removal and missing value replacement, and a derived variable that considers temporal and spatial variables was created. The parameters of the model were selected through the response surface method. XGBoost, Random Forest, and Deep Learning (Multilayer Perceptron) are used as predictive models to check the difference between fine dust concentration and predicted values, and to compare the performance between models.

Analysis of water demand characteristics using water consumption data measured by smart water meter from block 112 in YeongJong Island (영종도 112 블록 지능형 수도 계량기에서 계측된 물 사용 자료를 이용한 용도별 물 수요 특성 분석)

  • Koo, Kang Min;Han, Kuk Heon;Jun, Kyung Soo;Yum, Kyung Taek
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.390-390
    • /
    • 2022
  • 도시 생활용수 수요는 생활 수준 향상, 도시화 등으로 지속적 증가 추세에 있으며, 최근 기후변화, 시설 노후화, 도시화, 그리고 수질 오염 등의 문제들에 직면해 있다. 이는 물 부족을 심화시켜 현행 상수도시스템에서 한정적인 수자원을 배분하는데 어려움을 가중시킨다. 이를 해결하기 위해 스마트워터그리드 기술이 상수도시스템에 도입이 되면서 지능형 상수도 계량기를 이용한 개별 소비자들의 물 소비량 자료를 보다 정밀하게 실시간으로 모니터링 할 수 있게 되었다. 실시간 실측을 바탕으로 한 물 소비량 자료는 미래 용수 수요 예측과 수운영 관리에 도움을 줄 수 있다. 한편 생활용수는 용도 또는 요금 부과 기준에 따라 가정용, 업무용, 영업용, 욕탕용, 그리고 공업용으로 분류할 수 있다. 미국과 호주 등에서는 용도 분류에 따른 모니터링 강화로 절수 방안을 개발하여 물 부족에 대비하고 있다. 우리나라도 비 가정용수(가정용수를 제외한 용수들)를 체계적으로 분류하기 위한 선행 연구들이 이뤄졌으나 분류체계가 표준화되지 않았는데, 이는 용도에 따른 개별 소비자들의 소비 특성 분석이 충분히 선행되지 않았기 때문이며, 아직까지 많은 지자체에서 물 소비량을 월 단위로 인력검침 하는데 의존하고 있어, 충분한 물 소비량 자료가 부족했기 때문이다. 본 연구에서는 영종도 112 블록에 구축된 스마트워터그리드 파일롯플랜트 527개 개별 소비자들로부터 2018년 1월 1일부터 2020년 1월 1일까지 1시간단위로 수집된 물 소비량 자료를 이용하여, 개별 소비자들의 일평균 첨두 소비량과 발생 시간, 관경, 요일, 계절에 따른 물 수요 특성 분석을 수행했다. 이 때 수집된 자료의 결측치 및 오측치를 보정하여 자료의 신뢰성을 높이고자 했다. 분석결과는 용도별 물 수요 특성을 보다 잘 이해할 수 있게 도와주며, 비가정용수의 용도별 분류에 기초자료로 사용될 수 있을 것이라 사료된다.

  • PDF

Analysis of Phenological Changes by Phenocams on Some Major Species Distributed in Wetland and Forest Ecosystems in Korea (Phenocam을 활용한 국내 습지 및 산림생태계 대표 수종의 계절적 변화 분석)

  • Minki Hong;Hyohyemi Lee;Jeong-Soo Park
    • Ecology and Resilient Infrastructure
    • /
    • v.10 no.4
    • /
    • pp.226-236
    • /
    • 2023
  • As climate change intensifies, the importance of studying plant phenology has increased, leading to a surge in research employing automated video recording devices like Phenocams. In this study, using the Phenocams operated by the National Institute of Ecology, we examined the trends in plant phenological changes across diverse ecosystem types in South Korea and analyzed their correlations with climate factors. The patterns of plant phenological changes varied by region and tree species. Pinus thunbergii and Pinus densiflora typically show an overall increase in their growth period, positively correlating with temperatures and precipitation during winter. However, uniquely, for Abies koreana on Hallasan Mt., a higher amount of precipitation in August leads to an earlier end of season (eos), and the correlation analysis with the recent phenomenon of dying A. Koreana seems necessary. beyond the analysis, solutions for handling missing data issues during the data collection process were proposed. Furthermore, to expand future research scope and encompass diverse ecosystem types, a suggestion to combine Phenocam research with satellite observations was presented.

The Mediating Effect of Customer Satisfaction in the Relationship between Bakery Cafes Servicescape and Revisit Intention (베이커리카페의 서비스스케이프와 재방문의도 간 관계에 고객만족의 매개효과)

  • Kwon, Ki-Wan;Woo, Sung-Keun
    • Culinary science and hospitality research
    • /
    • v.21 no.6
    • /
    • pp.14-27
    • /
    • 2015
  • This study aims to analyze the influence of bakery cafes servicescape on customer satisfaction and revisit intention, and to verify the mediating effect of customer satisfaction on the relationship between servicescape and revisit intention. This study targeted 10 bakery cafes located in Seoul, and after asking the persons concerned of the bakery cafes to check understanding, a survey with customers aged 20 or over was conducted over 10 days from March 15th to 24th 2015. A total of 250 self-administered questionnaires were distributed, and 244 questionnaires(97.6%) were used for study analysis after the exclusion of 6 incomplete and unreliable responses. To investigate the demographic characteristics of the respondents, a frequency analysis was carried out; for verification of the reliability and validity of the measuring tools, a reliability analysis and exploratory factor analysis were carried out; and for verification of the research hypotheses, simple and multiple regression analyses as well as a mediation analysis were carried out. All the data required for this study were analyzed using the SPSS 18.0 statistic program. The study findings showed that servicescape influenced customer satisfaction and revisit intention, and that Customer satisfaction had a mediating effect. Based on these findings, future marketing strategies and differentiated servicescape application methods for bakery cafes were suggested. Moreover, the limitations of the study and orientation for further research were discussed.