• Title/Summary/Keyword: RandomForest

Search Result 1,033, Processing Time 0.024 seconds

A Comparative Study on Mapping and Filtering Radii of Local Climate Zone in Changwon city using WUDAPT Protocol (WUDAPT 절차를 활용한 창원시의 국지기후대 제작과 필터링 반경에 따른 비교 연구)

  • Tae-Gyeong KIM;Kyung-Hun PARK;Bong-Geun SONG;Seoung-Hyeon KIM;Da-Eun JEONG;Geon-Ung PARK
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.27 no.2
    • /
    • pp.78-95
    • /
    • 2024
  • For the establishment and comparison of environmental plans across various domains, considering climate change and urban issues, it is crucial to build spatial data at the regional scale classified with consistent criteria. This study mapping the Local Climate Zone (LCZ) of Changwon City, where active climate and environmental research is being conducted, using the protocol suggested by the World Urban Database and Access Portal Tools (WUDAPT). Additionally, to address the fragmentation issue where some grids are classified with different climate characteristics despite being in regions with homogeneous climate traits, a filtering technique was applied, and the LCZ classification characteristics were compared according to the filtering radius. Using satellite images, ground reference data, and the supervised classification machine learning technique Random Forest, classification maps without filtering and with filtering radii of 1, 2, and 3 were produced, and their accuracies were compared. Furthermore, to compare the LCZ classification characteristics according to building types in urban areas, an urban form index used in GIS-based classification methodology was created and compared with the ranges suggested in previous studies. As a result, the overall accuracy was highest when the filtering radius was 1. When comparing the urban form index, the differences between LCZ types were minimal, and most satisfied the ranges of previous studies. However, the study identified a limitation in reflecting the height information of buildings, and it is believed that adding data to complement this would yield results with higher accuracy. The findings of this study can be used as reference material for creating fundamental spatial data for environmental research related to urban climates in South Korea.

Basic Research on the Possibility of Developing a Landscape Perceptual Response Prediction Model Using Artificial Intelligence - Focusing on Machine Learning Techniques - (인공지능을 활용한 경관 지각반응 예측모델 개발 가능성 기초연구 - 머신러닝 기법을 중심으로 -)

  • Kim, Jin-Pyo;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.3
    • /
    • pp.70-82
    • /
    • 2023
  • The recent surge of IT and data acquisition is shifting the paradigm in all aspects of life, and these advances are also affecting academic fields. Research topics and methods are being improved through academic exchange and connections. In particular, data-based research methods are employed in various academic fields, including landscape architecture, where continuous research is needed. Therefore, this study aims to investigate the possibility of developing a landscape preference evaluation and prediction model using machine learning, a branch of Artificial Intelligence, reflecting the current situation. To achieve the goal of this study, machine learning techniques were applied to the landscaping field to build a landscape preference evaluation and prediction model to verify the simulation accuracy of the model. For this, wind power facility landscape images, recently attracting attention as a renewable energy source, were selected as the research objects. For analysis, images of the wind power facility landscapes were collected using web crawling techniques, and an analysis dataset was built. Orange version 3.33, a program from the University of Ljubljana was used for machine learning analysis to derive a prediction model with excellent performance. IA model that integrates the evaluation criteria of machine learning and a separate model structure for the evaluation criteria were used to generate a model using kNN, SVM, Random Forest, Logistic Regression, and Neural Network algorithms suitable for machine learning classification models. The performance evaluation of the generated models was conducted to derive the most suitable prediction model. The prediction model derived in this study separately evaluates three evaluation criteria, including classification by type of landscape, classification by distance between landscape and target, and classification by preference, and then synthesizes and predicts results. As a result of the study, a prediction model with a high accuracy of 0.986 for the evaluation criterion according to the type of landscape, 0.973 for the evaluation criterion according to the distance, and 0.952 for the evaluation criterion according to the preference was developed, and it can be seen that the verification process through the evaluation of data prediction results exceeds the required performance value of the model. As an experimental attempt to investigate the possibility of developing a prediction model using machine learning in landscape-related research, this study was able to confirm the possibility of creating a high-performance prediction model by building a data set through the collection and refinement of image data and subsequently utilizing it in landscape-related research fields. Based on the results, implications, and limitations of this study, it is believed that it is possible to develop various types of landscape prediction models, including wind power facility natural, and cultural landscapes. Machine learning techniques can be more useful and valuable in the field of landscape architecture by exploring and applying research methods appropriate to the topic, reducing the time of data classification through the study of a model that classifies images according to landscape types or analyzing the importance of landscape planning factors through the analysis of landscape prediction factors using machine learning.

The big data method for flash flood warning (돌발홍수 예보를 위한 빅데이터 분석방법)

  • Park, Dain;Yoon, Sanghoo
    • Journal of Digital Convergence
    • /
    • v.15 no.11
    • /
    • pp.245-250
    • /
    • 2017
  • Flash floods is defined as the flooding of intense rainfall over a relatively small area that flows through river and valley rapidly in short time with no advance warning. So that it can cause damage property and casuality. This study is to establish the flash-flood warning system using 38 accident data, reported from the National Disaster Information Center and Land Surface Model(TOPLATS) between 2009 and 2012. Three variables were used in the Land Surface Model: precipitation, soil moisture, and surface runoff. The three variables of 6 hours preceding flash flood were reduced to 3 factors through factor analysis. Decision tree, random forest, Naive Bayes, Support Vector Machine, and logistic regression model are considered as big data methods. The prediction performance was evaluated by comparison of Accuracy, Kappa, TP Rate, FP Rate and F-Measure. The best method was suggested based on reproducibility evaluation at the each points of flash flood occurrence and predicted count versus actual count using 4 years data.

Isolation of an Agarolytic Bacteria, Cellvibrio mixtus SC-22 and The Enzymatic Properties (한천분해세균 Cellvibrio mixtus SC-22의 분리 및 효소적 특성)

  • Cha, Jeong-Ah;Kim, Yoo-Jin;Seo, Yung-Bum;Yoon, Min-Ho
    • Journal of Applied Biological Chemistry
    • /
    • v.52 no.4
    • /
    • pp.157-162
    • /
    • 2009
  • An agar-liquefying bacteria (SC-22), which produces a diffusible agarase that caused agar softening around the colony was isolated from Daecheong lake in Korea. Chemotaxanomic and phylogenetic analyses based on 16S rRNA gene sequences revealed the strain was classified as Cellvibrio mixtus SC-22. The isolate SC-22 showed maximal extracellular agarase activity with 58.5 U/mL after 48 h cultivation in the presence of 0.2% agar. It was observed that the isolate produced two kinds of extracellular and three kinds of intracellular isoenzymes. The major agarase was purified from the culture filtrate of agarolytic bacteria by ammonium sulfate precipitation, anion exchange and gel filtration column chromatographic methods. The molecular mass of the purified enzyme was estimated to be 25 kDa by SDS-PAGE. The optimum pH and temperature of the purified enzyme were pH 7.0 and $50^{\circ}C$, respectively. The agarase activity was activated by $Fe^{2+}$, $Na^+$ and $Ca^{2+}$ ions while it was inhibited by $Hg^{2+}$, $Mn^{2+}$ and $Cu^{2+}$ at 1 mM concentration. The predominant hydrolysis product of agarose by the enzyme was galactose and disaccharide on TLC, indicating the cleavage of $\beta$-1,4 linkage in a random manner. The enzyme showed high substrate specificity for only agar and agarose among various polysaccharides.

Application of Machine Learning to Predict Weight Loss in Overweight, and Obese Patients on Korean Medicine Weight Management Program (한의 체중 조절 프로그램에 참여한 과체중, 비만 환자에서의 머신러닝 기법을 적용한 체중 감량 예측 연구)

  • Kim, Eunjoo;Park, Young-Bae;Choi, Kahye;Lim, Young-Woo;Ok, Ji-Myung;Noh, Eun-Young;Song, Tae Min;Kang, Jihoon;Lee, Hyangsook;Kim, Seo-Young
    • The Journal of Korean Medicine
    • /
    • v.41 no.2
    • /
    • pp.58-79
    • /
    • 2020
  • Objectives: The purpose of this study is to predict the weight loss by applying machine learning using real-world clinical data from overweight and obese adults on weight loss program in 4 Korean Medicine obesity clinics. Methods: From January, 2017 to May, 2019, we collected data from overweight and obese adults (BMI≥23 kg/m2) who registered for a 3-month Gamitaeeumjowi-tang prescription program. Predictive analysis was conducted at the time of three prescriptions, and the expected reduced rate and reduced weight at the next order of prescription were predicted as binary classification (classification benchmark: highest quartile, median, lowest quartile). For the median, further analysis was conducted after using the variable selection method. The data set for each analysis was 25,988 in the first, 6,304 in the second, and 833 in the third. 5-fold cross validation was used to prevent overfitting. Results: Prediction accuracy was increased from 1st to 2nd and 3rd analysis. After selecting the variables based on the median, artificial neural network showed the highest accuracy in 1st (54.69%), 2nd (73.52%), and 3rd (81.88%) prediction analysis based on reduced rate. The prediction performance was additionally confirmed through AUC, Random Forest showed the highest in 1st (0.640), 2nd (0.816), and 3rd (0.939) prediction analysis based on reduced weight. Conclusions: The prediction of weight loss by applying machine learning showed that the accuracy was improved by using the initial weight loss information. There is a possibility that it can be used to screen patients who need intensive intervention when expected weight loss is low.

Convergence Implementing Emotion Prediction Neural Network Based on Heart Rate Variability (HRV) (심박변이도를 이용한 인공신경망 기반 감정예측 모형에 관한 융복합 연구)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.5
    • /
    • pp.33-41
    • /
    • 2018
  • The purpose of this study is to develop more accurate and robust emotion prediction neural network (EPNN) model by combining heart rate variability (HRV) and neural network. For the sake of improving the prediction performance more reliably, the proposed EPNN model is based on various types of activation functions like hyperbolic tangent, linear, and Gaussian functions, all of which are embedded in hidden nodes to improve its performance. In order to verify the validity of the proposed EPNN model, a number of HRV metrics were calculated from 20 valid and qualified participants whose emotions were induced by using money game. To add more rigor to the experiment, the participants' valence and arousal were checked and used as output node of the EPNN. The experiment results reveal that the F-Measure for Valence and Arousal is 80% and 95%, respectively, proving that the EPNN yields very robust and well-balanced performance. The EPNN performance was compared with competing models like neural network, logistic regression, support vector machine, and random forest. The EPNN was more accurate and reliable than those of the competing models. The results of this study can be effectively applied to many types of wearable computing devices when ubiquitous digital health environment becomes feasible and permeating into our everyday lives.

Calibration of Portable Particulate Mattere-Monitoring Device using Web Query and Machine Learning

  • Loh, Byoung Gook;Choi, Gi Heung
    • Safety and Health at Work
    • /
    • v.10 no.4
    • /
    • pp.452-460
    • /
    • 2019
  • Background: Monitoring and control of PM2.5 are being recognized as key to address health issues attributed to PM2.5. Availability of low-cost PM2.5 sensors made it possible to introduce a number of portable PM2.5 monitors based on light scattering to the consumer market at an affordable price. Accuracy of light scatteringe-based PM2.5 monitors significantly depends on the method of calibration. Static calibration curve is used as the most popular calibration method for low-cost PM2.5 sensors particularly because of ease of application. Drawback in this approach is, however, the lack of accuracy. Methods: This study discussed the calibration of a low-cost PM2.5-monitoring device (PMD) to improve the accuracy and reliability for practical use. The proposed method is based on construction of the PM2.5 sensor network using Message Queuing Telemetry Transport (MQTT) protocol and web query of reference measurement data available at government-authorized PM monitoring station (GAMS) in the republic of Korea. Four machine learning (ML) algorithms such as support vector machine, k-nearest neighbors, random forest, and extreme gradient boosting were used as regression models to calibrate the PMD measurements of PM2.5. Performance of each ML algorithm was evaluated using stratified K-fold cross-validation, and a linear regression model was used as a reference. Results: Based on the performance of ML algorithms used, regression of the output of the PMD to PM2.5 concentrations data available from the GAMS through web query was effective. The extreme gradient boosting algorithm showed the best performance with a mean coefficient of determination (R2) of 0.78 and standard error of 5.0 ㎍/㎥, corresponding to 8% increase in R2 and 12% decrease in root mean square error in comparison with the linear regression model. Minimum 100 hours of calibration period was found required to calibrate the PMD to its full capacity. Calibration method proposed poses a limitation on the location of the PMD being in the vicinity of the GAMS. As the number of the PMD participating in the sensor network increases, however, calibrated PMDs can be used as reference devices to nearby PMDs that require calibration, forming a calibration chain through MQTT protocol. Conclusions: Calibration of a low-cost PMD, which is based on construction of PM2.5 sensor network using MQTT protocol and web query of reference measurement data available at a GAMS, significantly improves the accuracy and reliability of a PMD, thereby making practical use of the low-cost PMD possible.

A Smart Farm Environment Optimization and Yield Prediction Platform based on IoT and Deep Learning (IoT 및 딥 러닝 기반 스마트 팜 환경 최적화 및 수확량 예측 플랫폼)

  • Choi, Hokil;Ahn, Heuihak;Jeong, Yina;Lee, Byungkwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.6
    • /
    • pp.672-680
    • /
    • 2019
  • This paper proposes "A Smart Farm Environment Optimization and Yield Prediction Platform based on IoT and Deep Learning" which gathers bio-sensor data from farms, diagnoses the diseases of growing crops, and predicts the year's harvest. The platform collects all the information currently available such as weather and soil microbes, optimizes the farm environment so that the crops can grow well, diagnoses the crop's diseases by using the leaves of the crops being grown on the farm, and predicts this year's harvest by using all the information on the farm. The result shows that the average accuracy of the AEOM is about 15% higher than that of the RF and about 8% higher than the GBD. Although data increases, the accuracy is reduced less than that of the RF or GBD. The linear regression shows that the slope of accuracy is -3.641E-4 for the ReLU, -4.0710E-4 for the Sigmoid, and -7.4534E-4 for the step function. Therefore, as the amount of test data increases, the ReLU is more accurate than the other two activation functions. This paper is a platform for managing the entire farm and, if introduced to actual farms, will greatly contribute to the development of smart farms in Korea.

A study on the optimum cutter spacing ratio according to penetration depth using decision tree-based and SVM regressions (의사결정나무 기반 회귀분석과 SVM 회귀분석을 이용한 커터 관입깊이에 따른 최적 커터간격 비 연구)

  • Lee, Gi-Jun;Ryu, Hee-Hwan;Kwon, Tae-Hyuk
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.5
    • /
    • pp.501-513
    • /
    • 2020
  • Cutter cutting tests for the cutter placement in the cutter head are being conducted through various studies. Although the cutter spacing at the minimum specific energy is mainly reflected in the cutter head design, since the optimum cutter spacing at the same cutter penetration depth varies depending on the rock conditions, studies on deciding the optimum cutter spacing should be actively conducted. The machine learning techniques such as the decision tree-based regression model and the SVM regression model were applied to predict the optimum cutter spacing ratio for the nonlinear relationship between cutter penetration depth and cutter spacing. Since the decision tree-based methods are greatly influenced by the number of data, SVM regression predicted optimum cutter spacing ratio according to the penetration depth more accurately and it is judged that the SVM regression will be effectively used to decide the cutter spacing when designing the cutter head if a large amount of data of the optimum cutter spacing ratio according to the penetration depth is accumulated.

A Study on Detecting Fake Reviews Using Machine Learning: Focusing on User Behavior Analysis (머신러닝을 활용한 가짜리뷰 탐지 연구: 사용자 행동 분석을 중심으로)

  • Lee, Min Cheol;Yoon, Hyun Shik
    • Knowledge Management Research
    • /
    • v.21 no.3
    • /
    • pp.177-195
    • /
    • 2020
  • The social consciousness on fake reviews has triggered researchers to suggest ways to cope with them by analyzing contents of fake reviews or finding ways to discover them by means of structural characteristics of them. This research tried to collect data from blog posts in Naver and detect habitual patterns users use unconsciously by variables extracted from blogs and blog posts by a machine learning model and wanted to use the technique in predicting fake reviews. Data analysis showed that there was a very high relationship between the number of all the posts registered in the blog of the writer of the related writing and the date when it was registered. And, it was found that, as model to detect advertising reviews, Random Forest is the most suitable. If a review is predicted to be an advertising one by the model suggested in this research, it is very likely that it is fake review, and that it violates the guidelines on investigation into markings and advertising regarding recommendation and guarantee in the Law of Marking and Advertising. The fact that, instead of using analysis of morphemes in contents of writings, this research adopts behavior analysis of the writer, and, based on such an approach, collects characteristic data of blogs and blog posts not by manual works, but by automated system, and discerns whether a certain writing is advertising or not is expected to have positive effects on improving efficiency and effectiveness in detecting fake reviews.