• Title/Summary/Keyword: Machine Learning system

Search Result 1,848, Processing Time 0.032 seconds

Analysis on the Determinants of Land Compensation Cost: The Use of the Construction CALS Data (토지 보상비 결정 요인 분석 - 건설CALS 데이터 중심으로)

  • Lee, Sang-Gyu;Seo, Myoung-Bae;Kim, Jin-Uk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.10
    • /
    • pp.461-470
    • /
    • 2020
  • This study analyzed the determinants of land compensation costs using the CALS (Continuous Acquisition & Life-Cycle Support) system to generate data for the construction (planning, design, building, management) process. For analysis, variables used in the related research on land costs were used, which included eight variables (Land Area, Individual Public Land Price, Appraisal & Assessment, Land Category, Use District 1, Terrain Elevation, Terrain Shape, and Road). Also, the variables were analyzed using the machine learning-based Xgboost algorithm. Individual Public Land Price was identified as the most important variable in determining land cost. We used a linear multiple regression analysis to verify the determinants of land compensation. For this verification, the dependent variable included was the Individual Public Land Price, and the independent variables were the numeric variable (Land Area) and factor variables (Land Category, Use District 1, Terrain Elevation, Terrain Shape, Road). This study found that the significant variables were Land Category, Use District 1, and Road.

A Method of Detecting the Aggressive Driving of Elderly Driver (노인 운전자의 공격적인 운전 상태 검출 기법)

  • Koh, Dong-Woo;Kang, Hang-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.537-542
    • /
    • 2017
  • Aggressive driving is a major cause of car accidents. Previous studies have mainly analyzed young driver's aggressive driving tendency, yet they were only done through pure clustering or classification technique of machine learning. However, since elderly people have different driving habits due to their fragile physical conditions, it is necessary to develop a new method such as enhancing the characteristics of driving data to properly analyze aggressive driving of elderly drivers. In this study, acceleration data collected from a smartphone of a driving vehicle is analyzed by a newly proposed ECA(Enhanced Clustering method for Acceleration data) technique, coupled with a conventional clustering technique (K-means Clustering, Expectation-maximization algorithm). ECA selects high-intensity data among the data of the cluster group detected through K-means and EM in all of the subjects' data and models the characteristic data through the scaled value. Using this method, the aggressive driving data of all youth and elderly experiment participants were collected, unlike the pure clustering method. We further found that the K-means clustering has higher detection efficiency than EM method. Also, the results of K-means clustering demonstrate that a young driver has a driving strength 1.29 times higher than that of an elderly driver. In conclusion, the proposed method of our research is able to detect aggressive driving maneuvers from data of the elderly having low operating intensity. The proposed method is able to construct a customized safe driving system for the elderly driver. In the future, it will be possible to detect abnormal driving conditions and to use the collected data for early warning to drivers.

Analysis of Disaster Safety Situation Classification Algorithm Based on Natural Language Processing Using 119 Calls Data (119 신고 데이터를 이용한 자연어처리 기반 재난안전 상황 분류 알고리즘 분석)

  • Kwon, Su-Jeong;Kang, Yun-Hee;Lee, Yong-Hak;Lee, Min-Ho;Park, Seung-Ho;Kang, Myung-Ju
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.10
    • /
    • pp.317-322
    • /
    • 2020
  • Due to the development of artificial intelligence, it is used as a disaster response support system in the field of disaster. Disasters can occur anywhere, anytime. In the event of a disaster, there are four types of reports: fire, rescue, emergency, and other call. Disaster response according to the 119 call also responds differently depending on the type and situation. In this paper, 1280 data set of 119 calls were tested with 3 classes of SVM, NB, k-NN, DT, SGD, and RF situation classification algorithms using a training data set. Classification performance showed the highest performance of 92% and minimum of 77%. In the future, it is necessary to secure an effective data set by disaster in various fields to study disaster response.

Development of Music Classification of Light and Shade using VCM and Beat Tracking (VCM과 Beat Tracking을 이용한 음악의 명암 분류 기법 개발)

  • Park, Seung-Min;Park, Jun-Heong;Lee, Young-Hwan;Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.6
    • /
    • pp.884-889
    • /
    • 2010
  • Recently, a music genre classification has been studied. However, experts use different criteria to classify each of these classifications is difficult to derive accurate results. In addition, when the emergence of a new genre of music genre is a newly re-defined. Music as a genre rather than to separate search should be classified as emotional words. In this paper, the feelings of people on the basis of brightness and darkness tries to categorize music. The proposed classification system by applying VCM(Variance Considered Machines) is the contrast of the music. In this paper, we are using three kinds of musical characteristics. Based on surveys made throughout the learning, based on musical attributes(beat, timbre, note) was used to study in the VCM. VCM is classified by the trained compared with the results of the survey were analyzed. Note extraction using the MATLAB, sampled at regular intervals to share music via the FFT frequency analysis by the sector average is defined as representing the element extracted note by quantifying the height of the entire distribution was identified. Cumulative frequency distribution in the entire frequency rage, using the difference in Timbre and were quantified. VCM applied to these three characteristics with the experimental results by comparing the survey results to see the contrast of the music with a probability of 95.4% confirmed that the two separate.

Empirical Research on Search model of Web Service Repository (웹서비스 저장소의 검색기법에 관한 실증적 연구)

  • Hwang, You-Sub
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.173-193
    • /
    • 2010
  • The World Wide Web is transitioning from being a mere collection of documents that contain useful information toward providing a collection of services that perform useful tasks. The emerging Web service technology has been envisioned as the next technological wave and is expected to play an important role in this recent transformation of the Web. By providing interoperable interface standards for application-to-application communication, Web services can be combined with component-based software development to promote application interaction and integration within and across enterprises. To make Web services for service-oriented computing operational, it is important that Web services repositories not only be well-structured but also provide efficient tools for an environment supporting reusable software components for both service providers and consumers. As the potential of Web services for service-oriented computing is becoming widely recognized, the demand for an integrated framework that facilitates service discovery and publishing is concomitantly growing. In our research, we propose a framework that facilitates Web service discovery and publishing by combining clustering techniques and leveraging the semantics of the XML-based service specification in WSDL files. We believe that this is one of the first attempts at applying unsupervised artificial neural network-based machine-learning techniques in the Web service domain. We have developed a Web service discovery tool based on the proposed approach using an unsupervised artificial neural network and empirically evaluated the proposed approach and tool using real Web service descriptions drawn from operational Web services repositories. We believe that both service providers and consumers in a service-oriented computing environment can benefit from our Web service discovery approach.

A point-scale gap filling of the flux-tower data using the artificial neural network (인공신경망 기법을 이용한 청미천 유역 Flux tower 결측치 보정)

  • Jeon, Hyunho;Baik, Jongjin;Lee, Seulchan;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.11
    • /
    • pp.929-938
    • /
    • 2020
  • In this study, we estimated missing evapotranspiration (ET) data at a eddy-covariance flux tower in the Cheongmicheon farmland site using the Artificial Neural Network (ANN). The ANN showed excellent performance in numerical analysis and is expanding in various fields. To evaluate the performance the ANN-based gap-filling, ET was calculated using the existing gap-filling methods of Mean Diagnostic Variation (MDV) and Food and Aggregation Organization Penman-Monteith (FAO-PM). Then ET was evaluated by time series method and statistical analysis (coefficient of determination, index of agreement (IOA), root mean squared error (RMSE) and mean absolute error (MAE). For the validation of each gap-filling model, we used 30 minutes of data in 2015. Of the 121 missing values, the ANN method showed the best performance by supplementing 70, 53 and 84 missing values, respectively, in the order of MDV, FAO-PM, and ANN methods. Analysis of the coefficient of determination (MDV, FAO-PM, and ANN methods followed by 0.673, 0.784, and 0.841, respectively.) and the IOA (The MDV, FAO-PM, and ANN methods followed by 0.899, 0.890, and 0.951 respectively.) indicated that, all three methods were highly correlated and considered to be fully utilized, and among them, ANN models showed the highest performance and suitability. Based on this study, it could be used more appropriately in the study of gap-filling method of flux tower data using machine learning method.

Spark based Scalable RDFS Ontology Reasoning over Big Triples with Confidence Values (신뢰값 기반 대용량 트리플 처리를 위한 스파크 환경에서의 RDFS 온톨로지 추론)

  • Park, Hyun-Kyu;Lee, Wan-Gon;Jagvaral, Batselem;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.1
    • /
    • pp.87-95
    • /
    • 2016
  • Recently, due to the development of the Internet and electronic devices, there has been an enormous increase in the amount of available knowledge and information. As this growth has proceeded, studies on large-scale ontological reasoning have been actively carried out. In general, a machine learning program or knowledge engineer measures and provides a degree of confidence for each triple in a large ontology. Yet, the collected ontology data contains specific uncertainty and reasoning such data can cause vagueness in reasoning results. In order to solve the uncertainty issue, we propose an RDFS reasoning approach that utilizes confidence values indicating degrees of uncertainty in the collected data. Unlike conventional reasoning approaches that have not taken into account data uncertainty, by using the in-memory based cluster computing framework Spark, our approach computes confidence values in the data inferred through RDFS-based reasoning by applying methods for uncertainty estimating. As a result, the computed confidence values represent the uncertainty in the inferred data. To evaluate our approach, ontology reasoning was carried out over the LUBM standard benchmark data set with addition arbitrary confidence values to ontology triples. Experimental results indicated that the proposed system is capable of running over the largest data set LUBM3000 in 1179 seconds inferring 350K triples.

Study for implementation of smart water management system on Cisangkuy river basin in Indonesia (인도네시아 찌상쿠이강 유역의 지능형 물관리 시스템 적용 연구)

  • Kim, Eugene;Ko, Ick Hwan;Park, Chan Ho
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.469-469
    • /
    • 2017
  • 기후 변화 및 환경오염으로 인하여 물부족 국가가 세계적으로 증가하고 있는 추세이며, 특히 집중형 강우의 형태가 많아짐에 따라 홍수피해 및 상수공급의 문제가 사회적으로 큰 이슈가 되고 있다. 최근 20여 년간의 급속한 경제성장과 도시화 과정에서 인도네시아는 인구와 산업의 과도한 도시집중으로 지난 1960-80년대 한국이 산업화 과정에서 겪었던 것보다 훨씬 심각한 환경문제에 직면하고 있으며, 자카르타와 반둥을 포함하는 광역 수도권 지역의 물 부족과 수질 오염, 환경문제가 이미 매우 위험한 수준에 도달하고 있는 실정이다. 특히, 찌따룸강 중상류에 위치한 인도네시아 3대 도시인 반둥시는 고질적인 용수부족 문제를 겪고 있다. 2010년 현재 약 일평균 15 CMS의 용수가 부족한 상황이며, 2030년에는 지속적인 인구증가로 약 23 CMS의 용수가 추가로 더 필요한 것으로 전망된다. 이러한 용수공급 문제 해결을 위해 반둥시 및 찌따룸강 유역관리청은 댐 및 지하수 개발, 유역 간 물이동 등의 구조적인 대책뿐만 아니라 비구조적인 대책으로써 기존 및 신규 저수지 연계운영을 통한 용수이용의 효율성을 높이는 방안을 모색하고 있다. 이에 따라 본 연구에서는 해당유역의 용수공급 부족 문제를 해소할 수 있는 비구조적인 대책의 일환으로써 다양한 댐 및 보, 소수력 발전, 취수장 등 유역 내 수리 시설물의 운영 최적화를 위한 지능형 물관리 시스템 적용 방안을 제시하고자 한다. 본 연구의 지능형 물관리 시스템은 센서 및 사물 인터넷(Internet of Things, IoT), 네트워크 기술을 바탕으로 시설물 및 운영자, 유관기관 간의 양방향 통신을 통해 유기적인 상호연계 체계를 제공 할 수 있다. 또한 유역의 수문상황과 시설물의 운영현황, 용수공급 및 수요 현황을 실시간으로 확인함으로써 수요에 따른 즉각적인 용수공급량의 조절이 가능하다. 또한, 빅데이터 분석 및 기계학습(Machine Learning)을 통해 개별 물관리 시설물에 대한 최적 운영룰을 업데이트할 수 있으며, 유역의 수문상황과 용수 수요 현황을 고려하여 최적의 용수공급 우선순위를 선정할 수 있다. 지능형 물관리 시스템 개발의 목적은 찌상쿠이 유역의 수문현황을 실시간으로 모니터링하고, 하천시설물의 운영을 분석하여 최적의 용수공급 및 배분을 통해 유역의 수자원 활용 효율성을 향상시키는 데 있다. 이를 위해 수문자료의 수집체계를 구축하고 기관간 정보공유체계를 수립함으로써 분석을 위한 기반 인프라를 구성하며, 이를 기반으로 유역 유출을 비롯한 저수지 운영, 물수지 분석을 수행하고, 분석 및 예측결과, 과거 운영 자료를 토대로 새로운 물관리 시설 운영룰 및 시설물 간 연계운영 방안, 용수공급 우선순위 의사결정 등을 지원하고자 한다. 본 연구의 지능형 물관리 시스템은 통합 DB를 기반으로 수리수문 현상의 모의 분석을 통해 하천 시설물 운영의 합리적 기준을 제시함으로써 다양한 관리주체들의 시설물운영에 대한 이견 및 분쟁을 해소하고, 한정된 수자원과 다양한 수요 간의 효율적이고 합리적인 분배 및 시설물 운영문제를 해결하기 위한 의사결정도구로써 활용할 수 있을 것으로 기대된다.

  • PDF

Vulnerability Assessment for Fine Particulate Matter (PM2.5) in the Schools of the Seoul Metropolitan Area, Korea: Part I - Predicting Daily PM2.5 Concentrations (인공지능을 이용한 수도권 학교 미세먼지 취약성 평가: Part I - 미세먼지 예측 모델링)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_2
    • /
    • pp.1881-1890
    • /
    • 2021
  • Particulate matter (PM) affects the human, ecosystems, and weather. Motorized vehicles and combustion generate fine particulate matter (PM2.5), which can contain toxic substances and, therefore, requires systematic management. Consequently, it is important to monitor and predict PM2.5 concentrations, especially in large cities with dense populations and infrastructures. This study aimed to predict PM2.5 concentrations in large cities using meteorological and chemical variables as well as satellite-based aerosol optical depth. For PM2.5 concentrations prediction, a random forest (RF) model showing excellent performance in PM concentrations prediction among machine learning models was selected. Based on the performance indicators R2, RMSE, MAE, and MAPE with training accuracies of 0.97, 3.09, 2.18, and 13.31 and testing accuracies of 0.82, 6.03, 4.36, and 25.79 for R2, RMSE, MAE, and MAPE, respectively. The variables used in this study showed high correlation to PM2.5 concentrations. Therefore, we conclude that these variables can be used in a random forest model to generate reliable PM2.5 concentrations predictions, which can then be used to assess the vulnerability of schools to PM2.5.

Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks

  • Kim, Joo-Chang;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2060-2077
    • /
    • 2019
  • Recently, mobile healthcare services have attracted significant attention because of the emerging development and supply of diverse wearable devices. Smartwatches and health bands are the most common type of mobile-based wearable devices and their market size is increasing considerably. However, simple value comparisons based on accumulated data have revealed certain problems, such as the standardized nature of health management and the lack of personalized health management service models. The convergence of information technology (IT) and biotechnology (BT) has shifted the medical paradigm from continuous health management and disease prevention to the development of a system that can be used to provide ground-based medical services regardless of the user's location. Moreover, the IT-BT convergence has necessitated the development of lifestyle improvement models and services that utilize big data analysis and machine learning to provide mobile healthcare-based personal health management and disease prevention information. Users' health data, which are specific as they change over time, are collected by different means according to the users' lifestyle and surrounding circumstances. In this paper, we propose a prediction model of user physical activity that uses data characteristics-based long short-term memory (DC-LSTM) recurrent neural networks (RNNs). To provide personalized services, the characteristics and surrounding circumstances of data collectable from mobile host devices were considered in the selection of variables for the model. The data characteristics considered were ease of collection, which represents whether or not variables are collectable, and frequency of occurrence, which represents whether or not changes made to input values constitute significant variables in terms of activity. The variables selected for providing personalized services were activity, weather, temperature, mean daily temperature, humidity, UV, fine dust, asthma and lung disease probability index, skin disease probability index, cadence, travel distance, mean heart rate, and sleep hours. The selected variables were classified according to the data characteristics. To predict activity, an LSTM RNN was built that uses the classified variables as input data and learns the dynamic characteristics of time series data. LSTM RNNs resolve the vanishing gradient problem that occurs in existing RNNs. They are classified into three different types according to data characteristics and constructed through connections among the LSTMs. The constructed neural network learns training data and predicts user activity. To evaluate the proposed model, the root mean square error (RMSE) was used in the performance evaluation of the user physical activity prediction method for which an autoregressive integrated moving average (ARIMA) model, a convolutional neural network (CNN), and an RNN were used. The results show that the proposed DC-LSTM RNN method yields an excellent mean RMSE value of 0.616. The proposed method is used for predicting significant activity considering the surrounding circumstances and user status utilizing the existing standardized activity prediction services. It can also be used to predict user physical activity and provide personalized healthcare based on the data collectable from mobile host devices.