• Title/Summary/Keyword: Machine Learning & Training

Search Result 789, Processing Time 0.025 seconds

Short-term Predictive Models for Influenza-like Illness in Korea: Using Weekly ILI Surveillance Data and Web Search Queries (한국 인플루엔자 의사환자 단기 예측 모형 개발: 주간 ILI 감시 자료와 웹 검색 정보의 활용)

  • Jung, Jae Un
    • Journal of Digital Convergence
    • /
    • v.16 no.9
    • /
    • pp.147-157
    • /
    • 2018
  • Since Google launched a prediction service for influenza-like illness(ILI), studies on ILI prediction based on web search data have proliferated worldwide. In this regard, this study aims to build short-term predictive models for ILI in Korea using ILI and web search data and measure the performance of the said models. In these proposed ILI predictive models specific to Korea, ILI surveillance data of Korea CDC and Korean web search data of Google and Naver were used along with the ARIMA model. Model 1 used only ILI data. Models 2 and 3 added Google and Naver search data to the data of Model 1, respectively. Model 4 included a common query used in Models 2 and 3 in addition to the data used in Model 1. In the training period, the goodness of fit of all predictive models was higher than 95% ($R^2$). In predictive periods 1 and 2, Model 1 yielded the best predictions (99.98% and 96.94%, respectively). Models 3(a), 4(b), and 4(c) achieved stable predictability higher than 90% in all predictive periods, but their performances were not better than that of Model 1. The proposed models that yielded accurate and stable predictions can be applied to early warning systems for the influenza pandemic in Korea, with supplementary studies on improving their performance.

Analysis of Disaster Safety Situation Classification Algorithm Based on Natural Language Processing Using 119 Calls Data (119 신고 데이터를 이용한 자연어처리 기반 재난안전 상황 분류 알고리즘 분석)

  • Kwon, Su-Jeong;Kang, Yun-Hee;Lee, Yong-Hak;Lee, Min-Ho;Park, Seung-Ho;Kang, Myung-Ju
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.10
    • /
    • pp.317-322
    • /
    • 2020
  • Due to the development of artificial intelligence, it is used as a disaster response support system in the field of disaster. Disasters can occur anywhere, anytime. In the event of a disaster, there are four types of reports: fire, rescue, emergency, and other call. Disaster response according to the 119 call also responds differently depending on the type and situation. In this paper, 1280 data set of 119 calls were tested with 3 classes of SVM, NB, k-NN, DT, SGD, and RF situation classification algorithms using a training data set. Classification performance showed the highest performance of 92% and minimum of 77%. In the future, it is necessary to secure an effective data set by disaster in various fields to study disaster response.

Convergence of Artificial Intelligence Techniques and Domain Specific Knowledge for Generating Super-Resolution Meteorological Data (기상 자료 초해상화를 위한 인공지능 기술과 기상 전문 지식의 융합)

  • Ha, Ji-Hun;Park, Kun-Woo;Im, Hyo-Hyuk;Cho, Dong-Hee;Kim, Yong-Hyuk
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.63-70
    • /
    • 2021
  • Generating a super-resolution meteological data by using a high-resolution deep neural network can provide precise research and useful real-life services. We propose a new technique of generating improved training data for super-resolution deep neural networks. To generate high-resolution meteorological data with domain specific knowledge, Lambert conformal conic projection and objective analysis were applied based on observation data and ERA5 reanalysis field data of specialized institutions. As a result, temperature and humidity analysis data based on domain specific knowledge showed improved RMSE by up to 42% and 46%, respectively. Next, a super-resolution generative adversarial network (SRGAN) which is one of the aritifial intelligence techniques was used to automate the manual data generation technique using damain specific techniques as described above. Experiments were conducted to generate high-resolution data with 1 km resolution from global model data with 10 km resolution. Finally, the results generated with SRGAN have a higher resoltuion than the global model input data, and showed a similar analysis pattern to the manually generated high-resolution analysis data, but also showed a smooth boundary.

Vulnerability Assessment for Fine Particulate Matter (PM2.5) in the Schools of the Seoul Metropolitan Area, Korea: Part I - Predicting Daily PM2.5 Concentrations (인공지능을 이용한 수도권 학교 미세먼지 취약성 평가: Part I - 미세먼지 예측 모델링)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_2
    • /
    • pp.1881-1890
    • /
    • 2021
  • Particulate matter (PM) affects the human, ecosystems, and weather. Motorized vehicles and combustion generate fine particulate matter (PM2.5), which can contain toxic substances and, therefore, requires systematic management. Consequently, it is important to monitor and predict PM2.5 concentrations, especially in large cities with dense populations and infrastructures. This study aimed to predict PM2.5 concentrations in large cities using meteorological and chemical variables as well as satellite-based aerosol optical depth. For PM2.5 concentrations prediction, a random forest (RF) model showing excellent performance in PM concentrations prediction among machine learning models was selected. Based on the performance indicators R2, RMSE, MAE, and MAPE with training accuracies of 0.97, 3.09, 2.18, and 13.31 and testing accuracies of 0.82, 6.03, 4.36, and 25.79 for R2, RMSE, MAE, and MAPE, respectively. The variables used in this study showed high correlation to PM2.5 concentrations. Therefore, we conclude that these variables can be used in a random forest model to generate reliable PM2.5 concentrations predictions, which can then be used to assess the vulnerability of schools to PM2.5.

Character Motion Control by Using Limited Sensors and Animation Data (제한된 모션 센서와 애니메이션 데이터를 이용한 캐릭터 동작 제어)

  • Bae, Tae Sung;Lee, Eun Ji;Kim, Ha Eun;Park, Minji;Choi, Myung Geol
    • Journal of the Korea Computer Graphics Society
    • /
    • v.25 no.3
    • /
    • pp.85-92
    • /
    • 2019
  • A 3D virtual character playing a role in a digital story-telling has a unique style in its appearance and motion. Because the style reflects the unique personality of the character, it is very important to preserve the style and keep its consistency. However, when the character's motion is directly controlled by a user's motion who is wearing motion sensors, the unique style can be discarded. We present a novel character motion control method that uses only a small amount of animation data created only for the character to preserve the style of the character motion. Instead of machine learning approaches requiring a large amount of training data, we suggest a search-based method, which directly searches the most similar character pose from the animation data to the current user's pose. To show the usability of our method, we conducted our experiments with a character model and its animation data created by an expert designer for a virtual reality game. To prove that our method preserves well the original motion style of the character, we compared our result with the result obtained by using general human motion capture data. In addition, to show the scalability of our method, we presented experimental results with different numbers of motion sensors.

Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks

  • Kim, Joo-Chang;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2060-2077
    • /
    • 2019
  • Recently, mobile healthcare services have attracted significant attention because of the emerging development and supply of diverse wearable devices. Smartwatches and health bands are the most common type of mobile-based wearable devices and their market size is increasing considerably. However, simple value comparisons based on accumulated data have revealed certain problems, such as the standardized nature of health management and the lack of personalized health management service models. The convergence of information technology (IT) and biotechnology (BT) has shifted the medical paradigm from continuous health management and disease prevention to the development of a system that can be used to provide ground-based medical services regardless of the user's location. Moreover, the IT-BT convergence has necessitated the development of lifestyle improvement models and services that utilize big data analysis and machine learning to provide mobile healthcare-based personal health management and disease prevention information. Users' health data, which are specific as they change over time, are collected by different means according to the users' lifestyle and surrounding circumstances. In this paper, we propose a prediction model of user physical activity that uses data characteristics-based long short-term memory (DC-LSTM) recurrent neural networks (RNNs). To provide personalized services, the characteristics and surrounding circumstances of data collectable from mobile host devices were considered in the selection of variables for the model. The data characteristics considered were ease of collection, which represents whether or not variables are collectable, and frequency of occurrence, which represents whether or not changes made to input values constitute significant variables in terms of activity. The variables selected for providing personalized services were activity, weather, temperature, mean daily temperature, humidity, UV, fine dust, asthma and lung disease probability index, skin disease probability index, cadence, travel distance, mean heart rate, and sleep hours. The selected variables were classified according to the data characteristics. To predict activity, an LSTM RNN was built that uses the classified variables as input data and learns the dynamic characteristics of time series data. LSTM RNNs resolve the vanishing gradient problem that occurs in existing RNNs. They are classified into three different types according to data characteristics and constructed through connections among the LSTMs. The constructed neural network learns training data and predicts user activity. To evaluate the proposed model, the root mean square error (RMSE) was used in the performance evaluation of the user physical activity prediction method for which an autoregressive integrated moving average (ARIMA) model, a convolutional neural network (CNN), and an RNN were used. The results show that the proposed DC-LSTM RNN method yields an excellent mean RMSE value of 0.616. The proposed method is used for predicting significant activity considering the surrounding circumstances and user status utilizing the existing standardized activity prediction services. It can also be used to predict user physical activity and provide personalized healthcare based on the data collectable from mobile host devices.

Landslide Susceptibility Prediction using Evidential Belief Function, Weight of Evidence and Artificial Neural Network Models (Evidential Belief Function, Weight of Evidence 및 Artificial Neural Network 모델을 이용한 산사태 공간 취약성 예측 연구)

  • Lee, Saro;Oh, Hyun-Joo
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.2
    • /
    • pp.299-316
    • /
    • 2019
  • The purpose of this study was to analyze landslide susceptibility in the Pyeongchang area using Weight of Evidence (WOE) and Evidential Belief Function (EBF) as probability models and Artificial Neural Networks (ANN) as a machine learning model in a geographic information system (GIS). This study examined the widespread shallow landslides triggered by heavy rainfall during Typhoon Ewiniar in 2006, which caused serious property damage and significant loss of life. For the landslide susceptibility mapping, 3,955 landslide occurrences were detected using aerial photographs, and environmental spatial data such as terrain, geology, soil, forest, and land use were collected and constructed in a spatial database. Seventeen factors that could affect landsliding were extracted from the spatial database. All landslides were randomly separated into two datasets, a training set (50%) and validation set (50%), to establish and validate the EBF, WOE, and ANN models. According to the validation results of the area under the curve (AUC) method, the accuracy was 74.73%, 75.03%, and 70.87% for WOE, EBF, and ANN, respectively. The EBF model had the highest accuracy. However, all models had predictive accuracy exceeding 70%, the level that is effective for landslide susceptibility mapping. These models can be applied to predict landslide susceptibility in an area where landslides have not occurred previously based on the relationships between landslide and environmental factors. This susceptibility map can help reduce landslide risk, provide guidance for policy and land use development, and save time and expense for landslide hazard prevention. In the future, more generalized models should be developed by applying landslide susceptibility mapping in various areas.

Application of Google Search Queries for Predicting the Unemployment Rate for Koreans in Their 30s and 40s (한국 30~40대 실업률 예측을 위한 구글 검색 정보의 활용)

  • Jung, Jae Un;Hwang, Jinho
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.135-145
    • /
    • 2019
  • Prolonged recession has caused the youth unemployment rate in Korea to remain at a high level of approximately 10% for years. Recently, the number of unemployed Koreans in their 30s and 40s has shown an upward trend. To expand the government's employment promotion and unemployment benefits from youth-centered policies to diverse age groups, including people in their 30s and 40s, prediction models for different age groups are required. Thus, we aimed to develop unemployment prediction models for specific age groups (30s and 40s) using available unemployment rates provided by Statistics Korea and Google search queries related to them. We first estimated multiple linear regressions (Model 1) using seasonal autoregressive integrated moving average approach with relevant unemployment rates. Then, we introduced Google search queries to obtain improved models (Model 2). For both groups, consequently, Model 2 additionally using web queries outperformed Model 1 during training and predictive periods. This result indicates that a web search query is still significant to improve the unemployment predictive models for Koreans. For practical application, this study needs to be furthered but will contribute to obtaining age-wise unemployment predictions.

Construction of a Bark Dataset for Automatic Tree Identification and Developing a Convolutional Neural Network-based Tree Species Identification Model (수목 동정을 위한 수피 분류 데이터셋 구축과 합성곱 신경망 기반 53개 수종의 동정 모델 개발)

  • Kim, Tae Kyung;Baek, Gyu Heon;Kim, Hyun Seok
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.2
    • /
    • pp.155-164
    • /
    • 2021
  • Many studies have been conducted on developing automatic plant identification algorithms using machine learning to various plant features, such as leaves and flowers. Unlike other plant characteristics, barks show only little change regardless of the season and are maintained for a long period. Nevertheless, barks show a complex shape with a large variation depending on the environment, and there are insufficient materials that can be utilized to train algorithms. Here, in addition to the previously published bark image dataset, BarkNet v.1.0, images of barks were collected, and a dataset consisting of 53 tree species that can be easily observed in Korea was presented. A convolutional neural network (CNN) was trained and tested on the dataset, and the factors that interfere with the model's performance were identified. For CNN architecture, VGG-16 and 19 were utilized. As a result, VGG-16 achieved 90.41% and VGG-19 achieved 92.62% accuracy. When tested on new tree images that do not exist in the original dataset but belong to the same genus or family, it was confirmed that more than 80% of cases were successfully identified as the same genus or family. Meanwhile, it was found that the model tended to misclassify when there were distracting features in the image, including leaves, mosses, and knots. In these cases, we propose that random cropping and classification by majority votes are valid for improving possible errors in training and inferences.

A research on the possibility of restoring cultural assets of artificial intelligence through the application of artificial neural networks to roof tile(Wadang)

  • Kim, JunO;Lee, Byong-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.19-26
    • /
    • 2021
  • Cultural assets excavated in historical areas have their own characteristics based on the background of the times, and it can be seen that their patterns and characteristics change little by little according to the history and the flow of the spreading area. Cultural properties excavated in some areas represent the culture of the time and some maintain their intact appearance, but most of them are damaged/lost or divided into parts, and many experts are mobilized to research the composition and repair the damaged parts. The purpose of this research is to learn patterns and characteristics of the past through artificial intelligence neural networks for such restoration research, and to restore the lost parts of the excavated cultural assets based on Generative Adversarial Network(GAN)[1]. The research is a process in which the rest of the damaged/lost parts are restored based on some of the cultural assets excavated based on the GAN. To recover some parts of dammed of cultural asset, through training with the 2D image of a complete cultural asset. This research is focused on how much recovered not only damaged parts but also reproduce colors and materials. Finally, through adopted this trained neural network to real damaged cultural, confirmed area of recovered area and limitation.