• Title/Summary/Keyword: 학습된 패턴

Search Result 1,572, Processing Time 0.058 seconds

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.

Importance and Specialization Plan of the Indicators by the Function of the Arboretum (수목원 기능별 지표의 중요도와 특성화방안 - 대구, 경북, 경남 수목원을 대상으로 -)

  • Kim, Yong-Soo;Ha, Sun-Gyone;Park, Chan-Yong
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.4
    • /
    • pp.370-378
    • /
    • 2009
  • This study tries to provide the basic direction to form the arboretum with the distinct features by providing the basic data to help the differentiated strategy for each arboretum. For this purpose, the users' pattern, importance of the indicator by the function, and the stimulation and specialization importance were examined for Daegu Arboretum, Gyeongbuk Arboretum and Gyeongnam Arboretum in Gyeongsang Province. The result says, looking into the functions of arboretum, the collection function showed the highest importance in the preservation of the endangered crisis species; the display function showed the highest in the use as the nature experiencing spaces through the plant exhibition; the research function showed the highest in the study on Plant Systematics; the education function showed the highest in the protection of the native plants; and the recreational function showed the highest in the healthy recreational space. In the plan for the promotion of the arboretum showed the highest in the public education program operation such as the narration from arboretum and education for plant. Therefore, it is considered to need the system setup such as the education program, material development and specialist training in terms of the arboretum. For the specialization plan for arboretum in this study, it seem desirable to concentrate on the research and education related to the natural resources renewal, for Daegu Arboretum; to concentrate on the resort site for the protection and display of the species and the disabled visitors by utilizing the geographical traits in the mountains, for Gyeongbuk Arboretum; to create the specialization plan mainly for the tree species suitable for the warm weather and for the children.

EFFECT OF THE SOCIAL SKILL TRAINING IN ADHD CHILDREN (주의력 결핍 과잉운동장애 아동에서 사회기술훈련의 효과)

  • Park, Soon-Young;Kwack, Young-Sook;Kim, Mi-Koung
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.9 no.2
    • /
    • pp.154-164
    • /
    • 1998
  • Medication is widely accepted as an effective method to reduce the problem of attention deficit, hyperactivity, impulsivity, resistance and violence of ADHD children. However, it does not provide us with the solution on the conflicting routinized behavioral patterns to gain a high level of self-control and acceptable behavior. As a way of replacing medication, this study applies the social skills training program for ADHD children and measures the level of improvement of social skills and the change of the behavioral patterns. The experiment is carried out on 16 children ranged from 6 to 13 years of age for 10 weeks. The patients are divided into three groups:a pure ADHD group, an ADHD group with conduct disorder, an ADHD group with mental retardation and other symptoms. The change of symptoms and the change of social skills are measured by the Child Behavior Checklist(CBCL), the ADD-H Comprehensive Teacher’s Rating Scale(ACTeRS) and the Social Skills Rating Scale(SSRS), and finally Mastson Evaluation of Social Skills for Youth(MESSY). Wilcoxon signed ranks test is used to evaluate the effect of the treatment, and Kruskal-Wallis test is also used to measure the change after the treatment in each of the three groups. In the ADHD group with conduct disorder, the examination of the effect of the treatment shows a significant reduction of violence in the area of behavior(p<.05), and a significant difference of activity and social skills in the area of social competent(p<.001). In the ADHD group with mental retardation and other symptoms, a significant rise of social skills is found in the area of social skills evaluation (p<.05). However, there is no significant difference of effect by the treatment among the three groups. In addition, the current examination shows that the social skills training program does not make a statistically significant contribution to the social skills of the ADHD children. On the other hand, the training helps some children, when it is suitable for the characteristics and accompanying symptoms of the children:it reduces the level of violence in the ADHD group with conduct disorder, and it raises the social skills in the ADHD group with mental retardation. In other words, the social skills training program will reduce the conduct disorder and helps peer relation for ADHD children.

  • PDF

Monitoring Ground-level SO2 Concentrations Based on a Stacking Ensemble Approach Using Satellite Data and Numerical Models (위성 자료와 수치모델 자료를 활용한 스태킹 앙상블 기반 SO2 지상농도 추정)

  • Choi, Hyunyoung;Kang, Yoojin;Im, Jungho;Shin, Minso;Park, Seohui;Kim, Sang-Min
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1053-1066
    • /
    • 2020
  • Sulfur dioxide (SO2) is primarily released through industrial, residential, and transportation activities, and creates secondary air pollutants through chemical reactions in the atmosphere. Long-term exposure to SO2 can result in a negative effect on the human body causing respiratory or cardiovascular disease, which makes the effective and continuous monitoring of SO2 crucial. In South Korea, SO2 monitoring at ground stations has been performed, but this does not provide spatially continuous information of SO2 concentrations. Thus, this research estimated spatially continuous ground-level SO2 concentrations at 1 km resolution over South Korea through the synergistic use of satellite data and numerical models. A stacking ensemble approach, fusing multiple machine learning algorithms at two levels (i.e., base and meta), was adopted for ground-level SO2 estimation using data from January 2015 to April 2019. Random forest and extreme gradient boosting were used as based models and multiple linear regression was adopted for the meta-model. The cross-validation results showed that the meta-model produced the improved performance by 25% compared to the base models, resulting in the correlation coefficient of 0.48 and root-mean-square-error of 0.0032 ppm. In addition, the temporal transferability of the approach was evaluated for one-year data which were not used in the model development. The spatial distribution of ground-level SO2 concentrations based on the proposed model agreed with the general seasonality of SO2 and the temporal patterns of emission sources.

Estimation of Ground-level PM10 and PM2.5 Concentrations Using Boosting-based Machine Learning from Satellite and Numerical Weather Prediction Data (부스팅 기반 기계학습기법을 이용한 지상 미세먼지 농도 산출)

  • Park, Seohui;Kim, Miae;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.2
    • /
    • pp.321-335
    • /
    • 2021
  • Particulate matter (PM10 and PM2.5 with a diameter less than 10 and 2.5 ㎛, respectively) can be absorbed by the human body and adversely affect human health. Although most of the PM monitoring are based on ground-based observations, they are limited to point-based measurement sites, which leads to uncertainty in PM estimation for regions without observation sites. It is possible to overcome their spatial limitation by using satellite data. In this study, we developed machine learning-based retrieval algorithm for ground-level PM10 and PM2.5 concentrations using aerosol parameters from Geostationary Ocean Color Imager (GOCI) satellite and various meteorological parameters from a numerical weather prediction model during January to December of 2019. Gradient Boosted Regression Trees (GBRT) and Light Gradient Boosting Machine (LightGBM) were used to estimate PM concentrations. The model performances were examined for two types of feature sets-all input parameters (Feature set 1) and a subset of input parameters without meteorological and land-cover parameters (Feature set 2). Both models showed higher accuracy (about 10 % higher in R2) by using the Feature set 1 than the Feature set 2. The GBRT model using Feature set 1 was chosen as the final model for further analysis(PM10: R2 = 0.82, nRMSE = 34.9 %, PM2.5: R2 = 0.75, nRMSE = 35.6 %). The spatial distribution of the seasonal and annual-averaged PM concentrations was similar with in-situ observations, except for the northeastern part of China with bright surface reflectance. Their spatial distribution and seasonal changes were well matched with in-situ measurements.

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

An Exploratory Study on the Status of and Demand for Higher Education Programs in Fashion in Myanmar (미얀마의 패션 고등교육 현황과 수요에 대한 탐색적 연구)

  • Kang, Min-Kyung;Jin, Byoungho Ellie;Cho, Ahra;Lee, Hyojeong;Lee, Jaeil;Lee, Yoon-Jung
    • Journal of Korean Home Economics Education Association
    • /
    • v.34 no.3
    • /
    • pp.1-23
    • /
    • 2022
  • This study examined the perceptions of Myanmar university students and professors regarding the status and necessity of higher education programs in fashion. Data were collected from professors in textile engineering at Yangon Technological University and Myanmar university students. Closed- and open-ended questions were asked either through interviews or by email. The responses were analyzed using keyword extraction and categorization, and descriptive statistics(closed questions). Generally, the professors perceived higher education, as well as the cultural industries including art and fashion, as important for Myanmar's social and economic development. According to the students interests in pursuing a degree in textile were limited, despite the high interest in fashion. Low wages in the apparel industry and lack of fashion degrees that meet the demand of students were cited as reasons. The demand was high for educational programs in fashion product development, fashion design, pattern-making, fashion marketing, branding, management, costume history, and cultural studies. Students expected to find their future career in textiles and clothing factories. Many students wanted to be hired by global fashion brands for higher salaries and training for advanced knowledge and technical skills. They perceived advanced fashion education programs will have various positive effects on Myanmar's national economy.

Application of deep learning method for decision making support of dam release operation (댐 방류 의사결정지원을 위한 딥러닝 기법의 적용성 평가)

  • Jung, Sungho;Le, Xuan Hien;Kim, Yeonsu;Choi, Hyungu;Lee, Giha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1095-1105
    • /
    • 2021
  • The advancement of dam operation is further required due to the upcoming rainy season, typhoons, or torrential rains. Besides, physical models based on specific rules may sometimes have limitations in controlling the release discharge of dam due to inherent uncertainty and complex factors. This study aims to forecast the water level of the nearest station to the dam multi-timestep-ahead and evaluate the availability when it makes a decision for a release discharge of dam based on LSTM (Long Short-Term Memory) of deep learning. The LSTM model was trained and tested on eight data sets with a 1-hour temporal resolution, including primary data used in the dam operation and downstream water level station data about 13 years (2009~2021). The trained model forecasted the water level time series divided by the six lead times: 1, 3, 6, 9, 12, 18-hours, and compared and analyzed with the observed data. As a result, the prediction results of the 1-hour ahead exhibited the best performance for all cases with an average accuracy of MAE of 0.01m, RMSE of 0.015 m, and NSE of 0.99, respectively. In addition, as the lead time increases, the predictive performance of the model tends to decrease slightly. The model may similarly estimate and reliably predicts the temporal pattern of the observed water level. Thus, it is judged that the LSTM model could produce predictive data by extracting the characteristics of complex hydrological non-linear data and can be used to determine the amount of release discharge from the dam when simulating the operation of the dam.

Trends in Pre-service Science Teacher Education Research in Korea (우리나라 예비 과학교사 교육 연구의 동향)

  • Lee, Gyeong-Geon;An, Taesoo;Mun, Seonyeong;Hong, Hun-Gi
    • Journal of The Korean Association For Science Education
    • /
    • v.42 no.1
    • /
    • pp.127-147
    • /
    • 2022
  • Pre-service science teacher education is important to elaborate the quality of science teaching and learning in schools. Therefore, many pre-service science teacher education researches have been done in Korea. However, almost no research has comprehensively reviewed those literatures including secondary teacher education context. This study reviewed 410 pre-service science teacher education researches in Korea, from 1995 to 2021 published by 17 journals in KCI. The trends were analyzed with respect to the number of article according to period, keyword frequency, and qualitative features. The qualitative features were coded in multiple aspects of pre-service teachers' type, major, subject-matter in research context, research approach, data type, and the number of participants. The results indicate that the number of research articles has increased by about 40 for every 5-year period. JKASE has published most articles, and the diversity of journals has increased since 2010. Keyword frequency revealed that scientific concepts, science teaching efficacy, nature of science, and other teaching and learning contexts were emphasized. In qualitative features, the most frequent pre-service type was secondary in 'general' science context. For research topic, 'pre-service teacher education program' and 'perception and cognitive domain' were the most frequent. Most of the articles have 'analyzed' the phenomena or consequence of educational issue. Most research was conducted with 11 to 30 participants. These patterns of qualitative features have differed according to period, and types of pre-service teacher. Suggestions for the future pre-service science teacher education research topic were explored, such as policy-administrative research, integrated science teacher education, teacher agency, and environmental education.