• Title/Summary/Keyword: TREE FEATURE

Search Result 364, Processing Time 0.034 seconds

Feature Extraction and Evaluation for Classification Models of Injurious Falls Based on Surface Electromyography

  • Lim, Kitaek;Choi, Woochol Joseph
    • Physical Therapy Korea
    • /
    • v.28 no.2
    • /
    • pp.123-131
    • /
    • 2021
  • Background: Only 2% of falls in older adults result in serious injuries (i.e., hip fracture). Therefore, it is important to differentiate injurious versus non-injurious falls, which is critical to develop effective interventions for injury prevention. Objects: The purpose of this study was to a. extract the best features of surface electromyography (sEMG) for classification of injurious falls, and b. find a best model provided by data mining techniques using the extracted features. Methods: Twenty young adults self-initiated falls and landed sideways. Falling trials were consisted of three initial fall directions (forward, sideways, or backward) and three knee positions at the time of hip impact (the impacting-side knee contacted the other knee ("knee together") or the mat ("knee on mat"), or neither the other knee nor the mat was contacted by the impacting-side knee ("free knee"). Falls involved "backward initial fall direction" or "free knee" were defined as "injurious falls" as suggested from previous studies. Nine features were extracted from sEMG signals of four hip muscles during a fall, including integral of absolute value (IAV), Wilson amplitude (WAMP), zero crossing (ZC), number of turns (NT), mean of amplitude (MA), root mean square (RMS), average amplitude change (AAC), difference absolute standard deviation value (DASDV). The decision tree and support vector machine (SVM) were used to classify the injurious falls. Results: For the initial fall direction, accuracy of the best model (SVM with a DASDV) was 48%. For the knee position, accuracy of the best model (SVM with an AAC) was 49%. Furthermore, there was no model that has sensitivity and specificity of 80% or greater. Conclusion: Our results suggest that the classification model built upon the sEMG features of the four hip muscles are not effective to classify injurious falls. Future studies should consider other data mining techniques with different muscles.

Machine Learning Methods to Predict Vehicle Fuel Consumption

  • Ko, Kwangho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.13-20
    • /
    • 2022
  • It's proposed and analyzed ML(Machine Learning) models to predict vehicle FC(Fuel Consumption) in real-time. The test driving was done for a car to measure vehicle speed, acceleration, road gradient and FC for training dataset. The various ML models were trained with feature data of speed, acceleration and road-gradient for target FC. There are two kind of ML models and one is regression type of linear regression and k-nearest neighbors regression and the other is classification type of k-nearest neighbors classifier, logistic regression, decision tree, random forest and gradient boosting in the study. The prediction accuracy is low in range of 0.5 ~ 0.6 for real-time FC and the classification type is more accurate than the regression ones. The prediction error for total FC has very low value of about 0.2 ~ 2.0% and regression models are more accurate than classification ones. It's for the coefficient of determination (R2) of accuracy score distributing predicted values along mean of targets as the coefficient decreases. Therefore regression models are good for total FC and classification ones are proper for real-time FC prediction.

Vacant House Prediction and Important Features Exploration through Artificial Intelligence: In Case of Gunsan (인공지능 기반 빈집 추정 및 주요 특성 분석)

  • Lim, Gyoo Gun;Noh, Jong Hwa;Lee, Hyun Tae;Ahn, Jae Ik
    • Journal of Information Technology Services
    • /
    • v.21 no.3
    • /
    • pp.63-72
    • /
    • 2022
  • The extinction crisis of local cities, caused by a population density increase phenomenon in capital regions, directly causes the increase of vacant houses in local cities. According to population and housing census, Gunsan-si has continuously shown increasing trend of vacant houses during 2015 to 2019. In particular, since Gunsan-si is the city which suffers from doughnut effect and industrial decline, problems regrading to vacant house seems to exacerbate. This study aims to provide a foundation of a system which can predict and deal with the building that has high risk of becoming vacant house through implementing a data driven vacant house prediction machine learning model. Methodologically, this study analyzes three types of machine learning model by differing the data components. First model is trained based on building register, individual declared land value, house price and socioeconomic data and second model is trained with the same data as first model but with additional POI(Point of Interest) data. Finally, third model is trained with same data as the second model but with excluding water usage and electricity usage data. As a result, second model shows the best performance based on F1-score. Random Forest, Gradient Boosting Machine, XGBoost and LightGBM which are tree ensemble series, show the best performance as a whole. Additionally, the complexity of the model can be reduced through eliminating independent variables that have correlation coefficient between the variables and vacant house status lower than the 0.1 based on absolute value. Finally, this study suggests XGBoost and LightGBM based machine learning model, which can handle missing values, as final vacant house prediction model.

Protecting Accounting Information Systems using Machine Learning Based Intrusion Detection

  • Biswajit Panja
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.111-118
    • /
    • 2024
  • In general network-based intrusion detection system is designed to detect malicious behavior directed at a network or its resources. The key goal of this paper is to look at network data and identify whether it is normal traffic data or anomaly traffic data specifically for accounting information systems. In today's world, there are a variety of principles for detecting various forms of network-based intrusion. In this paper, we are using supervised machine learning techniques. Classification models are used to train and validate data. Using these algorithms we are training the system using a training dataset then we use this trained system to detect intrusion from the testing dataset. In our proposed method, we will detect whether the network data is normal or an anomaly. Using this method we can avoid unauthorized activity on the network and systems under that network. The Decision Tree and K-Nearest Neighbor are applied to the proposed model to classify abnormal to normal behaviors of network traffic data. In addition to that, Logistic Regression Classifier and Support Vector Classification algorithms are used in our model to support proposed concepts. Furthermore, a feature selection method is used to collect valuable information from the dataset to enhance the efficiency of the proposed approach. Random Forest machine learning algorithm is used, which assists the system to identify crucial aspects and focus on them rather than all the features them. The experimental findings revealed that the suggested method for network intrusion detection has a neglected false alarm rate, with the accuracy of the result expected to be between 95% and 100%. As a result of the high precision rate, this concept can be used to detect network data intrusion and prevent vulnerabilities on the network.

Prioritization of Species Selection Criteria for Urban Fine Dust Reduction Planting (도시 미세먼지 저감 식재를 위한 수종 선정 기준의 우선순위 도출)

  • Cho, Dong-Gil
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.4
    • /
    • pp.472-480
    • /
    • 2019
  • Selection of the plant material for planting to reduce fine dust should comprehensively consider the visual characteristics, such as the shape and texture of the plant leaves and form of bark, which affect the adsorption function of the plant. However, previous studies on reduction of fine dust through plants have focused on the absorption function rather than the adsorption function of plants and on foliage plants, which are indoor plants, rather than the outdoor plants. In particular, the criterion for selection of fine dust reduction species is not specific, so research on the selection criteria for plant materials for fine dust reduction in urban areas is needed. The purpose of this study is to identify the priorities of eight indicators that affect the fine dust reduction by using the fuzzy multi-criteria decision-making model (MCDM) and establish the tree selection criteria for the urban planting to reduce fine dust. For the purpose, we conducted a questionnaire survey of those who majored in fine dust-related academic fields and those with experience of researching fine dust. A result of the survey showed that the area of leaf and the tree species received the highest score as the factors that affect the fine dust reduction. They were followed by the surface roughness of leaves, tree height, growth rate, complexity of leaves, edge shape of leaves, and bark feature in that order. When selecting the species that have leaves with the coarse surface, it is better to select the trees with wooly, glossy, and waxy layers on the leaves. When considering the shape of the leaves, it is better to select the two-type or three-type leaves and palm-shaped leaves than the single-type leaves and to select the serrated leaves than the smooth edged leaves to increase the surface area for adsorbing fine dust in the air on the surface of the leaves. When considering the characteristics of the bark, it is better to select trees that have cork layers or show or are likely to show the bark loosening or cracks than to select those with lenticel or patterned barks. This study is significant in that it presents the priorities of the selection criteria of plant material based on the visual characteristics that affect the adsorption of fine dust for the planning of planting to reduce fine dust in the urban area. The results of this study can be used as basic data for the selection of trees for plantation planning in the urban area.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

Classification and Stand Characteristics of Subalpine Forest Vegetation at Hyangjeukbong and Jungbong in Mt. Deogyusan (덕유산 향적봉 및 중봉 아고산대의 산림식생유형분류와 임분 특성)

  • Han, Sang Hak;Han, Sim Hee;Yun, Chung Weon
    • Journal of Korean Society of Forest Science
    • /
    • v.105 no.1
    • /
    • pp.48-62
    • /
    • 2016
  • This study was conducted to classify forest vegetation structure and stand feature of Mt. Deogyusan National Park from Hyangjeukbong to Jungbong, 48 plots were surveyed. The type classification of the vegetation structure was performed with Z-M phytosociological method. As a result, Quercus mongolica community group was classified into the Picea jezoensis community, Carpinus cordata community and Tilia amurensis community in community unit. P. jezoensis community was subdivided into Deutzia glabrata group and Viburnum opulus var. calvescens group in group unit. D. glabrata group was subdivided into Acer mandshuricum subgroup and Ribes mandshuricum subgroup and V. opulus var. calvescens group was subdivided into Hemerocallis dumortieri subgroup and Prunus padus subgroup in subgroup unit. In the result of estimating the importance value, it constituted Q. mongolica (23.9%), Abies koreana (14.7%), Taxus cuspidata (10.2%), P. jezoensis (8.2%) and Betula ermanii (7.4%) in tree layer. It constituted Acer komarovii (18.6%), Acer pseudosieboldianum (18.4%) and Q. mongolica (8.9%) in subtree layer. It constituted Rhododendron schlippenbachii (20.7%), A. pseudosieboldianum (17.4%) and Symplocos chinensis (8.5%) in shrub layer. Indicator species analysis of vegetation unit 1 was consisted of Hydrangea serrata, Fraxinus mandshurica and D. glabrata that species prefer moist valley in subalpine or rocks. In the results of analyzing the species diversity, vegetation unit 1, 4 and 5 represented that there were different and complex local distributions. As in the similarity between the vegetation units, the vegetation units 1, 2, 3 and 4 represented high with 0.5 or above. It represented that there wasn't no differences on composition species in vegetation units.

An Optimized Combination of π-fuzzy Logic and Support Vector Machine for Stock Market Prediction (주식 시장 예측을 위한 π-퍼지 논리와 SVM의 최적 결합)

  • Dao, Tuanhung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.43-58
    • /
    • 2014
  • As the use of trading systems has increased rapidly, many researchers have become interested in developing effective stock market prediction models using artificial intelligence techniques. Stock market prediction involves multifaceted interactions between market-controlling factors and unknown random processes. A successful stock prediction model achieves the most accurate result from minimum input data with the least complex model. In this research, we develop a combination model of ${\pi}$-fuzzy logic and support vector machine (SVM) models, using a genetic algorithm to optimize the parameters of the SVM and ${\pi}$-fuzzy functions, as well as feature subset selection to improve the performance of stock market prediction. To evaluate the performance of our proposed model, we compare the performance of our model to other comparative models, including the logistic regression, multiple discriminant analysis, classification and regression tree, artificial neural network, SVM, and fuzzy SVM models, with the same data. The results show that our model outperforms all other comparative models in prediction accuracy as well as return on investment.

Finding the One-to-One Optimum Path Considering User's Route Perception Characteristics of Origin and Destination (Focused on the Origin-Based Formulation and Algorithm) (출발지와 도착지의 경로인지특성을 반영한 One-to-One 최적경로탐색 (출발지기반 수식 및 알고리즘을 중심으로))

  • Shin, Seong-Il;Sohn, Kee-Min;Cho, Chong-Suk;Cho, Tcheol-Woong;Kim, Won-Keun
    • Journal of Korean Society of Transportation
    • /
    • v.23 no.7 s.85
    • /
    • pp.99-110
    • /
    • 2005
  • Total travel cost of route which connects origin with destination (O-D) is consist of the total sum of link travel cost and route perception cost. If the link perception cost is different according to the origin and destination, optimal route search has limitation to reflect the actual condition by route enumeration problem. The purpose of this study is to propose optimal route searching formulation and algorithm which is enable to reflect different link perception cost by each route, not only avoid the enumeration problem between origin and destination. This method defines minimum unit of route as a link and finally compares routes using link unit costs. The proposed method considers the perception travel cost at both origin and destination in optimal route searching process, while conventional models refect the perception cost only at origin. However this two-way searching algorithm is still not able to guarantee optimum solution. To overcome this problem, this study proposed an orign based optimal route searching method which was developed based on destination based optimal perception route tree. This study investigates whether proposed numerical formulas and algorithms are able to reflect route perception behavior reflected the feature of origin and destination in a real traffic network by the example research including the diversity of route information for the surrounding area and the perception cost for the road hierarchy.

Matching Algorithms using the Union and Division (결합과 분배를 이용한 정합 알고리즘)

  • 박종민;조범준
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.5
    • /
    • pp.1102-1107
    • /
    • 2004
  • Fingerprint Recognition System is made up of Off-line treatment and On-line treatment; the one is registering all the information of there trieving features which are retrieved in the digitalized fingerprint getting out of the analog fingerprint through the fingerprint acquisition device and the other is the treatment making the decision whether the users are approved to be accessed to the system or not with matching them with the fingerprint features which are retrieved and database from the input fingerprint when the users are approaching the system to use. In matching between On-line and Off-line treatment, the most important thing is which features we are going to use as the standard. Therefore, we have been using “Delta” and “Core” as this standard until now, but there might have been some deficits not to exist in every person when we set them up as the standards. In order to handle the users who do not have those features, we are still using the matching method which enables us to make up of the spanning tree or the triangulation with the relations of the spanned feature. However, there are some overheads of the time on these methods and it is not sure whether they make the correct matching or not. Therefore, I would like to represent the more correct matching algorism in this paper which has not only better matching rate but also lower mismatching rate compared to the present matching algorism by selecting the line segment connecting two minutiae on the same ridge and furrow structures as the reference point.