• Title/Summary/Keyword: machine learning

Search Result 5,177, Processing Time 0.031 seconds

A Performance Improvement of Automatic Butterfly Identification Method Using Color Intensity Entropy (영상의 색체 강도 엔트로피를 이용한 나비 종 자동 인식 향상 방법)

  • Kang, Seung-Ho;Kim, Tae-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.5
    • /
    • pp.624-632
    • /
    • 2017
  • Automatic butterfly identification using images is one of the interesting research fields because it helps the related researchers studying species diversity and evolutionary and development process a lot in this field. The performance of the butterfly species identification system is dependent heavily on the quality of selected features. In this paper, we propose color intensity (CI) entropy by using the distribution of color intensities in a butterfly image. We show color intensity entropy can increase the recognition rate by 10% if it is used together with previously suggested branch length similarity entropy. In addition, the performance comparison with other features such as Eigenface, 2D Fourier transform, and 2D wavelet transform is conducted against several well known machine learning methods.

Formation of Nearest Neighbors Set Based on Similarity Threshold (유사도 임계치에 근거한 최근접 이웃 집합의 구성)

  • Lee, Jae-Sik;Lee, Jin-Chun
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.1-14
    • /
    • 2007
  • Case-based reasoning (CBR) is one of the most widely applied data mining techniques and has proven its effectiveness in various domains. Since CBR is basically based on k-Nearest Neighbors (NN) method, the value of k affects the performance of CBR model directly. Once the value of k is set, it is fixed for the lifetime of the CBR model. However, if the value is set greater or smaller than the optimal value, the performance of CBR model will be deteriorated. In this research, we propose a new method of composing the NN set using similarity scores as themselves, which we shall call s-NN method, rather than using the fixed value of k. In the s-NN method, the different number of nearest neighbors can be selected for each new case. Performance evaluation using the data from UCI Machine Learning Repository shows that the CBR model adopting the s-NN method outperforms the CBR model adopting the traditional k-NN method.

  • PDF

Design of Optimized Radial Basis Function Neural Networks Classifier with the Aid of Principal Component Analysis and Linear Discriminant Analysis (주성분 분석법과 선형판별 분석법을 이용한 최적화된 방사형 기저 함수 신경회로망 분류기의 설계)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.735-740
    • /
    • 2012
  • In this paper, we introduce design methodologies of polynomial radial basis function neural network classifier with the aid of Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA). By minimizing the information loss of given data, Feature data is obtained through preprocessing of PCA and LDA and then this data is used as input data of RBFNNs. The hidden layer of RBFNNs is built up by Fuzzy C-Mean(FCM) clustering algorithm instead of receptive fields and linear polynomial function is used as connection weights between hidden and output layer. In order to design optimized classifier, the structural and parametric values such as the number of eigenvectors of PCA and LDA, and fuzzification coefficient of FCM algorithm are optimized by Artificial Bee Colony(ABC) optimization algorithm. The proposed classifier is applied to some machine learning datasets and its result is compared with some other classifiers.

Convergence study to detect metabolic syndrome risk factors by gender difference (성별에 따른 대사증후군의 위험요인 탐색을 위한 융복합 연구)

  • Lee, So-Eun;Rhee, Hyun-Sill
    • Journal of Digital Convergence
    • /
    • v.19 no.12
    • /
    • pp.477-486
    • /
    • 2021
  • This study was conducted to detect metabolic syndrome risk factors and gender difference in adults. 18,616 cases of adults are collected by Korea Health and Nutrition Examination Study from 2016 to 2019. Using 4 types of machine Learning(Logistic Regression, Decision Tree, Naïve Bayes, Random Forest) to predict Metabolic Syndrome. The results showed that the Random Forest was superior to other methods in men and women. In both of participants, BMI, diet(fat, vitamin C, vitamin A, protein, energy intake), number of underlying chronic disease and age were the upper importance. In women, education level, menarche age, menopause was additional upper importance and age, number of underlying chronic disease were more powerful importance than men. Future study have to verify various strategy to prevent metabolic syndrome.

Consultation Management Model based on Behavior Classification of Special-Needs Students (특수학생들의 행동 분류 기반의 상담관리 모델)

  • Park, Won-Cheol;Park, Koo-Rack
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.9
    • /
    • pp.21-30
    • /
    • 2021
  • Unlike behaviors that are generally known, information regarding unspecific behaviors is insufficient. For an education or guidance regarding the unspecific behaviors, collection and management of data regarding the unspecific behaviors of special-needs students are needed. In this paper, a consultation management model based on behavior classification of special-needs students using machine learning is proposed. It collects data by photographing the behavior of special students in real time, analyzes the behavior pattern, composes a data set, and trains it in the suggestion system. It is possible to improve the accuracy by comparing the behavior of special students photographed later into the suggestion system and analyzing the results by comparing it with the existing data again. The test has been performed by arbitrarily applying unspecific behaviors that are not stored in the database, and the forecast model has accurately classified and grouped the input data. Also, it has been verified that it is possible to accurately distinguish and classify the behaviors through the feature data of the behaviors even if there are some errors in the input process.

A study on the practical use of smart meter end-user demand data (스마트미터 데이터 활용 방법에 대한 연구)

  • Park, Geunyeong;Jung, Donghwi;Jun, Sanghoon
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.10
    • /
    • pp.759-768
    • /
    • 2021
  • This work introduces a new approach that classifies individual household water usage by examining the characteristics of smart meter end-user demand data. Here, one of the most well-known unsupervised machine learning, K-means algorithm, is applied to classify water consumptions by each household. The intensity and duration of end-user demands are used as main features to determine the households with similar water consumption pattern. The results showed that 21 households are classified into 13 clusters with each cluster having one, two, three, or five houses. The reasoning why multiple households are classified into the same cluster is described in this paper with respect to the collected data and end-user water consumption behavior.

Risk Prediction and Analysis of Building Fires -Based on Property Damage and Occurrence of Fires- (건물별 화재 위험도 예측 및 분석: 재산 피해액과 화재 발생 여부를 바탕으로)

  • Lee, Ina;Oh, Hyung-Rok;Lee, Zoonky
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.133-144
    • /
    • 2021
  • This paper derives the fire risk of buildings in Seoul through the prediction of property damage and the occurrence of fires. This study differs from prior research in that it utilizes variables that include not only a building's characteristics but also its affiliated administrative area as well as the accessibility of nearby fire-fighting facilities. We use Ensemble Voting techniques to merge different machine learning algorithms to predict property damage and fire occurrence, and to extract feature importance to produce fire risk. Fire risk prediction was made on 300 buildings in Seoul utilizing the established model, and it has been derived that with buildings at Level 1 for fire risks, there were a high number of households occupying the building, and the buildings had many factors that could contribute to increasing the size of the fire, including the lack of nearby fire-fighting facilities as well as the far location of the 119 Safety Center. On the other hand, in the case of Level 5 buildings, the number of buildings and businesses is large, but the 119 Safety Center in charge are located closest to the building, which can properly respond to fire.

Analysis of suitable evacuation routes through multi-agent system simulation within buildings

  • Castillo Osorio, Ever Enrique;Seo, Min Song;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.5
    • /
    • pp.265-278
    • /
    • 2021
  • When a dangerous event arises for people inside a building and an immediate evacuation is required, it is important that suitable routes have been previously defined. These situations can happen especially when buildings are crowded, making the occupants have a very high vulnerability and can be trapped if they do not evacuate quickly and safely. However, in most cases, routes are considered based just on their proximity or short distance to the exit areas, and evacuation simulations that include more variables are not performed. This work aims to propose a methodology for building's indoor evacuation activities under the premise of processing simulation scenarios in multi-agent environments. In the methodology, importance indexes of simplified and validated geometry data from a BIM (Building Information Modeling) are considered as heuristic input data in a proposed algorithm. The algorithm is based on AP-Theta* pathfinding and collision avoidance machine learning techniques. It also includes conditioning variables such as the number of people, speed of movement as well as reaction ability of the agents that influence the evacuation times. Moreover, collision avoidance is applied between people or with objects along the route. The simulations using the proposed algorithm are tested in NetLogo for diverse scenarios, showing feasible evacuation routes and calculating evacuation times in a multi-agent environment. The experimental results are obtained by applying the method in a study case and demonstrate the level of effectiveness of the algorithm, and the influence of the conditioning variables analyzed together when performing safe evacuation routes.

Identification of Profane Words in Cyberbullying Incidents within Social Networks

  • Ali, Wan Noor Hamiza Wan;Mohd, Masnizah;Fauzi, Fariza
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.1
    • /
    • pp.24-34
    • /
    • 2021
  • The popularity of social networking sites (SNS) has facilitated communication between users. The usage of SNS helps users in their daily life in various ways such as sharing of opinions, keeping in touch with old friends, making new friends, and getting information. However, some users misuse SNS to belittle or hurt others using profanities, which is typical in cyberbullying incidents. Thus, in this study, we aim to identify profane words from the ASKfm corpus to analyze the profane word distribution across four different roles involved in cyberbullying based on lexicon dictionary. These four roles are: harasser, victim, bystander that assists the bully, and bystander that defends the victim. Evaluation in this study focused on occurrences of the profane word for each role from the corpus. The top 10 common words used in the corpus are also identified and represented in a graph. Results from the analysis show that these four roles used profane words in their conversation with different weightage and distribution, even though the profane words used are mostly similar. The harasser is the first ranked that used profane words in the conversation compared to other roles. The results can be further explored and considered as a potential feature in a cyberbullying detection model using a machine learning approach. Results in this work will contribute to formulate the suitable representation. It is also useful in modeling a cyberbullying detection model based on the identification of profane word distribution across different cyberbullying roles in social networks for future works.

Electrical fire prediction model study using machine learning (기계학습을 통한 전기화재 예측모델 연구)

  • Ko, Kyeong-Seok;Hwang, Dong-Hyun;Park, Sang-June;Moon, Ga-Gyeong
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.6
    • /
    • pp.703-710
    • /
    • 2018
  • Although various efforts have been made every year to reduce electric fire accidents such as accident analysis and inspection for electric fire accidents, there is no effective countermeasure due to lack of effective decision support system and existing cumulative data utilization method. The purpose of this study is to develop an algorithm for predicting electric fire based on data such as electric safety inspection data, electric fire accident information, building information, and weather information. Through the pre-processing of collected data for each institution such as Korea Electrical Safety Corporation, Meteorological Administration, Ministry of Land, Infrastructure, and Transport, Fire Defense Headquarters, convergence, analysis, modeling, and verification process, we derive the factors influencing electric fire and develop prediction models. The results showed insulation resistance value, humidity, wind speed, building deterioration(aging), floor space ratio, building coverage ratio and building use. The accuracy of prediction model using random forest algorithm was 74.7%.