• 제목/요약/키워드: Random Forest Classification

Search Result 311, Processing Time 0.026 seconds

Korean Traditional Music Genre Classification Using Sample and MIDI Phrases

  • Lee, JongSeol;Lee, MyeongChun;Jang, Dalwon;Yoon, Kyoungro
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.4
    • /
    • pp.1869-1886
    • /
    • 2018
  • This paper proposes a MIDI- and audio-based music genre classification method for Korean traditional music. There are many traditional instruments in Korea, and most of the traditional songs played using the instruments have similar patterns and rhythms. Although music information processing such as music genre classification and audio melody extraction have been studied, most studies have focused on pop, jazz, rock, and other universal genres. There are few studies on Korean traditional music because of the lack of datasets. This paper analyzes raw audio and MIDI phrases in Korean traditional music, performed using Korean traditional musical instruments. The classified samples and MIDI, based on our classification system, will be used to construct a database or to implement our Kontakt-based instrument library. Thus, we can construct a management system for a Korean traditional music library using this classification system. Appropriate feature sets for raw audio and MIDI phrases are proposed and the classification results-based on machine learning algorithms such as support vector machine, multi-layer perception, decision tree, and random forest-are outlined in this paper.

Land-cover classification using multi-temporal Radarsat-1 and ENVISAT data (다중 시기 Radarsat-1 자료와 ENVISAT 자료를 이용한 토지 피복 분류)

  • Park No-Wook;Chi Kwang-Hoon
    • Proceedings of the KSRS Conference
    • /
    • 2006.03a
    • /
    • pp.303-306
    • /
    • 2006
  • 이 연구에서는 C 밴드 SAR 자료이면서 서로 다른 편광 상태의 자료를 제공할 수 있는 다중 시기 Radarsat-1 자료와 ENVISAT ASAR 자료를 이용한 토지 피복 분류를 수행하였다. 다중 시기/편광 자료로부터 평균 후방산란계수, 시간적 변이도, 긴밀도 등의 특징을 기본적으로 추출하였고, 이외에 상호 비교를 위해 주성분 분석을 이용한 특징 추출을 시도하였다. 특징들을 이용한 분류기법으로는 Random Forests를 적용하였다. 충남 예당평야 일대를 대상으로 사례연구를 수행한 결과, 주성분 분석을 통한 특징과 다편광 자료를 이용하였을 때 분류 정확도가 향상되는 것으로 나타났다.

  • PDF

Crowdfunding Scams: The Profiles and Language of Deceivers

  • Lee, Seung-hun;Kim, Hyun-chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.55-62
    • /
    • 2018
  • In this paper, we propose a model to detect crowdfunding scams, which have been reportedly occurring over the last several years, based on their project information and linguistic features. To this end, we first collect and analyze crowdfunding scam projects, and then reveal which specific project-related information and linguistic features are particularly useful in distinguishing scam projects from non-scams. Our proposed model built with the selected features and Random Forest machine learning algorithm can successfully detect scam campaigns with 84.46% accuracy.

Effective Feature Extraction and Classification for IDS in Accessible IOT Environment (접근이 어려운 IOT 환경에서의 IDS를 위한 효과적인 특징 추출과 분류)

  • Lee, Joo-Hwa;Park, Ki-Hyun
    • Annual Conference of KIPS
    • /
    • 2019.05a
    • /
    • pp.714-717
    • /
    • 2019
  • IOT는 복잡하고 이질적인 네트워크 환경이며 저전력 장치를 위한 새로운 라우팅 프로토콜의 존재로 인해 혁신적인 침입탐지 시스템이 필요하다. 특히 접근이 어려운 IOT 환경에서는 공격을 받았을 때 정확하고 빠른 탐지가 용이하여야 한다. 따라서 본 논문에서는 탐지의 정확성과 희소의 공격을 잘 탐지하기 위한 효과적인 특징 추출과 분류를 위한 SAR(Stacked Auto Encoder+Random Forest) 시스템을 제안한다.

Data-driven Analysis for Future Land-use Change Prediction : Case Study on Seoul (서울 데이터 기반 필지별 용도전환 발생 예측)

  • Yun, Sung Bum;Mun, Sungchul;Park, Soon Yong;Kim, Taehyun
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.176-184
    • /
    • 2020
  • Due to constant development and decline on Seoul areas the Seoul government is pushing various policies to regenerate declined Seoul areas. Theses various policies lead to land-use changes around numerous Seoul districts. This study aims to create prediction model which can foresee future land-use changes and while doing so, tried to derive various influential factors which leads to land-use changes. To do so, various open-data from national departments and Seoul government have been collected and implemented into random forest algorithm. The results showed promising accuracy and derived multiple influential factors which causes land-use changes around Seoul districts. The result of this study could further be implemented in policy makings for the public sectors, or could also be used as basis for studying gentrification problems happening in Seoul Area.

Vulnerability Assessment for Fine Particulate Matter (PM2.5) in the Schools of the Seoul Metropolitan Area, Korea: Part II - Vulnerability Assessment for PM2.5 in the Schools (인공지능을 이용한 수도권 학교 미세먼지 취약성 평가: Part II - 학교 미세먼지 범주화)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_2
    • /
    • pp.1891-1900
    • /
    • 2021
  • Fine particulate matter (FPM; diameter ≤ 2.5 ㎛) is frequently found in metropolitan areas due to activities associated with rapid urbanization and population growth. Many adolescents spend a substantial amount of time at school where, for various reasons, FPM generated outdoors may flow into indoor areas. The aims of this study were to estimate FPM concentrations and categorize types of FPM in schools. Meteorological and chemical variables as well as satellite-based aerosol optical depth were analyzed as input data in a random forest model, which applied 10-fold cross validation and a grid-search method, to estimate school FPM concentrations, with four statistical indicators used to evaluate accuracy. Loose and strict standards were established to categorize types of FPM in schools. Under the former classification scheme, FPM in most schools was classified as type 2 or 3, whereas under strict standards, school FPM was mostly classified as type 3 or 4.

A Comparative Study of Prediction Models for College Student Dropout Risk Using Machine Learning: Focusing on the case of N university (머신러닝을 활용한 대학생 중도탈락 위험군의 예측모델 비교 연구 : N대학 사례를 중심으로)

  • So-Hyun Kim;Sung-Hyoun Cho
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.2
    • /
    • pp.155-166
    • /
    • 2024
  • Purpose : This study aims to identify key factors for predicting dropout risk at the university level and to provide a foundation for policy development aimed at dropout prevention. This study explores the optimal machine learning algorithm by comparing the performance of various algorithms using data on college students' dropout risks. Methods : We collected data on factors influencing dropout risk and propensity were collected from N University. The collected data were applied to several machine learning algorithms, including random forest, decision tree, artificial neural network, logistic regression, support vector machine (SVM), k-nearest neighbor (k-NN) classification, and Naive Bayes. The performance of these models was compared and evaluated, with a focus on predictive validity and the identification of significant dropout factors through the information gain index of machine learning. Results : The binary logistic regression analysis showed that the year of the program, department, grades, and year of entry had a statistically significant effect on the dropout risk. The performance of each machine learning algorithm showed that random forest performed the best. The results showed that the relative importance of the predictor variables was highest for department, age, grade, and residence, in the order of whether or not they matched the school location. Conclusion : Machine learning-based prediction of dropout risk focuses on the early identification of students at risk. The types and causes of dropout crises vary significantly among students. It is important to identify the types and causes of dropout crises so that appropriate actions and support can be taken to remove risk factors and increase protective factors. The relative importance of the factors affecting dropout risk found in this study will help guide educational prescriptions for preventing college student dropout.

Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

  • Zhao, Yongwei;Li, Bicheng;Liu, Xin;Ke, Shengcai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.364-380
    • /
    • 2016
  • The most popular approach in object classification is based on the bag of visual-words model, which has several fundamental problems that restricting the performance of this method, such as low time efficiency, the synonym and polysemy of visual words, and the lack of spatial information between visual words. In view of this, an object classification based on weakly supervised E2LSH and saliency map weighting is proposed. Firstly, E2LSH (Exact Euclidean Locality Sensitive Hashing) is employed to generate a group of weakly randomized visual dictionary by clustering SIFT features of the training dataset, and the selecting process of hash functions is effectively supervised inspired by the random forest ideas to reduce the randomcity of E2LSH. Secondly, graph-based visual saliency (GBVS) algorithm is applied to detect the saliency map of different images and weight the visual words according to the saliency prior. Finally, saliency map weighted visual language model is carried out to accomplish object classification. Experimental results datasets of Pascal 2007 and Caltech-256 indicate that the distinguishability of objects is effectively improved and our method is superior to the state-of-the-art object classification methods.

Classification Abnormal temperatures based on Meteorological Environment using Random forests (랜덤포레스트를 이용한 기상 환경에 따른 이상기온 분류)

  • Youn Su Kim;Kwang Yoon Song;In Hong Chang
    • Journal of Integrative Natural Science
    • /
    • v.17 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • Many abnormal climate events are occurring around the world. The cause of abnormal climate is related to temperature. Factors that affect temperature include excessive emissions of carbon and greenhouse gases from a global perspective, and air circulation from a local perspective. Due to the air circulation, many abnormal climate phenomena such as abnormally high temperature and abnormally low temperature are occurring in certain areas, which can cause very serious human damage. Therefore, the problem of abnormal temperature should not be approached only as a case of climate change, but should be studied as a new category of climate crisis. In this study, we proposed a model for the classification of abnormal temperature using random forests based on various meteorological data such as longitudinal observations, yellow dust, ultraviolet radiation from 2018 to 2022 for each region in Korea. Here, the meteorological data had an imbalance problem, so the imbalance problem was solved by oversampling. As a result, we found that the variables affecting abnormal temperature are different in different regions. In particular, the central and southern regions are influenced by high pressure (Mainland China, Siberian high pressure, and North Pacific high pressure) due to their regional characteristics, so pressure-related variables had a significant impact on the classification of abnormal temperature. This suggests that a regional approach can be taken to predict abnormal temperatures from the surrounding meteorological environment. In addition, in the event of an abnormal temperature, it seems that it is possible to take preventive measures in advance according to regional characteristics.

Comparison of Machine Learning Analysis on Predictive Factors of Children's Planning-Organizing Executive Function by Income Level: Through Home Environment Quality and Wealth Factors

  • Lim, Hye-Kyung;Kim, Hyun-Ok;Park, Hae-Seon
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.6
    • /
    • pp.651-662
    • /
    • 2021
  • Background and objective: This study identifies whether children's planning-organizing executive function can be significantly classified and predicted by home environment quality and wealth factors. Methods: For empirical analysis, we used the data collected from the 10th Panel Study on Korean Children in 2017. Using machine learning tools such as support vector machine (SVM) and random forest (RF), we evaluated the accuracy of the model in which home environment factors classify and predict children's planning-organizing executive functions, and extract the relative importance of variables that determine these executive functions by income group. Results: First, SVM analysis shows that home environment quality and wealth factors show high accuracy in classification and prediction in all three groups. Second, RF analysis shows that estate had the highest predictive power in the high-income group, followed by income, asset, learning, reinforcement, and emotional environment. In the middle-income group, emotional environment showed the highest score, followed by estate, asset, reinforcement, and income. In the low-income group, estate showed the highest score, followed by income, asset, learning, reinforcement, and emotional environment. Conclusion: This study confirmed that home environment quality and wealth factors are significant factors in predicting children's planning-organizing executive functions.