Search | Korea Science

An Experimental Study on Feature Ranking Schemes for Text Classification (텍스트 분류를 위한 자질 순위화 기법에 관한 연구)

Pan Jun Kim
- Journal of the Korean Society for information Management
- /
- v.40 no.1
- /
- pp.1-21
- /
- 2023
This study specifically reviewed the performance of the ranking schemes as an efficient feature selection method for text classification. Until now, feature ranking schemes are mostly based on document frequency, and relatively few cases have used the term frequency. Therefore, the performance of single ranking metrics using term frequency and document frequency individually was examined as a feature selection method for text classification, and then the performance of combination ranking schemes using both was reviewed. Specifically, a classification experiment was conducted in an environment using two data sets (Reuters-21578, 20NG) and five classifiers (SVM, NB, ROC, TRA, RNN), and to secure the reliability of the results, 5-Fold cross-validation and t-test were applied. As a result, as a single ranking scheme, the document frequency-based single ranking metric (chi) showed good performance overall. In addition, it was found that there was no significant difference between the highest-performance single ranking and the combination ranking schemes. Therefore, in an environment where sufficient learning documents can be secured in text classification, it is more efficient to use a single ranking metric (chi) based on document frequency as a feature selection method.
https://doi.org/10.3743/KOSIM.2023.40.1.001 인용 PDF

Classification and evaluation of river environment using Hyperspectral images (초분광 영상정보를 활용한 하천환경 분류 및 평가)

Han, Hyeong Jun;Lee, Chang Hun;Kang, Joon Gu;Kim, Jong Tae
- Proceedings of the Korea Water Resources Association Conference
- /
- 2019.05a
- /
- pp.423-423
- /
- 2019
RGB나 다중분광영상은 높은 공간 해상도로 인해 크기가 작은 물질의 클래스를 부여하는데 있어서는 효과적이지만 분광해상도가 낮아 다양한 종류의 지표물 분류 및 분광적으로 미세한 차이를 보이는 대상 체간의 분류에는 한계를 가지고 있다. 그러나 초분광 영상(Hyperspectral Image)은 대상 객체의 분광 반사곡선을 수백개의 연속적인 분광 파장대 영역으로 상세하게 해당 물체의 정보를 취득할 수 있는 기능을 가지고 있다. 최근 국내에서도 초분광 영상을 이용한 토지피복도 작성 및 환경 모니터링 등 다양한 분야에 적용하기 위한 연구가 시도되고 있다. 최근에는 드론과 같은 소형 UAV를 활용하여 경제적인 비용으로 시공간해상도가 높은 영상을 획득하는 것이 가능하게 되었으며 분광정보를 수집하는 영상 장비의 발전으로 드론에 탑재가 가능한 경량의 소형 초분광센서가 개발됨으로써 보다 높은 분광해상도의 영상을 취득할 수 있게 되었다. 본 연구에서는 효율적인 하천환경조사를 위해 UAV를 활용하여 고해상도 초분광 영상을 취득하였으며, 차원축소법과 분류기 적용에 따른 공간 분류 정확도 분석을 통해 하천환경에 대한 분류 및 평가를 실시하였다. 연구지역에서 획득한 초분광 영상은 노이즈로 인한 영향을 줄이고자 MNF와 PCA 기법으로 차원축소를 수행하였으며, MLC(Maximum Likelihood Classification)와 SVM(Support Vector Machine), SAM(Spectral Angle Mapping) 감독분류기법을 적용하여 하천환경특성에 따른 공간분류를 수행하였다. 연구 결과 MNF기법으로 차원 축소한 영상을 적용하여 MLC 감독분류를 수행하였을 때 가장 높은 분류정확도를 얻을 수 있었으나, 일부 클래스 및 수역의 경계와 그림자 공간에서 주로 오분류가 나타나는 것을 확인할 수 있었다.
PDF

A Study on Method for User Gender Prediction Using Multi-Modal Smart Device Log Data (스마트 기기의 멀티 모달 로그 데이터를 이용한 사용자 성별 예측 기법 연구)

Kim, Yoonjung;Choi, Yerim;Kim, Solee;Park, Kyuyon;Park, Jonghun
- The Journal of Society for e-Business Studies
- /
- v.21 no.1
- /
- pp.147-163
- /
- 2016
Gender information of a smart device user is essential to provide personalized services, and multi-modal data obtained from the device is useful for predicting the gender of the user. However, the method for utilizing each of the multi-modal data for gender prediction differs according to the characteristics of the data. Therefore, in this study, an ensemble method for predicting the gender of a smart device user by using three classifiers that have text, application, and acceleration data as inputs, respectively, is proposed. To alleviate privacy issues that occur when text data generated in a smart device are sent outside, a classification method which scans smart device text data only on the device and classifies the gender of the user by matching text data with predefined sets of word. An application based classifier assigns gender labels to executed applications and predicts gender of the user by comparing the label ratio. Acceleration data is used with Support Vector Machine to classify user gender. The proposed method was evaluated by using the actual smart device log data collected from an Android application. The experimental results showed that the proposed method outperformed the compared methods.
https://doi.org/10.7838/jsebs.2016.21.1.147 인용 PDF KSCI

Design of a Pattern Classifier for Pain Awareness using Electrocardiogram (심전도를 이용한 통증자각 패턴분류기 설계)

Lim, Hyunjun;Yoo, Sun Kook
- Journal of Korea Multimedia Society
- /
- v.20 no.9
- /
- pp.1509-1518
- /
- 2017
Although several methods have been used to assess the pain levels, few practical methods for classifying presence or absence of the pain using pattern classifiers have been suggested. The aim of this study is to design an pattern classifier that classifies the presence or absence of the pain using electrocardiogram (ECG). We measured the ECG signal from 10 subjects with the painless state and the pain state(Induced by mechanical stimulation). The 10 features of heart rate variability (HRV) were extracted from ECG - MeanRRI, SDNN, rMSSD, NN50, pNN50 in the time domain; VLF, LF, HF, Total Power, LF/HF in the frequency domain; and we used the features as input vector of the pattern classifier's artificial neural network (ANN) / support vector machine (SVM) for classifying the presence or absence of the pain. The study results showed that the classifiers using ANN / SVM could classify the presence or absence of the pain with accuracies of 81.58% / 81.84%. The proposed classifiers can be applied to the objective assessment of pain level.
https://doi.org/10.9717/kmms.2017.20.9.1509 인용 PDF KSCI

Performance Evaluation between Models for Smoker Classification Based on Health Examination Data (건강검진 데이터 기반 흡연자 분류를 위한 모형별 성능 분석)

Yun, Jisun;Yu, Heonchang
- Annual Conference of KIPS
- /
- 2018.10a
- /
- pp.648-651
- /
- 2018
흡연여부를 감별하는 지표가 있지만 반감기 등 여러 가지 요인에 따라 결과가 변한다는 단점이 있다. 그렇기 때문에 흡연여부 감별 시 외부요인에 영향을 덜 받는 지표가 필요하게 되었다. 그래서 흡연 여부 감별하는데 적합한 모형을 찾아 외부요인에 영향이 적은 지표를 개발에 도움이 될 것을 기대하며 연구를 진행하였다. 실험은 국민건강보험공단에서 제공한 건강검진정보데이터를 기반으로, SVM, Logistic Regression, KNN 등의 머신러닝 모델을 이용하여 흡연 여부를 감별하는 것을 진행한다. 이 실험은 속성에 따른 모형의 성능변화와 학습데이터 수에 따른 모형의 성능변화에 대한 2가지 측면에서 모델의 성능을 측정하였다. 모델의 평가는 정확도(accuracy), 정밀도(precision), 재현율(recall), 조화 평균(f1-score)으로 진행하였으며, 약 70퍼센트 정도의 정확도와, 60퍼센트 대의 재현율을 보인다. 실험 결과, SVM이 속성에 따른 모형의 성능 변화 실험에서는 63%의 재현율, 학습데이터 수에 따른 성능 변화 실험에서는 68%의 재현율을 보여, 흡연자 판별에 가장 좋은 성능을 보였다. 또한 재현율을 기준으로 실험 차수별로 가장 좋은 성능을 보인 모델과 가장 저조한 성능을 보인 모델의 차이를 비교한 결과, '속성에 따른 모형의 성능 변화 실험'에서는 최고 36%의 차이를 보였으며, '학습데이터 수에 따른 성능 변화 실험'에서 최고 42%의 차이를 보여 주었다. 이에 판별을 위한 속성도 중요하지만, 적합한 모형 선택 또한 중요하다는 것을 확인하였다.
https://doi.org/10.3745/PKIPS.y2018m10a.648 인용 PDF

Enhancement of Speech/Music Classification for 3GPP2 SMV Codec Employing Discriminative Weight Training (변별적 가중치 학습을 이용한 3GPP2 SVM의 실시간 음성/음악 분류 성능 향상)

Kang, Sang-Ick;Chang, Joon-Hyuk;Lee, Seong-Ro
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.6
- /
- pp.319-324
- /
- 2008
In this paper, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the discriminative weight training which is based on the minimum classification error (MCE) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then proposed the speech/music decision rule is expressed as the geometric mean of optimally weighted features which are selected from the SMV. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.
https://doi.org/10.7776/ASK.2008.27.6.319 인용 PDF KSCI

Development of a Face Detection and Recognition System Using a RaspberryPi (라즈베리파이를 이용한 얼굴검출 및 인식 시스템 개발)

Kim, Kang-Chul;Wei, Hai-tong
- The Journal of the Korea institute of electronic communication sciences
- /
- v.12 no.5
- /
- pp.859-864
- /
- 2017
IoT is a new emerging technology to lead the $4^{th}$ industry renovation and has been widely used in industry and home to increase the quality of human being. In this paper, IoT based face detection and recognition system for a smart elevator is developed. Haar cascade classifier is used in a face detection system and a proposed PCA algorithm written in Python in the face recognition system is implemented to reduce the execution time and calculates the eigenfaces. SVM or Euclidean metric is used to recognize the faces detected in the face detection system. The proposed system runs on RaspberryPi 3. 200 sample images in ORL face database are used for training and 200 samples for testing. The simulation results show that the recognition rate is over 93% for PP+EU and over 96% for PP+SVM. The execution times of the proposed PCA and the conventional PCA are 0.11sec and 1.1sec respectively, so the proposed PCA is much faster than the conventional one. The proposed system can be suitable for an elevator monitoring system, real time home security system, etc.
https://doi.org/10.13067/JKIECS.2017.12.5.859 인용 PDF KSCI

Strip Rupture Detection System of Cold Rolling Mill using Transient Current Signal (과도 전류신호를 이용한 냉간 압연기의 판 터짐 검지 시스템)

Yang, S.W.;Oh, J.S.;Shim, M.C.;Kim, S.J.;Yang, B.S.;Lee, W.H.
- Journal of Power System Engineering
- /
- v.14 no.2
- /
- pp.40-47
- /
- 2010
This paper proposes a fault detection system to detect the strip rupture in six-high stand Cold Rolling Mills based on transient current signal of an electrical motor. For this work, signal smoothing technique is used to highlight precise feature between normal and fault condition. Subtracting the smoothed signal from the original signal gives the residuals that contains the information related to the normal or faulty condition. Using residual signal, discrete wavelet transform is performed and acquire the signal presenting fault feature well. Also, feature extraction and classification are executed by using PCA, KPCA and SVM. The actual data is acquired from POSCO for validating the proposed method.
PDF KSCI

Malware Application Classification based on Feature Extraction and Machine Learning for Malicious Behavior Analysis in Android Platform (안드로이드 플랫폼에서 악성 행위 분석을 통한 특징 추출과 머신러닝 기반 악성 어플리케이션 분류)

Kim, Dong-Wook;Na, Kyung-Gi;Han, Myung-Mook;Kim, Mijoo;Go, Woong;Park, Jun Hyung
- Journal of Internet Computing and Services
- /
- v.19 no.1
- /
- pp.27-35
- /
- 2018
This paper is a study to classify malicious applications in Android environment. And studying the threat and behavioral analysis of malicious Android applications. In addition, malicious apps classified by machine learning were performed as experiments. Android behavior analysis can use dynamic analysis tools. Through this tool, API Calls, Runtime Log, System Resource, and Network information for the application can be extracted. We redefined the properties extracted for machine learning and evaluated the results of machine learning classification by verifying between the overall features and the main features. The results show that key features have been improved by 1~4% over the full feature set. Especially, SVM classifier improved by 10%. From these results, we found that the application of the key features as a key feature was more effective in the performance of the classification algorithm than in the use of the overall features. It was also identified as important to select meaningful features from the data sets.
https://doi.org/10.7472/jksii.2018.19.1.27 인용 PDF KSCI

Perceptual Color Difference based Image Quality Assessment Method and Evaluation System according to the Types of Distortion (인지적 색 차이 기반의 이미지 품질 평가 기법 및 왜곡 종류에 따른 평가 시스템 제안)

Lee, Jee-Yong;Kim, Young-Jin
- Journal of KIISE
- /
- v.42 no.10
- /
- pp.1294-1302
- /
- 2015
A lot of image quality assessment metrics that can precisely reflect the human visual system (HVS) have previously been researched. The Structural SIMilarity (SSIM) index is a remarkable HVS-aware metric that utilizes structural information, since the HVS is sensitive to the overall structure of an image. However, SSIM fails to deal with color difference in terms of the HVS. In order to solve this problem, the Structural and Hue SIMilarity (SHSIM) index has been selected with the Hue, Saturation, Intensity (HSI) model as a color space, but it cannot reflect the HVS-aware color difference between two color images. In this paper, we propose a new image quality assessment method for a color image by using a CIE Lab color space. In addition, by using a support vector machine (SVM) classifier, we also propose an optimization system for applying optimal metric according to the types of distortion. To evaluate the proposed index, a LIVE database, which is the most well-known in the area of image quality assessment, is employed and four criteria are used. Experimental results show that the proposed index is more consistent with the other methods.
https://doi.org/10.5626/JOK.2015.42.10.1294 인용 KSCI

Search Result 302, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)