• 제목/요약/키워드: Supervised Classification

검색결과 421건 처리시간 0.024초

욕설문장 분류의 불균형 데이터 해결을 위한 전이학습 방법 (A Transfer Learning Method for Solving Imbalance Data of Abusive Sentence Classification)

  • 서수인;조성배
    • 정보과학회 논문지
    • /
    • 제44권12호
    • /
    • pp.1275-1281
    • /
    • 2017
  • 욕설문장을 지도학습 접근법으로 분류하기 위해서 욕설인지 아닌지 판별된 학습 문장이 필요하다. 문자수준의 컨볼루션 신경망이 각 문자에 대해 강건성을 가지기 때문에 욕설분류에 적합하지만, 학습에 많은 데이터가 필요하다는 단점이 있다. 본 논문에서는 이를 해결하기 위해 임의로 생성한 욕설/비욕설 문장 쌍을 컨볼루션 신경망을 기반으로 하는 분류기에 학습시켜 컨볼루션 신경망의 필터가 욕설의 특징을 분류하도록 조정한 후, 실제 훈련문장을 학습시킬 때 필터를 재사용하는 전이학습방법을 제안한다. 이로써 데이터 부족과 클래스 불균형으로 인한 영향이 감소하여 분류 성능이 향상될 것이다. 실험 및 평가는 총 3가지 데이터에 대해 수행되었으며, 문자수준 컨볼루션 신경망을 활용한 분류기는 모든 데이터에서 전이학습을 적용했을 때 더 높은 F1 점수를 획득하였다.

Extraction of the aquaculture farms information from the Landsat- TM imagery of the Younggwang coastal area

  • Shanmugam, P.;Ahn, Yu-Hwan;Yoo, Hong-Ryong
    • 한국GIS학회:학술대회논문집
    • /
    • 한국GIS학회 2004년도 GIS/RS 공동 춘계학술대회 논문집
    • /
    • pp.493-498
    • /
    • 2004
  • The objective of the present study is to compare various conventional and recently evolved satellite image-processing techniques and to ascertain the best possible technique that can identify and position of aquaculture farms accurately in and around the Younggwang coastal area. Several conventional techniques performed to extract such information fiom the Landsat-TM imagery do not seem to yield better information about the aquaculture farms, and lead to misclassification. The large errors between the actual and extracted aquaculture farm information are due to existence of spectral confusion and inadequate spatial resolution of the sensor. This leads to possible occurrence of mixture pixels or 'mixels' of the source of errors in the classification techniques. Understanding the confusing and mixture pixel problems requires the development of efficient methods that can enable more reliable extraction of aquaculture farm information. Thus, the more recently evolved methods such as the step-by-step partial spectral end-member extraction and linear spectral unmixing methods are introduced. The farmer one assumes that an end-member, which is often referred to as 'spectrally pure signature' of a target feature, does not appear to be a spectrally pure form, but always mix with the other features at certain proportions. The assumption of the linear spectral unmxing is that the measured reflectance of a pixel is the linear sum of the reflectance of the mixture components that make up that pixel. The classification accuracy of the step-by-step partial end-member extraction improved significantly compared to that obtained from the traditional supervised classifiers. However, this method did not distinguish the aquaculture ponds and non-aquaculture ponds within the region of the aquaculture farming areas. In contrast, the linear spectral unmixing model produced a set of fraction images for the aquaculture, water and soil. Of these, the aquaculture fraction yields good estimates about the proportion of the aquaculture farm in each pixel. The acquired proportion was compared with the values of NDVI and both are positively correlated (R$^2$ =0.91), indicating the reliability of the sub-pixel classification.ixel classification.

  • PDF

차량 감시영상에서 그림자 제거를 통한 효율적인 차종의 학습 및 분류 (Efficient Learning and Classification for Vehicle Type using Moving Cast Shadow Elimination in Vehicle Surveillance Video)

  • 신욱선;이창훈
    • 정보처리학회논문지B
    • /
    • 제15B권1호
    • /
    • pp.1-8
    • /
    • 2008
  • 일반적으로 감시영상에서 움직이는 물체들은 배경빼기 혹은 프레임 차를 이용하여 추출된다. 하지만 객체에 의해서 만들어지는 그림자는 심각한 탐지의 오류를 야기시킬 수 있다. 특히, 도로 상에 설치된 감시카메라로부터 획득된 영상으로부터 차량 정보를 분석할 때, 차량에 의해서 생성되는 그림자로 인하여 차량의 모양을 왜곡시켜 부정확한 결과를 만든다. 때문에 그림자의 제거는 감시 영상 내에서의 정확한 객체 추출을 위해서 반드시 필요하다. 본 논문은 도로감시영상 내에서 움직이는 차량의 차종판별 성능을 향상시키기 위한 움직이는 객체 내에 만들어지는 그림자를 제거한다. 제거된 객체의 영역은 소실점을 이용하여 3차원 객체로 피팅(Fitting)한 후 측정된 데이터를 감독 학습하여 원하는 차종 판별결과를 얻는데 사용한다. 실험은 3가지 기계학습 방법{IBL, C4.5, NN(Neural Network)}을 이용하여 그림자의 제거가 차종의 판별성능에 미치는 결과의 평가한다.

심실 조기 수축 비트 검출을 위한 딥러닝 기반의 최적 파라미터 검출 (Optimal Parameter Extraction based on Deep Learning for Premature Ventricular Contraction Detection)

  • 조익성;권혁숭
    • 한국정보통신학회논문지
    • /
    • 제23권12호
    • /
    • pp.1542-1550
    • /
    • 2019
  • 부정맥 분류를 위한 기존 연구들은 분류의 정확성을 높이기 위해 신경회로망(Artificial Neural Network), 퍼지(Fuzzy), 기계학습(Machine Learning) 등을 이용한 방법이 연구되어 왔다. 특히 딥러닝은 신경회로망의 문제인 은닉층 개수의 한계를 해결함으로 인해 오류 역전파 알고리즘을 이용한 부정맥 분류에 가장 많이 사용되고 있다. 딥러닝 모델을 심전도 신호에 적용하기 위해서는 적절한 모델선택과 파라미터를 최적에 가깝게 선택할 필요가 있다. 본 연구에서는 심실 조기 수축 비트 검출을 위한 딥러닝 기반의 최적 파라미터 검출 방법을 제안한다. 이를 위해 먼저 잡음을 제거한 ECG신호에서 R파를 검출하고 QRS와 RR간격 세그먼트를 추출하였다. 이후 딥러닝을 통한 지도학습 방법으로 가중치를 학습시키고 검증데이터로 모델을 평가하였다. 제안된 방법의 타당성 평가를 위해 MIT-BIH 부정맥 데이터베이스를 통해 각 파라미터에 따른 딥러닝 모델로 훈련 및 검증 정확도를 확인하였다. 성능 평가 결과 R파의 평균 검출 성능은 99.77%, PVC는 97.84의 평균 분류율을 나타내었다.

Arabic Stock News Sentiments Using the Bidirectional Encoder Representations from Transformers Model

  • Eman Alasmari;Mohamed Hamdy;Khaled H. Alyoubi;Fahd Saleh Alotaibi
    • International Journal of Computer Science & Network Security
    • /
    • 제24권2호
    • /
    • pp.113-123
    • /
    • 2024
  • Stock market news sentiment analysis (SA) aims to identify the attitudes of the news of the stock on the official platforms toward companies' stocks. It supports making the right decision in investing or analysts' evaluation. However, the research on Arabic SA is limited compared to that on English SA due to the complexity and limited corpora of the Arabic language. This paper develops a model of sentiment classification to predict the polarity of Arabic stock news in microblogs. Also, it aims to extract the reasons which lead to polarity categorization as the main economic causes or aspects based on semantic unity. Therefore, this paper presents an Arabic SA approach based on the logistic regression model and the Bidirectional Encoder Representations from Transformers (BERT) model. The proposed model is used to classify articles as positive, negative, or neutral. It was trained on the basis of data collected from an official Saudi stock market article platform that was later preprocessed and labeled. Moreover, the economic reasons for the articles based on semantic unit, divided into seven economic aspects to highlight the polarity of the articles, were investigated. The supervised BERT model obtained 88% article classification accuracy based on SA, and the unsupervised mean Word2Vec encoder obtained 80% economic-aspect clustering accuracy. Predicting polarity classification on the Arabic stock market news and their economic reasons would provide valuable benefits to the stock SA field.

진동 아날로그 신호 기반의 이상상황 탐지를 위한 기계학습 모형의 성능지표 향상 (Improving the Performance of Machine Learning Models for Anomaly Detection based on Vibration Analog Signals)

  • 김재훈;엄상천;박철순
    • 산업경영시스템학회지
    • /
    • 제47권2호
    • /
    • pp.1-9
    • /
    • 2024
  • New motor development requires high-speed load testing using dynamo equipment to calculate the efficiency of the motor. Abnormal noise and vibration may occur in the test equipment rotating at high speed due to misalignment of the connecting shaft or looseness of the fixation, which may lead to safety accidents. In this study, three single-axis vibration sensors for X, Y, and Z axes were attached on the surface of the test motor to measure the vibration value of vibration. Analog data collected from these sensors was used in classification models for anomaly detection. Since the classification accuracy was around only 93%, commonly used hyperparameter optimization techniques such as Grid search, Random search, and Bayesian Optimization were applied to increase accuracy. In addition, Response Surface Method based on Design of Experiment was also used for hyperparameter optimization. However, it was found that there were limits to improving accuracy with these methods. The reason is that the sampling data from an analog signal does not reflect the patterns hidden in the signal. Therefore, in order to find pattern information of the sampling data, we obtained descriptive statistics such as mean, variance, skewness, kurtosis, and percentiles of the analog data, and applied them to the classification models. Classification models using descriptive statistics showed excellent performance improvement. The developed model can be used as a monitoring system that detects abnormal conditions of the motor test.

GIS와 RS를 이용한 농촌지역 토지이용 및 도시화 변화현상의 시공간 탐색 - 용인 및 안성지역을 중심으로 - (Spatio-temporal change detection of land-use and urbanization in rural areas using GIS and RS - Case studies of Yongin and Anseong regions -)

  • 고옥결;김대식
    • 농업과학연구
    • /
    • 제38권1호
    • /
    • pp.153-162
    • /
    • 2011
  • This study analyzed the spatio-temporal change detection of land-use and urbanization in Yongin and Anseong regions, Kyunggi Province, using three Landsat-5 TM images for 1990, 1996, and 2000. Remote sensing (RS) and geographic information system (GIS) techniques were used for image classification and result analysis. Six land-use types were classified using supervised maximum likelihood classification. In the two study areas, the land-use changed significantly, especially the decrease of arable land and forest and increase of built-up area. Spatially, the urban expansion of Yongin region showed a spreading trend mainly along the national road and expressways. But in Anseong region the expansion showed 'urban sprawl phenomenon' with irregular shape like starfish. Temporally, the urban expansion showed disparity - the growth rates of urbanized area rose from the period 1990-1996 to 1996-2000 in both study areas. The increased built-up areas were converted mainly from paddy, dry vegetation, and forest.

시그마파이 신경 트리의 진화적 학습 및 이의 분류 예측에의 응용 (Evolutionary Learning of Sigma-Pi Neural Trees and Its Application to classification and Prediction)

  • 장병탁
    • 한국지능시스템학회논문지
    • /
    • 제6권2호
    • /
    • pp.13-21
    • /
    • 1996
  • 하이오더 신경망에 대한 필요성과 유용성에 대해서는 신경망 연구의 초기부터 잘 알려져 있다. 그러나 오더가 늘어남에 따라 항의 수가 급격히 증가하는 문제로 인하여 이러한 망을 설계하고 학습하는데 많은 어려움이 있었다. 본 논문에서는 문제에 적합한 하이오더 신경망 모델을 효율적으로 구성하기 위한 진화적 학습 방법을 제시한다. 이 방법에서는 시그마유닛과 파이유닛을 융합한 신경트리 표현을 사용한다. 또한 MDL기반의 적합도 분류 및 예측 문제에 있어서 제시된 방법의 유용성을 검증한다.

  • PDF

Assessment of Land Cover Changes from Protected Forest Areas of Satchari National Park in Bangladesh and Implications for Conservation

  • Masum, Kazi Mohammad;Hasan, Md. Mehedi
    • Journal of Forest and Environmental Science
    • /
    • 제36권3호
    • /
    • pp.199-206
    • /
    • 2020
  • Satchari National Park is one of the most biodiverse forest in Bangladesh and home of many endangered flora and fauna. 206 tons of CO2 per hectare is sequestrated in this national park every year which helps to mitigate climate issues. As people living near the area are dependent on this forest, degradation has become a regular phenomenon destroying the forest biodiversity by altering its forest cover. So, it is important to map land cover quickly and accurately for the sustainable management of Satchari National Park. The main objective of this study was to obtain information on land cover change using remote sensing data. Combination of unsupervised NDVI classification and supervised classification using maximum likelihood is followed in this study to find out land cover map. The analysis showed that the land cover is gradually converting from one land use type to another. Dense forest becoming degraded forest or bare land. Although it was slowed down by the establishment of 'National Park' on the study site, forecasting shows that it is not enough to mitigate forest degradation. Legal steps and proper management strategies should be taken to mitigate causes of degradation such as illegal felling.

A Method of Analyzing ECG to Diagnose Heart Abnormality utilizing SVM and DWT

  • Shdefat, Ahmed;Joo, Moonil;Kim, Heecheol
    • Journal of Multimedia Information System
    • /
    • 제3권2호
    • /
    • pp.35-42
    • /
    • 2016
  • Electrocardiogram (ECG) signal gives a clear indication whether the heart is at a healthy status or not as the early notification of a cardiac problem in the heart could save the patient's life. Several methods were launched to clarify how to diagnose the abnormality over the ECG signal waves. However, some of them face the problem of lack of accuracy at diagnosis phase of their work. In this research, we present an accurate and successive method for the diagnosis of abnormality through Discrete Wavelet Transform (DWT), QRS complex detection and Support Vector Machines (SVM) classification with overall accuracy rate 95.26%. DWT Refers to sampling any kind of discrete wavelet transform, while SVM is known as a model with related learning algorithm, which is based on supervised learning that perform regression analysis and classification over the data sample. We have tested the ECG signals for 10 patients from different file formats collected from PhysioNet database to observe accuracy level for each patient who needs ECG data to be processed. The results will be presented, in terms of accuracy that ranged from 92.1% to 97.6% and diagnosis status that is classified as either normal or abnormal factors.