• 제목/요약/키워드: Unsupervised Classification

검색결과 276건 처리시간 0.026초

Comparison between Possibilistic c-Means (PCM) and Artificial Neural Network (ANN) Classification Algorithms in Land use/ Land cover Classification

  • Ganbold, Ganchimeg;Chasia, Stanley
    • International Journal of Knowledge Content Development & Technology
    • /
    • 제7권1호
    • /
    • pp.57-78
    • /
    • 2017
  • There are several statistical classification algorithms available for land use/land cover classification. However, each has a certain bias or compromise. Some methods like the parallel piped approach in supervised classification, cannot classify continuous regions within a feature. On the other hand, while unsupervised classification method takes maximum advantage of spectral variability in an image, the maximally separable clusters in spectral space may not do much for our perception of important classes in a given study area. In this research, the output of an ANN algorithm was compared with the Possibilistic c-Means an improvement of the fuzzy c-Means on both moderate resolutions Landsat8 and a high resolution Formosat 2 images. The Formosat 2 image comes with an 8m spectral resolution on the multispectral data. This multispectral image data was resampled to 10m in order to maintain a uniform ratio of 1:3 against Landsat 8 image. Six classes were chosen for analysis including: Dense forest, eucalyptus, water, grassland, wheat and riverine sand. Using a standard false color composite (FCC), the six features reflected differently in the infrared region with wheat producing the brightest pixel values. Signature collection per class was therefore easily obtained for all classifications. The output of both ANN and FCM, were analyzed separately for accuracy and an error matrix generated to assess the quality and accuracy of the classification algorithms. When you compare the results of the two methods on a per-class-basis, ANN had a crisper output compared to PCM which yielded clusters with pixels especially on the moderate resolution Landsat 8 imagery.

AVHRR과 Landsat TM 자료를 이용한 적조 패취 관측 (Detection of Red Tide Patches using AVHRR and Landsat TM data)

  • 정종철
    • 환경영향평가
    • /
    • 제10권1호
    • /
    • pp.1-8
    • /
    • 2001
  • Detection of red tides by satellite remote sensing can be done either by detecting enhanced level of chlorophyll pigment or by detecting changes in the spectral composition of pixels. Using chlorophyll concentration, however, is not effective currently due to the facts: 1) Chlorophyll-a is a universal pigment of phytoplankton, and 2) no accurate algorithm for chlorophyll in case 2 water is available yet. Here, red band algorithm, classification and PCA (Principal Component Analysis) techniques were applied for detecting patches of Cochlodinium polykrikoides red tides which occurred in Korean waters in 1995. This dinoflagellate species appears dark red due to the characteristic pigments absorbing lights in the blue and green wavelength most effectively. In the satellite image, the brightness of red tide pixels in all the three visible bands were low making the detection difficult. Red band algorithm is not good for detecting the red tide because of reflectance of suspended sediments. For supervised classification, selecting training area was difficult, while unsupervised classification was not effective in delineating the patches from surrounding pixels. On the other hand, PCA gave a good qualitative discrimination on the distribution compared with actual observation.

  • PDF

Analysis of forest types and stand structures over Korean peninsula Using NOAA/AVHRR data

  • Lee, Seung-Ho;Kim, Cheol-Min;Oh, Dong-Ha
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 1999년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.386-389
    • /
    • 1999
  • In this study, visible and near infrared channels of NOAA/AVHRR data were used to classify land use and vegetation types over Korean peninsula. Analyzing forest stand structures and prediction of forest productivity using satellite data were also reviewed. Land use and land cover classification was made by unsupervised clustering methods. After monthly Normalized Difference Vegetation Index (NDVI) composite images were derived from April to November 1998, the derived composite images were used as temporal feature vector's in this clustering analysis. Visually interpreted, the classification result was satisfactory in overall for it matched well with the general land cover patterns. But subclassification of forests into coniferous, deciduous, and mixed forests were much confused due to the effects of low ground resolution of AVHRR data and without defined classification scheme. To investigate into the forest stand structures, digital forest type maps were used as an ancillary data. Forest type maps, which were compiled and digitalized by Forestry Research Institute, were registered to AVHRR image coordinates. Two data sets were compared and percent forest cover over whole region was estimated by multiple regression analysis. Using this method, other forest stand structure characteristics within the primary data pixels are expected to be extracted and estimated.

  • PDF

Multi-Radial Basis Function SVM Classifier: Design and Analysis

  • Wang, Zheng;Yang, Cheng;Oh, Sung-Kwun;Fu, Zunwei
    • Journal of Electrical Engineering and Technology
    • /
    • 제13권6호
    • /
    • pp.2511-2520
    • /
    • 2018
  • In this study, Multi-Radial Basis Function Support Vector Machine (Multi-RBF SVM) classifier is introduced based on a composite kernel function. In the proposed multi-RBF support vector machine classifier, the input space is divided into several local subsets considered for extremely nonlinear classification tasks. Each local subset is expressed as nonlinear classification subspace and mapped into feature space by using kernel function. The composite kernel function employs the dual RBF structure. By capturing the nonlinear distribution knowledge of local subsets, the training data is mapped into higher feature space, then Multi-SVM classifier is realized by using the composite kernel function through optimization procedure similar to conventional SVM classifier. The original training data set is partitioned by using some unsupervised learning methods such as clustering methods. In this study, three types of clustering method are considered such as Affinity propagation (AP), Hard C-Mean (HCM) and Iterative Self-Organizing Data Analysis Technique Algorithm (ISODATA). Experimental results on benchmark machine learning datasets show that the proposed method improves the classification performance efficiently.

고객 감성 분석을 위한 학습 기반 토크나이저 비교 연구 (Comparative Study of Tokenizer Based on Learning for Sentiment Analysis)

  • 김원준
    • 품질경영학회지
    • /
    • 제48권3호
    • /
    • pp.421-431
    • /
    • 2020
  • Purpose: The purpose of this study is to compare and analyze the tokenizer in natural language processing for customer satisfaction in sentiment analysis. Methods: In this study, a supervised learning-based tokenizer Mecab-Ko and an unsupervised learning-based tokenizer SentencePiece were used for comparison. Three algorithms: Naïve Bayes, k-Nearest Neighbor, and Decision Tree were selected to compare the performance of each tokenizer. For performance comparison, three metrics: accuracy, precision, and recall were used in the study. Results: The results of this study are as follows; Through performance evaluation and verification, it was confirmed that SentencePiece shows better classification performance than Mecab-Ko. In order to confirm the robustness of the derived results, independent t-tests were conducted on the evaluation results for the two types of the tokenizer. As a result of the study, it was confirmed that the classification performance of the SentencePiece tokenizer was high in the k-Nearest Neighbor and Decision Tree algorithms. In addition, the Decision Tree showed slightly higher accuracy among the three classification algorithms. Conclusion: The SentencePiece tokenizer can be used to classify and interpret customer sentiment based on online reviews in Korean more accurately. In addition, it seems that it is possible to give a specific meaning to a short word or a jargon, which is often used by users when evaluating products but is not defined in advance.

Statistical Approach to Noisy Band Removal for Enhancement of HIRIS Image Classification

  • Huan, Nguyen Van;Kim, Hak-Il
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2008년도 춘계학술대회 논문집
    • /
    • pp.195-200
    • /
    • 2008
  • The accuracy of classifying pixels in HIRIS images is usually degraded by noisy bands since noisy bands may deform the typical shape of spectral reflectance. Proposed in this paper is a statistical method for noisy band removal which mainly makes use of the correlation coefficients between bands. Considering each band as a random variable, the correlation coefficient measures the strength and direction of a linear relationship between two random variables. While the correlation between two signal bands is high, existence of a noisy band will produce a low correlation due to ill-correlativeness and undirectedness. The application of the correlation coefficient as a measure for detecting noisy bands is under a two-pass screening scheme. This method is independent of the prior knowledge of the sensor or the cause resulted in the noise. The classification in this experiment uses the unsupervised k-nearest neighbor algorithm in accordance with the well-accepted Euclidean distance measure and the spectral angle mapper measure. This paper also proposes a hierarchical combination of these measures for spectral matching. Finally, a separability assessment based on the between-class and within-class scatter matrices is followed to evaluate the performance.

  • PDF

울산 지역 암석 시료의 스펙트럼 특성과 이의 Clustering 응용 (The Clustering Application of Spectral Characteristics of Rock Samples from Ulsan)

  • 박종남;김지훈
    • 대한원격탐사학회지
    • /
    • 제6권2호
    • /
    • pp.115-133
    • /
    • 1990
  • Study was made on the spectral characteristics of rock samples including bentonites collected from the northern Ulsan area. The geology of the area consists mainly of sediments of the Kyongsang Series and Bulguksa granite, the Tertiary volcanics, andesites and tuffs. Relative reflectances of meshed samples(2.5~10mm) to BaSO$_4$ are measured at 6 Landsat TM spectral windows (excluding the thermal band) with HHRR, and their reflection charactristics were analysed. In addition, three different data selection schemes including the Eulidean distance, multiple regression, and PCA weight methods were applied to the 30 TM ratio channels, derived from the above 6 bands. The selected data sets were subject to two unsupervised classification techniques(FA and ISODATA) in order to compare the effectiveness for classification of particularly bentonite from others. As a result, in ISODATA analysis the multiple regression model shows the best, followed by the Euliean distances one. The PCA weight model seems to show some confusion. In FA, though difficult for quantitative analysis, the best still seems to be the regression model. Among ratio bands, rations of band 7 or 5 against other bands represent the best contribution in classification of bentonites from others.

Opera Clustering: K-means on librettos datasets

  • 정하림;유주헌
    • 인터넷정보학회논문지
    • /
    • 제23권2호
    • /
    • pp.45-52
    • /
    • 2022
  • With the development of artificial intelligence analysis methods, especially machine learning, various fields are widely expanding their application ranges. However, in the case of classical music, there still remain some difficulties in applying machine learning techniques. Genre classification or music recommendation systems generated by deep learning algorithms are actively used in general music, but not in classical music. In this paper, we attempted to classify opera among classical music. To this end, an experiment was conducted to determine which criteria are most suitable among, composer, period of composition, and emotional atmosphere, which are the basic features of music. To generate emotional labels, we adopted zero-shot classification with four basic emotions, 'happiness', 'sadness', 'anger', and 'fear.' After embedding the opera libretto with the doc2vec processing model, the optimal number of clusters is computed based on the result of the elbow method. Decided four centroids are then adopted in k-means clustering to classify unsupervised libretto datasets. We were able to get optimized clustering based on the result of adjusted rand index scores. With these results, we compared them with notated variables of music. As a result, it was confirmed that the four clusterings calculated by machine after training were most similar to the grouping result by period. Additionally, we were able to verify that the emotional similarity between composer and period did not appear significantly. At the end of the study, by knowing the period is the right criteria, we hope that it makes easier for music listeners to find music that suits their tastes.

후두음성 질환에 대한 인공지능 연구 (Artificial Intelligence for Clinical Research in Voice Disease)

  • 석준걸;권택균
    • 대한후두음성언어의학회지
    • /
    • 제33권3호
    • /
    • pp.142-155
    • /
    • 2022
  • Diagnosis using voice is non-invasive and can be implemented through various voice recording devices; therefore, it can be used as a screening or diagnostic assistant tool for laryngeal voice disease to help clinicians. The development of artificial intelligence algorithms, such as machine learning, led by the latest deep learning technology, began with a binary classification that distinguishes normal and pathological voices; consequently, it has contributed in improving the accuracy of multi-classification to classify various types of pathological voices. However, no conclusions that can be applied in the clinical field have yet been achieved. Most studies on pathological speech classification using speech have used the continuous short vowel /ah/, which is relatively easier than using continuous or running speech. However, continuous speech has the potential to derive more accurate results as additional information can be obtained from the change in the voice signal over time. In this review, explanations of terms related to artificial intelligence research, and the latest trends in machine learning and deep learning algorithms are reviewed; furthermore, the latest research results and limitations are introduced to provide future directions for researchers.

머신러닝을 활용한 냉간압조용 선재의 다중 분류 및 지능형 매칭 시스템 개발 (Developing a Multiclass Classification and Intelligent Matching System for Cold Rolled Steel Wire using Machine Learning)

  • 이근원;이동건;권영준;조기훈;박성수;조기섭
    • 열처리공학회지
    • /
    • 제36권2호
    • /
    • pp.69-76
    • /
    • 2023
  • In this study, we present a system for identifying equivalent grades of standardized wire rod steel based on alloy composition using machine learning techniques. The system comprises two models, one based on a supervised multi-class classification algorithm and the other based on unsupervised autoencoder algorithm. Our evaluation showed that the supervised model exhibited superior performance in terms of prediction stability and reliability of prediction results. This system provides a useful tool for non-experts seeking similar grades of steel based on alloy composition.