• Title/Summary/Keyword: Supervised Classification

Search Result 411, Processing Time 0.031 seconds

Unsupervised Classification of Landsat-8 OLI Satellite Imagery Based on Iterative Spectral Mixture Model (자동화된 훈련 자료를 활용한 Landsat-8 OLI 위성영상의 반복적 분광혼합모델 기반 무감독 분류)

  • Choi, Jae Wan;Noh, Sin Taek;Choi, Seok Keun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.53-61
    • /
    • 2014
  • Landsat OLI satellite imagery can be applied to various remote sensing applications, such as generation of land cover map, urban area analysis, extraction of vegetation index and change detection, because it includes various multispectral bands. In addition, land cover map is an important information to monitor and analyze land cover using GIS. In this paper, land cover map is generated by using Landsat OLI and existing land cover map. First, training dataset is obtained using correlation between existing land cover map and unsupervised classification result by K-means, automatically. And then, spectral signatures corresponding to each class are determined based on training data. Finally, abundance map and land cover map are generated by using iterative spectral mixture model. The experiment is accomplished by Landsat OLI of Cheongju area. It shows that result by our method can produce land cover map without manual training dataset, compared to existing land cover map and result by supervised classification result by SVM, quantitatively and visually.

A Study on Automatic Classification Technique of Malware Packing Type (악성코드 패킹유형 자동분류 기술 연구)

  • Kim, Su-jeong;Ha, Ji-hee;Lee, Tae-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1119-1127
    • /
    • 2018
  • Most of the cyber attacks are caused by malicious codes. The damage caused by cyber attacks are gradually expanded to IoT and CPS, which is not limited to cyberspace but a serious threat to real life. Accordingly, various malicious code analysis techniques have been appeared. Dynamic analysis have been widely used to easily identify the resulting malicious behavior, but are struggling with an increase in Anti-VM malware that is not working in VM environment detection. On the other hand, static analysis has difficulties in analysis due to various packing techniques. In this paper, we proposed malware classification techniques regardless of known packers or unknown packers through the proposed model. To do this, we designed a model of supervised learning and unsupervised learning for the features that can be used in the PE structure, and conducted the results verification through 98,000 samples. It is expected that accurate analysis will be possible through customized analysis technology for each class.

Analysis of Burn Severity in Large-fire Area Using SPOT5 Images and Field Survey Data (SPOT5영상과 현장조사자료를 융합한 대형산불지역의 피해강도 분석)

  • Won, Myoungsoo;Kim, Kyongha;Lee, Sangwoo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.16 no.2
    • /
    • pp.114-124
    • /
    • 2014
  • For classifying fire damaged areas and analyzing burn severity of two large-fire areas damaged over 100 ha in 2011, three methods were employed utilized supervised classification, unsupervised classification and Normalized Difference Vegetation Index (NDVI). In this paper, the post-fire imageries of SPOT were used to compute the Maximum Likelihood (MLC), Minimum Distance (MIN), ISODATA, K-means, NDVI and to evaluate large-scale patterns of burn severity from 1 m to 5 m spatial resolutions. The result of the accuracy verification on burn severity from satellite images showed that average overall accuracy was 88.38 % and the Kappa coefficient was 0.8147. To compare the accuracy between burn severity and field survey at Uljin and Youngduk, two large fire sites were selected as study areas, and forty-four sampling plots were assigned in each study area for field survey. The burn severities of the study areas were estimated by analyzing burn severity (BS) classes from SPOT images taken one month after the occurrence of the fire. The applicability of composite burn index (CBI) was validated with a correlation analysis between field survey data and burn severity classified by SPOT5, and by their confusion matrix. The result showed that correlation between field survey data and BS by SPOT5 were closely correlated in both Uljin (r = -0.544 and p<0.01) and Youngduk (r = -0.616 and p<0.01). Thus, this result supported that the proposed burn severity analysis is an adequate method to measure burn severity of large fire areas in Korea.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

Experience Way of Artificial Intelligence PLAY Educational Model for Elementary School Students

  • Lee, Kibbm;Moon, Seok-Jae
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.232-237
    • /
    • 2020
  • Given the recent pace of development and expansion of Artificial Intelligence (AI) technology, the influence and ripple effects of AI technology on the whole of our lives will be very large and spread rapidly. The National Artificial Intelligence R&D Strategy, published in 2019, emphasizes the importance of artificial intelligence education for K-12 students. It also mentions STEM education, AI convergence curriculum, and budget for supporting the development of teaching materials and tools. However, it is necessary to create a new type of curriculum at a time when artificial intelligence curriculum has never existed before. With many attempts and discussions going very fast in all countries on almost the same starting line. Also, there is no suitable professor for K-12 students, and it is difficult to make K-12 students understand the concept of AI. In particular, it is difficult to teach elementary school students through professional programming in AI education. It is also difficult to learn tools that can teach AI concepts. In this paper, we propose an educational model for elementary school students to improve their understanding of AI through play or experience. This an experiential education model that combineds exploratory learning and discovery learning using multi-intelligence and the PLAY teaching-learning model to undertand the importance of data training or data required for AI education. This educational model is designed to learn how a computer that knows only binary numbers through UA recognizes images. Through code.org, students were trained to learn AI robots and configured to understand data bias like play. In addition, by learning images directly on a computer through TeachableMachine, a tool capable of supervised learning, to understand the concept of dataset, learning process, and accuracy, and proposed the process of AI inference.

Development of Retirement Prediction Model based on Work Life Profile Using Machine Learning Method (기계 학습 방법을 이용한 직장 생활 프로파일 기반의 퇴직 예측 모델 개발)

  • Yun, You-Dong;Lee, Seol-Hwa;Ji, Hye-Sung;Lim, Heui-Seok
    • The Journal of Korean Association of Computer Education
    • /
    • v.20 no.1
    • /
    • pp.87-97
    • /
    • 2017
  • Recently, much research has been done on the turnover and retirement intentions of the organization members as many companies recognize the negative impact of the human resource outflow on the organization. However, most of the studies are conducted in the form of questionnaires, and there is still a lack of studies on the turnover and retirement intentions based on the work life data. In this study, we analyzed the factors affecting the retirement of employees based on the work life profile, and created a retirement prediction model using the machine learning method. As a result, we could identify various factors that were not covered in previous researches. In addition, we have established a basis for research that can provide a solution for the problem of human resource outflow by generating a good performance retirement prediction model.

Recognition of damage pattern and evolution in CFRP cable with a novel bonding anchorage by acoustic emission

  • Wu, Jingyu;Lan, Chengming;Xian, Guijun;Li, Hui
    • Smart Structures and Systems
    • /
    • v.21 no.4
    • /
    • pp.421-433
    • /
    • 2018
  • Carbon fiber reinforced polymer (CFRP) cable has good mechanical properties and corrosion resistance. However, the anchorage of CFRP cable is a big issue due to the anisotropic property of CFRP material. In this article, a high-efficient bonding anchorage with novel configuration is developed for CFRP cables. The acoustic emission (AE) technique is employed to evaluate the performance of anchorage in the fatigue test and post-fatigue ultimate bearing capacity test. The obtained AE signals are analyzed by using a combination of unsupervised K-means clustering and supervised K-nearest neighbor classification (K-NN) for quantifying the performance of the anchorage and damage evolutions. An AE feature vector (including both frequency and energy characteristics of AE signal) for clustering analysis is proposed and the under-sampling approaches are employed to regress the influence of the imbalanced classes distribution in AE dataset for improving clustering quality. The results indicate that four classes exist in AE dataset, which correspond to the shear deformation of potting compound, matrix cracking, fiber-matrix debonding and fiber fracture in CFRP bars. The AE intensity released by the deformation of potting compound is very slight during the whole loading process and no obvious premature damage observed in CFRP bars aroused by anchorage effect at relative low stress level, indicating the anchorage configuration in this study is reliable.

Developing a Text Categorization System Based on Unsupervised Learning Using an Information Retrieval Technique (정보검색 기술을 이용한 비지도 학습 기반 문서 분류 시스템 개발)

  • Noh, Dae-Wook;Lee, Soo-Yong;Ra, Dong-Yul
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.2
    • /
    • pp.160-168
    • /
    • 2007
  • For developing a text classifier using supervised learning, a manually labeled corpus of large size is required. However, it takes a lot of time and human effort. Recently a research paradigm was proposed to use a raw corpus and a small amount of seed information instead of manually labeled corpus. In this paper we introduce an unsupervised learning method that makes it possible to achieve better performance than other related works. The characteristics of our approach is that average mutual information is used to learn representative words and their weights and then update of the weights is done using a technique inspired by the works in information retrieval. By iterating this teaming process it was shown that a high performance system can be developed.

Monitoring the Vegetation Coverage Rate of Small Artificial Wetland Using Radio Controlled Helicopter (무선조종 헬리콥터를 이용한 소규모 인공 습지의 식생피복율 변화 모니터링)

  • Lee, Chun-Seok
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.9 no.2
    • /
    • pp.81-89
    • /
    • 2006
  • The purpose of this study was to evaluate the applicability of small RC(radio controlled) helicopter and single lens reflect camera as SFAP(Small Format Aerial Photography) aquisition system to monitor the vegetation coverage of wetland. The system used to take pictures of small artificial wetland were a common optical camera(Nikon F80 with manual lens whose focal length was 28mm) attached to the bottom of a RC helicopter with a 50 cubic inch size glow engine. Three hundreds pictures were taken at the altitude of 50m above the ground, from 23rd June to 7th September 2005. Four from the images were selected and scanned to digital images whose dimension were 2048${\times}$1357 pixels. Those images were processed and rectified with GCP(Ground Control Poins) and digital map, and then classified by the supervised- classification module of image processing program PG-steamer Version 2.2. The major findings were as follows ; 1. The final images processed had very high spatial resolution so that the objects bigger than 30mm like lotus(Nelumbo nucifera), rock and deck were easily identified. 2. The dominant plants of the monitoring site were Monochoria ragianlis, Typha latifolia, Beckmannia syzigachne etc. Because those species have narrow and long leaves and form irregular biomass, the individuals were hardly identifiable, but the distribution of population were easily identifiable depending on the color difference. 3. The area covered by vegetation was rapidly increased during the first month of monitoring. At the beginning of the monitoring 23th June 2005, The rate of area covered by vegetation were only 34%, but after 27 and 60 days it increased to 74%, and the 86% respectively.

Analysis Process based on Modify K-means for Efficiency Improvement of Electric Power Data Pattern Detection (전력데이터 패턴 추출의 효율성 향상을 위한 변형된 K-means 기반의 분석 프로세스)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Yong Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.12
    • /
    • pp.1960-1969
    • /
    • 2017
  • There have been ongoing researches to identify and analyze the patterns of electric power IoT data inside sensor nodes to supplement the stable supply of power and the efficiency of energy consumption. This study set out to propose an analysis process for electric power IoT data with the K-means algorithm, which is an unsupervised learning technique rather than a supervised one. There are a couple of problems with the old K-means algorithm, and one of them is the selection of cluster number K in a heuristic or random method. That approach is proper for the age of standardized data. The investigator proposed an analysis process of selecting an automated cluster number K through principal component analysis and the space division of normal distribution and incorporated it into electric power IoT data. The performance evaluation results show that it recorded a higher level of performance than the old algorithm in the cluster classification and analysis of pitches and rolls included in the communication bodies of utility poles.