• Title/Summary/Keyword: 반지도 학습

Search Result 21, Processing Time 0.171 seconds

Gene ontology based semi-supervised clustering method (유전자 온톨로지를 활용한 반지도 클러스터링 기법)

  • Go, Song;Kim, Dae-Won
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2008.04a
    • /
    • pp.183-187
    • /
    • 2008
  • 본 논문은 유전자의 기능이 비슷한 정도에 따른 사전정보의 값을 부여하며, 클러스터링시 사전정보를 활용할 수 있는 방법을 제시한다. 실세계 문제인 유전자는 각기 다양한 기능을 하는 특징적인 것으로 사전정보의 형태를 1과 0등으로 구분하던 과거의 방식으로는 정의하기가 어렵다. 유전자간의 비슷한 정도에 따라 사전정보의 값이 정해져야 하는 것은 필요하며, 이는 생물학자가 구축해놓은 유전자 온톨로지의 분석을 통하여 산출한다. 유전자 온톨로지는 기능별 카테고리로 분류하며, 세부 기능은 하위의 카테고리로 형성된 거대한 트리 구조의 형태를 띤다. 온톨로지 분석을 통해 형성된 사전정보의 값은 0과 1사이의 연속적인 값으로 형성이 되며, 이 값은 클러스터링 과정 중 거리 계산에 활용함으로써, 그 결과의 성능이 우수함을 보인다.

  • PDF

Performance Improvement of LVQ Network for Pattern Classification (패턴 분류를 위한 LVQ 네트워크의 성능 개선)

  • 정경권;이정훈;김주웅;손동설;엄기환
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2003.05a
    • /
    • pp.245-248
    • /
    • 2003
  • In this paper, we propose a learning method of the performance improvement of the LVQ network using the radios of the hypersphere with the n-dimensional input vectors. The proposed method determines the reference vectors using the radius of the hypersphere include n+1 set of input vectors in the same class. In order to verify the effectiveness of the proposed method, we performed experiments on the Fisher's IRIS data. The experimental results showed that the proposed method improves considerably on the performance of the conventional LVQ network.

  • PDF

Application of AI technology for various disaster analysis (다양한 재해분석을 위한 AI 기술적용 사례 소개)

  • Giha Lee;Xuan-Hien Le;Van-Giang Nguyen;Van-Linh Ngyen;Sungho Jung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.97-97
    • /
    • 2023
  • 최근 재해분야에서 인공신경망(ANN), 기계학습(ML), 딥러닝(DL) 등 AI 기술이 활용성이 점차 증가하고 있으며, 센싱정보와 연계한 시설물 안전관리, 원격탐사와 연계한 재해감시(녹조, 산사태, 산불 등), 수문시계열(수위, 유량 등) 예측, 레이더·위성강수 자료의 보정과 예측, 상하수도 관망누수예측 등 다양한 분야에서 AI 기술이 적용되고 그 활용성이 검증된 바 있다. 본 연구에서는 ML, DL, 물리기반신경망(Pysics-informed Neural Networks, PINNs)을 이용한 다양한 재해분석 사례를 소개하고, 그 활용성과 한계에 대해서 논의하고자 한다. 주요사례로는 (1) SAR영상과 기계학습을 이용한 재해피해지역(울진 산불) 감지, (2) 국가 디지털 정보를 이용한 산사태 위험지역 판별(인제 산사태) (3) 기계학습 및 딥러닝 기법을 이용한 위성강수 자료의 보정·예측 및 유출해석, (4) 수리해석을 위한 수치해석분야에서의 PINNs의 적용성(1차원 Saint-Venant 식 해석) 평가 연구결과를 공유한다. 특히, 자료의 입·출력 자료만으로 학습된 인공신경망 모형 대신 지배방정식(물리방정식)을 만족하도록 강제한 PINNs의 경우, 인공신경망 모형보다 우수한 모의능력을 보여주었으며, 향후 복잡한 수리모델링 등 수치해석분야에서 그 활용가능성이 매우 높을 것으로 판단된다.

  • PDF

An Ensemble Clustering Algorithm based on a Prior Knowledge (사전정보를 활용한 앙상블 클러스터링 알고리즘)

  • Ko, Song;Kim, Dae-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.2
    • /
    • pp.109-121
    • /
    • 2009
  • Although a prior knowledge is a factor to improve the clustering performance, it is dependant on how to use of them. Especial1y, when the prior knowledge is employed in constructing initial centroids of cluster groups, there should be concerned of similarities of a prior knowledge. Despite labels of some objects of a prior knowledge are identical, the objects whose similarities are low should be separated. By separating them, centroids of initial group were not fallen in a problem which is collision of objects with low similarities. There can use the separated prior knowledge by various methods such as various initializations. To apply association rule, proposed method makes enough cluster group number, then the centroids of initial groups could constructed by separated prior knowledge. Then ensemble of the various results outperforms what can not be separated.

Wifi Fingerprint Calibration Using Semi-Supervised Self Organizing Map (반지도식 자기조직화지도를 이용한 wifi fingerprint 보정 방법)

  • Thai, Quang Tung;Chung, Ki-Sook;Keum, Changsup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.536-544
    • /
    • 2017
  • Wireless RSSI (Received Signal Strength Indication) fingerprinting is one of the most popular methods for indoor positioning as it provides reasonable accuracy while being able to exploit existing wireless infrastructure. However, the process of radio map construction (aka fingerprint calibration) is laborious and time consuming as precise physical coordinates and wireless signals have to be measured at multiple locations of target environment. This paper proposes a method to build the map from a combination of RSSIs without location information collected in a crowdsourcing fashion, and a handful of labeled RSSIs using a semi-supervised self organizing map learning algorithm. Experiment on simulated data shows promising results as the method is able to recover the full map effectively with only 1% RSSI samples from the fingerprint database.

An Experimental Study on AutoEncoder to Detect Botnet Traffic Using NetFlow-Timewindow Scheme: Revisited (넷플로우-타임윈도우 기반 봇넷 검출을 위한 오토엔코더 실험적 재고찰)

  • Koohong Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.4
    • /
    • pp.687-697
    • /
    • 2023
  • Botnets, whose attack patterns are becoming more sophisticated and diverse, are recognized as one of the most serious cybersecurity threats today. This paper revisits the experimental results of botnet detection using autoencoder, a semi-supervised deep learning model, for UGR and CTU-13 data sets. To prepare the input vectors of autoencoder, we create data points by grouping the NetFlow records into sliding windows based on source IP address and aggregating them to form features. In particular, we discover a simple power-law; that is the number of data points that have some flow-degree is proportional to the number of NetFlow records aggregated in them. Moreover, we show that our power-law fits the real data very well resulting in correlation coefficients of 97% or higher. We also show that this power-law has an impact on the learning of autoencoder and, as a result, influences the performance of botnet detection. Furthermore, we evaluate the performance of autoencoder using the area under the Receiver Operating Characteristic (ROC) curve.

Design and Implementation of WBI System for Test and Diagnoses based on WWW (WWW기반에서 테스트 및 진단을 위한 WBI 시스템의 설계 및 구현)

  • Kim, Du-Gyu;Lee, Jae-Mu
    • Journal of KIISE:Software and Applications
    • /
    • v.28 no.12
    • /
    • pp.938-946
    • /
    • 2001
  • A web support open environment in which flexibility that allows it to be applied in the education field has gradually evolved but the WBI(Web Based Instruction) which compose it have many limitations and problems, as far as learning efficiency is concerned. In particular, existing web-based estimation systems just give information on whether learner's replies are 'correct' or 'incorrect' and offer the learners evaluations of results in terms of scores. Therefore it is difficult for the learners to get more detailed information about their shortcomings and errors. What is needed for the learners is that web based instruction systems diagnose learner's comprehension status, providing c causes: Why did the learners make the errors\ulcorner In this paper, we propose the development of a web-based instruction system that learners can access with their browsers at any time and no matter where they are. Our system has a facility that analyses learner's weak points and diagnoses error cause, giving advice to learners and more detailed error information than existing systems. By accumulating user behaviors, relevant individualized information on the learners can be given.

  • PDF

An Enhanced Fuzzy ART Algorithm for Effective Image Recognition (효과적인 영상 인식을 위한 개선된 퍼지 ART 알고리즘)

  • Kim, Kwang-Baek;Park, Choong-Shik
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.06a
    • /
    • pp.262-267
    • /
    • 2007
  • 퍼지 ART 알고리즘에서 경계 변수는 패턴들을 클러스터링하는데 있어서 반지름 값이 되며 임의의 패턴과 저장된 패턴과의 불일치(mismatch) 허용도를 결정한다. 이 경계 변수가 크면 입력 벡터와 기대 벡터 사이에 약간의 차이가 있어도 새로운 카테고리(category)로 분류하게 된다. 반대로 경계 변수가 작으면 입력 벡터와 기대 벡터 사이에 많은 차이가 있더라도 유사성이 인정되어 입력 벡터들을 대략적으로 분류한다. 따라서 영상 인식에 적용하기 위해서는 경험적으로 경계 변수를 설정해야 단점이 있다. 그리고 연결 가중치를 조정하는 과정에서 학습률의 설정에 따라 저장된 패턴들의 정보들이 손실되는 경우가 발생하여 인식율을 저하시킨다. 본 논문에서는 퍼지 ART 알고리즘의 문제점을 개선하기 위하여 퍼지 논리 접속 연산자를 이용하여 경계 변수를 동적으로 조정하고 저장 패턴들과 학습 패턴간의 실제적인 왜곡 정도를 충분히 고려하여 승자 노드로 선택된 빈도수를 학습률로 설정하여 가중치 조정에 적용한 개선된 퍼지 ART 알고리즘을 제안하였다. 제안된 방법의 성능을 확인하기 위해서 실제 영문 명함에서 추출한 영문자들을 대상으로 실험한 결과, 기존의 ART1과 ART2 알고리즘이나 퍼지 ART 알고리즘보다 클러스터의 수가 적게 생성되었고 인식 성능도 기존의 방법들보다 우수한 성능이 있음을 확인하였다.

  • PDF

Nearest-neighbor Rule based Prototype Selection Method and Performance Evaluation using Bias-Variance Analysis (최근접 이웃 규칙 기반 프로토타입 선택과 편의-분산을 이용한 성능 평가)

  • Shim, Se-Yong;Hwang, Doo-Sung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.10
    • /
    • pp.73-81
    • /
    • 2015
  • The paper proposes a prototype selection method and evaluates the generalization performance of standard algorithms and prototype based classification learning. The proposed prototype classifier defines multidimensional spheres with variable radii within class areas and generates a small set of training data. The nearest-neighbor classifier uses the new training set for predicting the class of test data. By decomposing bias and variance of the mean expected error value, we compare the generalization errors of k-nearest neighbor, Bayesian classifier, prototype selection using fixed radius and the proposed prototype selection method. In experiments, the bias-variance changing trends of the proposed prototype classifier are similar to those of nearest neighbor classifiers with all training data and the prototype selection rates are under 27.0% on average.

Utilizing Local Bilingual Embeddings on Korean-English Law Data (한국어-영어 법률 말뭉치의 로컬 이중 언어 임베딩)

  • Choi, Soon-Young;Matteson, Andrew Stuart;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.10
    • /
    • pp.45-53
    • /
    • 2018
  • Recently, studies about bilingual word embedding have been gaining much attention. However, bilingual word embedding with Korean is not actively pursued due to the difficulty in obtaining a sizable, high quality corpus. Local embeddings that can be applied to specific domains are relatively rare. Additionally, multi-word vocabulary is problematic due to the lack of one-to-one word-level correspondence in translation pairs. In this paper, we crawl 868,163 paragraphs from a Korean-English law corpus and propose three mapping strategies for word embedding. These strategies address the aforementioned issues including multi-word translation and improve translation pair quality on paragraph-aligned data. We demonstrate a twofold increase in translation pair quality compared to the global bilingual word embedding baseline.