• Title/Summary/Keyword: learning algorithms

Search Result 2,317, Processing Time 0.031 seconds

An Outlier Cluster Detection Technique for Real-time Network Intrusion Detection Systems (실시간 네트워크 침입탐지 시스템을 위한 아웃라이어 클러스터 검출 기법)

  • Chang, Jae-Young;Park, Jong-Myoung;Kim, Han-Joon
    • Journal of Internet Computing and Services
    • /
    • v.8 no.6
    • /
    • pp.43-53
    • /
    • 2007
  • Intrusion detection system(IDS) has recently evolved while combining signature-based detection approach with anomaly detection approach. Although signature-based IDS tools have been commonly used by utilizing machine learning algorithms, they only detect network intrusions with already known patterns, Ideal IDS tools should always keep the signature database of your detection system up-to-date. The system needs to generate the signatures to detect new possible attacks while monitoring and analyzing incoming network data. In this paper, we propose a new outlier cluster detection algorithm with density (or influence) function, Our method assumes that an outlier is a kind of cluster with similar instances instead of a single object in the context of network intrusion, Through extensive experiments using KDD 1999 Cup Intrusion Detection dataset. we show that the proposed method outperform the conventional outlier detection method using Euclidean distance function, specially when attacks occurs frequently.

  • PDF

An Efficient Damage Information Extraction from Government Disaster Reports

  • Shin, Sungho;Hong, Seungkyun;Song, Sa-Kwang
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.55-63
    • /
    • 2017
  • One of the purposes of Information Technology (IT) is to support human response to natural and social problems such as natural disasters and spread of disease, and to improve the quality of human life. Recent climate change has happened worldwide, natural disasters threaten the quality of life, and human safety is no longer guaranteed. IT must be able to support tasks related to disaster response, and more importantly, it should be used to predict and minimize future damage. In South Korea, the data related to the damage is checked out by each local government and then federal government aggregates it. This data is included in disaster reports that the federal government discloses by disaster case, but it is difficult to obtain raw data of the damage even for research purposes. In order to obtain data, information extraction may be applied to disaster reports. In the field of information extraction, most of the extraction targets are web documents, commercial reports, SNS text, and so on. There is little research on information extraction for government disaster reports. They are mostly text, but the structure of each sentence is very different from that of news articles and commercial reports. The features of the government disaster report should be carefully considered. In this paper, information extraction method for South Korea government reports in the word format is presented. This method is based on patterns and dictionaries and provides some additional ideas for tokenizing the damage representation of the text. The experiment result is F1 score of 80.2 on the test set. This is close to cutting-edge information extraction performance before applying the recent deep learning algorithms.

Building Living Lab for Acquiring Behavioral Data for Early Screening of Developmental Disorders

  • Kim, Jung-Jun;Kwon, Yong-Seop;Kim, Min-Gyu;Kim, Eun-Soo;Kim, Kyung-Ho;Sohn, Dong-Seop
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.8
    • /
    • pp.47-54
    • /
    • 2020
  • Developmental disorders are impairments of brain and/or central nervous system and refer to a disorder of brain function that affects languages, communication skills, perception, sociality and so on. In diagnosis of developmental disorders, behavioral response such as expressing emotions in proper situation is one of observable indicators that tells whether or not individual has the disorders. However, diagnosis by observation can allow subjective evaluation that leads erroneous conclusion. This research presents the technological environment and data acquisition system for AI based screening of autism disorder. The environment was built considering activities for two screening protocols, namely Autism Diagnostic Observation Schedule (ADOS) and Behavior Development Screening for Toddler (BeDevel). The activities between therapist and baby during the screening are fully recorded. The proposed software in this research was designed to support recording, monitoring and data tagging for learning AI algorithms.

Real-time Estimation on Service Completion Time of Logistics Process for Container Vessels (선박 물류 프로세스의 실시간 서비스 완료시간 예측에 대한 연구)

  • Yun, Shin-Hwi;Ha, Byung-Hyun
    • The Journal of Society for e-Business Studies
    • /
    • v.17 no.2
    • /
    • pp.149-163
    • /
    • 2012
  • Logistics systems provide their service to customers by coordinating the resources with limited capacity throughout the underlying processes involved to each other. To maintain the high level of service under such complicated condition, it is essential to carry out the real-time monitoring and continuous management of logistics processes. In this study, we propose a method of estimating the service completion time of key processes based on process-state information collected in real time. We first identify the factors that influence the process completion time by modeling and analyzing an influence diagram, and then suggest algorithms for quantifying the factors. We suppose the container terminal logistics and the process of discharging and loading containers to a vessel. The remaining service time of a vessel is estimated using a decision tree which is the result of machine-learning using historical data. We validated the estimation model using container terminal simulation. The proposed model is expected to improve competitiveness of logistics systems by forecasting service completion in real time, as well as to prevent the waste of resources.

Performance Evaluation of the Extractiojn Method of Representative Keywords by Fuzzy Inference (퍼지추론 기반 대표 키워드 추출방법의 성능 평가)

  • Rho Sun-Ok;Kim Byeong Man;Oh Sang Yeop;Lee Hyun Ah
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.10 no.1
    • /
    • pp.28-37
    • /
    • 2005
  • In our previous works, we suggested a method that extracts representative keywords from a few positive documents and assigns weights to them. To show the usefulness of the method, in this paper, we evaluate the performance of a famous classification algorithm called GIS(Generalized Instance Set) when it is combined with our method. In GIS algorithm, generalized instances are built from learning documents by a generalization function and then the K-NN algorithm is applied to them. Here, our method is used as a generalization function. For comparative works, Rocchio and Widrow-Hoff algorithms are also used as a generalization function. Experimental results show that our method is better than the others for the case that only positive documents are considered, but not when negative documents are considered together.

  • PDF

Real Time Face Detection and Recognition using Rectangular Feature based Classifier and Class Matching Algorithm (사각형 특징 기반 분류기와 클래스 매칭을 이용한 실시간 얼굴 검출 및 인식)

  • Kim, Jong-Min;Kang, Myung-A
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.1
    • /
    • pp.19-26
    • /
    • 2010
  • This paper proposes a classifier based on rectangular feature to detect face in real time. The goal is to realize a strong detection algorithm which satisfies both efficiency in calculation and detection performance. The proposed algorithm consists of the following three stages: Feature creation, classifier study and real time facial domain detection. Feature creation organizes a feature set with the proposed five rectangular features and calculates the feature values efficiently by using SAT (Summed-Area Tables). Classifier learning creates classifiers hierarchically by using the AdaBoost algorithm. In addition, it gets excellent detection performance by applying important face patterns repeatedly at the next level. Real time facial domain detection finds facial domains rapidly and efficiently through the classifier based on the rectangular feature that was created. Also, the recognition rate was improved by using the domain which detected a face domain as the input image and by using PCA and KNN algorithms and a Class to Class rather than the existing Point to Point technique.

Sparse Document Data Clustering Using Factor Score and Self Organizing Maps (인자점수와 자기조직화지도를 이용한 희소한 문서데이터의 군집화)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.205-211
    • /
    • 2012
  • The retrieved documents have to be transformed into proper data structure for the clustering algorithms of statistics and machine learning. A popular data structure for document clustering is document-term matrix. This matrix has the occurred frequency value of a term in each document. There is a sparsity problem in this matrix because most frequencies of the matrix are 0 values. This problem affects the clustering performance. The sparseness of document-term matrix decreases the performance of clustering result. So, this research uses the factor score by factor analysis to solve the sparsity problem in document clustering. The document-term matrix is transformed to document-factor score matrix using factor scores in this paper. Also, the document-factor score matrix is used as input data for document clustering. To compare the clustering performances between document-term matrix and document-factor score matrix, this research applies two typed matrices to self organizing map (SOM) clustering.

Self-Organizing Polynomial Neural Networks Based on Genetically Optimized Multi-Layer Perceptron Architecture

  • Park, Ho-Sung;Park, Byoung-Jun;Kim, Hyun-Ki;Oh, Sung-Kwun
    • International Journal of Control, Automation, and Systems
    • /
    • v.2 no.4
    • /
    • pp.423-434
    • /
    • 2004
  • In this paper, we introduce a new topology of Self-Organizing Polynomial Neural Networks (SOPNN) based on genetically optimized Multi-Layer Perceptron (MLP) and discuss its comprehensive design methodology involving mechanisms of genetic optimization. Let us recall that the design of the 'conventional' SOPNN uses the extended Group Method of Data Handling (GMDH) technique to exploit polynomials as well as to consider a fixed number of input nodes at polynomial neurons (or nodes) located in each layer. However, this design process does not guarantee that the conventional SOPNN generated through learning results in optimal network architecture. The design procedure applied in the construction of each layer of the SOPNN deals with its structural optimization involving the selection of preferred nodes (or PNs) with specific local characteristics (such as the number of input variables, the order of the polynomials, and input variables) and addresses specific aspects of parametric optimization. An aggregate performance index with a weighting factor is proposed in order to achieve a sound balance between the approximation and generalization (predictive) abilities of the model. To evaluate the performance of the GA-based SOPNN, the model is experimented using pH neutralization process data as well as sewage treatment process data. A comparative analysis indicates that the proposed SOPNN is the model having higher accuracy as well as more superb predictive capability than other intelligent models presented previously.reviously.

Comparison on Effectiveness of SW Education using Robots based on Narrative-Paper Art Activities (내러티브-종이아트 활동 기반 로봇활용 SW교육 효과성 비교)

  • Sohn, Kyungjin;Han, JeongHye
    • Journal of The Korean Association of Information Education
    • /
    • v.22 no.4
    • /
    • pp.419-425
    • /
    • 2018
  • The national curriculum includes the problem solving process, algorithms, and programming of SW education. The education using robots is one of attractive alternatives for students who have no interest of SW or are poor at programming. We have developed a courseware using robots for SW education based on paper art activities with narrative storytelling to enhance students' creative thinking and problem solving within limitation of class time in schools. We apply the courseware and obtained the result of pre and post-test on the creative problem solving ability of third graders in the elementary school The four factors of creative problem solving have shown significantly increase. In addition, it had an significant effects for understanding robot technology and for learning attitude using robots of SW or programming.

Combined Artificial Bee Colony for Data Clustering (융합 인공벌군집 데이터 클러스터링 방법)

  • Kang, Bum-Su;Kim, Sung-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • Data clustering is one of the most difficult and challenging problems and can be formally considered as a particular kind of NP-hard grouping problems. The K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, it has high possibility to trap in local optimum and high variation of solutions with different initials for the large data set. Therefore, we need study efficient computational intelligence method to find the global optimal solution in data clustering problem within limited computational time. The objective of this paper is to propose a combined artificial bee colony (CABC) with K-means for initialization and finalization to find optimal solution that is effective on data clustering optimization problem. The artificial bee colony (ABC) is an algorithm motivated by the intelligent behavior exhibited by honeybees when searching for food. The performance of ABC is better than or similar to other population-based algorithms with the added advantage of employing fewer control parameters. Our proposed CABC method is able to provide near optimal solution within reasonable time to balance the converged and diversified searches. In this paper, the experiment and analysis of clustering problems demonstrate that CABC is a competitive approach comparing to previous partitioning approaches in satisfactory results with respect to solution quality. We validate the performance of CABC using Iris, Wine, Glass, Vowel, and Cloud UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KABCK (K-means+ABC+K-means) is better than ABCK (ABC+K-means), KABC (K-means+ABC), ABC, and K-means in our simulations.