• Title/Summary/Keyword: Machine learning algorithm

Search Result 1,482, Processing Time 0.028 seconds

Detection of API(Anomaly Process Instance) Based on Distance for Process Mining (프로세스 마이닝을 위한 거리 기반의 API(Anomaly Process Instance) 탐지법)

  • Jeon, Daeuk;Bae, Hyerim
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.41 no.6
    • /
    • pp.540-550
    • /
    • 2015
  • There have been many attempts to find knowledge from data using conventional statistics, data mining, artificial intelligence, machine learning and pattern recognition. In those research areas, knowledge is approached in two ways. Firstly, researchers discover knowledge represented in general features for universal recognition, and secondly, they discover exceptional and distinctive features. In process mining, an instance is sequential information bounded by case ID, known as process instance. Here, an exceptional process instance can cause a problem in the analysis and discovery algorithm. Hence, in this paper we develop a method to detect the knowledge of exceptional and distinctive features when performing process mining. We propose a method for anomaly detection named Distance-based Anomaly Process Instance Detection (DAPID) which utilizes distance between process instances. DAPID contributes to a discovery of distinctive characteristic of process instance. For verifying the suggested methodology, we discovered characteristics of exceptional situations from log data. Additionally, we experiment on real data from a domestic port terminal to demonstrate our proposed methodology.

The Adaptive-Neuro Controller Design of Industrial Robot Using TMS320C3X Chip (TMS320C30칩을 사용한 산업용 로봇의 적응-신경제어기 설계)

  • 하석흥
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 1999.10a
    • /
    • pp.162-169
    • /
    • 1999
  • In this paper, it is presented a new scheme of adaptive-neuro control system to implement real-time control of robot manipulator using digital Signal Processors. Digital signal processors DSPs. are micro-processors that are particularly developed for variables. Digital version of most advanced control algorithms can be defined as sums and products of measured variables, thus it can be programmed and executed through DSPs. In addition, DSPs are as fast in computation as most 32-bit micro-processors and yet at a fraction of their prices. These features make DSPs a biable computatinal tool in digital implementation of sophisticated controllers. Unlike the well-established theory for the adaptive control of linear systems, there exists relatively little general theory for the adaptive control of nonlinear systems. Adaptive control technique is essential for providing a stable and robust performance for application of robot control. The proposed neuro control algorithm is one of learning a model based error back-propagation scheme using Lyapunov stability analysis method. The proposed adaptive-neuro control scheme is illustrated to be a efficient control scheme for implementation of real-time control of robot system by the simulation and experiment.

  • PDF

CS-RANSAC Algorithm using Machine Learning Technique (머신러닝 기법올 적용한 CS-RANSAC 알고리즘)

  • Ko, Seunghyun;Yoon, Ui-Nyoung;Alikhanov, Jumabek;Jo, Geun-Sik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.632-635
    • /
    • 2016
  • 증강현실에서 영상과 증강된 콘텐츠 간의 이질감을 줄이기 위해서 정확한 호모그래피 행렬을 추정해야 하며, 정확한 호모그래피 행렬을 추정할때 RANSAC 알고리즘이 널리 사용된다. 그러나 RANSAC 알고리즘은 랜덤 샘플링 과정을 반복적으로 거치기 때문에 불필요한 연산 과정이 발생하고 이로 인해 알고리즘의 효율이 저하된다. 이러한 단점을 극복하기 위해 DCS-RANSAC 알고리즘이 제안되었다. 제안된 DCS-RANSAC 알고리즘은 이미지를 특징점 분포 패턴에 따라 그룹으로 분류하고 각 그룹에 제약조건 문제를 적용하여 불필요한 연산 과정을 줄이고 정확도를 향상시킨 알고리즘이다. 그러나 DCS-RANSAC 알고리즘에서 사용된 이미지 그룹 데이터는 수동적인 방법을 통해 직관적으로 분류되어 있지만 특징점 분포 패턴이 다양하지 않아 분류시 정확도가 저하되는 경우가 있다. 위의 문제점을 해결하기 위해 본 논문에서는 머신러닝 기법을 통해 이미지들을 자동으로 분류하고 각 그룹마다 각기 다른 제약조건을 적용하는 MCS-RANSAC 알고리즘을 제안한다. 제안하는 알고리즘은 머신러닝 기법을 사용하여 전처리 단계에서 이미지를 분류하고 분류된 이미지에 제약조건을 적용시켜 알고리즘의 처리시간을 줄이고 정확도를 향상시켰다. 실험 결과 본 논문에서 제안하는 MCS-RANSAC은 DCS-RANSAC 알고리즘에 비해 수행시간이 약 6% 단축되었고 호모그래피 오차율은 약 15% 줄어들었으며 참정보 비율은 2.8% 증가한 것으로 확인되었다.

HKIB-20000 & HKIB-40075: Hangul Benchmark Collections for Text Categorization Research

  • Kim, Jin-Suk;Choe, Ho-Seop;You, Beom-Jong;Seo, Jeong-Hyun;Lee, Suk-Hoon;Ra, Dong-Yul
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.3
    • /
    • pp.165-180
    • /
    • 2009
  • The HKIB, or Hankookilbo, test collections are two archives of Korean newswire stories manually categorized with semi-hierarchical or hierarchical category taxonomies. The base newswire stories were made available by the Hankook Ilbo (The Korea Daily) for research purposes. At first, Chungnam National University and KISTI collaborated to manually tag 40,075 news stories with categories by semi-hierarchical and balanced three-level classification scheme, where each news story has only one level-3 category (single-labeling). We refer to this original data set as HKIB-40075 test collection. And then Yonsei University and KISTI collaborated to select 20,000 newswire stories from the HKIB-40075 test collection, to rearrange the classification scheme to be fully hierarchical but unbalanced, and to assign one or more categories to each news story (multi-labeling). We refer to this modified data set as HKIB-20000 test collection. We benchmark a k-NN categorization algorithm both on HKIB-20000 and on HKIB-40075, illustrating properties of the collections, providing baseline results for future studies, and suggesting new directions for further research on Korean text categorization problem.

Face Recognition using 2D-PCA and Image Partition (2D - PCA와 영상분할을 이용한 얼굴인식)

  • Lee, Hyeon Gu;Kim, Dong Ju
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.2
    • /
    • pp.31-40
    • /
    • 2012
  • Face recognition refers to the process of identifying individuals based on their facial features. It has recently become one of the most popular research areas in the fields of computer vision, machine learning, and pattern recognition because it spans numerous consumer applications, such as access control, surveillance, security, credit-card verification, and criminal identification. However, illumination variation on face generally cause performance degradation of face recognition systems under practical environments. Thus, this paper proposes an novel face recognition system using a fusion approach based on local binary pattern and two-dimensional principal component analysis. To minimize illumination effects, the face image undergoes the local binary pattern operation, and the resultant image are divided into two sub-images. Then, two-dimensional principal component analysis algorithm is separately applied to each sub-images. The individual scores obtained from two sub-images are integrated using a weighted-summation rule, and the fused-score is utilized to classify the unknown user. The performance evaluation of the proposed system was performed using the Yale B database and CMU-PIE database, and the proposed method shows the better recognition results in comparison with existing face recognition techniques.

A Risk Classification Based Approach for Android Malware Detection

  • Ye, Yilin;Wu, Lifa;Hong, Zheng;Huang, Kangyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.2
    • /
    • pp.959-981
    • /
    • 2017
  • Existing Android malware detection approaches mostly have concentrated on superficial features such as requested or used permissions, which can't reflect the essential differences between benign apps and malware. In this paper, we propose a quantitative calculation model of application risks based on the key observation that the essential differences between benign apps and malware actually lie in the way how permissions are used, or rather the way how their corresponding permission methods are used. Specifically, we employ a fine-grained analysis on Android application risks. We firstly classify application risks into five specific categories and then introduce comprehensive risk, which is computed based on the former five, to describe the overall risk of an application. Given that users' risk preference and risk-bearing ability are naturally fuzzy, we design and implement a fuzzy logic system to calculate the comprehensive risk. On the basis of the quantitative calculation model, we propose a risk classification based approach for Android malware detection. The experiments show that our approach can achieve high accuracy with a low false positive rate using the RandomForest algorithm.

Online Reviews Analysis for Prediction of Product Ratings based on Topic Modeling (토픽 모델링에 기반한 온라인 상품 평점 예측을 위한 온라인 사용 후기 분석)

  • Park, Sang Hyun;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Information Technology Services
    • /
    • v.16 no.3
    • /
    • pp.113-125
    • /
    • 2017
  • Customers have been affected by others' opinions when they make a purchase. Thanks to the development of technologies, people are sharing their experiences such as reviews or ratings through online or social network services, However, although ratings are intuitive information for others, many reviews include only texts without ratings. Also, because of huge amount of reviews, customers and companies can't read all of them so they are hard to evaluate to a product without ratings. Therefore, in this study, we propose a methodology to predict ratings based on reviews for a product. In a methodology, we first estimate the topic-review matrix using the Latent Dirichlet Allocation technic which is widely used in topic modeling. Next, we predict ratings based on the topic-review matrix using the artificial neural network model which is based on the backpropagation algorithm. Through experiments with actual reviews, we find that our methodology can predict ratings based on customers' reviews. And our methodology performs better with reviews which include certain opinions. As a result, our study can be used for customers and companies that want to know exactly a product with ratings. Moreover, we hope that our study leads to the implementation of future studies that combine machine learning and topic modeling.

Accurate and Efficient Log Template Discovery Technique

  • Tak, Byungchul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.10
    • /
    • pp.11-21
    • /
    • 2018
  • In this paper we propose a novel log template discovery algorithm which achieves high quality of discovered log templates through iterative log filtering technique. Log templates are the static string pattern of logs that are used to produce actual logs by inserting variable values during runtime. Identifying individual logs into their template category correctly enables us to conduct automated analysis using state-of-the-art machine learning techniques. Our technique looks at the group of logs column-wise and filters the logs that have the value of the highest proportion. We repeat this process per each column until we are left with highly homogeneous set of logs that most likely belong to the same log template category. Then, we determine which column is the static part and which is the variable part by vertically comparing all the logs in the group. This process repeats until we have discovered all the templates from given logs. Also, during this process we discover the custom patterns such as ID formats that are unique to the application. This information helps us quickly identify such strings in the logs as variable parts thereby further increasing the accuracy of the discovered log templates. Existing solutions suffer from log templates being too general or too specific because of the inability to detect custom patterns. Through extensive evaluations we have learned that our proposed method achieves 2 to 20 times better accuracy.

Measuring Pattern Recognition from Decision Tree and Geometric Data Analysis of Industrial CR Images (산업용 CR영상의 기하학적 데이터 분석과 의사결정나무에 의한 측정 패턴인식)

  • Hwang, Jung-Won;Hwang, Jae-Ho
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.56-62
    • /
    • 2008
  • This paper proposes the use of decision tree classification for the measuring pattern recognition from industrial Computed Radiography(CR) images used in nondestructive evaluation(NDE) of steel-tubes. It appears that NDE problems are naturally desired to have machine learning techniques identify patterns and their classification. The attributes of decision tree are taken from NDE test procedure. Geometric features, such as radiative angle, gradient and distance, are estimated from the analysis of input image data. These factors are used to make it easy and accurate to classify an input object to one of the pre-specified classes on decision tree. This algerian is to simplify the characterization of NDE results and to facilitate the determination of features. The experimental results verify the usefulness of proposed algorithm.

The Algorithm For The Flow Of Debris Through Machine Learning (머신러닝 기법을 통한 토석류 흐름 구현 알고리즘)

  • Moon, Ju-Hwan;Yoon, Hong-Sik
    • Proceedings of the Korean Society of Disaster Information Conference
    • /
    • 2017.11a
    • /
    • pp.366-368
    • /
    • 2017
  • 본 연구는 국내 산사태 발생 데이터를 기반으로 시뮬레이션 모델을 머신러닝 기법을 통해 학습시켜 산사태의 토석류 흐름을 구현하는 알고리즘에 대한 연구이다. 전통적인 프로그래밍을 통한 산사태 시뮬레이션 모델 개발을 해당 시스템에 더 많은 고도의 물리학 법칙을 통합 적용시켜 토석류의 흐름을 공학적으로 재현해내는데 중점을 두고 개발이 진행되지만, 본 연구에서 다루는 머신러닝 기법을 통한 산사태 시뮬레이션 모델 개발의 경우 시스템에 입력되는 데이터를 기반으로한 학습을 통하여 토석류 흐름에 영향을 미치는 변수와 파라메터를 산출하고 정의는데 중점을 두고 개발이 진행된다. 본 연구에서 산사태 시뮬레이션 모델 개발에 활용하는 머신러닝 알고리즘은 강화학습 알고리즘으로 기존 산사태 발생 지점을 기반으로 에이전트를 설정해 시간에 따라 시뮬레이션의 각 스텝에서 토석류의 흐름 즉 액션을 환경에 따른 가중치를 기준으로 산정하게 된다. 여기서 환경에 따른 가중치는 시뮬레이션 모델에 정의된 메서드에 따라 산정된다. 시간이 목표값에 도달하여 결과가 출력되면 출력된 결과와 해당 산사태 발생 지점의 실제 산사태 피해 지역 데이터 즉 시뮬레이션 결과 이상치와의 비교를 통하여 시뮬레이션을 평가하게 된다. 이러한 평가는 시뮬레이션 데이터와 실제 데이터간의 유사도 비교를 통해 손실률을 도출하게 되고 이러한 손실률을 경사하강법등의 최적화 알고리즘을 통해 최소화 하여 입력된 데이터를 기반으로한 최적의 토석류 흐름 구현 알고리즘을 도출한다.

  • PDF