• Title/Summary/Keyword: instance

Search Result 2,050, Processing Time 0.027 seconds

Improving an Ensemble Model Using Instance Selection Method (사례 선택 기법을 활용한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.105-115
    • /
    • 2016
  • Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.

The use of Local API(Anomaly Process Instances) Detection for Analyzing Container Terminal Event (로컬 API(Anomaly Process Instances) 탐지법을 이용한 컨테이너 터미널 이벤트 분석)

  • Jeon, Daeuk;Bae, Hyerim
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.41-59
    • /
    • 2015
  • Information systems has been developed and used in various business area, therefore there are abundance of history data (log data) stored, and subsequently, it is required to analyze those log data. Previous studies have been focusing on the discovering of relationship between events and no identification of anomaly instances. Previously, anomaly instances are treated as noise and simply ignored. However, this kind of anomaly instances can occur repeatedly. Hence, a new methodology to detect the anomaly instances is needed. In this paper, we propose a methodology of LAPID (Local Anomaly Process Instance Detection) for discriminating an anomalous process instance from the log data. We specified a distance metric from the activity relation matrix of each instance, and use it to detect API (Anomaly Process Instance). For verifying the suggested methodology, we discovered characteristics of exceptional situations from log data. To demonstrate our proposed methodology, we performed our experiment on real data from a domestic port terminal.

Analyzing Problem Instance Space Based on Difficulty-distance Correlation (난이도-거리 상관관계 기반의 문제 인스턴스 공간 분석)

  • Jeon, So-Yeong;Kim, Yong-Hyuk
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.4
    • /
    • pp.414-424
    • /
    • 2012
  • Finding or automatically generating problem instance is useful for algorithm analysis/test. The topic has been of interest in the field of hardware/software engineering and theory of computation. We apply objective value-distance correlation analysis to problem spaces, as previous researchers applied it to solution spaces. According to problems, we define the objective function by (1) execution time of tested algorithm or (2) its optimality; this definition is interpreted as difficulty of the problem instance being solved. Our correlation analysis is based on the following aspects: (1) change of correlation when we use different algorithms or different distance functions for the same problem, (2) change of that when we improve the tested algorithm, (3) relation between a problem instance space and the solution space for the same problem. Our research demonstrates the way of problem instance space analysis and will accelerate the problem instance space analysis as an initiative research.

Learning Multiple Instance Support Vector Machine through Positive Data Distribution (긍정 데이터 분포를 반영한 다중 인스턴스 지지 벡터 기계 학습)

  • Hwang, Joong-Won;Park, Seong-Bae;Lee, Sang-Jo
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.227-234
    • /
    • 2015
  • This paper proposes a modified MI-SVM algorithm by considering data distribution. The previous MI-SVM algorithm seeks the margin by considering the "most positive" instance in a positive bag. Positive instances included in positive bags are located in a similar area in a feature space. In order to reflect this characteristic of positive instances, the proposed method selects the "most positive" instance by calculating the distance between each instance in the bag and a pivot point that is the intersection point of all positive instances. This paper suggests two ways to select the "most positive" pivot point in the training data. First, the algorithm seeks the "most positive" pivot point along the current predicted parameter, and then selects the nearest instance in the bag as a representative from the pivot point. Second, the algorithm finds the "most positive" pivot point by using a Diverse Density framework. Our experiments on 12 benchmark multi-instance data sets show that the proposed method results in higher performance than the previous MI-SVM algorithm.

Developing a Module to Store 3DF-GML Instance Documents in a Database (3DF-GML 인스턴스 문서의 데이터베이스 저장을 위한 모듈 개발)

  • Lee, Kang-Jae;Jang, Gun-Up;Lee, Ji-Yeong
    • Spatial Information Research
    • /
    • v.19 no.6
    • /
    • pp.87-100
    • /
    • 2011
  • Recently, a variety of GML application schemas have been designed in many fields. GML application schemas are specific to the application domain of interest and specify object types using primitive object types defined in the GML standard. GML instance documents are created based on such GML application schemas. The GML instance documents generally require large volumes to represent huge amounts of geographic objects. Thus, it is essential to store such GML instance documents in relational database for efficient management and use. Relational database is relatively convenient to use and is widely applied in various fields. Furthermore, it is fundamentally more efficient than file structure to handle large datasets. Many researches on storing GML documents have been carried out so far. However, there are few studies on storage of GML instance documents. Therefore, in this study, we developed the storage module to store the GML instance documents in relational database.

Graph-based ISA/instanceOf Relation Extraction from Category Structure (그래프 구조를 이용한 카테고리 구조로부터 상하위 관계 추출)

  • Choi, Dong-Hyun;Choi, Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.464-469
    • /
    • 2010
  • In this paper, we propose a method to extract isa/instanceOf relation from category structure. Existing researches use lexical patterns to get isa/instanceOf relation from the category structure, e.g. head word matching, to determine whether the given category link is isa/instanceOf relation or not. In this paper, we propose a new approach which analyzes other category links related to the given category link to determine whether the given category link is isa/instanceOf relation or not. The experimental result shows that our algorithm can cover many cases which the existing algorithms were not able to deal with.

A Combined Method of Rule Induction Learning and Instance-Based Learning (귀납법칙 학습과 개체위주 학습의 결합방법)

  • Lee, Chang-Hwan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.9
    • /
    • pp.2299-2308
    • /
    • 1997
  • While most machine learning research has been primarily concerned with the development of systems that implement one type of learning strategy, we use a multistrategy approach which integrates rule induction learning and instance-based learning, and show how this marriage allows for overall better performance. In the rule induction learning phase, we derive an entropy function, based on Hellinger divergence, which can measure the amount of information each inductive rule contains, and show how well the Hellinger divergence measures the importance of each rule. We also propose some heuristics to reduce the computational complexity by analyzing the characteristics of the Hellinger measure. In the instance-based learning phase, we improve the current instance-based learning method in a number of ways. The system has been implemented and tested on a number of well-known machine learning data sets. The performance of the system has been compared with that of other classification learning technique.

  • PDF

A XML Instance Repository Model based on the Edge-Labeled Graph (Edge-Labeled 그래프 기반의 XML 인스턴스 저장 모델)

  • Kim Jeong-Hee;Kwak Ho-Young
    • Journal of Internet Computing and Services
    • /
    • v.4 no.6
    • /
    • pp.33-42
    • /
    • 2003
  • A XML Instance repository model based on the Edge-Labeled Graph is suggested for storing the XML instance in Relational Databases, This repository model represents the XML instance as a data graph based on the Edge-Labeled Graph, extracts the defined value based on the structure of data path, element, attribute, and table index table presented as database schema, and stores these values using the Mapper module, In order to support querry, XML repository model offers the module translating XQL which is a query language under XPATH to SQL, and has DBtoXML generator module restoring the stored XML instance. As a result, it is possible to represent the storage relationship between the XML instances and the proposed repository model in terms of Graph-based Path, and it shows the possibility of easy search of specific element and attribute information.

  • PDF

Automatic Dataset Generation of Object Detection and Instance Segmentation using Mask R-CNN (Mask R-CNN을 이용한 물체인식 및 개체분할의 학습 데이터셋 자동 생성)

  • Jo, HyunJun;Kim, Dawit;Song, Jae-Bok
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.1
    • /
    • pp.31-39
    • /
    • 2019
  • A robot usually adopts ANN (artificial neural network)-based object detection and instance segmentation algorithms to recognize objects but creating datasets for these algorithms requires high labeling costs because the dataset should be manually labeled. In order to lower the labeling cost, a new scheme is proposed that can automatically generate a training images and label them for specific objects. This scheme uses an instance segmentation algorithm trained to give the masks of unknown objects, so that they can be obtained in a simple environment. The RGB images of objects can be obtained by using these masks, and it is necessary to label the classes of objects through a human supervision. After obtaining object images, they are synthesized with various background images to create new images. Labeling the synthesized images is performed automatically using the masks and previously input object classes. In addition, human intervention is further reduced by using the robot arm to collect object images. The experiments show that the performance of instance segmentation trained through the proposed method is equivalent to that of the real dataset and that the time required to generate the dataset can be significantly reduced.

Real-Time Tomato Instance Tracking Algorithm by using Deep Learning and Probability Model (딥러닝과 확률모델을 이용한 실시간 토마토 개체 추적 알고리즘)

  • Ko, KwangEun;Park, Hyun Ji;Jang, In Hoon
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.1
    • /
    • pp.49-55
    • /
    • 2021
  • Recently, a smart farm technology is drawing attention as an alternative to the decline of farm labor population problems due to the aging society. Especially, there is an increasing demand for automatic harvesting system that can be commercialized in the market. Pre-harvest crop detection is the most important issue for the harvesting robot system in a real-world environment. In this paper, we proposed a real-time tomato instance tracking algorithm by using deep learning and probability models. In general, It is hard to keep track of the same tomato instance between successive frames, because the tomato growing environment is disturbed by the change of lighting condition and a background clutter without a stochastic approach. Therefore, this work suggests that individual tomato object detection for each frame is conducted by YOLOv3 model, and the continuous instance tracking between frames is performed by Kalman filter and probability model. We have verified the performance of the proposed method, an experiment was shown a good result in real-world test data.