Search | Korea Science

User Identification and Session completion in Input Data Preprocessing for Web Mining (웹 마이닝을 위한 입력 데이타의 전처리과정에서 사용자구분과 세션보정)

최영환;이상용
- Journal of KIISE:Software and Applications
- /
- v.30 no.9
- /
- pp.843-849
- /
- 2003
Web usage mining is the technique of data mining that analyzes web users' usage patterns by large web log. To use the web usage mining technique, we have to classify correctly users and users session in preprocessing, but can't classify them completely by only log files with standard web log format. To classify users and user session there are many problems like local cache, firewall, ISP, user privacy, cookey etc., but there isn't any definite method to solve the problems now. Especially local cache problem is the most difficult problem to classify user session which is used as input in web mining systems. In this paper we propose a heuristic method which solves local cache problem by using only click stream data of server side like referrer log, agent log and access log, classifies user sessions and completes session.
PDF KSCI

A Clustering Algorithm based on Heuristic Evolution Algorithm (휴리스틱 진화 알고리즘을 이용한 클러스터링 알고리즘)

강명구;류정우;김명원
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.10b
- /
- pp.78-80
- /
- 2000
클러스터링이란 주어진 데이터들을 유사한 성질을 가지는 군집으로 나누는 것으로 많은 분야에서 응용되고 있으며, 특히 최근 관심의 대상인 데이터 마이닝의 중요한 기술로서 활발히 응용되고 있다. 클러스터링에 있어서 기존의 알고리즘들은 지역적 최적해에 수렴하는 것과 사전에 클러스터 개수를 미리 결정해야 하는 문제점을 가지고 있다. 본 논문에서는 병렬 탐색을 통해 최적해를 찾는 진화알고리즘을 사용하여 지역적 최적해에 수렴되는 문제점을 개선하였으며, 자동으로 적절한 클러스터 개수를 결정할 수 있게 하였다. 또한 진화알고리즘의 단점인 탐색공간의 확대에 따른 탐색시간의 증가는 휴리스틱 연산을 정의하여 개선하였다. 제안한 알고리즘의 성능 및 타당성을 보이기 위해 가우시안 분포 데이터를 사용하여 제안한 알고리즘의 성능이 우수함을 보였다.
PDF

Detection Models for Intrusion Types based on Data Mining (데이터 마이닝 기반의 침입유형별 탐지 모델)

Kim, Sang-Young;Woo, Chong-Woo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2003.05c
- /
- pp.2049-2052
- /
- 2003
인터넷의 급속한 발전으로 인한 유용성 이면에는, 공공 시스템에 대한 악의적인 침입에 따른 피해가 날로 증가되고 있다. 이에 대비하기 위한 침입 탐지 시스템들이 소개되고 있으나, 공격의 형태가 다양하게 변화되고 있기 때문에 침입탐지 시스템도 이에 대비할 수 있도록 지속적인 연구 노력이 필요하다. 최근의 다양한 연구노력 중에는 데이터 마이닝 기법을 이용하여 침입자의 정보를 분석하는 연구가 활발히 진행되고 있다. 본 논문에서는 데이터 마이닝 기법을 사용하여 KDD CUP 99의 훈련 집합(Training Set)을 기반으로 효과적인 분류를 하기 위한 모델을 제시하였다. 제시된 모델에서는 휴리스틱을 적용하여 효과적으로 필요한 데이터를 생성할 수 있었으며, 또한 각 공격 유형마다 분류자를 두어 보다 정확하고 효율적인 탐지가 가능하도록 하였다.
PDF

Workflow Mining Technique(Heuristic Approach) (워크플로우 마이닝 기법(휴리스틱접근))

Lee Myoung-Hee;Chang Young-Won;Yoo Cheol-jung;Jang Ok-bae
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.07b
- /
- pp.412-414
- /
- 2005
최근들어 기업의 업무가 더욱 전문화되고 복잡해짐에 따라 워크플로우 시스템도 복잡해지고 다양해 지고 있다. 이러한 문제로 인하여 실제 필요로 하는 프로세스의 관리 및 도출이 요구된다. 본 논문에서는 영향력있는 프로세스를 도출하고 지원하기 위한 워크플로우 마이닝에 관하여 분석한 후 분석을 바탕으로 상관관계분석과 주성분분석을 통하여 워크플로우를 보다 효율적으로 관리할 수 있는 마이닝 규칙을 제시한다.
PDF

Workflow Mining based on Heuristic Approach using Log data (워크플로우 마이닝 : 휴리스틱 접근)

Lee, Myoung-Hee;Yoo, Cheol-Jung;Jang, Ok-Bae
- Proceedings of the CALSEC Conference
- /
- 2005.03a
- /
- pp.195-200
- /
- 2005
As the workflow systems are becoming complex and obscure, there are discrepancies between actual workflow process and designed process. Therefore, we have developed techniques for discovering workflow models. The starting point for such techniques is a so-called 'workflow log' containing information about the workflow process as it is actually being executed. This paper presents an algorithm of workflow process mining based on heuristic approach from the workflow log, which can be happen to business process system.
PDF

Frequent Itemset Search Using LSI Similarity (LSI 유사도를 이용한 효율적인 빈발항목 탐색 알고리즘)

Ko, Younhee;Kim, Hyeoncheol;Lee, Wongyu
- The Journal of Korean Association of Computer Education
- /
- v.6 no.1
- /
- pp.1-8
- /
- 2003
We introduce a efficient vertical mining algorithm that reduces searching complexity for frequent k-itemsets significantly. This method includes sorting items by their LSI(Least Support Itemsets) similarity and then searching frequent itemsets in tree-based manner. The search tree structure provides several useful heuristics and therefore, reduces search space significantly at early stages. Experimental results on various data sets shows that the proposed algorithm improves searching performance compared to other algorithms, especially for a database having long pattern.
PDF

Selection of controller using improved Artificial Bee Colony algorithm based on Apriori algorithm in SDN environment (SDN 환경에서 Apriori 알고리즘 기반의 향상된 인공벌 군집(ABC) 알고리즘을 이용한 컨트롤러 선택)

Yoo, Seung-Eon;Lim, Hwan-Hee;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2019.01a
- /
- pp.39-40
- /
- 2019
본 논문에서는 연관규칙 마이닝 알고리즘인 Apriori 알고리즘을 기반으로 향상된 인공벌 군집 알고리즘(ABC algorihtm)을 적용하여 SDN 환경에서 분산된 컨트롤러를 선택하는 모델을 제안하였다. 이를 통해 자주 사용되는 컨트롤러를 우선적으로 선택함으로써 향상된 컨트롤러 선택을 목표로 한다.
PDF

Analysis of Startup Process based on Process Mining Techniques: ICT Service Cases (프로세스 마이닝 기반 창업 프로세스 분석: ICT 서비스 창업 사례를 중심으로)

Min Woo Park;Hyun Sil Moon;Jae Kyeong Kim
- Information Systems Review
- /
- v.21 no.1
- /
- pp.135-152
- /
- 2019
Recently there are many development and support policies for start-up companies because of successful venture companies related to ICT services. However, as these policies have focused on the support for the initial stage of start-up, many start-up companies have difficulties to continuously grow up. The main reason for these difficulties is that they recognize start-up tasks as independent activities. However, many experts or related articles say that start-up tasks are composed of related processes from the initial stage to the stable stage of start-up firms. In this study, we models the start-up processes based on the survey collected by the start-up companies, and analyze the start-up process of ICT service companies with process mining techniques. Through process mining analysis, we can draw a sequential flow of tasks for start-ups and the characteristics of them. The analysis of start-up businessman, idea derivation, creating business model, business diversification processes are resulted as important processes, but marketing activity and managing investment funds are not. This result means that marketing activity and managing investment funds are activities that need ongoing attention. Moreover, we can find temporal and complementary tasks which could not be captured by independent individual-level activity analysis. Our process analysis results are expected to be used in simulation-based web-intelligent system to support start-up business, and more cumulated start-up business cases will be helpful to give more detailed individual-level personalization service. And our proposed process model and analyzing results can be used to solve many difficulties for start-up companies.
https://doi.org/10.14329/isr.2019.21.1.135 인용 PDF

An Efficient Clustering Algorithm based on Heuristic Evolution (휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘)

Ryu, Joung-Woo;Kang, Myung-Ku;Kim, Myung-Won
- Journal of KIISE:Software and Applications
- /
- v.29 no.1_2
- /
- pp.80-90
- /
- 2002
Clustering is a useful technique for grouping data points such that points within a single group/cluster have similar characteristics. Many clustering algorithms have been developed and used in engineering applications including pattern recognition and image processing etc. Recently, it has drawn increasing attention as one of important techniques in data mining. However, clustering algorithms such as K-means and Fuzzy C-means suffer from difficulties. Those are the needs to determine the number of clusters apriori and the clustering results depending on the initial set of clusters which fails to gain desirable results. In this paper, we propose a new clustering algorithm, which solves mentioned problems. In our method we use evolutionary algorithm to solve the local optima problem that clustering converges to an undesirable state starting with an inappropriate set of clusters. We also adopt a new measure that represents how well data are clustered. The measure is determined in terms of both intra-cluster dispersion and inter-cluster separability. Using the measure, in our method the number of clusters is automatically determined as the result of optimization process. And also, we combine heuristic that is problem-specific knowledge with a evolutionary algorithm to speed evolutionary algorithm search. We have experimented our algorithm with several sets of multi-dimensional data and it has been shown that one algorithm outperforms the existing algorithms.
PDF KSCI

Feature Selection for Anomaly Detection Based on Genetic Algorithm (유전 알고리즘 기반의 비정상 행위 탐지를 위한 특징선택)

Seo, Jae-Hyun
- Journal of the Korea Convergence Society
- /
- v.9 no.7
- /
- pp.1-7
- /
- 2018
Feature selection, one of data preprocessing techniques, is one of major research areas in many applications dealing with large dataset. It has been used in pattern recognition, machine learning and data mining, and is now widely applied in a variety of fields such as text classification, image retrieval, intrusion detection and genome analysis. The proposed method is based on a genetic algorithm which is one of meta-heuristic algorithms. There are two methods of finding feature subsets: a filter method and a wrapper method. In this study, we use a wrapper method, which evaluates feature subsets using a real classifier, to find an optimal feature subset. The training dataset used in the experiment has a severe class imbalance and it is difficult to improve classification performance for rare classes. After preprocessing the training dataset with SMOTE, we select features and evaluate them with various machine learning algorithms.
https://doi.org/10.15207/JKCS.2018.9.7.001 인용 PDF KSCI

Search Result 11, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)