• Title/Summary/Keyword: process mining

Search Result 1,054, Processing Time 0.031 seconds

Big Data Analysis in School Adjustment Factors using Data Mining

  • Ko, Sujeong
    • International journal of advanced smart convergence
    • /
    • v.8 no.1
    • /
    • pp.87-97
    • /
    • 2019
  • Data mining technology is applied to various fields because it is a technique for analyzing vast amount of data and finding useful information. In this paper, we propose a big data analysis method that uses Apriori algorithm, which is a data mining technique, to find the related factors that have negative and positive influences on school adjustment. Among Korea Child and Youth Panel Survey(KCYPS), data related to adjustment to school life and data showing parental inclinations were extracted from the data of fourth grade elementary school students, first year middle school students, and high school freshman students, respectively and we have mapped the useful association rules among them. As a result, the factors affecting school adjustment were different according to the timing of the growth process, we were able to find interesting rules by looking for connections between rules. On the other hand, the factors that positively influenced school adjustment were not significantly different from each other, and overall, they were associated with positive variables.

A Better Prediction for Higher Education Performance using the Decision Tree

  • Hilal, Anwar;Zamani, Abu Sarwar;Ahmad, Sultan;Rizwanullah, Mohammad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.4
    • /
    • pp.209-213
    • /
    • 2021
  • Data mining is the application of specific algorithms for extracting patterns from data and KDD is the automated or convenient extraction of patterns representing knowledge implicitly stored or captured in large databases, data warehouses, the Web, other massive information repositories or data streams. Data mining can be used for decision making in educational system. But educational institution does not use any knowledge discovery process approach on these data; this knowledge can be used to increase the quality of education. The problem was happening in the educational management system, but to make education system more flexible and discover knowledge from it huge data, we will use data mining techniques to solve problem.

HBase based Business Process Event Log Schema Design of Hadoop Framework

  • Ham, Seonghun;Ahn, Hyun;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.5
    • /
    • pp.49-55
    • /
    • 2019
  • Organizations design and operate business process models to achieve their goals efficiently and systematically. With the advancement of IT technology, the number of items that computer systems can participate in and the process becomes huge and complicated. This phenomenon created a more complex and subdivide flow of business process.The process instances that contain workcase and events are larger and have more data. This is an essential resource for process mining and is used directly in model discovery, analysis, and improvement of processes. This event log is getting bigger and broader, which leads to problems such as capacity management and I / O load in management of existing row level program or management through a relational database. In this paper, as the event log becomes big data, we have found the problem of management limit based on the existing original file or relational database. Design and apply schemes to archive and analyze large event logs through Hadoop, an open source distributed file system, and HBase, a NoSQL database system.

PPFP(Push and Pop Frequent Pattern Mining): A Novel Frequent Pattern Mining Method for Bigdata Frequent Pattern Mining (PPFP(Push and Pop Frequent Pattern Mining): 빅데이터 패턴 분석을 위한 새로운 빈발 패턴 마이닝 방법)

  • Lee, Jung-Hun;Min, Youn-A
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.12
    • /
    • pp.623-634
    • /
    • 2016
  • Most of existing frequent pattern mining methods address time efficiency and greatly rely on the primary memory. However, in the era of big data, the size of real-world databases to mined is exponentially increasing, and hence the primary memory is not sufficient enough to mine for frequent patterns from large real-world data sets. To solve this problem, there are some researches for frequent pattern mining method based on disk, but the processing time compared to the memory based methods took very time consuming. There are some researches to improve scalability of frequent pattern mining, but their processes are very time consuming compare to the memory based methods. In this paper, we present PPFP as a novel disk-based approach for mining frequent itemset from big data; and hence we reduced the main memory size bottleneck. PPFP algorithm is based on FP-growth method which is one of the most popular and efficient frequent pattern mining approaches. The mining with PPFP consists of two setps. (1) Constructing an IFP-tree: After construct FP-tree, we assign index number for each node in FP-tree with novel index numbering method, and then insert the indexed FP-tree (IFP-tree) into disk as IFP-table. (2) Mining frequent patterns with PPFP: Mine frequent patterns by expending patterns using stack based PUSH-POP method (PPFP method). Through this new approach, by using a very small amount of memory for recursive and time consuming operation in mining process, we improved the scalability and time efficiency of the frequent pattern mining. And the reported test results demonstrate them.

A Study on the Site Selection Process of Field Emergency Medical Facilities Based on Text Mining (텍스트마이닝 기반의 재난현장 응급의료시설 대상지선정 프로세스 연구)

  • Suh, Sangwook
    • Journal of The Korea Institute of Healthcare Architecture
    • /
    • v.24 no.2
    • /
    • pp.27-36
    • /
    • 2018
  • Purpose: In the case of mass disaster, the establishment of temporary medical facilities for the first aid and treatment is required for the stable accommodation of patients caused by the disaster. However, the criteria for decision making related to the deployment of field emergency medical facilities are not specified. So, The purpose of this study is to draw considerable factors needed for the deployment of field emergency medical facilities and to make proposal for site selection process of field emergency medical facilities on the basis of the factor. Methods: This study performs text mining of disaster-related laws, guidelines and documents to derive key factors affecting site selection, also proposes a decision making process and conducts virtual deployment to validate the process. Results: The key factors for the site selection derived as the size of the damage, the size of the DMAT inputs, the location of available place, and distance to the disaster base hospital. As a result of virtual deployment following proposed decision making process, It is confirmed that the site of field emergency medical facilities is changed depending on the type of disaster, even if the scope of the disaster damage was the same. Implications: The deployment of field emergency medical facilities requires a separate criteria for each type of disaster, not uniform, as a future research a quantitative approach of the criteria needs to be performed.

A Study on Searching Stabled EMI Shielding Effectiveness Measurement Point for Military Communication Shelter Using Support Vector Machine and Process Capability Analysis (서포트 벡터 머신과 공정능력분석을 이용한 군 통신 쉘터의 EMI 차폐효과 안정 포인트 탐색 연구)

  • Ku, Ki-Beom;Kwon, Jae-Wook;Jin, Hong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.2
    • /
    • pp.321-328
    • /
    • 2019
  • A military shelter for communication and information is necessary to optimize the integrated combat ability of weapon systems in the network centric warfare. Therefore, the military shelter is required for EMI shielding performance. This study examines the stable measurement points for EMI shielding effectiveness of a military shelter for communication and information. The measurement points were found by analyzing the EMI shielding effectiveness measurement data with data mining technique and process capability analysis. First, a support vector machine was used to separate the measurement point that has stable EMI shielding effectiveness according to set condition. Second, this process was conducted with process capability analysis. Finally, the results of data mining technique were compared with those of process capability analysis. As a result, 24 measurement points with stable EMI shielding effectiveness were found.

Research on no coal pillar protection technology in a double lane with pre-set isolation wall

  • Liu, Hui;Li, Xuelong;Gao Xin;Long, Kun;Chen, Peng
    • Geomechanics and Engineering
    • /
    • v.27 no.6
    • /
    • pp.537-550
    • /
    • 2021
  • There are various technical problems need to be solved in the construction process of pre-setting an isolation wall into a double lane in the outburst prone mine. This study presents a methodology that pre-setting an isolation wall into a double lane without a coal pillar. This requires the excavation of two small section roadways to dig a wide section roadway, followed by construction of the separation wall. During this process the connecting lane is reserved. In order to ensure the stability of the separation wall, the required bearing capacity of the isolation wall is 4.66 MN/m and the deformation of the isolation wall is approximately 25 cm. To reduce the difficulty of implementing support the roadway is driven by 5 m/d. After the construction of the separation wall, the left side coal wall is brushed 1.5 m to make the width of the gas roadway reach 2.5 m and the roadway support utilizes anchor rod, ladder beam, anchor cable beam and net configuration. During construction, the concrete pump and removable self-propelled hydraulic wall mold are used to pump and pour the concrete of the isolation wall. In the process of mining, the stress distribution of coal body and isolation wall is detected and measured on site. The results demonstrate that the deformation of the surrounding rock of roadway and separation of roof in the roadway is small. The stress of the bolt and anchor cable is within equipment tolerance validating their selection. The roadway is well supported and the intended goal is achieved. The methodology can be used for reference for similar mine gas control.

Privacy Preserving Sequential Patterns Mining for Network Traffic Data (사이트의 접속 정보 유출이 없는 네트워크 트래픽 데이타에 대한 순차 패턴 마이닝)

  • Kim, Seung-Woo;Park, Sang-Hyun;Won, Jung-Im
    • Journal of KIISE:Databases
    • /
    • v.33 no.7
    • /
    • pp.741-753
    • /
    • 2006
  • As the total amount of traffic data in network has been growing at an alarming rate, many researches to mine traffic data with the purpose of getting useful information are currently being performed. However, network users' privacy can be compromised during the mining process. In this paper, we propose an efficient and practical privacy preserving sequential pattern mining method on network traffic data. In order to discover frequent sequential patterns without violating privacy, our method uses the N-repository server model and the retention replacement technique. In addition, our method accelerates the overall mining process by maintaining the meta tables so as to quickly determine whether candidate patterns have ever occurred. The various experiments with real network traffic data revealed tile efficiency of the proposed method.

Adaptive Data Mining Model using Fuzzy Performance Measures (퍼지 성능 측정자를 이용한 적응 데이터 마이닝 모델)

  • Rhee, Hyun-Sook
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.541-546
    • /
    • 2006
  • Data Mining is the process of finding hidden patterns inside a large data set. Cluster analysis has been used as a popular technique for data mining. It is a fundamental process of data analysis and it has been Playing an important role in solving many problems in pattern recognition and image processing. If fuzzy cluster analysis is to make a significant contribution to engineering applications, much more attention must be paid to fundamental decision on the number of clusters in data. It is related to cluster validity problem which is how well it has identified the structure that Is present in the data. In this paper, we design an adaptive data mining model using fuzzy performance measures. It discovers clusters through an unsupervised neural network model based on a fuzzy objective function and evaluates clustering results by a fuzzy performance measure. We also present the experimental results on newsgroup data. They show that the proposed model can be used as a document classifier.

Aspect Mining Process Design Using Abstract Syntax Tree (추상구문트리를 이용한 어스팩트 마이닝 프로세스 설계)

  • Lee, Seung-Hyung;Song, Young-Jae
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.5
    • /
    • pp.75-83
    • /
    • 2011
  • Aspect-oriented programming is the paradigm which extracts crosscutting concern from a system and solves scattering of a function and confusion of a code through software modularization. Existing aspect developing method has a difficult to extract a target area, so it is not easy to apply aspect mining. In an aspect minning, it is necessary a technique that convert existing program refactoring elements to crosscutting area. In the paper, it is suggested an aspect mining technique for extracting crosscutting concern in a system. Using abstract syntax structure specification, extract functional duplicated relation elements. Through Apriori algorithm, it is possible to create a duplicated syntax tree and automatic creation and optimization of a duplicated source module, target of crosscutting area. As a result of applying module of Berkeley Yacc(berbose.c) to mining process, it is confirmed that the length and volume of program has been decreased of 9.47% compared with original module, and it has been decreased of 4.92% in length and 5.11% in volume compared with CCFinder.