• 제목/요약/키워드: Distributed data mining

검색결과 110건 처리시간 0.022초

분산데이터베이스 환경하의 시간연관규칙 적용 (Discovery Temporal Association Rules in Distributed Database)

  • Yan Zhao;Kim, Long;Sungbo Seo;Ryu, Keun-Ho
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2004년도 봄 학술발표논문집 Vol.31 No.1 (B)
    • /
    • pp.115-117
    • /
    • 2004
  • Recently, mining far association rules in distributed database environments is a central problem in knowledge discovery area. While the data are located in different share-nothing machines, and each data site grows by time. Mining global frequent itemsets is hard and not efficient in large number of distributed sewen. In many distributed databases. time component(which is usually attached to transactions in database), contains meaningful time-related rules. In this paper, we design a new DTA(distributed temporal association) algorithm that combines temporal concepts inside distributed association rules. The algorithm confirms the time interval for applying association rules in distributed databases. The experiment results show that DTA can generate interesting correlation frequent itemsets related with time periods.

  • PDF

동적 그룹 바인딩 기반의 모바일 에이전트를 이용한 인텔리전트 분산 플랫폼 (Intelligent Distributed Platform using Mobile Agent based on Dynamic Group Binding)

  • 마테오 로미오;이재완
    • 인터넷정보학회논문지
    • /
    • 제8권3호
    • /
    • pp.131-143
    • /
    • 2007
  • 오늘날 정보 기술 및 지능형 시스템에서는 분산 데이터베이스로부터 패턴들을 찾고 규칙들을 추출하기 위해 데이터 마이닝 기술을 사용한다. 분산환경에서 데이터 마이닝 기술을 이용해 추출된 규칙들은 동적인 중복, 적응형 부하 균형 및 기타 기술들에서 활용될 수 있다. 그러나 대량의 데이터 전송은 에러를 야기하며 신뢰할 수 없는 결과를 초래할 수 있다. 이 논문은 이동 에이전트를 사용하여 동적 그룹 바인딩을 기반으로 한 인텔리전트 분산 플랫폼을 제안한다. 그룹서비스를 통해 효율적인 객체 검색을 위한 분류 알고리즘을 구현한다. 지능형 모델은 동적 중복을 위해 추출된 규칙을 사용한다. 데이터 마이닝 에이전트와 데이터 압축 에이전트는 각각 서비스 노드 데이터베이스로부터 규칙을 추출하여 데이터를 압축한다. 제안한 알고리즘은 데이터를 전송하기 전에 neuro-fuzzy 분류기를 사용하여 빈도가 적은 데이터 ???V을 합하는 전처리 과정을 수행한다. 객체그룹 분류, 서비스 노드 데이터베이스 마이닝, 데이터 압축 및 규칙 추출에 대한 시뮬레이션을 수행했다. 효율적인 데이터 압축 및 신뢰성 있는 규칙 추출에 대한 실험 결과 제안한 알고리즘이 다른 방법들과 비교해 이러한 관점에서 성능이 우수함을 나타내었다.

  • PDF

A New Approach to Web Data Mining Based on Cloud Computing

  • Zhu, Wenzheng;Lee, Changhoon
    • Journal of Computing Science and Engineering
    • /
    • 제8권4호
    • /
    • pp.181-186
    • /
    • 2014
  • Web data mining aims at discovering useful knowledge from various Web resources. There is a growing trend among companies, organizations, and individuals alike of gathering information through Web data mining to utilize that information in their best interest. In science, cloud computing is a synonym for distributed computing over a network; cloud computing relies on the sharing of resources to achieve coherence and economies of scale, similar to a utility over a network, and means the ability to run a program or application on many connected computers at the same time. In this paper, we propose a new system framework based on the Hadoop platform to realize the collection of useful information of Web resources. The system framework is based on the Map/Reduce programming model of cloud computing. We propose a new data mining algorithm to be used in this system framework. Finally, we prove the feasibility of this approach by simulation experiment.

A Prototyping Framework of the Documentation Retrieval System for Enhancing Software Development Quality

  • Chang, Wen-Kui;Wang, Tzu-Po
    • International Journal of Quality Innovation
    • /
    • 제2권2호
    • /
    • pp.93-100
    • /
    • 2001
  • This paper illustrates a prototyping framework of the documentation-standards retrieval system via the data mining approach for enhancing software development quality. We first present an approach for designing a retrieval algorithm based on data mining, with the three basic technologies of machine learning, statistics and database management, applied to this system to speed up the searching time and increase the fitness. This approach derives from the observation that data mining can discover unsuspected relationships among elements in large databases. This observation suggests that data mining can be used to elicit new knowledge about the design of a subject system and that it can be applied to large legacy systems for efficiency. Finally, software development quality will be improved at the same time when the project managers retrieving for the documentation standards.

  • PDF

효율적인 데이터베이스 마케팅을 위한 데이터마이닝 전처리도구에 관한 연구 (A Study on the Data Mining Preprocessing Tool For Efficient Database Marketing)

  • 이준석
    • 디지털융복합연구
    • /
    • 제12권11호
    • /
    • pp.257-264
    • /
    • 2014
  • 효율적인 데이터베이스 마케팅을 위하여 고객들을 세분화하고, 새로운 지식을 탐색할 수 있는 데이터마이닝의 필요성이 증대되고 있다. 데이터마이닝 도구를 구축하기 위해서는 단계별 구현이 요구되어 지는데, 본 연구에서는 데이터마이닝을 위한 분산 환경에 적응 가능한 데이터 전처리 도구를 구성하였다. 기존의 데이터마이닝 도구인 앤서 트리, 클레멘타인, 엔터프라이즈 마이너, 캔싱턴, 웨카의 전처리 부분을 고찰하고, 분산 환경에서 효율적으로 사용할 수 있는 데이터 마이닝 전처리 도구를 구성하였다. 새로이 제안된 시스템은 엔터프라이즈 자바 빈즈와 XML을 기반으로 하였다.

RFID-based Supply Chain Process Mining for Imported Beef

  • Kang, Yong-Shin;Lee, Kyounghun;Lee, Yong-Han;Chung, Ku-Young
    • 한국축산식품학회지
    • /
    • 제33권4호
    • /
    • pp.463-473
    • /
    • 2013
  • Through the development of efficient data collecting technologies like RFID, and inter-enterprise collaboration platforms such as web services, companies which participate in supply chains can acquire visibility over the whole supply chain, and can make decisions to optimize the overall supply chain networks and processes, based on the extracted knowledge from historical data collected by the visibility system. Although not currently active, the MeatWatch system has been developed, and is used in part for this purpose, in the imported beef distribution network in Korea. However, the imported beef distribution network is too complicated to analyze its various aspects using ordinary process analysis approaches. In this paper, we suggest a novel approach, called RFID-based supply chain process mining, to automatically discover and analyze the overall supply chain processes from the distributed RFID event data, without any prior knowledge. The proposed approach was implemented and validated, by using a case study of the imported beef distribution network in Korea. Specifically we demonstrated that the proposed approach can be successfully applied to discover supply chain networks from the distributed event data, to simplify the supply chain networks, and to analyze anomaly of the distribution networks. Such novel process mining functionalities can reinforce the capability of traceability services like MeatWatch in the future.

우수 의약품 제조 기준 위반 패턴 인식을 위한 연관규칙과 텍스트 마이닝 기반 t-SNE분석 (Violation Pattern Analysis for Good Manufacturing Practice for Medicine using t-SNE Based on Association Rule and Text Mining)

  • 이준오;손소영
    • 품질경영학회지
    • /
    • 제50권4호
    • /
    • pp.717-734
    • /
    • 2022
  • Purpose: The purpose of this study is to effectively detect violations that occur simultaneously against Good Manufacturing Practice, which were concealed by drug manufacturers. Methods: In this study, we present an analysis framework for analyzing regulatory violation patterns using Association Rule Mining (ARM), Text Mining, and t-distributed Stochastic Neighbor Embedding (t-SNE) to increase the effectiveness of on-site inspection. Results: A number of simultaneous violation patterns was discovered by applying Association Rule Mining to FDA's inspection data collected from October 2008 to February 2022. Among them there were 'concurrent violation patterns' derived from similar regulatory ranges of two or more regulations. These patterns do not help to predict violations that simultaneously appear but belong to different regulations. Those unnecessary patterns were excluded by applying t-SNE based on text-mining. Conclusion: Our proposed approach enables the recognition of simultaneous violation patterns during the on-site inspection. It is expected to decrease the detection time by increasing the likelihood of finding intentionally concealed violations.

효과적인 공간 데이터 마이닝을 위한 SOA 기반 데이터 통합 프레임워크 설계 (A Design of SOA-based Data Integration Framework for Effective Spatial Data Mining)

  • 문일환;허환;김삼근
    • 정보처리학회논문지D
    • /
    • 제18D권5호
    • /
    • pp.385-392
    • /
    • 2011
  • 최근 농업 분야에 IT를 접목시킨 농업-IT 융합 기술에 대한 연구가 주목 받고 있다. 특히, 공간 데이터 마이닝(spatial data mining, SDM)을 이용한 농작물 관련 예측 서비스들을 통해 자연재해에 대한 피해를 줄이고 농작물의 생산성을 높이고자 하는 연구들이 있어 왔다. 그러나 예측 서비스를 위한 SDM에 필요한 학습 데이터는 분산되어 있는 데이터간의 이질성으로 인해 데이터 변환과 통합과정에 많은 비용과 시간이 발생한다. 또한 공간 데이터와 비공간 데이터 간의 공간적 이웃 관계를 연산하기 위해 대용량의 데이터에 대한 복잡한 연산과정이 필요하다. 본 논문에서는 각각의 데이터 소스를 하나의 서비스 단위로 취급함으로써 분산된 이질적인 데이터를 효과적으로 통합 관리할 수 있고 SDM을 위한 학습 데이터의 생산성을 향상시켜 최적의 예측 서비스의 발견을 지원해 주는 SOA 기반의 데이터 통합 프레임워크를 제안한다. 실험을 통해 경기도 이천시의 복숭아나무의 동해 피해지역에 대한 최적의 예측 서비스의 발견을 위해 제안 프레임워크를 효과적으로 적용할 수 있음을 확인하였다.

The HCARD Model using an Agent for Knowledge Discovery

  • Gerardo Bobby D.;Lee Jae-Wan;Joo Su-Chong
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제14권3호
    • /
    • pp.53-58
    • /
    • 2005
  • In this study, we will employ a multi-agent for the search and extraction of data in a distributed environment. We will use an Integrator Agent in the proposed model on the Hierarchical Clustering and Association Rule Discovery(HCARD). The HCARD will address the inadequacy of other data mining tools in processing performance and efficiency when use for knowledge discovery. The Integrator Agent was developed based on CORBA architecture for search and extraction of data from heterogeneous servers in the distributed environment. Our experiment shows that the HCARD generated essential association rules which can be practically explained for decision making purposes. Shorter processing time had been noted in computing for clusters using the HCARD and implying ideal processing period than computing the rules without HCARD.

  • PDF

Model test on slope deformation and failure caused by transition from open-pit to underground mining

  • Zhang, Bin;Wang, Hanxun;Huang, Jie;Xu, Nengxiong
    • Geomechanics and Engineering
    • /
    • 제19권2호
    • /
    • pp.167-178
    • /
    • 2019
  • Open-pit (OP) and underground (UG) mining are usually used to exploit shallow and deep ore deposits, respectively. When mine deposit starts from shallow subsurface and extends to a great depth, sequential use of OP and UG mining is an efficient and economical way to maintain mining productivity. However, a transition from OP to UG mining could induce significant rock movements that cause the slope instability of the open pit. Based on Yanqianshan Iron Mine, which was in the transition from OP to UG mining, a large-scale two-dimensional (2D) model test was built according to the similar theory. Thereafter, the UG mining was carried out to mimic the process of transition from OP to UG mining to disclose the triggered rock movement as well as to assess the associated slope instability. By jointly using three-dimensional (3D) laser scanning, distributed fiber optics, and digital photogrammetry measurement, the deformations, movements and strains of the rock slope during mining were monitored. The obtained data showed that the transition from OP to UG mining led to significant slope movements and deformations that can trigger catastrophic slope failure. The progressive movement of the slope could be divided into three stages: onset of micro-fracture, propagation of tensile cracks, and the overturning and/or sliding of slopes. The failure mode depended on the orientation of structural joints of the rock mass as well as the formation of tension cracks. This study also proved that these non-contact monitoring technologies were valid methods to acquire the interior strain and external deformation with high precision.