• 제목/요약/키워드: Database Mining

검색결과 572건 처리시간 0.023초

중요지지도를 고려한 연관규칙 탐사 알고리즘 (Algorithm mining Association Rules by considering Weight Support)

  • 김근형;황병웅;김민철
    • 정보처리학회논문지D
    • /
    • 제11D권3호
    • /
    • pp.545-552
    • /
    • 2004
  • 데이터마이닝 기법중의 하나인 연관규칙 탐사는 데이터베이스상에서 빈번하게 나타나는 데이터들 중 서로 연관성이 강한 데이터들을 탐색대상으로 한다. 그러나. 빈번하게 나타나지 않는 희소한 데이터들이라 할 지라도 가중치가 높은 중요한 데이터이면서 서로 연관성이 강할 경우 비즈니스정보로서 중요한 가치가 있다. 본 논문에서는 데이터베이스 상에서 희소하게 나타나지만 중요한 의미를 갖고 또한 서로 연관성이 높은 데이터들을 탐사할 수 있는 연관규칙 탐사 알고리즘을 제안한다. 제안한 알고리즘의 성능을 시뮬레이션을 통하여 평가한 결과 희소하면서도 중요한 데이터를 사이의 연간규칙을 효율적으로 탐사함을 알 수 없었다

개인별 상품추천시스템, WebCF-PT: 웹마이닝과 상품계층도를 이용한 협업필터링 (A Personalized Recommender System, WebCF-PT: A Collaborative Filtering using Web Mining and Product Taxonomy)

  • 김재경;안도현;조윤호
    • Asia pacific journal of information systems
    • /
    • 제15권1호
    • /
    • pp.63-79
    • /
    • 2005
  • Recommender systems are a personalized information filtering technology to help customers find the products they would like to purchase. Collaborative filtering is known to be the most successful recommendation technology, but its widespread use has exposed some problems such as sparsity and scalability in the e-business environment. In this paper, we propose a recommendation system, WebCF-PT based on Web usage mining and product taxonomy to enhance the recommendation quality and the system performance of traditional CF-based recommender systems. Web usage mining populates the rating database by tracking customers' shopping behaviors on the Web, so leading to better quality recommendations. The product taxonomy is used to improve the performance of searching for nearest neighbors through dimensionality reduction of the rating database. A prototype recommendation system, WebCF-PT is developed and Internet shopping mall, EBIB(e-Business & Intelligence Business) is constructed to test the WebCF-PT system.

ID3를 활용한 데이터 마이닝 (Data Mining using ID3)

  • 석현태
    • 융합신호처리학회 학술대회논문집
    • /
    • 한국신호처리시스템학회 2003년도 하계학술대회 논문집
    • /
    • pp.38-41
    • /
    • 2003
  • 현재 전세계적으로 데이터마이닝을 위해 많은 종류의 알고리즘이 사용되고 있으나 사용되는 알고리즘의 정확한 이해 없이는 데이터마이닝 결과를 올바르게 해석할 수 없다. 이러한 측면에서 중요한 의사 결정목 생성 알고리즘의 하나인 ID3의 원리를 다루었고, 이를 실세계에서 가장 널리 사용되고 있는 관계형 데이터베이스에 성공적으로 적용하기 위한 훈련 예의 생성 방법 및 연속치를 취급하는 방법을 제시한다.

  • PDF

TEMPORAL CLASSIFICATION METHOD FOR FORECASTING LOAD PATTERNS FROM AMR DATA

  • Lee, Heon-Gyu;Shin, Jin-Ho;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2007년도 Proceedings of ISRS 2007
    • /
    • pp.594-597
    • /
    • 2007
  • We present in this paper a novel mid and long term power load prediction method using temporal pattern mining from AMR (Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.

  • PDF

Schema- and Data-driven Discovery of SQL Keys

  • Le, Van Bao Tran;Sebastian, Link;Mozhgan, Memari
    • Journal of Computing Science and Engineering
    • /
    • 제6권3호
    • /
    • pp.193-206
    • /
    • 2012
  • Keys play a fundamental role in all data models. They allow database systems to uniquely identify data items, and therefore, promote efficient data processing in many applications. Due to this, support is required to discover keys. These include keys that are semantically meaningful for the application domain, or are satisfied by a given database. We study the discovery of keys from SQL tables. We investigate the structural and computational properties of Armstrong tables for sets of SQL keys. Inspections of Armstrong tables enable data engineers to consolidate their understanding of semantically meaningful keys, and to communicate this understanding to other stake-holders. The stake-holders may want to make changes to the tables or provide entirely different tables to communicate their views to the data engineers. For such a purpose, we propose data mining algorithms that discover keys from a given SQL table. We combine the key mining algorithms with Armstrong table computations to generate informative Armstrong tables, that is, key-preserving semantic samples of existing SQL tables. Finally, we define formal measures to assess the distance between sets of SQL keys. The measures can be applied to validate the usefulness of Armstrong tables, and to automate the marking and feedback of non-multiple choice questions in database courses.

Temporal Classification Method for Forecasting Power Load Patterns From AMR Data

  • Lee, Heon-Gyu;Shin, Jin-Ho;Park, Hong-Kyu;Kim, Young-Il;Lee, Bong-Jae;Ryu, Keun-Ho
    • 대한원격탐사학회지
    • /
    • 제23권5호
    • /
    • pp.393-400
    • /
    • 2007
  • We present in this paper a novel power load prediction method using temporal pattern mining from AMR(Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.

데이터 베이스 특성에 따른 효율적인 데이터 마이닝 알고리즘 (An Efficient Data Mining Algorithm based on the Database Characteristics)

  • 박지현;고찬
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제10권1호
    • /
    • pp.107-119
    • /
    • 2006
  • 인터넷과 웹 기술 발전에 따라 데이터베이스에 축적되는 자료의 양이 급속히 늘어나고 있다. 데이터베이스의 응용 범위가 확대되고 대용량 데이터베이스로부터 유용한 지식을 발견하고자 하는 데이터 마이닝(Data Mining) 기술에 대한 연구가 활발하게 진행되고 있다. 기존의 알고리즘들은 대부분 후보 항목 집합들을 줄임과 동시에 데이터베이스의 크기를 줄이는 방법으로 발전해 오고 있다. 그러나 후보 항목집합들을 줄이는 노력이나 데이터베이스의 크기를 줄이는 방법들이 빈발 항목집합들을 생성하는 전 과정에서 필요로 하지는 않는다. 그러한 방법들이 어느 과정에서는 시간을 줄이는데 효과가 있지만 다른 과정에서는 오히려 그러한 방법들을 적용하는데 더 많은 시간이 소요되기 때문이다. 본 논문에서는 트랜잭션들의 길이가 짧거나 데이터베이스를 이루는 항목들의 수가 비교적 적은 트랜잭션 데이터베이스에서 해슁 기법을 사용하여 데이터베이스를 한 번 스캔하고 동시에 각 트랜잭션에서 발생 가능한 모든 부분집합들을 해쉬 테이블에 저장함으로써 최소 지지도에 영향을 받지 않고 기존의 알고리즘보다 더 짧은 시간에 빈발항목집합을 발견할 수 있는 효과적인 연관 규칙 탐사 알고리즘을 제안하고 실험하였다.

  • PDF

아파트 경매를 위한 웹 기반의 지능형 의사결정지원 시스템 구현 (Implementation of a Web-Based Intelligent Decision Support System for Apartment Auction)

  • 나민영;이현호
    • 한국정보처리학회논문지
    • /
    • 제6권11호
    • /
    • pp.2863-2874
    • /
    • 1999
  • Apartment auction is a system that is used for the citizens to get a house. This paper deals with the implementation of a web-based intelligent decision support system using OLAP technique and data mining technique for auction decision support. The implemented decision support system is working on a real auction database and is mainly composed of OLAP Knowledge Extractor based on data warehouse and Auction Data Miner based on data mining methodology. OLAP Knowledge Extractor extracts required knowledge and visualizes it from auction database. The OLAP technique uses fact, dimension, and hierarchies to provide the result of data analysis by menas of roll-up, drill-down, slicing, dicing, and pivoting. Auction Data Miner predicts a successful bid price by means of applying classification to auction database. The Miner is based on the lazy model-based classification algorithm and applies the concepts such as decision fields, dynamic domain information, and field weighted function to this algorithm and applies the concepts such as decision fields, dynamic domain information, and field weighted function to this algorithm to reflect the characteristics of auction database.

  • PDF

전력 부하 패턴 분석을 위한 3차원 큐브 마이닝과 캘랜더 패턴 기반 시간 데이터 마이닝 (3D Cube Mining and Calendar Pattern Based Temporal Mining for Analyzing Power Load Pattern)

  • 박진형;신진호;;이헌규;류근호
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2008년도 춘계학술발표대회
    • /
    • pp.200-203
    • /
    • 2008
  • 최근 전력산업에서의 에너지 가격 및 공급과 수요의 변동, 그리고 기후의 변화에 의해서 부하 예측은 전력회사 경영방침 계획에 있어 중요한 요소가 되었다. 이 논문에서 전력계통의 최적 운용 계획을 위하여 우리가 제안한 기법은 다차원 분석이 가능한 3D 큐브 마이닝과 시간의 변화에 따른 패턴 예측이 가능한 캘린더 기반 시간 데이터 마이닝 기법이다. 이를 통하여 무선 부하 감시 시스템의 부하 데이터의 다차원 분석이 가능하고, 시간 변화에 따른 서로 다른 부하 패턴의 예측이 가능하도록 한다.

Data mining and Copyright

  • Kim, Kyungsuk
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제14권4호
    • /
    • pp.11-19
    • /
    • 2022
  • Data mining has broad applications that reach beyond scholarly and scientific research and provide internet search engine services that are commonly used forms of Text and Data Mining('TDM') of websites. The exceptions and limitations for data mining provide a competitive advantage in the global race for policy innovation because it permits researchers to conduct computational analysis - TDM on any materials to which they have access. For this purpose, Japan and the EU added limitations on copyright to legalize some TDM research through amendments to copyright law, and the U.S. copyright law has allowed data mining by the fair use provision. On the other hand, there are no explicit exceptions and limitations for data mining under the Korean Copyright Act, and there are no cases considering data mining fair use. We review comparatively exceptions and limitations on copyright which will help to encourage AI-related business by using more data smoothly through the mining process and extracting more valuable information.