• Title/Summary/Keyword: Database Mining

Search Result 574, Processing Time 0.025 seconds

Identification Process Variables and Process Improvement Using Data Mining (데이터마이닝을 이용한 공정변수 확인 및 공정개선)

  • Jeong, Young-Soo;Gang, Chang-Uk;Byeon, Seong-Kyu
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.28 no.3
    • /
    • pp.166-171
    • /
    • 2005
  • With development of the database, there are too many data on process variables and the manufacturing process for the traditional statistical process control methods to identify the process variables related with assignable causes. Data mining is useful in this situation and provides variety of approaches for improving the process. In this paper, we applied control charts to monitor the process and if assignable causes are detected, then we applied the SVM technique and the sequence pattern analysis to find out the process variables suspected. These techniques made possible to predict the behavior of process variables. We illustrated our proposed methods with real manufacturing process data.

An Improved Rectangular Decomposition Algorithm for Data Mining (데이터 마이닝을 위한 개선된 직사각형 분해 알고리즘)

  • Song, Ji-Young;Im, Young-Hee;Park, Dai-Hee
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.265-272
    • /
    • 2003
  • In this paper, we propose a novel improved algorithm for the rectangular decomposition technique for the purpose of performing data mining from large scaled database in a dynamic environment. The proposed algorithm performs the rectangular decompositions by transforming a binary matrix to bipartite graph and finding bicliques from the transformed bipartite graph. To demonstrate its effectiveness, we compare the proposed one which is based on the newly derived mathematical properties with those of other methods with respect to the classification rate, the number of rules, and complexity analysis.

Research Trends on Literature Reviews in Scopus Journals by Authors from Indonesia, Japan, South Korea, Vietnam, Singapore, and Malaysia: A Bibliometric Analysis from 2003 to 2022

  • Prakoso Bhairawa Putera;Amelya Gustina
    • Asian Journal of Innovation and Policy
    • /
    • v.12 no.3
    • /
    • pp.304-322
    • /
    • 2023
  • Text data mining ('big data methods') is one of the most widely used approaches during the COVID-19 pandemic. In particular, text data mining on Scopus databases or Web of Science (WoS). Text data mining is widely used to collect literature for later bibliometric analysis, and in the end, it becomes a literature review article. Therefore, in this article, we reveal the trend of publication of literature reviews in Scopus journals from Indonesia, Japan, South Korea, Vietnam, Singapore, and Malaysia. This article describes two essential parts, namely 1) a comparison of international publication trends and subject area of literature review publications, and 2) a comparison of Top 5 for Authors, Affiliation, Source Title, and Collaboration Country.

Association Rule Mining Considering Strategic Importance (전략적 중요도를 고려한 연관규칙 탐사)

  • Choi, Doug-Won;Shin, Jin-Gyu
    • Annual Conference of KIPS
    • /
    • 2007.05a
    • /
    • pp.443-446
    • /
    • 2007
  • A new association rule mining algorithm, which reflects the strategic importance of associative relationships between items, was developed and presented in this paper. This algorithm exploits the basic framework of Apriori procedures and TSAA(transitive support association Apriori) procedure developed by Hyun and Choi in evaluating non-frequent itemsets. The algorithm considers the strategic importance(weight) of feature variables in the association rule mining process. Sample feature variables of strategic importance include: profitability, marketing value, customer satisfaction, and frequency. A database with 730 transaction data set of a large scale discount store was used to compare and verify the performance of the presented algorithm against the existing Apriori and TSAA algorithms. The result clearly indicated that the new algorithm produced substantially different association itemsets according to the weights assigned to the strategic feature variables.

Datawarehousing Technology as the Basis for Formulation of Database Marketing Strategy (데이터웨어하우징 기술을 이용한 DB 마케팅 전략에 관한 연구)

  • 조재희
    • The Journal of Information Technology and Database
    • /
    • v.6 no.1
    • /
    • pp.103-114
    • /
    • 1999
  • Marketing decision support systems rely on an underlying enterprise information infrastructure. In traditional business situations, a limited number of product lines and markets divided into large chunks were adequately served by existing management information systems. With the advent of an increasingly segmented focus on niche markets and individual customers, the demand for market information has grown exponentially. The practical solutions offered by such data warehousing tools as OLAP and data mining directly address this need, allowing organizations to discover new niches. Marketing decision support systems built on these foundations provide organizations with new avenues for creating specifically targeted marketing strategies and promotional campaigns. The contribution of this article lies in introducing a graphical framework for data warehousing applications. Based on prior research, the framework links data warehousing and database marketing. To illustrate the effectiveness of this approach, three case examples of successful database marketing conclude the paper.

  • PDF

A PROPOSAL OF SEMI-AUTOMATIC INDEXING ALGORITHM FOR MULTI-MEDIA DATABASE WITH USERS' SENSIBILITY

  • Mitsuishi, Takashi;Sasaki, Jun;Funyu, Yutaka
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2000.04a
    • /
    • pp.120-125
    • /
    • 2000
  • We propose a semi-automatic and dynamic indexing algorithm for multi-media database(e.g. movie files, audio files), which are difficult to create indexes expressing their emotional or abstract contents, according to user's sensitivity by using user's histories of access to database. In this algorithm, we simply categorize data at first, create a vector space of each user's interest(user model) from the history of which categories the data belong to, and create vector space of each data(title model) from the history of which users the data had been accessed from. By continuing the above method, we could create suitable indexes, which show emotional content of each data. In this paper, we define the recurrence formulas based on the proposed algorithm. We also show the effectiveness of the algorithm by simulation result.

  • PDF

Methodologies to Selecting Tunable Resources (튜닝 가능한 자원선택 방법론)

  • Kim, Hye-Sook;Oh, Jeong-Soek
    • Journal of Information Technology Applications and Management
    • /
    • v.15 no.1
    • /
    • pp.271-282
    • /
    • 2008
  • Database administrators are demanded to acquire much knowledges and take great efforts for keeping consistent performance in system. Various principles, methods, and tools have been proposed in many studies and commercial products in order to alleviate such burdens on database administrators, and it has resulted to the automation of DBMS which reduces the intervention of database administrator. This paper suggests a resource selection method that estimates the status of the database system based on the workload characteristics and that recommends tuneable resources. Our method tries to simplify selection information on DBMS status using data-mining techniques, enhance the accuracy of the selection model, and recommend tuneable resource. For evaluating the performance of our method, instances are collected in TPC-C and TPC-W workloads, and accuracy are calculated using 10 cross validation method, comparisons are made between our scheme and the method which uses only the classification procedure without any simplification of informations. It is shown that our method has over 90% accuracy and can perform tuneable resource selection.

  • PDF

TIME-VARIANT OUTLIER DETECTION METHOD ON GEOSENSOR NETWORKS

  • Kim, Dong-Phil;I, Gyeong-Min;Lee, Dong-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.410-413
    • /
    • 2008
  • Existing Outlier detections have been widely studied in geosensor networks. Recently, machine learning and data mining have been applied the outlier detection method to build a model that distinguishes outliers based on anchored criterion. However, it is difficult for the existing methods to detect outliers against incoming time-variant data, because outlier detection needs to monitor incoming data and classify irregular attacks. Therefore, in order to solve the problem, we propose a time-variant outlier detection using 2-dimensional grid method based on unanchored criterion. In the paper, outliers using geosensor data was performed to classify efficiently. The proposed method can be utilized applications such as network intrusion detection, stock market analysis, and error data detection in bank account.

  • PDF

Temporal Pattern Mining of Moving Objects considering Ambiguity (모호성을 고려한 이동 객체의 시간 패턴 탐사)

  • Lee, Yang-Woo;Lee, Jun-Wook;Kim, Ryong;Ryu, Geun-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.7-9
    • /
    • 2002
  • 위치 기반 서비스가 무선 인터넷의 새로운 이슈로 떠오르고 있다. 이동 객체의 패턴 마이닝은 이동 객체의 시간 패턴을 탐사함으로써 이동 객체에 위치에 기반한 유용한 서비스를 제공할 수 있게 해준다. 이동 객체는 시간에 따라 빈번하게 이동하기 때문에 패턴도 최근의 경향을 반영하기 위해 빈번하게 탐사되어야 한다. 따라서 점진적으로 시간 패턴을 탐사하는 접근법이 요구된다. 이 논문에서는 이동 객체의 시간 패턴을 탐사하는데 있어서 측정된 위치 데이터가 가질 수 있는 모호성을 제시했다. 또한 모호성을 고려한 시간 패턴 마이닝를 위해 패턴 탐사 단계에서의 모호성의 처리를 위해 모호성을 원인에 따라 세 가지 임계치를 정의하였다. 그리고 이러한 임계치를 고려한 시간 패턴 마이닝 프로시저 구조를 제시하였다.

  • PDF

Continuous Mining Over Append-Only Databases (추가전용 데이터베이스에 대한 연속 마이닝)

  • Jin, Long;Lee, Jun-Wook;Lee, Yang-Woo;Ryu,Keun-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.10-12
    • /
    • 2002
  • 최근에 많은 새로운 타입의 어플리케이션에서 정보 시스템들에 대한 사용의 증가로 인해 연속 질의들은 여러 연구 프로젝트들에서 초점이 되고 있으며 연구가 활발히 진행되고 있다. 특히 시계열에 대해서 미래의 값에 대한 예측 모델과 FFT(Fast Fourier Transform)을 이용하여 새로운 값이 입력될 때마다 신속하게 응답할 수 있는 이웃에 관한 연속 질의에 대해 이미 연구되었다. 그러나 이것은 이웃에 관한 질의이며 또한 방대한 데이터를 처리함에 있어서 매우 효율적이지 못하다. 이 논문에서는 시계열에 있어서 예측 모델을 이용하여 미래의 값을 예측한다. 다음 DFT(Discrete Fourier Transform)을 이용하여 변환한 후 R*-tree를 구성하고, 새로운 값이 입력될 때마다 신속하게 유사성 시계열들을 찾아서 응답하는 연속 범위 질의 과정과 시스템 구조에 대해 제안한다.

  • PDF