• Title/Summary/Keyword: 지식기반 데이터 마이닝

Search Result 129, Processing Time 0.029 seconds

Design and Implementation of Spatial Characterization System using Density-Based Clustering (밀도 클러스터링을 이용한 공간 특성화 시스템 설계 및 구현)

  • You Jae-Hyun;Park Tae-Su;Ahn Chan-Min;Park Sang-Ho;Hong Jun-Sik;Lee Ju-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2 s.40
    • /
    • pp.43-52
    • /
    • 2006
  • LRecently, with increasing interest in ubiquitous computing, knowledge discovery method is needed with consideration of the efficiency and the effectiveness of wide range and various forms of data. Spatial Characterization which extends former characterization method with consideration of spatial and non-spatial property enables to find various form of knowledge in spatial region. The previous spatial characterization methods have the problems as follows. Firstly, former study shows the problem that the result of searched knowledge is unable to perform the multiple spatial analysis. Secondly, it is unable to secure the useful knowledge search since it searches the limited spatial region which is allocated by the user. Thus, this study suggests spatial characterization which applies to density based clustering.

  • PDF

Data Mining for Business & Marketing Based on Customer (고객 중심의 기업 경영 및 마케팅을 위한 데이터 마이닝의 활용 : 멀티플렉스에 적용)

  • Donghan Chung;Wongil Choi;UngMo Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.311-314
    • /
    • 2008
  • 최근 기업의 경영 및 마케팅 환경은 급변하고 있다. 특히 기업 간 경쟁에서 우위를 차지하기 위해서는 고객과의 관계를 구축하고 유지하는 것이 매우 중요한데, 신규고객의 유치보다는 기존고객의 유지하는 것이 기업에게 있어 더 유익하다. 이를 위해 데이터 마이닝의 방법에 기반 하여 비즈니스 인텔리젼스(BI)와 고객관계관리(CRM)을 활용할 수 있다. 본 논문에서는 멀티플렉스를 통해 관련 지식들을 적용해 보고자 한다.

텍스트마이닝 기반 고정밀 검색시스템

  • 안태성;서형국;이경일
    • Korea Information Processing Society Review
    • /
    • v.11 no.2
    • /
    • pp.88-97
    • /
    • 2004
  • 지난 10년 동안 인터넷의 대중화 덕분으로 World Wide Web과 e-mail은 이미 정보 전달의 일반적인 수단으로 자리를 잡았다. 인터넷과 이에 기반한 e-Busine器는 기존 산업의 전 부분에 걸쳐 효율성과 생산성 증대를 위한 전략적인 도구로 그 중요성이 지속적으로 증대되고 있으며. 지식 노동자들은 업무 시간의 대부분을 문서로 대표되는 정보와 지식을 생산하고 검색하는데 보내고 있다. 새로운 기업정보 자료들이 끊임없이 등록되고, 지난 자료들이 수정, 갱신되는 등 전 세계에 있는 수 많은 기업에서 다양한 지식 자산(Knowledge Asset)들이 지속적으로 생성, 재활용되고 있다. 그러나 이렇게 기업이 생성, 저장, 재 사용하는 정보 중 20% 만이 활용성이 높은 정형 데이터로 구성되어 있고, 나머지 80%는 워드프로세서, e-mail, 프리젠테이션, 스프레드시트, PDF와 같은 복합문서와 인터넷 페이지 등의 비정형 텍스트 형태로 구성되어 있다[1].(중략)

  • PDF

Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model (텍스트 마이닝 기반의 그래프 모델을 이용한 미발견 공공 지식 추론)

  • Heo, Go Eun;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.1
    • /
    • pp.231-250
    • /
    • 2014
  • Due to the recent development of Information and Communication Technologies (ICT), the amount of research publications has increased exponentially. In response to this rapid growth, the demand of automated text processing methods has risen to deal with massive amount of text data. Biomedical text mining discovering hidden biological meanings and treatments from biomedical literatures becomes a pivotal methodology and it helps medical disciplines reduce the time and cost. Many researchers have conducted literature-based discovery studies to generate new hypotheses. However, existing approaches either require intensive manual process of during the procedures or a semi-automatic procedure to find and select biomedical entities. In addition, they had limitations of showing one dimension that is, the cause-and-effect relationship between two concepts. Thus;this study proposed a novel approach to discover various relationships among source and target concepts and their intermediate concepts by expanding intermediate concepts to multi-levels. This study provided distinct perspectives for literature-based discovery by not only discovering the meaningful relationship among concepts in biomedical literature through graph-based path interference but also being able to generate feasible new hypotheses.

A Text Mining-based Intrusion Log Recommendation in Digital Forensics (디지털 포렌식에서 텍스트 마이닝 기반 침입 흔적 로그 추천)

  • Ko, Sujeong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.6
    • /
    • pp.279-290
    • /
    • 2013
  • In digital forensics log files have been stored as a form of large data for the purpose of tracing users' past behaviors. It is difficult for investigators to manually analysis the large log data without clues. In this paper, we propose a text mining technique for extracting intrusion logs from a large log set to recommend reliable evidences to investigators. In the training stage, the proposed method extracts intrusion association words from a training log set by using Apriori algorithm after preprocessing and the probability of intrusion for association words are computed by combining support and confidence. Robinson's method of computing confidences for filtering spam mails is applied to extracting intrusion logs in the proposed method. As the results, the association word knowledge base is constructed by including the weights of the probability of intrusion for association words to improve the accuracy. In the test stage, the probability of intrusion logs and the probability of normal logs in a test log set are computed by Fisher's inverse chi-square classification algorithm based on the association word knowledge base respectively and intrusion logs are extracted from combining the results. Then, the intrusion logs are recommended to investigators. The proposed method uses a training method of clearly analyzing the meaning of data from an unstructured large log data. As the results, it complements the problem of reduction in accuracy caused by data ambiguity. In addition, the proposed method recommends intrusion logs by using Fisher's inverse chi-square classification algorithm. So, it reduces the rate of false positive(FP) and decreases in laborious effort to extract evidences manually.

Data Mining-Based Performance Prediction Technology of Geothermal Heat Pump System (지열 히트펌프 시스템의 데이터 마이닝 기반 성능 예측 기술)

  • Hwang, Min Hye;Park, Myung Kyu;Jun, In Ki;Sohn, Byonghu
    • Transactions of the KSME C: Technology and Education
    • /
    • v.4 no.1
    • /
    • pp.27-34
    • /
    • 2016
  • This preliminary study investigated data mining-based methods to assess and predict the performance of geothermal heat pump(GHP) system. Data mining is a key process of the knowledge discovery in database (KDD), which includes five steps: 1) Selection; 2) Pre-processing; 3) Transformation; 4) Analysis(data mining); and 5) Interpretation/Evaluation. We used two analysis models, categorical and numerical decision tree models to ascertain the patterns of performance(COP) and electrical consumption of the GHP system. Prior to applying the decision tree models, we statistically analyzed measurement database to determine the effect of sampling intervals on the system performance. Analysis results showed that 10-min sampling data for the performance analysis had highest accuracy of 97.7% over the actual dataset of the GHP system.

Design of Process Management System based on Data Mining and Artificial Modelling for the Etching Process (데이터 마이닝과 지능 모델링에 기반한 에칭공정의 공정관리시스템 설계)

  • Bae, Hyeon;Kim, Sung-shin;Woo, Kwang-Bang
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.390-395
    • /
    • 2004
  • A semiconductor manufacturing process is the complicate and dynamic process, and consists of many sub-processes. An etching process is the most important process in the semiconductor fabrication. In this paper, the decision support system based upon data mining and knowledge discovery is an important factor to improve the productivity and yield. The proposed decision support system consists of a neural network model and an inference system based on fuzzy logic Firstly, the product results are predicted by the neural network model constructed by the product patterns that represent the quality of the etching process. And the product patters are classified by expert's knowledge. Finally, the product conditions are estimated by the fuzzy inference system using the rules extracted from the classified patterns. Prediction of product qualities can be linked to each input and process variables. We employ data mining and intelligent techniques to find the best condition of the etching process. The proposed decision support system is efficient and easy to be implemented for the process management based upon expert's knowledge.

Prognostic Modeling of Metabolic Syndrome Using Bayesian Networks (베이지안 네트워크를 이용한 대사증후군의 예측 모델링)

  • Park Han-Saem;Cho Sung-Bae;Lee Hong Kyu
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.292-294
    • /
    • 2005
  • 대사증후군은 당뇨병, 고혈압, 복부 비만, 고지혈증 등의 질병이 한 개인에게 동시에 발현하는 것을 말한다. 미국에서는 $25\%$ 이상의 성인이 대사성 증후군인 것으로 알려져 있으며, 경제 여건의 향상 및 식생활 습관의 변화와 함께 최근 우리나라에서도 심각한 문제가 되고 있다. 한편 불확실성의 처리를 위해 많이 사용되고 있는 베이지안 네트워크는 사람이 분석 가능한 확률 기반의 모델로 최근 의학 분야에서 지식 발견, 데이터 마이닝을 위한 도구로 유용하게 사용되고 있다. 본 논문에 서 는 대사증후군을 예측하는 문제를 다루며, 베이지안 네트워크와 의학 지식을 이용한 대사증후군의 예측 모델을 제안한다. 제안하는 모델을 통해 1993년의 데이터를 가지고 1995년의 상태를 예측하는 분류 실험을 수행하였으며, 실험 결과 다층 신경망, k-최근접 이웃 등의 분류기 보다 높은 $81.5\%$의 예측율을 보였다.

  • PDF

Analysis of Judicial Precedent Information related to Debt Recovery based on Deep-Learning (심층 학습 기반의 채권 회수 판례 분석)

  • Kim, Seon-wu;Ji, Sun-young;Choi, Sung-pil
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.373-377
    • /
    • 2018
  • 판례는 재판에 대한 선례로, 법적 결정에 대한 근거가 되는 핵심 단서 중 하나이다. 본 연구에서는 채권회수를 예측하는 서비스 구축을 위한 단서를 추출하기 위해 채권 회수 판례를 수집하여 이를 분석한다. 먼저 채권 회수 판례에 대한 기초 분석을 위하여, 채권 회수 사례와 비회수 사례를 각 20건씩 수집하여 분석하였으며, 이후 대법원 및 법률 지식베이스의 채권 관련 판례 12,457건을 수집하고 채권 회수 여부에 따라 가공하였다. 채권 회수 사례와 비회수 사례를 분류하기 위한 판례 내의 패턴을 분석하여 레이블링하고, 이를 자동 분류할 수 있는 Bidirectional LSTM 기반 심층학습 모델을 구성하여 학습하였다. 채권 관련 판례 가공 기준에 따라 네 가지의 데이터 셋을 구성하였으며, 각 데이터셋을 8:2의 비율로 나누어 실험한 결과, 검증 데이터에 대하여 F1 점수 89.82%의 우수한 성능을 보였다.

  • PDF

The Prediction of Cryptocurrency on Using Text Mining and Deep Learning Techniques : Comparison of Korean and USA Market (텍스트 마이닝과 딥러닝을 활용한 암호화폐 가격 예측 : 한국과 미국시장 비교)

  • Won, Jonggwan;Hong, Taeho
    • Knowledge Management Research
    • /
    • v.22 no.2
    • /
    • pp.1-17
    • /
    • 2021
  • In this study, we predicted the bitcoin prices of Bithum and Coinbase, a leading exchange in Korea and USA, using ARIMA and Recurrent Neural Networks(RNNs). And we used news articles from each country to suggest a separated RNN model. The suggested model identifies the datasets based on the changing trend of prices in the training data, and then applies time series prediction technique(RNNs) to create multiple models. Then we used daily news data to create a term-based dictionary for each trend change point. We explored trend change points in the test data using the daily news keyword data of testset and term-based dictionary, and apply a matching model to produce prediction results. With this approach we obtained higher accuracy than the model which predicted price by applying just time series prediction technique. This study presents that the limitations of the time series prediction techniques could be overcome by exploring trend change points using news data and various time series prediction techniques with text mining techniques could be applied to improve the performance of the model in the further research.