• Title/Summary/Keyword: Mining method

Search Result 2,076, Processing Time 0.033 seconds

Feature Extraction of Web Document using Association Word Mining (연관 단어 마이닝을 사용한 웹문서의 특징 추출)

  • 고수정;최준혁;이정현
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.351-361
    • /
    • 2003
  • The previous studies to extract features for document through word association have the problems of updating profiles periodically, dealing with noun phrases, and calculating the probability for indices. We propose more effective feature extraction method which is using association word mining. The association word mining method, by using Apriori algorithm, represents a feature for document as not single words but association-word-vectors. Association words extracted from document by Apriori algorithm depend on confidence, support, and the number of composed words. This paper proposes an effective method to determine confidence, support, and the number of words composing association words. Since the feature extraction method using association word mining does not use the profile, it need not update the profile, and automatically generates noun phrase by using confidence and support at Apriori algorithm without calculating the probability for index. We apply the proposed method to document classification using Naive Bayes classifier, and compare it with methods of information gain and TFㆍIDF. Besides, we compare the method proposed in this paper with document classification methods using index association and word association based on the model of probability, respectively.

Location Generalization Method of Moving Object using $R^*$-Tree and Grid ($R^*$-Tree와 Grid를 이용한 이동 객체의 위치 일반화 기법)

  • Ko, Hyun;Kim, Kwang-Jong;Lee, Yon-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.2 s.46
    • /
    • pp.231-242
    • /
    • 2007
  • The existing pattern mining methods[1,2,3,4,5,6,11,12,13] do not use location generalization method on the set of location history data of moving object, but even so they simply do extract only frequent patterns which have no spatio-temporal constraint in moving patterns on specific space. Therefore, it is difficult for those methods to apply to frequent pattern mining which has spatio-temporal constraint such as optimal moving or scheduling paths among the specific points. And also, those methods are required more large memory space due to using pattern tree on memory for reducing repeated scan database. Therefore, more effective pattern mining technique is required for solving these problems. In this paper, in order to develop more effective pattern mining technique, we propose new location generalization method that converts data of detailed level into meaningful spatial information for reducing the processing time for pattern mining of a massive history data set of moving object and space saving. The proposed method can lead the efficient spatial moving pattern mining of moving object using by creating moving sequences through generalizing the location attributes of moving object into 2D spatial area based on $R^*$-Tree and Area Grid Hash Table(AGHT) in preprocessing stage of pattern mining.

  • PDF

FEROM: Feature Extraction and Refinement for Opinion Mining

  • Jeong, Ha-Na;Shin, Dong-Wook;Choi, Joong-Min
    • ETRI Journal
    • /
    • v.33 no.5
    • /
    • pp.720-730
    • /
    • 2011
  • Opinion mining involves the analysis of customer opinions using product reviews and provides meaningful information including the polarity of the opinions. In opinion mining, feature extraction is important since the customers do not normally express their product opinions holistically but separately according to its individual features. However, previous research on feature-based opinion mining has not had good results due to drawbacks, such as selecting a feature considering only syntactical grammar information or treating features with similar meanings as different. To solve these problems, this paper proposes an enhanced feature extraction and refinement method called FEROM that effectively extracts correct features from review data by exploiting both grammatical properties and semantic characteristics of feature words and refines the features by recognizing and merging similar ones. A series of experiments performed on actual online review data demonstrated that FEROM is highly effective at extracting and refining features for analyzing customer review data and eventually contributes to accurate and functional opinion mining.

A View from the Bottom: Project-Oriented Risk Mining Approach for Overseas Construction Projects

  • Lee, JeeHee;Son, JeongWook;Yi, June-Seong
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.97-100
    • /
    • 2015
  • Analysis of construction tender documents in overseas projects is a very important issue from a risk management point of view. Unfortunately, majority of construction firms are biased by winning contracts without in-depth analysis of tender documents. As a result, many contractors have incurred loss in overseas projects. Although a lot of risk analysis techniques have been introduced, most of them focus project's external unexpected risks such as country conditions and owner's financial standing. However, because those external risks are difficult to control and take preemptive action, we need to concentrate on project inherent risks. Based on this premise, this paper proposes a project-oriented risk mining approach which could detect and extract project risk factors automatically before they are materialized and assess them. This study presents a methodology regarding how to extract potential risks which exist in owner's project requirements and project tender documents using state of the art data analysis method such as text mining, data mining, and information visualization. The project-oriented risk mining approach is expected to effectively reflect project characteristics to the project risk management and could provide construction firms with valuable business intelligence.

  • PDF

Cooperative bearing behaviors of roadside support and surrounding rocks along gob-side

  • Tan, Yunliang;Ma, Qing;Zhao, Zenghui;Gu, Qingheng;Fan, Deyuan;Song, Shilin;Huang, Dongmei
    • Geomechanics and Engineering
    • /
    • v.18 no.4
    • /
    • pp.439-448
    • /
    • 2019
  • The bearing capacity of roadside support is the key problem in gob-side entry retaining technology. To study the cooperative bearing characteristics of the roof-roadside support-floor along the gob-side entry retaining, a mechanical model of the composite structure of the roof-roadside support-floor was first established. A method for determining the structural parameters of gob-side entry retaining was then proposed. Based on this model, adaptability analysis of roadside support was carried out. The results showed that the reasonable width of the gob-side entry roadway was inversely proportional to the mining height, and directly proportional to the bearing strength of the roof and floor. And the reasonable width of the "flexible-hard" roadside support was directly proportional to its own strength, and inversely proportional to the width of the gob-side entry retaining. When determining the position and size of the roadside support along the gob-side entry retaining, the surrounding rock environment should be fully considered. Measured results from case study also show the rationality of the model and calculation method.

Finding Frequent Itemsets based on Open Data Mining in Data Streams (데이터 스트림에서 개방 데이터 마이닝 기반의 빈발항목 탐색)

  • Chang, Joong-Hyuk;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.447-458
    • /
    • 2003
  • The basic assumption of conventional data mining methodology is that the data set of a knowledge discovery process should be fixed and available before the process can proceed. Consequently, this assumption is valid only when the static knowledge embedded in a specific data set is the target of data mining. In addition, a conventional data mining method requires considerable computing time to produce the result of mining from a large data set. Due to these reasons, it is almost impossible to apply the mining method to a realtime analysis task in a data stream where a new transaction is continuously generated and the up-to-dated result of data mining including the newly generated transaction is needed as quickly as possible. In this paper, a new mining concept, open data mining in a data stream, is proposed for this purpose. In open data mining, whenever each transaction is newly generated, the updated mining result of whole transactions including the newly generated transactions is obtained instantly. In order to implement this mechanism efficiently, it is necessary to incorporate the delayed-insertion of newly identified information in recent transactions as well as the pruning of insignificant information in the mining result of past transactions. The proposed algorithm is analyzed through a series of experiments in order to identify the various characteristics of the proposed algorithm.

Variable Coefficient Inductance Model-Based Four-Quadrant Sensorless Control of SRM

  • Kuai, Song-Yan;Li, Xue-Feng;Li, Xing-Hong;Ma, Jinyang
    • Journal of Power Electronics
    • /
    • v.14 no.6
    • /
    • pp.1243-1253
    • /
    • 2014
  • The phase inductance of a switch reluctance motor (SRM) is significantly nonlinear. With different saturation conditions, the phase inductance shape is clearly changed. This study focuses on the relationship between coefficient and current in an inductance model with ignored harmonics above the order of 3. A position estimation method based on the variable coefficient inductance model is proposed in this paper. A four-quadrant sensorless control system of the SRM drive is constructed based on the relationship between variable coefficient inductance and rotor position. The proposed algorithms are implemented in an experimental SRM test setup. Experimental results show that the proposed method estimates position accurately in operating two/four-quadrants. The entire system also has good static and dynamic performance.

Short-term Water Demand Forecasting Algorithm Based on Kalman Filtering with Data Mining (데이터 마이닝과 칼만필터링에 기반한 단기 물 수요예측 알고리즘)

  • Choi, Gee-Seon;Shin, Gang-Wook;Lim, Sang-Heui;Chun, Myung-Geun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.10
    • /
    • pp.1056-1061
    • /
    • 2009
  • This paper proposes a short-term water demand forecasting algorithm based on kalman filtering with data mining for sustainable water supply and effective energy saving. The proposed algorithm utilizes a mining method of water supply data and a decision tree method with special days like Chuseok. And the parameters of MLAR (Multi Linear Auto Regression) model are estimated by Kalman filtering algorithm. Thus, we can achieve the practicality of the proposed forecasting algorithm through the good results applied to actual operation data.

Preparative and Thermal Studies of Tris (8-hydroxyquinolino)molybdenum (III) (Tris(8-hydroxyquinolino) molybdenum (III)의 합성과 열적 성질에 관하여)

  • Choi, Q. Won;Oh, Joon-Suk;Lee, Kwang-Woo;Lee, Won
    • Journal of the Korean Chemical Society
    • /
    • v.12 no.4
    • /
    • pp.146-149
    • /
    • 1968
  • A new chelate compound, tris(8-bydroxyquinolino)molybdenum(III), [$Mo(C_9H_6ON)_3$], has been prepared by the method of electrolytic reduction of the acidic molybdate solution. Thermal decomposition products of the chelate has been studied by DTA and TGA method. It is concluded that the decomposition product is a yellowish green colored bis(8-hydroxyquinolino)dioxo molybdenum(VI), [$MoO_2(C_9H_6ON)_2$].

  • PDF

A patent analysis method for identifying core technologies: Data mining and multi-criteria decision making approach (핵심 기술 파악을 위한 특허 분석 방법: 데이터 마이닝 및 다기준 의사결정 접근법)

  • Kim, Chul-Hyun
    • Journal of the Korea Safety Management & Science
    • /
    • v.16 no.1
    • /
    • pp.213-220
    • /
    • 2014
  • This study suggests new approach to identify core technologies through patent analysis. Specially, the approach applied data mining technique and multi-criteria decision making method to the co-classification information of registered patents. First, technological interrelationship matrices of intensity, relatedness, and cross-impact perspectives are constructed with support, lift and confidence values calculated by conducting an association rule mining on the co-classification information of patent data. Second, the analytic network process is applied to the constructed technological interrelationship matrices in order to produce the importance values of technologies from each perspective. Finally, data envelopment analysis is employed to the derived importance values in order to identify priorities of technologies, putting three perspectives together. It is expected that suggested approach could help technology planners to formulate strategy and policy for technological innovation.