• Title/Summary/Keyword: Pattern mining

Search Result 624, Processing Time 0.028 seconds

Discovery and Recommendation of User Search Patterns from Web Data (웹 데이터에서의 사용자 탐색 패턴 발견 및 추천)

  • 구흠모;양재영;홍광희;최중민
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.287-296
    • /
    • 2002
  • 웹 사용 마이닝은 데이터마이닝을 바탕으로 사용자의 로그 파일 정보를 이용하여 웹이 이용되는 패턴을 발견한다. 이를 이용하여 웹을 개선하여 사용자들이 보다 빨리 원하는 내용을 검색할 수 있도록 할 수 있으며 시스템 관리자에게는 효율적인 웹 구조를 인한 정보를 제공할 수 있다. 웹 사용 마이닝에서 사용하는 데이터는 성형화되어 있지 않으며 웹 사용 패턴을 분석하는데 방해가 되는 잡음 데이터까지 포함하고 있다. 이것은 기존에 개발된 여러 데이터마이닝 기법을 적용하는데 어려움으로 작용한다. 이러한 어려움을 해결하기 위해 본 논문에서는 새로운 방법을 도입한 SPMiner을 .제안한다. SPMiner는 웹의 구조를 이용하여 로그 파일의 전처리 과정을 줄이며 사용자의 탐색 패턴 분석을 효율적으로 수행 할 수 있는 시스템이다. SPMiner는 WebTree 에이전트를 이용하여 웹 사이트 구조를 분석하여 WebTree를 생성하고 사용자 로그 파일을 분석하여 각 웹 페이지의 사용빈도에 대한 정보를 추출한다. WebTree와 로그 파일에서 추출된 웹 페이지에 대한 정보는 SPMiner에 의해 패턴을 분석할 퍼 이용될 수 있는 형태인 WebTree$^{+}$로 병합된다 WebTree$^{+}$는 패턴 발견을 쉽게 해주며 사용자에게 추천할 정보나 웹 페이지를 능동적으로 추천할 수 있게 만들어 준다.

  • PDF

Analysis of Agrifood Purchasing Pattern Using Association Rule Mining - Case of the Seoul·Gyeonggido·Incheon in South Korea -

  • Jo, Hyebin;Choe, Young Chan
    • Agribusiness and Information Management
    • /
    • v.4 no.2
    • /
    • pp.14-21
    • /
    • 2012
  • Since the Free Trade Agreements (FTAs) with Chile, the EU, and the U.S., Korean agricultural produce markets have turned into a fierce competition landscape. Under these competitive circumstances, marketing is critical. The objective of the research presented herein is to understand the characteristics of customer preferences after locating trends of purchased items. So This research establishes sustainable strategies for Korean agricultural produce. This investigation used market-basket analysis techniques and panel data for its research. Market-basket analysis is a technique which attempts to find groups of items that are commonly found together. The results show that, for one year, processed food using wheat, processed marine products, and pork are commonly bought together and that yogurt and milk also are bought together. The characteristics of customers buying these items are 44 years old and live in a four-person household with two children. These customers do not live with their parents.

  • PDF

Modeling concrete fracturing using a hybrid finite-discrete element method

  • Elmo, Davide;Mitelman, Amichai
    • Computers and Concrete
    • /
    • v.27 no.4
    • /
    • pp.297-304
    • /
    • 2021
  • The hybrid Finite-Discrete Element (FDEM) approach combines aspects of both finite elements and discrete elements with fracture mechanics principles, and therefore it is well suited for realistic simulation of quasi-brittle materials. Notwithstanding, in the literature its application for the analysis of concrete is rather limited. In this paper, the proprietary FDEM code ELFEN is used to model concrete specimens under uniaxial compression and indirect tension (Brazilian tests) of different sizes. The results show that phenomena such as size effect and influence of strain-rate are captured using this modeling technique. In addition, a preliminary model of a slab subjected to dynamic shear punching due to progressive collapse is presented. The resulting fracturing pattern of the impacted slab is similar to observations from actual collapse.

Comparing Results of Classification Techniques Regarding Heart Disease Diagnosing

  • AL badr, Benan Abdullah;AL ghezzi, Raghad Suliman;AL moqhem, ALjohara Suliman;Eljack, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.135-142
    • /
    • 2022
  • Despite global medical advancements, many patients are misdiagnosed, and more people are dying as a result. We must now develop techniques that provide the most accurate diagnosis of heart disease based on recorded data. To help immediate and accurate diagnose of heart disease, several data mining methods are accustomed to anticipating the disease. A large amount of clinical information offered data mining strategies to uncover the hidden pattern. This paper presents, comparison between different classification techniques, we applied on the same dataset to see what is the best. In the end, we found that the Random Forest algorithm had the best results.

Load Forecasting and ESS Scheduling Considering the Load Pattern of Building (부하 패턴을 고려한 건물의 전력수요예측 및 ESS 운용)

  • Hwang, Hye-Mi;Park, Jong-Bae;Lee, Sung-Hee;Roh, Jae Hyung;Park, Yong-Gi
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.9
    • /
    • pp.1486-1492
    • /
    • 2016
  • This study presents the electrical load forecasting and error correction method using a real building load pattern, and the way to manage the energy storage system with forecasting results for economical load operation. To make a unique pattern of target load, we performed the Hierarchical clustering that is one of the data mining techniques, defined load pattern(group) and forecasted the demand load according to the clustering result of electrical load through the previous study. In this paper, we propose the new reference demand for improving a predictive accuracy of load demand forecasting. In addition we study an error correction method for response of load events in demand load forecasting, and verify the effects of proposed correction method through EMS scheduling simulation with load forecasting correction.

Hybrid Approach to Sentiment Analysis based on Syntactic Analysis and Machine Learning (구문분석과 기계학습 기반 하이브리드 텍스트 논조 자동분석)

  • Hong, Mun-Pyo;Shin, Mi-Young;Park, Shin-Hye;Lee, Hyung-Min
    • Language and Information
    • /
    • v.14 no.2
    • /
    • pp.159-181
    • /
    • 2010
  • This paper presents a hybrid approach to the sentiment analysis of online texts. The sentiment of a text refers to the feelings that the author of a text has towards a certain topic. Many existing approaches employ either a pattern-based approach or a machine learning based approach. The former shows relatively high precision in classifying the sentiments, but suffers from the data sparseness problem, i.e. the lack of patterns. The latter approach shows relatively lower precision, but 100% recall. The approach presented in the current work adopts the merits of both approaches. It combines the pattern-based approach with the machine learning based approach, so that the relatively high precision and high recall can be maintained. Our experiment shows that the hybrid approach improves the F-measure score for more than 50% in comparison with the pattern-based approach and for around 1% comparing with the machine learning based approach. The numerical improvement from the machine learning based approach might not seem to be quite encouraging, but the fact that in the current approach not only the sentiment or the polarity information of sentences but also the additional information such as target of sentiments can be classified makes the current approach promising.

  • PDF

Cost-effectiveness of Tunnel Blasting Pattern by Applying Large Blasting Holes (대구경의 발파공을 적용한 터널 발파 패턴의 비용 효과)

  • Choi, Won-Gyu
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.7
    • /
    • pp.147-152
    • /
    • 2020
  • The research is carried out to analyze the cost-effectiveness of blasting patterns with regard to the diameters and design of blasting holes. Blasting patterns for single diameter array, and mixed diameter array were comparatively analyzed with regard to drilling and charging time, and materials required. The number of blasting holes required for single array pattern and mixed array pattern were 138 and 93 holes, respectively. From the drilling time analysis, reduction in time and its efficiency of mixed pattern were 139 minutes and 25%, respectively, in comparison with single pattern. Charging time reduction and its efficiency of mixed blasting pattern were evaluated as 22.5 minutes per worker and 33%, respectively, compare to single blasting pattern. The explosive quantities of G1 and G2 required for single array patterns were 270 and 30, while those were 222 and 20 for mixed array patterns for tunnelling 4m. And single pattern required 45 more detonators than the mixed pattern. The evaluation of material required can also be positive parameter for cost reduction of tunnel construction.

An Investigation on Expanding Traditional Sequential Analysis Method by Considering the Reversion of Purchase Realization Order (구매의도 생성 순서와 구매실현 순서의 역전 현상을 감안한 확장된 순차분석 방법론)

  • Kim, Minseok;Kim, Namgyu
    • The Journal of Information Systems
    • /
    • v.22 no.3
    • /
    • pp.25-42
    • /
    • 2013
  • Recently various kinds of Information Technology services are created and the quantities of the data flow are increase rapidly. Not only that, but the data patterns that we deal with also slowly becoming diversity. As a result, the demand of discover the meaningful knowledge/information through the various mining analysis such as linkage analysis, sequencing analysis, classification and prediction, has been steadily increasing. However, solving the business problems using data mining analysis does not always concerning, one of the major causes of these limitations is there are some analyzed data can't accurately reflect the real world phenomenon. For example, although the time gap of purchasing the two products is very short, by using the traditional sequencing analysis, the precedence relationship of the two products is clearly reflected. But in the real world, with the very short time interval, the precedence relationship of the two purchases might not be defined. What was worse, the sequence of the purchase intention and the sequence of the purchase realization of the two products might be mutually be reversed. Therefore, in this study, an expanded sequencing analysis methodology has been proposed in order to reflect this situation. In this proposed methodology, the purchases that being made in a very short time interval among the purchase order which might not important will be notice, and the analysis which included the original sequence and reversed sequence will be used to extend the analysis of the data. Also, to some extent a very short time interval can be defined as the time interval, so an experiment were carried out to determine the varying based on the time interval for the actual data.

Automated Generation Algorithm of the Penetration Scenarios using Association Mining Technique (연관 마이닝 기법을 이용한 침입 시나리오 자동생성 알고리즘)

  • 정경훈;주정은;황현숙;김창수
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 1999.05a
    • /
    • pp.203-207
    • /
    • 1999
  • In this paper we propose the automated generation algorithm of penetration scenario using association mining technique. Until now known intrusion detections are classified into anomaly detection and misuse detection. The former uses statistical method, features selection, neural network method in order to decide intrusion, the latter uses conditional probability, expert system, state transition analysis, pattern matching for deciding intrusion. In proposed many intrusion detection algorithms unknown penetrations are created and updated by security experts. Our algorithm automatically generates penetration scenarios applying association mining technique to state transition technique. Association mining technique discovers efficient and useful unknown information in existing data. In this paper the algorithm we propose can automatically generate penetration scenarios to have been produced by security experts and is easy to cope with intrusions when it is compared to existing intrusion algorithms. Also It has advantage that maintenance cost is not high.

  • PDF

Identification of major risk factors association with respiratory diseases by data mining (데이터마이닝 모형을 활용한 호흡기질환의 주요인 선별)

  • Lee, Jea-Young;Kim, Hyun-Ji
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.373-384
    • /
    • 2014
  • Data mining is to clarify pattern or correlation of mass data of complicated structure and to predict the diverse outcomes. This technique is used in the fields of finance, telecommunication, circulation, medicine and so on. In this paper, we selected risk factors of respiratory diseases in the field of medicine. The data we used was divided into respiratory diseases group and health group from the Gyeongsangbuk-do database of Community Health Survey conducted in 2012. In order to select major risk factors, we applied data mining techniques such as neural network, logistic regression, Bayesian network, C5.0 and CART. We divided total data into training and testing data, and applied model which was designed by training data to testing data. By the comparison of prediction accuracy, CART was identified as best model. Depression, smoking and stress were proved as the major risk factors of respiratory disease.