• Title/Summary/Keyword: 연관마이닝

Search Result 486, Processing Time 0.03 seconds

Association Analysis of Comorbidity of Cerebral Infarction Using Data Mining (데이터 마이닝을 활용한 뇌경색증과 동반되는 질환의 연관성 분석)

  • Lee, In-Hee;Shin, A-Mi;Son, Chang-Sik;Park, Hee-Joon;Kim, Joong-Hwi;Park, Sang-Young;Choi, Jin-Ho;Kim, Yoon-Nyun
    • The Journal of Korean Physical Therapy
    • /
    • v.22 no.1
    • /
    • pp.75-81
    • /
    • 2010
  • Purpose: The purpose of this study was to apply association rule mining to explore the labyrinthine network of cerebral infarction comorbidity and basic data supply to develop cutting-edge physical therapy protocols for cerebral infarction with comorbidity Methods: From clinic records of enrollees of A Hospital in D city, patients over 18 years of age with cerebral infarction and cerebral infarction comorbidity were recruited as a case group. All diagnoses of that hospital were categorized according to the "International Classification of Disease (ICD)" diagnosis system. We extracted code I63 from the "Korea Classification of Disease (KCD)-4". Associated rule mining was done with a priori modeling and Web nodes to examine the strengths of associations among those diagnoses. The support and confidence values of associated rule mining results were examined. Results: The subjects of this study were 2,267 cerebral infarction patients. E11 (Non-insulin-dependent diabetes mellitus), E78 (Disorders of lipoprotein metabolism and other lipidaemias), G81 (Hemiplegia), I10 (Essential hypertension), and K29 (Gastritis and duodenitis) were high frequency diagnoses, being found in 10% or more of total diagnoses of cerebral infarction from frequency analysis results. The highest frequency diagnosis was 1,042 (46.0%) for I10. The second most frequent diagnosis was for E11(21.5%) while the third most frequent diagnosis was E78 (20.2%). Results from a priori modeling and Web nodes indicated that cerebral infarction has a strong association withessential hypertension, non-insulin-dependent diabetes mellitus, disorders of lipoprotein metabolism and other lipidaemias. Conclusion: Cerebral infarction is associated with hypertension, diabetes mellitus, and disorders of lipoprotein metabolism and other lipidaemias. The result of this study will be helpful to clinicians treating patients with cerebral infarction.

Topic-Specific Mobile Web Contents Adaptation (주제기반 모바일 웹 콘텐츠 적응화)

  • Lee, Eun-Shil;Kang, Jin-Beom;Choi, Joong-Min
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.6
    • /
    • pp.539-548
    • /
    • 2007
  • Mobile content adaptation is a technology of effectively representing the contents originally built for the desktop PC on wireless mobile devices. Previous approaches for Web content adaptation are mostly device-dependent. Also, the content transformation to suit to a smaller device is done manually. Furthermore, the same contents are provided to different users regardless of their individual preferences. As a result, the user has difficulty in selecting relevant information from a heavy volume of contents since the context information related to the content is not provided. To resolve these problems, this paper proposes an enhanced method of Web content adaptation for mobile devices. In our system, the process of Web content adaptation consists of 4 stages including block filtering, block title extraction, block content summarization, and personalization through learning. Learning is initiated when the user selects the full content menu from the content summary page. As a result of learning, personalization is realized by showing the information for the relevant block at the top of the content list. A series of experiments are performed to evaluate the content adaptation for a number of Web sites including online newspapers. The results of evaluation are satisfactory, both in block filtering accuracy and in user satisfaction by personalization.

Re-ranking the Results from Two Image Retrieval System in Cooperative Manner (두 영상검색 시스템의 협력적 이용을 통한 재순위화)

  • Hwang, Joong-Won;Kim, Hyunwoo;Kim, Junmo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.1
    • /
    • pp.7-15
    • /
    • 2014
  • Image retrieval has become a huge part of computer vision and data mining. Although commercial image retrieval systems such as Google show great performances, the improvement on the performances are constantly on demand because of the rapid growth of data on web space. To satisfy the demand, many re-ranking methods, which enhance the performances by reordering retrieved results with independent algorithms, has been proposed. Conventional re-ranking algorithms are based on the assumption that visual patterns are not used on initial image retrieval stage. However, image search engines in present have begun to use the visual and the assumption is required to be reconsidered. Also, though it is possible to suspect that integration of multiple retrieval systems can improve the overall performance, the research on the topic has not been done sufficiently. In this paper, we made the condition that other manner than cooperation cannot improve the ranking result. We evaluate the algorithm on toy model and show that propose module can improve the retrieval results.

A Spatial Entropy based Decision Tree Method Considering Distribution of Spatial Data (공간 데이터의 분포를 고려한 공간 엔트로피 기반의 의사결정 트리 기법)

  • Jang, Youn-Kyung;You, Byeong-Seob;Lee, Dong-Wook;Cho, Sook-Kyung;Bae, Hae-Young
    • The KIPS Transactions:PartB
    • /
    • v.13B no.7 s.110
    • /
    • pp.643-652
    • /
    • 2006
  • Decision trees are mainly used for the classification and prediction in data mining. The distribution of spatial data and relationships with their neighborhoods are very important when conducting classification for spatial data mining in the real world. Spatial decision trees in previous works have been designed for reflecting spatial data characteristic by rating Euclidean distance. But it only explains the distance of objects in spatial dimension so that it is hard to represent the distribution of spatial data and their relationships. This paper proposes a decision tree based on spatial entropy that represents the distribution of spatial data with the dispersion and dissimilarity. The dispersion presents the distribution of spatial objects within the belonged class. And dissimilarity indicates the distribution and its relationship with other classes. The rate of dispersion by dissimilarity presents that how related spatial distribution and classified data with non-spatial attributes we. Our experiment evaluates accuracy and building time of a decision tree as compared to previous methods. We achieve an improvement in performance by about 18%, 11%, respectively.

IRFP-tree: Intersection Rule Based FP-tree (IRFP-tree(Intersection Rule Based FP-tree): 메모리 효율성을 향상시키기 위해 교집합 규칙 기반의 패러다임을 적용한 FP-tree)

  • Lee, Jung-Hun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.155-164
    • /
    • 2016
  • For frequency pattern analysis of large databases, the new tree-based frequency pattern analysis algorithm which can compensate for the disadvantages of the Apriori method has been variously studied. In frequency pattern tree, the number of nodes is associated with memory allocation, but also affects memory resource consumption and processing speed of the growth. Therefore, reducing the number of nodes in the tree is very important in the frequency pattern mining. However, the absolute criteria which need to order the transaction items for construction frequency pattern tree has lowered the compression ratio of the tree nodes. But most of the frequency based tree construction methods adapted the absolute criteria. FP-tree is typically frequency pattern tree structure which is an extended prefix-tree structure for storing compressed frequent crucial information about frequent patterns. For construction the tree, all the frequent items in different transactions are sorted according to the absolute criteria, frequency descending order. CanTree also need to absolute criteria, canonical order, to construct the tree. In this paper, we proposed a novel frequency pattern tree construction method that does not use the absolute criteria, IRFP-tree algorithm. IRFP-tree(Intersection Rule based FP-tree). IRFP-tree is constituted with the new paradigm of the intersection rule without the use of the absolute criteria. It increased the compression ratio of the tree nodes, and reduced the tree construction time. Our method has the additional advantage that it provides incremental mining. The reported test result demonstrate the applicability and effectiveness of the proposed approach.

Classification and Analysis of Data Mining Algorithms (데이터마이닝 알고리즘의 분류 및 분석)

  • Lee, Jung-Won;Kim, Ho-Sook;Choi, Ji-Young;Kim, Hyon-Hee;Yong, Hwan-Seung;Lee, Sang-Ho;Park, Seung-Soo
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.279-300
    • /
    • 2001
  • Data mining plays an important role in knowledge discovery process and usually various existing algorithms are selected for the specific purpose of the mining. Currently, data mining techniques are actively to the statistics, business, electronic commerce, biology, and medical area and currently numerous algorithms are being researched and developed for these applications. However, in a long run, only a few algorithms, which are well-suited to specific applications with excellent performance in large database, will survive. So it is reasonable to focus our effort on those selected algorithms in the future. This paper classifies about 30 existing algorithms into 7 categories - association rule, clustering, neural network, decision tree, genetic algorithm, memory-based reasoning, and bayesian network. First of all, this work analyzes systematic hierarchy and characteristics of algorithms and we present 14 criteria for classifying the algorithms and the results based on this criteria. Finally, we propose the best algorithms among some comparable algorithms with different features and performances. The result of this paper can be used as a guideline for data mining researches as well as field applications of data mining.

  • PDF

Extracting Typical Group Preferences through User-Item Optimization and User Profiles in Collaborative Filtering System (사용자-상품 행렬의 최적화와 협력적 사용자 프로파일을 이용한 그룹의 대표 선호도 추출)

  • Ko Su-Jeong
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.7
    • /
    • pp.581-591
    • /
    • 2005
  • Collaborative filtering systems have problems involving sparsity and the provision of recommendations by making correlations between only two users' preferences. These systems recommend items based only on the preferences without taking in to account the contents of the items. As a result, the accuracy of recommendations depends on the data from user-rated items. When users rate items, it can be expected that not all users ran do so earnestly. This brings down the accuracy of recommendations. This paper proposes a collaborative recommendation method for extracting typical group preferences using user-item matrix optimization and user profiles in collaborative tittering systems. The method excludes unproven users by using entropy based on data from user-rated items and groups users into clusters after generating user profiles, and then extracts typical group preferences. The proposed method generates collaborative user profiles by using association word mining to reflect contents as well as preferences of items and groups users into clusters based on the profiles by using the vector space model and the K-means algorithm. To compensate for the shortcoming of providing recommendations using correlations between only two user preferences, the proposed method extracts typical preferences of groups using the entropy theory The typical preferences are extracted by combining user entropies with item preferences. The recommender system using typical group preferences solves the problem caused by recommendations based on preferences rated incorrectly by users and reduces time for retrieving the most similar users in groups.

A Study on the Research Trends for Smart City using Topic Modeling (토픽 모델링을 활용한 스마트시티 연구동향 분석)

  • Park, Keon Chul;Lee, Chi Hyung
    • Journal of Internet Computing and Services
    • /
    • v.20 no.3
    • /
    • pp.119-128
    • /
    • 2019
  • This study aims to analyze the research trends on Smart City and to present implications to policy maker, industry professional, and researcher. Cities around globe have undergone the rapid progress in urbanization and the consequent dramatic increase in urban dwellings over the past few decades, and faced many urban problems in such areas as transportation, environment and housing. Cities around the globe are in a hurry to introduce Smart City to pursue a common goal of solving these urban problems and improving the quality of their lives. However, various conceptual approaches to smart city are causing uncertainty in setting policy goals and establishing direction for implementation. The study collected 11,527 papers titled "Smart City(cities)" from the Scopus DB and Springer DB, and then analyze research status, topic, trends based on abstracts and publication date(year) information using the LDA based Topic Modeling approaches. Research topics are classified into three categories(Services, Technologies, and User Perspective) and eight regarding topics. Out of eight topics, citizen-driven innovation is the most frequently referred. Additional topic network analysis reveals that data and privacy/security are the most prevailing topics affecting others. This study is expected to helps understand the trends of Smart City researches and predict the future researches.

An Analysis of the Research Trends for Urban Study using Topic Modeling (토픽모델링을 이용한 도시 분야 연구동향 분석)

  • Jang, Sun-Young;Jung, Seunghyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.661-670
    • /
    • 2021
  • Research trends can be usefully used to determine the importance of research topics by period, identify insufficient research fields, and discover new fields. In this study, research trends of urban spaces, where various problems are occurring due to population concentration and urbanization, were analyzed by topic modeling. The analysis target was the abstracts of papers listed in the Korea Citation Index (KCI) published between 2002 and 2019. Topic modeling is an algorithm-based text mining technique that can discover a certain pattern in the entire content, and it is easy to cluster. In this study, the frequency of keywords, trends by year, topic derivation, cluster by topic, and trend by topic type were analyzed. Research in urban regeneration is increasing continuously, and it was analyzed as a field where detailed topics could be expanded in the future. Furthermore, urban regeneration is now becoming a regular research field. On the other hand, topics related to development/growth and energy/environment have entered a stagnation period. This study is meaningful because the correlation and trends between keywords were analyzed using topic modeling targeting all domestic urban studies.

A Study on the Factors of Well-aging through Big Data Analysis : Focusing on Newspaper Articles (빅데이터 분석을 활용한 웰에이징 요인에 관한 연구 : 신문기사를 중심으로)

  • Lee, Chong Hyung;Kang, Kyung Hee;Kim, Yong Ha;Lim, Hyo Nam;Ku, Jin Hee;Kim, Kwang Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.354-360
    • /
    • 2021
  • People hope to live a healthy and happy life achieving satisfaction by striking a good work-life balance. Therefore, there is a growing interest in well-aging which means living happily to a healthy old age without worry. This study identified important factors related to well-aging by analyzing news articles published in Korea. Using Python-based web crawling, 1,199 articles were collected on the news service of portal site Daum till November 2020, and 374 articles were selected which matched the subject of the study. The frequency analysis results of text mining showed keywords such as 'elderly', 'health', 'skin', 'well-aging', 'product', 'person', 'aging', 'female', 'domestic' and 'retirement' as important keywords. Besides, a social network analysis with 45 important keywords revealed strong connections in the order of 'skin-wrinkle', 'skin-aging' and 'old-health'. The result of the CONCOR analysis showed that 45 main keywords were composed of eight clusters of 'life and happiness', 'disease and death', 'nutrition and exercise', 'healing', 'health', and 'elderly services'.