• Title/Summary/Keyword: 연관규칙분석

Search Result 346, Processing Time 0.028 seconds

On the Privacy Preserving Mining Association Rules by using Randomization (연관규칙 마이닝에서 랜덤화를 이용한 프라이버시 보호 기법에 관한 연구)

  • Kang, Ju-Sung;Cho, Sung-Hoon;Yi, Ok-Yeon;Hong, Do-Won
    • The KIPS Transactions:PartC
    • /
    • v.14C no.5
    • /
    • pp.439-452
    • /
    • 2007
  • We study on the privacy preserving data mining, PPDM for short, by using randomization. The theoretical PPDM based on the secure multi-party computation techniques is not practical for its computational inefficiency. So we concentrate on a practical PPDM, especially randomization technique. We survey various privacy measures and study on the privacy preserving mining of association rules by using randomization. We propose a new randomization operator, binomial selector, for privacy preserving technique of association rule mining. A binomial selector is a special case of a select-a-size operator by Evfimievski et al.[3]. Moreover we present some simulation results of detecting an appropriate parameter for a binomial selector. The randomization by a so-called cut-and-paste method in [3] is not efficient and has high variances on recovered support values for large item-sets. Our randomization by a binomial selector make up for this defects of cut-and-paste method.

A Topic Analysis of Abstracts in Journal of Korean Data Analysis Society (한국자료분석학회지에 대한 토픽분석)

  • Kang, Changwan;Kim, Kyu Kon;Choi, Seungbae
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2907-2915
    • /
    • 2018
  • Journal of the Korean Data Analysis Society founded in 1998 has played the role of a major application journal. In this study, we checked the objective of this journal by checking the abstracts for 10 years. Abstract data was crawled from the online journal site (kdas.jems.or.kr) and analyzed by topic model. As a result, we found 18 topics from 2680 abstracts that had several contents, for example, nursing, marketing, economics, regression, factor analysis, data mining and statistical inferences. Topic1 (regression) is most frequent with 460 documents and we found the usefulness of regression in the applied science area. We confirmed the significant 10 association rules using by Fisher's exact test. Also, for exploring the trend of topics, we conducted the topic analysis for two periods which are 2006-2011 period and 2012-2016 period. We found that the control study was more frequent than survey study over time and regression and factor analysis were frequent regardless of time.

SME Bakery's Marketing Strategies Based on Apriori Algorithm (Apriori 알고리즘 기반의 중소 베이커리 기업의 대응 전략)

  • Kim, Do Hoon;Lee, Hyeon June;Lee, Bong Gyou
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.4
    • /
    • pp.328-337
    • /
    • 2022
  • The importance of online marketing is emerging due to the prevalence of COVID-19. In order to respond to the changing business environment, we have collected ten years of sales data of SME bakery company that have experienced a decrease in sales due to the COVID-19. As a result of the analysis, we found that switching from offline markets to omnichannel B2B and B2C markets and taking 'small quantity batch production' to 'mass production in a small variety can improve management. This study presented online and offline marketing strategies through data analysis of small and medium-sized bakery companies, which have relatively insufficient digital capabilities compared to large companies, and could be a guideline for many SMEs.

A Topic Modeling-based Recommender System Considering Changes in User Preferences (고객 선호 변화를 고려한 토픽 모델링 기반 추천 시스템)

  • Kang, So Young;Kim, Jae Kyeong;Choi, Il Young;Kang, Chang Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.43-56
    • /
    • 2020
  • Recommender systems help users make the best choice among various options. Especially, recommender systems play important roles in internet sites as digital information is generated innumerable every second. Many studies on recommender systems have focused on an accurate recommendation. However, there are some problems to overcome in order for the recommendation system to be commercially successful. First, there is a lack of transparency in the recommender system. That is, users cannot know why products are recommended. Second, the recommender system cannot immediately reflect changes in user preferences. That is, although the preference of the user's product changes over time, the recommender system must rebuild the model to reflect the user's preference. Therefore, in this study, we proposed a recommendation methodology using topic modeling and sequential association rule mining to solve these problems from review data. Product reviews provide useful information for recommendations because product reviews include not only rating of the product but also various contents such as user experiences and emotional state. So, reviews imply user preference for the product. So, topic modeling is useful for explaining why items are recommended to users. In addition, sequential association rule mining is useful for identifying changes in user preferences. The proposed methodology is largely divided into two phases. The first phase is to create user profile based on topic modeling. After extracting topics from user reviews on products, user profile on topics is created. The second phase is to recommend products using sequential rules that appear in buying behaviors of users as time passes. The buying behaviors are derived from a change in the topic of each user. A collaborative filtering-based recommendation system was developed as a benchmark system, and we compared the performance of the proposed methodology with that of the collaborative filtering-based recommendation system using Amazon's review dataset. As evaluation metrics, accuracy, recall, precision, and F1 were used. For topic modeling, collapsed Gibbs sampling was conducted. And we extracted 15 topics. Looking at the main topics, topic 1, top 3, topic 4, topic 7, topic 9, topic 13, topic 14 are related to "comedy shows", "high-teen drama series", "crime investigation drama", "horror theme", "British drama", "medical drama", "science fiction drama", respectively. As a result of comparative analysis, the proposed methodology outperformed the collaborative filtering-based recommendation system. From the results, we found that the time just prior to the recommendation was very important for inferring changes in user preference. Therefore, the proposed methodology not only can secure the transparency of the recommender system but also can reflect the user's preferences that change over time. However, the proposed methodology has some limitations. The proposed methodology cannot recommend product elaborately if the number of products included in the topic is large. In addition, the number of sequential patterns is small because the number of topics is too small. Therefore, future research needs to consider these limitations.

An Integrated Data Mining Model for Customer Relationship Management (고객관계관리를 위한 데이터마이닝 통합모형에 관한 연구)

  • Song, Im-Young;Oh, R.D.;Yi, T.S.;Shin, K.J.;Kim, K.C.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10c
    • /
    • pp.154-159
    • /
    • 2006
  • 본 논문은 웹 서버에 의해 자동으로 수집되는 로그 파일로부터 고객 가치 판단 기준을 고객의 행동 기반에 두고 군집화 기법을 이용하여 고객을 세분화하고 세분화 결과에 의사결정나무를 적용함으로써 고객을 분류하는 통합 모형을 제안하였다. 또한, 분류된 고객들의 주 서비스 활용 패턴을 분석하기 위하여 연관규칙기법을 적용하여 고객의 과학기술정보 활용의 연관성을 분석함으로써, 과학정보포털 서비스를 제공하는 사이트 이용자의 분류군에 해당하는 정보와 인터페이스를 제공하는 새로운 방법에 대하여 연구하였다. 고객 관리 측면에서 본 논문은 정보 서비스를 제공하는 웹 사이트의 기존고객을 분류하여 패턴을 분석함으로써 고객 위주의 사이트 운영정책과 동적 인터페이스를 제공하기 위한 웹사이트 활용 방안을 제시하였다. 또한, 고객의 지속적인 관리라 각 고객 분류군별에 안는 서비스를 제공하고 고객의 관리에도 기여할 수 있을 것이다.

  • PDF

Discovering Web Page Association Rules & Evaluating Web Site Performance To Improve Web Site Structure (웹사이트 구조 개선을 위한 웹페이지 연관 규칙 발견과 웹사이트 성능 평가)

  • 김민정;박승수
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.46-48
    • /
    • 2001
  • 현재 수많은 웹사이트들이 웹상에 존재하며 서비스를 하고 있다. 사용자는 여러 웹사이트 중에서 접속하기 편하고 잘 구성된 웹사이트에 접속하기 마련이므로, 잘 구성된 웹사이트 운영은 그 웹사이트의 생존 전략이며 방문자 유지에 필수적이다. 이를 위해 사용자들이 웹사이트에 접속한 기록이 남아 있는 웹서버 로그데이터(이하 웹 로그파일)를 분석하여 사용자들의 브라우징 패턴과 접속 경향, 웹 서버의 에러발생 정보 등을 파악할 수 있다. 본 논문에서는 Web Usage Mining 과 Web Structure Mining 작업으로 로그파일 분석과 웹사이트 구조분석을 수행하여 페이지들의 연관 관계와 웹사이트의 구조 정보를 발견해서 웹사이트의 구조를 개선하는 방안을 제안하고자 한다.

  • PDF

A Korean Revision System Using the governal and collocational relation between words (단어 간 지배 관계 및 연관 관계를 이용한 한국어 교열 시스템)

  • Sim, Chul-Min;Kim, Min-Jung;Lee, Young-Sik;Kwon, Hyuk-Chul
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.303-316
    • /
    • 1993
  • 스펠러와 같은 오류 처리 기법은 한 어절 사이의 처리에 국한되거나, 또는 수사 처리와 같이 일부 제한된 품사 영역에서만 어절을 넘어선 처리가 행해지고 있다. 한편 교열과 같은 어절 단위를 넘어선 오류 처리는 완벽한 통사 분석과 의미 해석을 반드시 필요로 한다고 생각되어져 왔다. 그리고 현재 한국어 처리에서는 완벽한 통사적, 의미적 처리가 불가능하기 때문에 교열 시스템 또는 어절 단위를 넘어선 오류 처리에 대한 연구가 거의 전무한 실정이다. 본 논문은 어절을 넘어선 오류의 유형을 분류하고, 문장 단위로 관련된 단어 사용오류를 검사하는 기법과 관련 단어 처리를 위한 규칙 데이타 베이스의 구조를 제안한다. 단어 사이에 존재하는 통사적, 의미적 지배 관계와 연관 관계를 어휘선택 제약으로 이용함으로써 완벽한 통사 분석과 의미 분석이 없이도 교열이 가능하게 하였다.

  • PDF

A Patent Trend Analysis for Technological Convergence of IoT and Wearables (IoT와 Wearables 기술융합을 위한 특허동향분석)

  • Kang, Ji Ho;Kim, Jong Chan;Lee, Jun Hyuck;Park, Sang Sung;Jang, Dong Sik
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.3
    • /
    • pp.306-311
    • /
    • 2015
  • This study aims at analyzing the convergence of Internet-of-Things and wearables technologies using cooperative patent classification(CPC). CPC, introduced to an increasing number of technological fields of Korean patents, is expected to be widely used in Patent Informatics because the classification codes in CPC are more specific than those of IPC, which reflect the characteristics of technologies in detail with accuracy. CPC has seldom been used up to date and most of the previous researches on technological convergence used IPC. As a pre-analysis step for analyzing the trend of technological convergence of IoT and wearables, CPC and IPC codes assigned to each patent were compared. By applying association rule mining to the analysis of CPC codes, we identified the technological fields where convergence frequently takes place and examined the trend of technological convergence over time.

Analysis of Educational Issues through Topic Modeling of National Petitions Text (국민청원글의 토픽 모델링을 통한 교육이슈 분석)

  • Shim, Jaekwoun
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.4
    • /
    • pp.633-640
    • /
    • 2021
  • Education related issues are social problems in which various groups and situations are intricately linked to each other. It is difficult to find issues by analyzing social phenomena related to education. Korean based text analysis can be analyzed in a quantitative. With the development of text analysis techniques, research results have been recently achieved, and it can be fully utilized to derive educational issues from text data in Korean. In this study, petition articles in the field of childcare/education were collected on the online-board of the Blue House National Petition website, and text analysis was used to derive issues in the education world. The analysis derived 6 topics through Latent Dirichlet Allocation(LDA) among topic modeling techniques. The association rules of major keywords were analyzed and visualized as graphs. In addition to deriving educational issues through the existing questionnaire, it can provide implications for future research directions and policies in that issues can be sufficiently discovered through text-based analysis methods.

Analysis on the Importance Factor of Residential Environment using R (R을 활용한 주거환경 중요도 요소에 대한 분석)

  • Oh, Hyungjun;Choi, Youngoh
    • Journal of Creative Information Culture
    • /
    • v.6 no.3
    • /
    • pp.209-217
    • /
    • 2020
  • Recently, interest in data analysis has increased, and convergence research through data analysis has been actively conducted in various fields such as engineering, natural science, and social science. In the field of architecture, various studies using data analysis are being conducted, and in particular, efforts are being made to solve the problems in the field of architecture that have been quantitatively expanded through the urbanization process. In this study, data analysis on residential satisfaction of residents in residential environment improvement areas and similar neighborhoods through urban regeneration projects is performed. Through analysis using R for post-residential evaluation elements that are conducted after building construction and occupancy, important evaluation items that affect the satisfaction of the residential environment are identified by analyzing the association rules between each evaluation element and identifying the frequency of major requirements of residents. To grasp. Through this, we intend to conduct convergence research between IT and architecture fields, such as the development of a system that can recommend high-quality residential areas as well as providing data for securing high-quality residential spaces when constructing residential areas in the future.