• 제목/요약/키워드: Association Mining

검색결과 1,060건 처리시간 0.025초

온라인 뉴스 웹사이트의 로그를 이용한 연관규칙 발견에 관한 연구 (Mining Association Rules from the Web Access Log of an Online News website)

  • 황현석;유기동
    • 한국산업정보학회논문지
    • /
    • 제18권2호
    • /
    • pp.47-57
    • /
    • 2013
  • 인터넷의 활용으로 기업활동의 많은 영역이 온라인을 통해 이루어지고 있다. 온라인 쇼핑몰에서는 고객이 웹사이트 방문 후에 어떤 활동을 하는지를 파악하고 이를 경영활동의 성과로 연계하기 위해 웹 로그를 분석하고 있다. 온라인 뉴스 사이트에서도 방문자의 활동을 파악하고 어떤 기사에 관심이 많은지, 어떤 분야의 기사를 많이 보는지 등을 파악하여 독자에게 서비스하는 것이 필요하다. 그러나 언론사의 웹사이트 로그를 분석하는 연구는 충분히 이루어지지 않고 있다. 본 연구에서는 온라인 뉴스 웹사이트에서 수집된 로그를 이용하여 방문자의 웹사이트 내에서의 활동을 파악하고 뉴스 기사간 연관규칙을 도출한다. 연구는 크게 방문자의 세션(session)을 파악하는 첫 번째 단계와 방문자가 읽은 뉴스 기사간의 연관규칙을 살펴보는 두 번째 단계로 이루어져 있으며 두 차례에 걸쳐 수집된 웹사이트 로그를 이용하여 분석하였다. 최종적으로 도출된 규칙의 의미와 온라인 뉴스 사이트에서 고려해야 하는 함의를 제시하였다.

공간 연관규칙을 이용한 대형할인점의 입지 분석 (Analyzing the Location Decision of the Large-Scale Discount Store Using the Spatial Association Rules Mining)

  • 이용익;홍성언;김정엽;박수홍
    • 대한지리학회지
    • /
    • 제41권3호
    • /
    • pp.319-330
    • /
    • 2006
  • 본 연구의 목적은 1990년대부터 급속히 증가한 대형할인점에 대하여 입지영향인자를 추출하여 의사결정에 객관성을 확보하고 대량의 데이터베이스를 이용하여 숨겨진 유용한 정보를 입지 선정에 활용하는 것이다. 이를 위해 대형할인점이 입점하는데 미치는 인구학적 변수, 경제학적 변수 그리고 주변환경적 변수에 대한 다양한 인자를 통계자료를 수집하고 연구대상 지역의 공간 자료를 구축하여 공간 연관성 분석을 실시하여 공간 연관규칙을 추출하였다. 결과의 검증을 위해 추출된 규칙과 대형할인점의 매출액을 이용한 적용성 여부를 상호 비교하였다. 검증 결과 추출된 공간 연관규칙이 해당 대형할인점에 많이 부합할수록 매출액도 많은 것으로 나타났다. 본 연구를 통해 공간 연관규칙을 활용하여 객관적이고 매출에 이익을 주는 대형할인점의 최적입지 선정을 기대할 수 있다.

연관규칙 분석을 통한 ESG 우려사안 키워드 도출에 관한 연구 (A Study on the Keyword Extraction for ESG Controversies Through Association Rule Mining)

  • 안태욱;이희승;이준서
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제30권1호
    • /
    • pp.123-149
    • /
    • 2021
  • Purpose The purpose of this study is to define the anti-ESG activities of companies recognized by media by reflecting ESG recently attracted attention. This study extracts keywords for ESG controversies through association rule mining. Design/methodology/approach A research framework is designed to extract keywords for ESG controversies as follows: 1) From DeepSearch DB, we collect 23,837 articles on anti-ESG activities exposed to 130 media from 2013 to 2018 of 294 listed companies with ESG ratings 2) We set keywords related to environment, social, and governance, and delete or merge them with other keywords based on the support, confidence, and lift derived from association rule mining. 3) We illustrate the importance of keywords and the relevance between keywords through density, degree centrality, and closeness centrality on network analysis. Findings We identify a total of 26 keywords for ESG controversies. 'Gapjil' records the highest frequency, followed by 'corruption', 'bribery', and 'collusion'. Out of the 26 keywords, 16 are related to governance, 8 to social, and 2 to environment. The keywords ranked high are mostly related to the responsibility of shareholders within corporate governance. ESG controversies associated with social issues are often related to unfair trade. As a result of confidence analysis, the keywords related to social and governance are clustered and the probability of mutual occurrence between keywords is high within each group. In particular, in the case of "owner's arrest", it is caused by "bribery" and "misappropriation" with an 80% confidence level. The result of network analysis shows that 'corruption' is located in the center, which is the most likely to occur alone, and is highly related to 'breach of duty', 'embezzlement', and 'bribery'.

Corporate Social Responsibility Regulation in the Indonesian Mining Companies

  • NUSWANTARA, Dian Anita;PRAMESTI, Dhea Ayu
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제7권10호
    • /
    • pp.161-169
    • /
    • 2020
  • The condition of mining companies that exploit natural resources in their business processes underline this research to emphasize on social and environmental issues. After twelve years of government regulation on CSR practices, this study investigates the factors that influence mining companies in disclosing information about corporate social responsibility based on legitimacy, stakeholders, and agency theory. Thus, independent variables are foreign ownership, company size, leverage, and the board of commissioners. The dependent variable is the corporate social reporting disclosure that is measured using GRI indexing. For sampling, we have used thirty-four Indonesian mining companies listed in IDX during the 2014-2018. out of which only fifty-two companies meet the sample criteria. All data should pass the classical assumption test to get the best estimator. Multiple linear regression is used to test the hypothesis, and the results show that the model is good, and can explain 60% of the dependent variable. Based on F-test, all four variables affect CSR practices simultaneously. The findings of this study suggest that foreign ownership and firm size influences CSR disclosure in a positive direction. However, this study did not support the hypothesis that leverage negatively affects CSR disclosure and board size measures positively affect CSR disclosure.

빅데이터마이닝을 이용한 회계정보처리 모형 (Accounting Information Processing Model Using Big Data Mining)

  • 김경일
    • 융합정보논문지
    • /
    • 제10권7호
    • /
    • pp.14-19
    • /
    • 2020
  • 확장성 보고서 언어인 XML기술을 회계보고 영역에 응용한 인터넷 표준인 XBRL에 기초한 회계정보처리 모형을 제안하고자 한다. 기업마다 문서의 특성이 상이하기에 의사결정자에게 유용한 정보를 제공하여야 한다는 회계의 목적에 비추어 그 중요성이 크다. 본 연구는 X-Hive 데이터베이스 내에 XBRL로 저장된 XML 계층구조를 기반으로 하는 데이터 마이닝 모형을 제안하고자 한다. 데이터마이닝 분석은 연관규칙으로 실험되었고 XBRL을 기반으로 DC-Apriori 데이터마이닝 방법을 Apriori알고리즘과 X쿼리를 결합하여 제안한다. 마지막으로 제안 모형의 타당성과 유효성에 대해서는 실험을 통해 검증하였다.

Data Mining and FNN-Driven Knowledge Acquisition and Inference Mechanism for Developing A Self-Evolving Expert Systems

  • Kim, Jin-Sung
    • 한국산학기술학회:학술대회논문집
    • /
    • 한국산학기술학회 2003년도 Proceeding
    • /
    • pp.99-104
    • /
    • 2003
  • In this research, we proposed the mechanism to develop self evolving expert systems (SEES) based on data mining (DM), fuzzy neural networks (FNN), and relational database (RDB)-driven forward/backward inference engine. Most former researchers tried to develop a text-oriented knowledge base (KB) and inference engine (IE). However, thy have some limitations such as 1) automatic rule extraction, 2) manipulation of ambiguousness in knowledge, 3) expandability of knowledge base, and 4) speed of inference. To overcome these limitations, many of researchers had tried to develop an automatic knowledge extraction and refining mechanisms. As a result, the adaptability of the expert systems was improved. Nonetheless, they didn't suggest a hybrid and generalized solution to develop self-evolving expert systems. To this purpose, in this study, we propose an automatic knowledge acquisition and composite inference mechanism based on DM, FNN, and RDB-driven inference. Our proposed mechanism has five advantages empirically. First, it could extract and reduce the specific domain knowledge from incomplete database by using data mining algorithm. Second, our proposed mechanism could manipulate the ambiguousness in knowledge by using fuzzy membership functions. Third, it could construct the relational knowledge base and expand the knowledge base unlimitedly with RDBMS (relational database management systems). Fourth, our proposed hybrid data mining mechanism can reflect both association rule-based logical inference and complicate fuzzy logic. Fifth, RDB-driven forward and backward inference is faster than the traditional text-oriented inference.

  • PDF

Data Mining for Knowledge Management in a Health Insurance Domain

  • Chae, Young-Moon;Ho, Seung-Hee;Cho, Kyoung-Won;Lee, Dong-Ha;Ji, Sun-Ha
    • 지능정보연구
    • /
    • 제6권1호
    • /
    • pp.73-82
    • /
    • 2000
  • This study examined the characteristicso f the knowledge discovery and data mining algorithms to demonstrate how they can be used to predict health outcomes and provide policy information for hypertension management using the Korea Medical Insurance Corporation database. Specifically this study validated the predictive power of data mining algorithms by comparing the performance of logistic regression and two decision tree algorithms CHAID (Chi-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) since logistic regression has assumed a major position in the healthcare field as a method for predicting or classifying health outcomes based on the specific characteristics of each individual case. This comparison was performed using the test set of 4,588 beneficiaries and the training set of 13,689 beneficiaries that were used to develop the models. On the contrary to the previous study CHAID algorithm performed better than logistic regression in predicting hypertension but C5.0 had the lowest predictive power. In addition CHAID algorithm and association rule also provided the segment characteristics for the risk factors that may be used in developing hypertension management programs. This showed that data mining approach can be a useful analytic tool for predicting and classifying health outcomes data.

  • PDF

기업과 소비자간 전자상거래에서의 웹 마이닝을 이용한 상품관리 (Merchandise Management Using Web Mining in Business To Customer Electronic Commerce)

  • 임광혁;홍한국;박상찬
    • 지능정보연구
    • /
    • 제7권1호
    • /
    • pp.97-121
    • /
    • 2001
  • 본 연구에서는 웹 마이닝을 이용하여 기업과 소비자간 전자상거래(Business-To-Customer Electronic Commerce)환경에 기초한 가상상점(Cyber market)의 상품 관리자 입장에서 효율적인 상품관리를 가능케 하는 시스템적 접근방법을 통한 상품관리 방법론을 제시하고자 한다. 또한 이 상품 관리 방법론을 실제 웹 상에서 운영되고 있는 가상상점에 직접 적용하여 봄으로써 실증적인 예를 보여주고자 한다.

  • PDF

트리 구조를 이용한 연관규칙의 효율적 탐색 (An Efficient Tree Structure Method for Mining Association Rules)

  • 김창오;안광일;김성집;김재련
    • 대한산업공학회지
    • /
    • 제27권1호
    • /
    • pp.30-36
    • /
    • 2001
  • We present a new algorithm for mining association rules in the large database. Association rules are the relationships of items in the same transaction. These rules provide useful information for marketing. Since Apriori algorithm was introduced in 1994, many researchers have worked to improve Apriori algorithm. However, the drawback of Apriori-based algorithm is that it scans the transaction database repeatedly. The algorithm which we propose scans the database twice. The first scanning of the database collects frequent length l-itemsets. And then, the algorithm scans the database one more time to construct the data structure Common-Item Tree which stores the information about frequent itemsets. To find all frequent itemsets, the algorithm scans Common-Item Tree instead of the database. As scanning Common-Item Tree takes less time than scanning the database, the algorithm proposed is more efficient than Apriori-based algorithm.

  • PDF

VOC 기반 연관규칙 마이닝을 이용한 통신선로설비의 장애 예측 (Fault Prediction of a Telecommunications Network using Association Rules Mining based on Voice of the Customer)

  • 나기주;한인섭;조남욱
    • 디지털산업정보학회논문지
    • /
    • 제11권4호
    • /
    • pp.13-24
    • /
    • 2015
  • Customer complaints handling helps organizations to retain existing customers and attract new customers, as well. As Voice of the Customer (VOC) is one of the main sources of customer complaints, many organizations utilize VOC to enhance customer satisfaction. Effective management of VOC has been proved as one of the best ways to maintain organization's brand image and reputation. In spite of its importance, little has been reported on the utilization of VOC to detect faults in a telecommunication industry. In this paper, association rule mining based on VOC is used to identify root fault causes of a telecommunications network. To do that, VOC of a Communication Service Provider has been collected first. Then, association rule mining has also been conducted with various support and confidence levels. As a result, root fault causes of the telecommunications network can be identified. It is expected that this study can be used as a basis for decisions about customer satisfaction management such as preventive maintenances or reduction of the customer maintenance cost.