• Title/Summary/Keyword: Rule mining

Search Result 481, Processing Time 0.026 seconds

Cryptocurrency Recommendation Model using the Similarity and Association Rule Mining (유사도와 연관규칙분석을 이용한 암호화폐 추천모형)

  • Kim, Yechan;Kim, Jinyoung;Kim, Chaerin;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.287-308
    • /
    • 2022
  • The explosive growth of cryptocurrency, led by Bitcoin has emerged as a major issue in the financial market recently. As a result, interest in cryptocurrency investment is increasing, but the market opens 24 hours and 365 days a year, price volatility, and exponentially increasing number of cryptocurrencies are provided as risks to cryptocurrency investors. For that reasons, It is raising the need for research to reduct investors' risks by dividing cryptocurrency which is not suitable for recommendation. Unlike the previous studies of maximizing returns by simply predicting the future of cryptocurrency prices or constructing cryptocurrency portfolios by focusing on returns, this paper reflects the tendencies of investors and presents an appropriate recommendation method with interpretation that can reduct investors' risks by selecting suitable Altcoins which are recommended using Apriori algorithm, one of the machine learning techniques, but based on the similarity and association rules of Bitocoin.

Comparison of Herbs in Prescription Composition of Consumptive Disease and Internal Injury in Donguibogam Through Network Analysis (네트워크 분석을 통한 동의보감(東醫寶鑑) 내상(內傷)문과 허로(虛勞)문의 처방 구성 본초 비교)

  • Chien-hsin Kuo;Heung Ko;Seon-mi Shin
    • The Journal of Internal Korean Medicine
    • /
    • v.44 no.1
    • /
    • pp.35-52
    • /
    • 2023
  • Objective: Internal injuries and consumptive disease have different causes, yet they can affect each other. The relationship and combination of prescription drugs in the clinical practice of internal injuries and consumptive disease were analyzed for various diseases of "Donguibogam" through network analysis. Methods: The prescriptions used in consumptive disease and internal injury were established by conducting a full survey on the papers extracted from Donguibogam. The R version 4.0.3 (2020-10-10) and the igraph and arules package were used to perform network analysis and association rule relationship mining analysis in the first and second prescription compositions. Results: The herb frequently used for internal injury was Glycyrrhizae Radix, while the herb combination frequently used was Citri Pericarpium-Glycyrrhizae Radix. For centrality, the main factor was generally Glycyrrhizae Radix. In the case of consumptive disease, the herb most frequently used was Angelicae Gigantis Radix, and the combination most frequently used was Rehmanniae Radix Preparata-Angelicae Gigantis Radix. In terms of centrality, it was Angelicae Gigantis Radix. As a result of the network analysis of herbal prescription frequency, each group was divided into three. Conclusion: The interrelationship between internal injury and consumptive disease prescription drugs may reveal the differences and similarities between internal injury and consumptive disease and may serve as a basis for the development of new drugs or materials that can enhance mutual effectiveness in the treatment of internal injury and consumptive diseases.

Selection of Key Management Targets for Claim Causes through Relational Analysis on the Causes of Change Order Claims

  • Min, Kwang-Ho;Ko, Gun-Ho;Jin, Chengquan;Hyun, Chang-Taek;Han, Sang-Won
    • International conference on construction engineering and project management
    • /
    • 2017.10a
    • /
    • pp.281-290
    • /
    • 2017
  • As various stakeholders are involved in construction projects, disputes between the parties are more likely to occur, which is a very important issue for the participants in the projects. Claims in construction projects, however, are very complex and thus difficult to manage. In particular, as the cause of a claim in the preceding stage that has not been resolved in a timely manner has an effect on the cause of a claim in the following stage, it is difficult to find a point of compromise regarding a claim caused by the relationship between the causes that occur in the preceding and following stages. In this regard, this study sought to examine the rules for the generation of change order claims, which occur most frequently among the construction claims, and thus to select the key management targets through the analysis of the relationship between the causes of claims arising in the preceding and following stages for the efficient management of claims. It is expected that the use of rules for the generation of change order claims as well as of representative and similar cases will help the construction practitioners in judging claims, considering the relationships among the causes of the claims. Meanwhile, in this study, association analysis was conducted regarding the causes of the occurrence of change order claims in a design-build delivery method, and therefore, it is necessary to verify the effectiveness of the method when applied to other delivery methods.

  • PDF

Extension Method of Association Rules Using Social Network Analysis (사회연결망 분석을 활용한 연관규칙 확장기법)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.111-126
    • /
    • 2017
  • Recommender systems based on association rule mining significantly contribute to seller's sales by reducing consumers' time to search for products that they want. Recommendations based on the frequency of transactions such as orders can effectively screen out the products that are statistically marketable among multiple products. A product with a high possibility of sales, however, can be omitted from the recommendation if it records insufficient number of transactions at the beginning of the sale. Products missing from the associated recommendations may lose the chance of exposure to consumers, which leads to a decline in the number of transactions. In turn, diminished transactions may create a vicious circle of lost opportunity to be recommended. Thus, initial sales are likely to remain stagnant for a certain period of time. Products that are susceptible to fashion or seasonality, such as clothing, may be greatly affected. This study was aimed at expanding association rules to include into the list of recommendations those products whose initial trading frequency of transactions is low despite the possibility of high sales. The particular purpose is to predict the strength of the direct connection of two unconnected items through the properties of the paths located between them. An association between two items revealed in transactions can be interpreted as the interaction between them, which can be expressed as a link in a social network whose nodes are items. The first step calculates the centralities of the nodes in the middle of the paths that indirectly connect the two nodes without direct connection. The next step identifies the number of the paths and the shortest among them. These extracts are used as independent variables in the regression analysis to predict future connection strength between the nodes. The strength of the connection between the two nodes of the model, which is defined by the number of nodes between the two nodes, is measured after a certain period of time. The regression analysis results confirm that the number of paths between the two products, the distance of the shortest path, and the number of neighboring items connected to the products are significantly related to their potential strength. This study used actual order transaction data collected for three months from February to April in 2016 from an online commerce company. To reduce the complexity of analytics as the scale of the network grows, the analysis was performed only on miscellaneous goods. Two consecutively purchased items were chosen from each customer's transactions to obtain a pair of antecedent and consequent, which secures a link needed for constituting a social network. The direction of the link was determined in the order in which the goods were purchased. Except for the last ten days of the data collection period, the social network of associated items was built for the extraction of independent variables. The model predicts the number of links to be connected in the next ten days from the explanatory variables. Of the 5,711 previously unconnected links, 611 were newly connected for the last ten days. Through experiments, the proposed model demonstrated excellent predictions. Of the 571 links that the proposed model predicts, 269 were confirmed to have been connected. This is 4.4 times more than the average of 61, which can be found without any prediction model. This study is expected to be useful regarding industries whose new products launch quickly with short life cycles, since their exposure time is critical. Also, it can be used to detect diseases that are rarely found in the early stages of medical treatment because of the low incidence of outbreaks. Since the complexity of the social networking analysis is sensitive to the number of nodes and links that make up the network, this study was conducted in a particular category of miscellaneous goods. Future research should consider that this condition may limit the opportunity to detect unexpected associations between products belonging to different categories of classification.

A Literature Review and Classification of Recommender Systems on Academic Journals (추천시스템관련 학술논문 분석 및 분류)

  • Park, Deuk-Hee;Kim, Hyea-Kyeong;Choi, Il-Young;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.139-152
    • /
    • 2011
  • Recommender systems have become an important research field since the emergence of the first paper on collaborative filtering in the mid-1990s. In general, recommender systems are defined as the supporting systems which help users to find information, products, or services (such as books, movies, music, digital products, web sites, and TV programs) by aggregating and analyzing suggestions from other users, which mean reviews from various authorities, and user attributes. However, as academic researches on recommender systems have increased significantly over the last ten years, more researches are required to be applicable in the real world situation. Because research field on recommender systems is still wide and less mature than other research fields. Accordingly, the existing articles on recommender systems need to be reviewed toward the next generation of recommender systems. However, it would be not easy to confine the recommender system researches to specific disciplines, considering the nature of the recommender system researches. So, we reviewed all articles on recommender systems from 37 journals which were published from 2001 to 2010. The 37 journals are selected from top 125 journals of the MIS Journal Rankings. Also, the literature search was based on the descriptors "Recommender system", "Recommendation system", "Personalization system", "Collaborative filtering" and "Contents filtering". The full text of each article was reviewed to eliminate the article that was not actually related to recommender systems. Many of articles were excluded because the articles such as Conference papers, master's and doctoral dissertations, textbook, unpublished working papers, non-English publication papers and news were unfit for our research. We classified articles by year of publication, journals, recommendation fields, and data mining techniques. The recommendation fields and data mining techniques of 187 articles are reviewed and classified into eight recommendation fields (book, document, image, movie, music, shopping, TV program, and others) and eight data mining techniques (association rule, clustering, decision tree, k-nearest neighbor, link analysis, neural network, regression, and other heuristic methods). The results represented in this paper have several significant implications. First, based on previous publication rates, the interest in the recommender system related research will grow significantly in the future. Second, 49 articles are related to movie recommendation whereas image and TV program recommendation are identified in only 6 articles. This result has been caused by the easy use of MovieLens data set. So, it is necessary to prepare data set of other fields. Third, recently social network analysis has been used in the various applications. However studies on recommender systems using social network analysis are deficient. Henceforth, we expect that new recommendation approaches using social network analysis will be developed in the recommender systems. So, it will be an interesting and further research area to evaluate the recommendation system researches using social method analysis. This result provides trend of recommender system researches by examining the published literature, and provides practitioners and researchers with insight and future direction on recommender systems. We hope that this research helps anyone who is interested in recommender systems research to gain insight for future research.

Recommending Core and Connecting Keywords of Research Area Using Social Network and Data Mining Techniques (소셜 네트워크와 데이터 마이닝 기법을 활용한 학문 분야 중심 및 융합 키워드 추천 서비스)

  • Cho, In-Dong;Kim, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.127-138
    • /
    • 2011
  • The core service of most research portal sites is providing relevant research papers to various researchers that match their research interests. This kind of service may only be effective and easy to use when a user can provide correct and concrete information about a paper such as the title, authors, and keywords. However, unfortunately, most users of this service are not acquainted with concrete bibliographic information. It implies that most users inevitably experience repeated trial and error attempts of keyword-based search. Especially, retrieving a relevant research paper is more difficult when a user is novice in the research domain and does not know appropriate keywords. In this case, a user should perform iterative searches as follows : i) perform an initial search with an arbitrary keyword, ii) acquire related keywords from the retrieved papers, and iii) perform another search again with the acquired keywords. This usage pattern implies that the level of service quality and user satisfaction of a portal site are strongly affected by the level of keyword management and searching mechanism. To overcome this kind of inefficiency, some leading research portal sites adopt the association rule mining-based keyword recommendation service that is similar to the product recommendation of online shopping malls. However, keyword recommendation only based on association analysis has limitation that it can show only a simple and direct relationship between two keywords. In other words, the association analysis itself is unable to present the complex relationships among many keywords in some adjacent research areas. To overcome this limitation, we propose the hybrid approach for establishing association network among keywords used in research papers. The keyword association network can be established by the following phases : i) a set of keywords specified in a certain paper are regarded as co-purchased items, ii) perform association analysis for the keywords and extract frequent patterns of keywords that satisfy predefined thresholds of confidence, support, and lift, and iii) schematize the frequent keyword patterns as a network to show the core keywords of each research area and connecting keywords among two or more research areas. To estimate the practical application of our approach, we performed a simple experiment with 600 keywords. The keywords are extracted from 131 research papers published in five prominent Korean journals in 2009. In the experiment, we used the SAS Enterprise Miner for association analysis and the R software for social network analysis. As the final outcome, we presented a network diagram and a cluster dendrogram for the keyword association network. We summarized the results in Section 4 of this paper. The main contribution of our proposed approach can be found in the following aspects : i) the keyword network can provide an initial roadmap of a research area to researchers who are novice in the domain, ii) a researcher can grasp the distribution of many keywords neighboring to a certain keyword, and iii) researchers can get some idea for converging different research areas by observing connecting keywords in the keyword association network. Further studies should include the following. First, the current version of our approach does not implement a standard meta-dictionary. For practical use, homonyms, synonyms, and multilingual problems should be resolved with a standard meta-dictionary. Additionally, more clear guidelines for clustering research areas and defining core and connecting keywords should be provided. Finally, intensive experiments not only on Korean research papers but also on international papers should be performed in further studies.

Text Mining and Association Rules Analysis to a Self-Introduction Letter of Freshman at Korea National College of Agricultural and Fisheries (2) (한국농수산대학 신입생 자기소개서의 텍스트 마이닝과 연관규칙 분석 (2))

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Shin, Y.K.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.22 no.2
    • /
    • pp.99-114
    • /
    • 2020
  • In this study we examined the topic analysis and correlation analysis by text mining from the self introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The analysis items of the 3rd question were and the 4th question were the motivation for applying to college, the academic plan and the career plan. The text mining to the 3rd question showed that the frequency of 'friends' was overwhelmingly high, followed by keywords such as 'thought', 'time', 'opinion', 'activity', and 'club'. In the 4th question, keyword frequency such as 'thought', 'agriculture', 'KNCAF', 'farm', 'father' was high. The result of association rules analysis for each question showed that the relationship with the highest support level, which means the frequency and importance of the rule, was the {friend} <=> {thought}, {thought} <=> {KNCAF}. The confidence level of a correlation between keywords was the highest in the rules of {teacher}=>{friend}, {agriculture, KNCAF}=>{thought}. Also the lift level that indicates the closeness of two words was the highest in the rules of {friend} <=> {teacher}, {knowledge} <=> {professional}. These keywords are found to play a very important roles in analyzing betweenness centrality and analyzing degree centrality between keywords. The results of frequency analysis and association analysis were visualized with word cloud and correlation graphs to make it easier to understand all the results.

An Analysis of the 20th National Congress Report through Text-mining Methods (텍스트 마이닝을 활용한 중국공산당 20차 당대회 보고문 분석)

  • Kwon, Dokyung;Kim, Jungsoo;Park, Jihyun
    • Analyses & Alternatives
    • /
    • v.7 no.1
    • /
    • pp.115-145
    • /
    • 2023
  • The 20th National Congress of the Chinese Communist Party (hereafter referred to as "the 20th National Congress") was under the global spotlight long before it was held for seven days from 16 to 22 October 2022. People wondered whether Xi Jinping would secure a third term as China's leader or whether he would lay the foundations to be in power forever during the third term. In Korea, the press and media questioned whether the event would become the "crowning of Emperor Xi (Xi Huangdi)," whose power rivaled that of the first emperor in China, Shi Hunagdi, and featured the scene where Hu Jintao was forced to leave the venue during the Congress. On the other hand, many Korean academics focused more on how Xi would organize the Politburo and its Standing Committee and whether the outline of his heirs would appear during the event. This tendency in academia in turn worsened the media's concerns. This paper presents a quantitative analysis of the 20th National Congress Report, as opposed to an analysis of Xi's political intentions at the event. The National Congress Report outlines the Party's visions, goals, and strategies for the next five years in politics, economy, society, culture, foreign affairs, and relationship with Taiwan. The authoritative document is rich in narrative and logic and deserves academic study. This research analyzes the 18th, 19th, and 20th Reports by identifying their keywords and regular expressions and checking their frequency and percentage through text-mining methods. This approach enables the quantification and visualization of the significant changes in the Party's sovereign vision over the fifteen years of Xi's rule from 2013 to 2027.

Performance Comparison of Clustering using Discritization Algorithm (이산화 알고리즘을 이용한 계층적 클러스터링의 실험적 성능 평가)

  • Won, Jae Kang;Lee, Jeong Chan;Jung, Yong Gyu;Lee, Young Ho
    • Journal of Service Research and Studies
    • /
    • v.3 no.2
    • /
    • pp.53-60
    • /
    • 2013
  • Datamining from the large data in the form of various techniques for obtaining information have been developed. In recent years one of the most sought areas of pattern recognition and machine learning method is created with most of existing learning algorithms based on categorical attributes to a rule or decision model. However, the real-world data, it may consist of numeric attributes in many cases. In addition it contains attributes with numerical values to the normal categorical attribute. In this case, therefore, it is required processes in order to use the data to learn an appropriate value for the type attribute. In this paper, the domain of the numeric attributes are divided into several segments using learning algorithm techniques of discritization. It is described Clustering with other data mining techniques. Large amount of first cluster with characteristics is similar records from the database into smaller groups that split multiple given finite patterns in the pattern space. It is close to each other of a set of patterns that together make up a bunch. Among the set without specifying a particular category in a given data by extracting a pattern. It will be described similar grouping of data clustering technique to classify the data.

  • PDF

Identification of Emerging Research at the national level: Scientometric Approach using Scopus (국가적 차원의 유망연구영역 탐색: Scopus 데이터베이스를 이용한 과학계량학적 접근)

  • Yeo, Woon-Dong;Sohn, Eun-Soo;Jung, Eui-Seob;Lee, Chang-Hoan
    • Journal of Information Management
    • /
    • v.39 no.3
    • /
    • pp.95-113
    • /
    • 2008
  • In todays environment in which scientific technologies are changing very fast than ever, companies have to monitor and search emerging technologies to gain competitiveness. Actually many nations try to do that. Most of them use Dephi approach based on experts review as a searching method. But experts review has been criticised for probability of inclination and its derivative problems in the sense that it is accomplished only by expert's subjectivity. To overcome such problems, we used Scientometric Method for identifying emerging technology that had been done by Delphi as a rule. We made three particular efforts in order to improve the Quality of the result. Firstly, we selected one alternative database between SCI and Scopus hoping to see evenly-distributing results in wide fields on the front burner. Secondly we used Fractional citation counting in counting citation number in the stage of linear regression analysis. Lastly, we verified Scientometric result with experts opinions to minimize probable errors in a Scientometric research. As a result, we derived 290 emerging technologies from Scientometric analysis with Scopus Database, and visualized them on 2-dimension map with data mining system named KnowledgeMatrix which was developed by KISTI.