• Title/Summary/Keyword: 그래프 마이닝

Search Result 71, Processing Time 0.04 seconds

Selecting a key issue through association analysis of realtime search words (실시간 검색어 연관 분석을 통한 핵심 이슈 선정)

  • Chong, Min-Yeong
    • Journal of Digital Convergence
    • /
    • v.13 no.12
    • /
    • pp.161-169
    • /
    • 2015
  • Realtime search words of typical portal sites appear every few seconds in descending order by search frequency in order to show issues increasing rapidly in interest. However, the characteristics of realtime search words reordering within too short a time cause problems that they go over the key issues of the day. This paper proposes a method for deriving a key issue through association analysis of realtime search words. The proposed method first makes scores of realtime search words depending on the ranking and the relative interest, and derives the top 10 search words through descriptive statistics for groups. Then, it extracts association rules depending on 'support' and 'confidence', and chooses the key issue based on the results as a graph visualizing them. The results of experiments show that the key issue through association rules is more meaningful than the first realtime search word.

Design and Application of a Winning Forecast Model of the AOS Genre Game (AOS 장르 게임의 승패 예측 모형의 설계와 활용)

  • Ku, Ji-Min;Yu, Kyeonah
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.1
    • /
    • pp.37-44
    • /
    • 2017
  • Games of the AOS genre are classified as an e-sport rather than a recreational computer game. The involved statistical analyses such as game playing patterns and the season's characters gain importance due to the expertise-requiring nature of sports. In this study, the strategic analysis of computer games was conducted by using data mining techniques on League of Legend, a representative AOS game. We designed and tested a winning forecast model using winning percentage prediction techniques such as logistic regression analysis, discriminant analysis, and artificial neural networks. The game data analysis results were represented by a probabilistic graph and used in the visualization tool for game play. Experimental results of the winning forecast model showed a high classification rate of 95% on average with potential for use in establishing various strategies for game play with the visualization tool.

Adaptive Customer Relation Management Strategies using Association Rules (연관 규칙을 이용한 적응적 고객 관계 관리 전략)

  • Han, Ki-Tae;Chung, Kyung-Yong;Baek, Jun-Ho;Kim, Jong-Hun;Ryu, Joong-Kyung;Lee, Jung-Hyun
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2008.05a
    • /
    • pp.84-86
    • /
    • 2008
  • The customer relation marketing in which companies can utilize to control and to get the filtered information efficiently has appeared. It is applying data mining to build the management that can even predict and recommend products to customers. In this paper, we proposed the adaptive customer relation management strategies using the association rules of data mining. The proposed method uses the association rules composes frequent customers with occurrence of candidate customer set creates the rules of associative customers. We analyzed the efficient feature of purchase customers using the hyper graph partition according to the lift of creative association rules. Therefore, we discovered strategies of the cross-selling and the up-selling about customers.

  • PDF

Employee's Discontent Text Analysis on Anonymous Company Review Web and Suggestions for Discontent Resolve (기업 리뷰 웹 사이트 텍스트 분석을 통한 직원 불만 표현 추출과 불만 원인 도출 및 해소 방안)

  • Baek, HyeYeon;Park, Yongsuk
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.4
    • /
    • pp.357-364
    • /
    • 2019
  • As industrial information disclosure by insider's rate is around 80%, most of relevant researches explain briefly its causes are discontent of salary or human resources system. This paper scrapes texts on Jobplanet, an anonymous company review website and analyzes discontent keyword by 7 related area and their contexts to find out more details on brief causes referred above. After drawing LGG (Local Grammar Graph) by each areas with related dictionary list, this paper shows an example of concordance as a proof and several ways for human resources leakage prevention. Finally, text analysis results are compared with previous researches based on survey with limited questions and answers. This study is meaningful to expand the scope of employee discontent analysis with company review text and provide more specific, granular and honest discontent vocabularies.

Korean Collective Intelligence in Sharing Economy Using R Programming: A Text Mining and Time Series Analysis Approach (R프로그래밍을 활용한 공유경제의 한국인 집단지성: 텍스트 마이닝 및 시계열 분석)

  • Kim, Jae Won;Yun, You Dong;Jung, Yu Jin;Kim, Ki Youn
    • Journal of Internet Computing and Services
    • /
    • v.17 no.5
    • /
    • pp.151-160
    • /
    • 2016
  • The purpose of this research is to investigate Korean popular attitudes and social perceptions of 'sharing economy' terminology at the current moment from a creative or socio-economic point of view. In Korea, this study discovers and interprets the objective and tangible annual changes and patterns of sociocultural collective intelligence that have taken place over the last five years by applying text mining in the big data analysis approach. By crawling and Googling, this study collected a significant amount of time series web meta-data with regard to the theme of the sharing economy on the world wide web from 2010 to 2014. Consequently, huge amounts of raw data concerning sharing economy are processed into the value-added meaningful 'word clouding' form of graphs or figures by using the function of word clouding with R programming. Till now, the lack of accumulated data or collective intelligence about sharing economy notwithstanding, it is worth nothing that this study carried out preliminary research on conducting a time-series big data analysis from the perspective of knowledge management and processing. Thus, the results of this study can be utilized as fundamental data to help understand the academic and industrial aspects of future sharing economy-related markets or consumer behavior.

Text Mining Analysis Technique on ECDIS Accident Report (텍스트 마이닝 기법을 활용한 ECDIS 사고보고서 분석)

  • Lee, Jeong-Seok;Lee, Bo-Kyeong;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.25 no.4
    • /
    • pp.405-412
    • /
    • 2019
  • SOLAS requires that ECDIS be installed on ships of more than 500 gross tonnage engaged in international navigation until the first inspection arriving after July 1, 2018. Several accidents related to the use of ECDIS have occurred with its installation as a new major navigation instrument. The 12 incident reports issued by MAIB, BSU, BEAmer, DMAIB, and DSB were analyzed, and the cause of accident was determined to be related to the operation of the navigator and the ECDIS system. The text was analyzed using the R-program to quantitatively analyze words related to the cause of the accident. We used text mining techniques such as Wordcloud, Wordnetwork and Wordweight to represent the importance of words according to their frequency of derivation. Wordcloud uses the N-gram model as a way of expressing the frequency of used words in cloud form. As a result of the uni-gram analysis of the N-gram model, ECDIS words were obtained the most, and the bi-gram analysis results showed that the word "Safety Contour" was used most frequently. Based on the bi-gram analysis, the causative words are classified into the officer and the ECDIS system, and the related words are represented by Wordnetwork. Finally, the related words with the of icer and the ECDIS system were composed of word corpus, and Wordweight was applied to analyze the change in corpus frequency by year. As a result of analyzing the tendency of corpus variation with the trend line graph, more recently, the corpus of the officer has decreased, and conversely, the corpus of the ECDIS system is gradually increasing.

The Prediction of the Helpfulness of Online Review Based on Review Content Using an Explainable Graph Neural Network (설명가능한 그래프 신경망을 활용한 리뷰 콘텐츠 기반의 유용성 예측모형)

  • Eunmi Kim;Yao Ziyan;Taeho Hong
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.309-323
    • /
    • 2023
  • As the role of online reviews has become increasingly crucial, numerous studies have been conducted to utilize helpful reviews. Helpful reviews, perceived by customers, have been verified in various research studies to be influenced by factors such as ratings, review length, review content, and so on. The determination of a review's helpfulness is generally based on the number of 'helpful' votes from consumers, with more 'helpful' votes considered to have a more significant impact on consumers' purchasing decisions. However, recently written reviews that have not been exposed to many customers may have relatively few 'helpful' votes and may lack 'helpful' votes altogether due to a lack of participation. Therefore, rather than relying on the number of 'helpful' votes to assess the helpfulness of reviews, we aim to classify them based on review content. In addition, the text of the review emerges as the most influential factor in review helpfulness. This study employs text mining techniques, including topic modeling and sentiment analysis, to analyze the diverse impacts of content and emotions embedded in the review text. In this study, we propose a review helpfulness prediction model based on review content, utilizing movie reviews from IMDb, a global movie information site. We construct a review helpfulness prediction model by using an explainable Graph Neural Network (GNN), while addressing the interpretability limitations of the machine learning model. The explainable graph neural network is expected to provide more reliable information about helpful or non-helpful reviews as it can identify connections between reviews.

Analysis of Gene-Drug Interactions Using Bayesian Networks (베이지안망을 이용한 유전자와 약물 간 관계 분석)

  • O, Seok-Jun;Hwang, Gyu-Baek;Jang, Jeong-Ho;Jang, Byeong-Tak
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.05a
    • /
    • pp.91-97
    • /
    • 2002
  • 최근의 생물학 연구를 위한 기기의 자동화 및 고속화는 생물학 관련 정보량의 급증을 가져오고 있다. 예를 들어, DNA chip에서 얻어지는 마이크로어레이(microarray)는 수천 종류의 유전자의 발현량을 동시에 측정한다. 이러한 기술들은 생물의 세포나 조직에서 일어나는 일련의 다양한 현상을 전체적으로 조망하는 관점에서 관찰할 수 있는 기회를 제공하고 있으며, 이를 통한 생명공학의 전반적인 발전이 기대되고 있다. 따라서 대량의 생물학 관련 정보의 분석이나 데이터 마이닝이 행해지고 있으며 이를 위한 대표적인 기법들로는 각종 클러스터링(clustering) 및 신경망 계열의 모델 등이 있다. 본 논문에서는 확률그래프모델의 하나인 베이지안망(Bayesian network)을 생물정보분석에 이용한다. 구체적으로 유전자 발현패턴과 약물의 활성패턴 및 암 종류 사이의 확률적 관계를 모델링한다. 이러한 모델은 NCI60 dataset(http://discover.nci.nih.gov)에서 베이지안망을 학습함으로써 구성된다. 분석의 대상이 되는 데이터가 sparse하기 때문에 발생하는 어려움들을 해결하기 위한 기법들이 제시되며 학습된 모델에 대한 검증은 이미 생물학적으로 확인되어 있는 사실과의 비교를 통해 이루어진다. 학습된 베이지안망 모델은 각각의 유전자 간, 혹은 유전자와 처리된 약물 간의 실제 생물학적 관계를 다수 표현하며, 이는 제시되는 방법이 생물학적으로 유의미한 가설을 데이터 분석을 통해 효율적으로 생성하는데 유용하게 활용될 수 있음을 보인다.

  • PDF

Text-mining Based Graph Model for Keyword Extraction from Patent Documents (특허 문서로부터 키워드 추출을 위한 위한 텍스트 마이닝 기반 그래프 모델)

  • Lee, Soon Geun;Leem, Young Moon;Um, Wan Sup
    • Journal of the Korea Safety Management & Science
    • /
    • v.17 no.4
    • /
    • pp.335-342
    • /
    • 2015
  • The increasing interests on patents have led many individuals and companies to apply for many patents in various areas. Applied patents are stored in the forms of electronic documents. The search and categorization for these documents are issues of major fields in data mining. Especially, the keyword extraction by which we retrieve the representative keywords is important. Most of techniques for it is based on vector space model. But this model is simply based on frequency of terms in documents, gives them weights based on their frequency and selects the keywords according to the order of weights. However, this model has the limit that it cannot reflect the relations between keywords. This paper proposes the advanced way to extract the more representative keywords by overcoming this limit. In this way, the proposed model firstly prepares the candidate set using the vector model, then makes the graph which represents the relation in the pair of candidate keywords in the set and selects the keywords based on this relationship graph.

Idea proposal of InfograaS for Visualization of Public Big-data (공공 빅데이터의 시각화를 위한 InfograaS의 아이디어 제안)

  • Cha, Byung-Rae;Lee, Hyung-Ho;Sim, Su-Jeong;Kim, Jong-Won
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.5
    • /
    • pp.524-531
    • /
    • 2014
  • In this paper, we have proposed the processing and analyzing the linked open data (LOD), a kind of big-data, using resources of cloud computing. The LOD is web-based open data in order to share and recycle of public data. Specially, we defined the InfograaS (Info-graphic as a service), new business area of SaaS (software as a service), to support visualization technique for BA (business analytics) and Info-graphic. The goal of this study is easily to use it by the non-specialist and beginner without experts of visualization and business analysis. Data visualization is the process to represent visually and understand the data analysis easily. The purpose of data visualization is to deliver information clearly and effectively by chart and figure. The big data of public data are shared and presented in the charts and the graphics understood easily by various processing results using Hadoop, R, machine learning, and data mining of open source and resources of cloud computing.