• Title/Summary/Keyword: big data mining

Search Result 679, Processing Time 0.027 seconds

Study on prediction for a film success using text mining (텍스트 마이닝을 활용한 영화흥행 예측 연구)

  • Lee, Sanghun;Cho, Jangsik;Kang, Changwan;Choi, Seungbae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1259-1269
    • /
    • 2015
  • Recently, big data is positioning as a keyword in the academic circles. And usefulness of big data is carried into government, a local public body and enterprise as well as academic circles. Also they are endeavoring to obtain useful information in big data. This research mainly deals with analyses of box office success or failure of films using text mining. For data, it used a portal site 'D' and film review data, grade point average and the number of screens gained from the Korean Film Commission. The purpose of this paper is to propose a model to predict whether a film is success or not using these data. As a result of analysis, the correct classification rate by the prediction model method proposed in this paper is obtained 95.74%.

A Comparison of Starbucks between South Korea and U.S.A. through Big Data Analysis (빅데이터 분석을 통한 한국과 미국의 스타벅스 비교 분석)

  • Jo, Ara;Kim, Hak-Seon
    • Culinary science and hospitality research
    • /
    • v.23 no.8
    • /
    • pp.195-205
    • /
    • 2017
  • The purpose of this study was to compare the Starbucks in South Korea with Starbucks in U.S.A through the semantic network analysis of big data by collecting online data with SCTM(Smart Crawling & Text Mining) program which was developed by big data research institute at Kyungsung University, a data collecting and processing program. The data collection period was from January 1st 2014 to December 7th 2017, and packaged Netdraw along with UCINET 6.0 were utilized for data analysis and visualization. After performing CONCOR(convergence of iterated correlation) analysis and centrality analysis, this study illustrated the current characteristics of Starbucks for Korea and U.S.A reflected by the social network and the differences between Korea and U.S.A. Since the Starbucks was greatly developed, especially in Korea. this study also was supposed to provide significant and social-network oriented suggestions for Starbucks USA, Starbucks Korea and also the whole coffee industry. Also this study revealed that big data analytics can generate new insights into variables that have been extensively studied in existing hospitality literature. In addition, implications for theory and practice as well as directions for future research are discussed.

A Big Data Learning for Patent Analysis (특허분석을 위한 빅 데이터학습)

  • Jun, Sunghae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.406-411
    • /
    • 2013
  • Big data issue has been considered in diverse fields. Also, big data learning has been required in all areas such as engineering and social science. Statistics and machine learning algorithms are representative tools for big data learning. In this paper, we study learning tools for big data and propose an efficient methodology for big data learning via legacy data to practical application. We apply our big data learning to patent analysis, because patent is one of big data. Also, we use patent analysis result for technology forecasting. To illustrate how the proposed methodology could be applied in real domain, we will retrieve patents related to big data from patent databases in the world. Using searched patent data, we perform a case study by text mining preprocessing and multiple linear regression of statistics.

Study on Application of Big Data in Packaging (패키징(Packaging) 분야에서의 빅데이터(Big data) 적용방안 연구)

  • Kang, WookGeon;Ko, Euisuk;Shim, Woncheol;Lee, Hakrae;Kim, Jaineung
    • KOREAN JOURNAL OF PACKAGING SCIENCE & TECHNOLOGY
    • /
    • v.23 no.3
    • /
    • pp.201-209
    • /
    • 2017
  • The Big Data, the element of the Fourth Industrial Revolution, is drawing attention as the 4th Industrial Revolution is mentioned in the 2016 World Economic Forum. Big Data is being used in various fields because it predicts the near future and can create new business. However, utilization and research in the field of packaging are lacking. Today packaging has been demanded marketing elements that effect on consumer choice. Big data is actively used in marketing. In the marketing field, big data can be used to analyze sales information and consumer reactions to produce meaningful results. Therefore, this study proposed a method of applying big data in the field of packaging focusing on marketing. In this study suggest that try to utilize the private data and community data to analyze interaction between consumers and products. Using social big data will enable to understand the preferred packaging and consumer perceptions and emotions in the same product line. It can also be used to analyze the effects of packaging among various components of the product. Packaging is one of the many components of the product. Therefore, it is not easy to understand the impact of a single packaging element. However, this study presents the possibility of using Big Data to analyze the perceptions and feelings of consumers about packaging.

Inter-category Map: Building Cognition Network of General Customers through Big Data Mining

  • Song, Gil-Young;Cheon, Youngjoon;Lee, Kihwang;Park, Kyung Min;Rim, Hae-Chang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.2
    • /
    • pp.583-600
    • /
    • 2014
  • Social media is considered a valuable platform for gathering and analyzing the collective and subconscious opinions of people in Internet and mobile environments, where they express, explicitly and implicitly, their daily preferences for brands and products. Extracting and tracking the various attitudes and concerns that people express through social media could enable us to categorize brands and decipher individuals' cognitive decision-making structure in their choice of brands. We investigate the cognitive network structure of consumers by building an inter-category map through the mining of big data. In so doing, we create an improved online recommendation model. Building on economic sociology theory, we suggest a framework for revealing collective preference by analyzing the patterns of brand names that users frequently mention in the online public sphere. We expect that our study will be useful for those conducting theoretical research on digital marketing strategies and doing practical work on branding strategies.

PPFP(Push and Pop Frequent Pattern Mining): A Novel Frequent Pattern Mining Method for Bigdata Frequent Pattern Mining (PPFP(Push and Pop Frequent Pattern Mining): 빅데이터 패턴 분석을 위한 새로운 빈발 패턴 마이닝 방법)

  • Lee, Jung-Hun;Min, Youn-A
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.12
    • /
    • pp.623-634
    • /
    • 2016
  • Most of existing frequent pattern mining methods address time efficiency and greatly rely on the primary memory. However, in the era of big data, the size of real-world databases to mined is exponentially increasing, and hence the primary memory is not sufficient enough to mine for frequent patterns from large real-world data sets. To solve this problem, there are some researches for frequent pattern mining method based on disk, but the processing time compared to the memory based methods took very time consuming. There are some researches to improve scalability of frequent pattern mining, but their processes are very time consuming compare to the memory based methods. In this paper, we present PPFP as a novel disk-based approach for mining frequent itemset from big data; and hence we reduced the main memory size bottleneck. PPFP algorithm is based on FP-growth method which is one of the most popular and efficient frequent pattern mining approaches. The mining with PPFP consists of two setps. (1) Constructing an IFP-tree: After construct FP-tree, we assign index number for each node in FP-tree with novel index numbering method, and then insert the indexed FP-tree (IFP-tree) into disk as IFP-table. (2) Mining frequent patterns with PPFP: Mine frequent patterns by expending patterns using stack based PUSH-POP method (PPFP method). Through this new approach, by using a very small amount of memory for recursive and time consuming operation in mining process, we improved the scalability and time efficiency of the frequent pattern mining. And the reported test results demonstrate them.

A Study on the Perception of Corona19 Period Play Culture Based on Big Data Analysis

  • Jung, Seon-Jin
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.196-203
    • /
    • 2020
  • In this study, we tried to explore the actual direction for the play culture by looking at the social perception of the change of play culture due to the Corona 19 using big data analysis. For this research, we used Textom, a website specializing in collecting big data, and collected 10,216 data using keywords of "Corona + Play," "Play Culture" and "Leisure" from January 19, 2020 to September 30, 2020, when the first confirmed case of Corona 19 occurred in Korea on various portal sites at home and abroad. The results of this paper showed that the social perception of the play culture in Corona 19 was 51.61%, not much different from the negative image of 48.15%. It is necessary to develop a play culture program that can identify people's various desires and emotions under the premise that situations similar to the current With Corona period and Corona19 can occur at any time, and find mental and physical stability and vitality in unstable situations. In addition, the results of this study can be used as basic data for the development of play culture policies or programs, with the significance that this study helped vitalize big data utilization research in the fields of play, leisure, and culture.

Knowledge Extractions, Visualizations, and Inference from the big Data in Healthcare and Medical

  • Kim, Jin Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.400-405
    • /
    • 2013
  • The purpose of this study is to develop a composite platform for knowledge extractions, visualizations, and inference. Generally, the big data sets were frequently used in the healthcare and medical area. To help the knowledge managers/users working in the field, this study is focused on knowledge management (KM) based on Data Mining (DM), Knowledge Distribution Map (KDM), Decision Tree (DT), RDBMS, and SQL-inference. The proposed mechanism is composed of five key processes. Firstly, in Knowledge Parsing, it extracts logical rules from a big data set by using DM technology. Then it transforms the rules into RDB tables. Secondly, through Knowledge Maintenance, it refines and manages the knowledge to be ready for the computing of knowledge distributions. Thirdly, in Knowledge Distribution process, we can see the knowledge distributions by using the DT mechanism.Fourthly, in Knowledge Hierarchy, the platform shows the hierarchy of the knowledge. Finally, in Inference, it deduce the conclusions by using the given facts and data.This approach presents the advantages of diversity in knowledge representations and inference to improve the quality of computer-based medical diagnosis.

Big Data Analysis of the Women Who Score Goal Sports Entertainment Program: Focusing on Text Mining and Semantic Network Analysis.

  • Hyun-Myung, Kim;Kyung-Won, Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.222-230
    • /
    • 2023
  • The purpose of this study is to provide basic data on sports entertainment programs by collecting data on unstructured data generated by Naver and Google for SBS entertainment program 'Women Who Score Goal', which began regular broadcast in June 2021, and analyzing public perceptions through data mining, semantic matrix, and CONCOR analysis. Data collection was conducted using Textom, and 27,911 cases of data accumulated for 16 months from June 16, 2021 to October 15, 2022. For the collected data, 80 key keywords related to 'Kick a Goal' were derived through simple frequency and TF-IDF analysis through data mining. Semantic network analysis was conducted to analyze the relationship between the top 80 keywords analyzed through this process. The centrality was derived through the UCINET 6.0 program using NetDraw of UCINET 6.0, understanding the characteristics of the network, and visualizing the connection relationship between keywords to express it clearly. CONCOR analysis was conducted to derive a cluster of words with similar characteristics based on the semantic network. As a result of the analysis, it was analyzed as a 'program' cluster related to the broadcast content of 'Kick a Goal' and a 'Soccer' cluster, a sports event of 'Kick a Goal'. In addition to the scenes about the game of the cast, it was analyzed as an 'Everyday Life' cluster about training and daily life, and a cluster about 'Broadcast Manipulation' that disappointed viewers with manipulation of the game content.

Comparison and Analysis of Domestic and Foreign Sports Brands Using Text Mining and Opinion Mining Analysis (텍스트 마이닝과 오피니언 마이닝 분석을 활용한 국내외 스포츠용품 브랜드 비교·분석 연구)

  • Kim, Jae-Hwan;Lee, Jae-Moon
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.6
    • /
    • pp.217-234
    • /
    • 2018
  • In this study, big data analysis was conducted for domestic and international sports goods brands. Text Mining, TF-IDF, Opinion Mining, interestity graph were conducted through the social matrix program Textom and the fashion data analysis platform MISP. In order to examine the recent recognition of sports brands, the period of study is limited to 1 year from January 1, 2017 to December 31, 2017. As a result of analysis, first, we could confirm the products representing each brand. Second, I could confirm the marketing that represents each brand. Third, the common words extracted from each brand were identified. Fourth, the emotions of positive and negative of each brand were confirmed.