• Title/Summary/Keyword: Box-office Prediction

Search Result 29, Processing Time 0.022 seconds

Product Community Analysis Using Opinion Mining and Network Analysis: Movie Performance Prediction Case (오피니언 마이닝과 네트워크 분석을 활용한 상품 커뮤니티 분석: 영화 흥행성과 예측 사례)

  • Jin, Yu;Kim, Jungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.49-65
    • /
    • 2014
  • Word of Mouth (WOM) is a behavior used by consumers to transfer or communicate their product or service experience to other consumers. Due to the popularity of social media such as Facebook, Twitter, blogs, and online communities, electronic WOM (e-WOM) has become important to the success of products or services. As a result, most enterprises pay close attention to e-WOM for their products or services. This is especially important for movies, as these are experiential products. This paper aims to identify the network factors of an online movie community that impact box office revenue using social network analysis. In addition to traditional WOM factors (volume and valence of WOM), network centrality measures of the online community are included as influential factors in box office revenue. Based on previous research results, we develop five hypotheses on the relationships between potential influential factors (WOM volume, WOM valence, degree centrality, betweenness centrality, closeness centrality) and box office revenue. The first hypothesis is that the accumulated volume of WOM in online product communities is positively related to the total revenue of movies. The second hypothesis is that the accumulated valence of WOM in online product communities is positively related to the total revenue of movies. The third hypothesis is that the average of degree centralities of reviewers in online product communities is positively related to the total revenue of movies. The fourth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. The fifth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. To verify our research model, we collect movie review data from the Internet Movie Database (IMDb), which is a representative online movie community, and movie revenue data from the Box-Office-Mojo website. The movies in this analysis include weekly top-10 movies from September 1, 2012, to September 1, 2013, with in total. We collect movie metadata such as screening periods and user ratings; and community data in IMDb including reviewer identification, review content, review times, responder identification, reply content, reply times, and reply relationships. For the same period, the revenue data from Box-Office-Mojo is collected on a weekly basis. Movie community networks are constructed based on reply relationships between reviewers. Using a social network analysis tool, NodeXL, we calculate the averages of three centralities including degree, betweenness, and closeness centrality for each movie. Correlation analysis of focal variables and the dependent variable (final revenue) shows that three centrality measures are highly correlated, prompting us to perform multiple regressions separately with each centrality measure. Consistent with previous research results, our regression analysis results show that the volume and valence of WOM are positively related to the final box office revenue of movies. Moreover, the averages of betweenness centralities from initial community networks impact the final movie revenues. However, both of the averages of degree centralities and closeness centralities do not influence final movie performance. Based on the regression results, three hypotheses, 1, 2, and 4, are accepted, and two hypotheses, 3 and 5, are rejected. This study tries to link the network structure of e-WOM on online product communities with the product's performance. Based on the analysis of a real online movie community, the results show that online community network structures can work as a predictor of movie performance. The results show that the betweenness centralities of the reviewer community are critical for the prediction of movie performance. However, degree centralities and closeness centralities do not influence movie performance. As future research topics, similar analyses are required for other product categories such as electronic goods and online content to generalize the study results.

Thermal Crack Control of Massive Foundation Mat of Office-tel Using Thermal Analysis (오피스텔 대형 기초매트의 온도해석을 통한 온도균열제어)

  • 김태홍;하재담;김동석;이종열
    • Proceedings of the Korea Concrete Institute Conference
    • /
    • 2000.10b
    • /
    • pp.1181-1186
    • /
    • 2000
  • The crack of concrete induced by the heat of hydration is a serious problem, particularly in concrete structures such as biers, thick walls, box type walls, mat-slab of nuclear reactor buildings, dams or foundations of high rise buildings, etc.. As a result of the temperature rise and restriction condition of foundation, the thermal stress which may induce the cracks can occur. Therefore the various techniques of the thermal stress control in massive concrete have been widely used. One of them is prediction of the thermal stress, besides low-heat cement which mitigates the temperature rise, design change which considers steel bar reinforcement, operation control and so on. In this study, firstly it introduce the thermal cracks control technique by employing low-heat cement concrete, thermal stress analysis considering season. Secondly it shows the application of the cracks control technique like block placement.

Box Office Hit Prediction Using Data mining and Text mining (데이터마이닝과 텍스트마이닝을 활용한 영화 흥행 예측)

  • Jo, Hyo-jung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.316-318
    • /
    • 2021
  • 영화 수익에 있어 영화의 흥행 여부는 중요한 영향을 끼친다. 영화 흥행 요인은 영화 산업의 규모가 커지면서 많은 제작사들 및 투자자들이 고려해야 하는 사항이 되었다. 따라서 영화의 흥행을 예측하기 위한 많은 모델이 연구되었다. 본 연구의 목적은 선행연구에서 흥행에 유의미한 영향을 끼친다고 밝혀진 스크린 수, 감독명, 제작사명 등의 내재적인 속성과 더불어 온라인 구전 변수를 사용하여 영화 흥행 예측 모델을 만드는 것이다. 이때 기사 수, 블로그 수와 같이 온라인 구전의 크기를 나타내는 변수들을 사용하는 대신 개봉 후 첫 주간의 관람객 리뷰를 텍스트마이닝을 이용하여 전체 리뷰 중 긍정 리뷰의 비율에 따라 점수를 매긴 후 독립변수로 사용한다. 그 후, 데이터 마이닝 기법을 활용하여 만든 모델에 앞서 언급한 독립변수를 입력 값으로 사용하여 영화의 흥행을 예측한다. 최종적으로 의사결정트리와 로지스틱회귀를 수행한 결과 영화 흥행에 영향을 주는 독립변수를 찾고 모델의 성능을 평가하였다. 로지스틱회귀의 결과 관객 수, 평점이 영화의 흥행에 특히 유의한 영향을 끼치는 변수로 선정되었고 리뷰 역시 유의한 변수로 선정되었다. 이때 만들어진 모델은 약 90%의 높은 수준의 정확도를 보여주었다. 의사결정트리의 결과 관객 수가 가장 중요한 변수로 선정되었다.

Study on prediction for a film success using text mining (텍스트 마이닝을 활용한 영화흥행 예측 연구)

  • Lee, Sanghun;Cho, Jangsik;Kang, Changwan;Choi, Seungbae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1259-1269
    • /
    • 2015
  • Recently, big data is positioning as a keyword in the academic circles. And usefulness of big data is carried into government, a local public body and enterprise as well as academic circles. Also they are endeavoring to obtain useful information in big data. This research mainly deals with analyses of box office success or failure of films using text mining. For data, it used a portal site 'D' and film review data, grade point average and the number of screens gained from the Korean Film Commission. The purpose of this paper is to propose a model to predict whether a film is success or not using these data. As a result of analysis, the correct classification rate by the prediction model method proposed in this paper is obtained 95.74%.

Data analysis by Integrating statistics and visualization: Visual verification for the prediction model (통계와 시각화를 결합한 데이터 분석: 예측모형 대한 시각화 검증)

  • Mun, Seong Min;Lee, Kyung Won
    • Design Convergence Study
    • /
    • v.15 no.6
    • /
    • pp.195-214
    • /
    • 2016
  • Predictive analysis is based on a probabilistic learning algorithm called pattern recognition or machine learning. Therefore, if users want to extract more information from the data, they are required high statistical knowledge. In addition, it is difficult to find out data pattern and characteristics of the data. This study conducted statistical data analyses and visual data analyses to supplement prediction analysis's weakness. Through this study, we could find some implications that haven't been found in the previous studies. First, we could find data pattern when adjust data selection according as splitting criteria for the decision tree method. Second, we could find what type of data included in the final prediction model. We found some implications that haven't been found in the previous studies from the results of statistical and visual analyses. In statistical analysis we found relation among the multivariable and deducted prediction model to predict high box office performance. In visualization analysis we proposed visual analysis method with various interactive functions. Finally through this study we verified final prediction model and suggested analysis method extract variety of information from the data.

Predicting Movie Success based on Machine Learning Using Twitter (트위터를 이용한 기계학습 기반의 영화흥행 예측)

  • Yim, Junyeob;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.7
    • /
    • pp.263-270
    • /
    • 2014
  • This paper suggests a method for predicting a box-office success of the film. Lately, as the growth of the film industry, a variety of studies for the prediction of market demand is being performed. The product life cycle of film is relatively short cultural goods. Therefore, in order to produce stable profits, marketing costs before opening as well as the number of screen after opening need a plan. To fulfill this plan, the demand for the product and the calculation of economic profit scale should be preceded. The cases of existing researches, as a variable for predicting, primarily use the factors of competition of the market or the properties of the film. However, the proportion of the potential audiences who purchase the goods is relatively insufficient. Therefore, in this paper, in order to consider people's perception of a movie, Twitter was utilized as one of the survey samples. The existing variables and the information extracted from Twitter are defined as off-line and on-line element, and applied those two elements in machine learning by combining. Through the experiment, the proposed predictive techniques are validated, and the results of the experiment predicted the chance of successful film with about 95% of accuracy.

An Expoloratory Study on Influencing Factors of Film Equity Crowdfunding Success: Based on Chinese Movie Crowdfunding (영화 크라우드펀딩 성공에 영향을 미치는 요인에 관한 탐색적 연구: 중국의 영화 플랫폼 크라우드펀딩을 중심으로)

  • Bao, Tantan;Kim, Hun;Chang, Byeng-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.2
    • /
    • pp.1-14
    • /
    • 2021
  • Recently, crowdfunding platforms have received attention as one of the content investment platforms for the public. This research attempts to explore the influencing factors on the success of movie euqity crowdfunding project. We use 'number of texts', 'number of images', 'star influence power', 'IP-based movie project', 'movie production stage', 'box office prediction', 'investment capital ratio', 'amount of surplus available investment', 'profit calculation method' and 'minimum investment amount' as independent variables. And we examined how these factors affects the achievement rate of movie crowdfunding. As a result of multiple regression analysis, 'movie production stage', 'investment capital ratio', 'amount of surplus available investment' and 'profit calculation method' have a significant effect on the crowdfunding achievement rate. In addition, the results of this research can be used for reference when planning film crowdfunding projects.

A study on the use of a Business Intelligence system : the role of explanations (비즈니스 인텔리전스 시스템의 활용 방안에 관한 연구: 설명 기능을 중심으로)

  • Kwon, YoungOk
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.155-169
    • /
    • 2014
  • With the rapid advances in technologies, organizations are more likely to depend on information systems in their decision-making processes. Business Intelligence (BI) systems, in particular, have become a mainstay in dealing with complex problems in an organization, partly because a variety of advanced computational methods from statistics, machine learning, and artificial intelligence can be applied to solve business problems such as demand forecasting. In addition to the ability to analyze past and present trends, these predictive analytics capabilities provide huge value to an organization's ability to respond to change in markets, business risks, and customer trends. While the performance effects of BI system use in organization settings have been studied, it has been little discussed on the use of predictive analytics technologies embedded in BI systems for forecasting tasks. Thus, this study aims to find important factors that can help to take advantage of the benefits of advanced technologies of a BI system. More generally, a BI system can be viewed as an advisor, defined as the one that formulates judgments or recommends alternatives and communicates these to the person in the role of the judge, and the information generated by the BI system as advice that a decision maker (judge) can follow. Thus, we refer to the findings from the advice-giving and advice-taking literature, focusing on the role of explanations of the system in users' advice taking. It has been shown that advice discounting could occur when an advisor's reasoning or evidence justifying the advisor's decision is not available. However, the majority of current BI systems merely provide a number, which may influence decision makers in accepting the advice and inferring the quality of advice. We in this study explore the following key factors that can influence users' advice taking within the setting of a BI system: explanations on how the box-office grosses are predicted, types of advisor, i.e., system (data mining technique) or human-based business advice mechanisms such as prediction markets (aggregated human advice) and human advisors (individual human expert advice), users' evaluations of the provided advice, and individual differences in decision-makers. Each subject performs the following four tasks, by going through a series of display screens on the computer. First, given the information of the given movie such as director and genre, the subjects are asked to predict the opening weekend box office of the movie. Second, in light of the information generated by an advisor, the subjects are asked to adjust their original predictions, if they desire to do so. Third, they are asked to evaluate the value of the given information (e.g., perceived usefulness, trust, satisfaction). Lastly, a short survey is conducted to identify individual differences that may affect advice-taking. The results from the experiment show that subjects are more likely to follow system-generated advice than human advice when the advice is provided with an explanation. When the subjects as system users think the information provided by the system is useful, they are also more likely to take the advice. In addition, individual differences affect advice-taking. The subjects with more expertise on advisors or that tend to agree with others adjust their predictions, following the advice. On the other hand, the subjects with more knowledge on movies are less affected by the advice and their final decisions are close to their original predictions. The advances in predictive analytics of a BI system demonstrate a great potential to support increasingly complex business decisions. This study shows how the designs of a BI system can play a role in influencing users' acceptance of the system-generated advice, and the findings provide valuable insights on how to leverage the advanced predictive analytics of the BI system in an organization's forecasting practices.

An Intelligent Intrusion Detection Model Based on Support Vector Machines and the Classification Threshold Optimization for Considering the Asymmetric Error Cost (비대칭 오류비용을 고려한 분류기준값 최적화와 SVM에 기반한 지능형 침입탐지모형)

  • Lee, Hyeon-Uk;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.157-173
    • /
    • 2011
  • As the Internet use explodes recently, the malicious attacks and hacking for a system connected to network occur frequently. This means the fatal damage can be caused by these intrusions in the government agency, public office, and company operating various systems. For such reasons, there are growing interests and demand about the intrusion detection systems (IDS)-the security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. The intrusion detection models that have been applied in conventional IDS are generally designed by modeling the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. These kinds of intrusion detection models perform well under the normal situations. However, they show poor performance when they meet a new or unknown pattern of the network attacks. For this reason, several recent studies try to adopt various artificial intelligence techniques, which can proactively respond to the unknown threats. Especially, artificial neural networks (ANNs) have popularly been applied in the prior studies because of its superior prediction accuracy. However, ANNs have some intrinsic limitations such as the risk of overfitting, the requirement of the large sample size, and the lack of understanding the prediction process (i.e. black box theory). As a result, the most recent studies on IDS have started to adopt support vector machine (SVM), the classification technique that is more stable and powerful compared to ANNs. SVM is known as a relatively high predictive power and generalization capability. Under this background, this study proposes a novel intelligent intrusion detection model that uses SVM as the classification model in order to improve the predictive ability of IDS. Also, our model is designed to consider the asymmetric error cost by optimizing the classification threshold. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, when considering total cost of misclassification in IDS, it is more reasonable to assign heavier weights on FNE rather than FPE. Therefore, we designed our proposed intrusion detection model to optimize the classification threshold in order to minimize the total misclassification cost. In this case, conventional SVM cannot be applied because it is designed to generate discrete output (i.e. a class). To resolve this problem, we used the revised SVM technique proposed by Platt(2000), which is able to generate the probability estimate. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 1,000 samples from them by using random sampling method. In addition, the SVM model was compared with the logistic regression (LOGIT), decision trees (DT), and ANN to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell 4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on SVM outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that our model reduced the total misclassification cost compared to the ANN-based intrusion detection model. As a result, it is expected that the intrusion detection model proposed in this paper would not only enhance the performance of IDS, but also lead to better management of FNE.