• Title/Summary/Keyword: Media

Search Result 30,011, Processing Time 0.063 seconds

A Study on the Various Attributes of E-Sport Influencing Flow and Identification (e-스포츠의 다양한 속성이 유동(flow)과 동일시에 미치는 영향에 관한 연구)

  • Suh, Mun-Shik;Ahn, Jin-Woo;Kim, Eun-Young;Um, Seong-Won
    • Journal of Global Scholars of Marketing Science
    • /
    • v.18 no.1
    • /
    • pp.59-80
    • /
    • 2008
  • Recently, e-sports are growing with potentiality as a new industry with conspicuous profit model. But studies that dealing with e-sports are not enough. Hence, proposes of this paper are both to establish basic model that is for the design of e-sport marketing strategy and to contribute toward future studies which are related to e-sports. Recently, the researches to explain sports-sponsorship through the identification theory have been discovered. Many researches say that somewhat proper identification is a requirement for most sponsors to improve the their images which is essential to sponsorship activity. Consequently, the research for sponsorship associated with identification in the e-sports, not in the physical sports is the core sector of this study. We extracted the variables from online's major characteristics and existing sport sponsorship researches. First, because e-sports mean the tournaments or leagues in the use of online game, the main event of the game is likely to call it online game. Online media's attributes are distinguished from those of offline. Especially, interactivity, anonymity, and expandibility as a e-sport game attributes are able to be mentioned. So, these inherent online attributes are examined on the relationship with flow. Second, in physical sports games, Fisher(1998) revealed that team similarity and team attractivity were positively related to team identification. Wann(1996) said that the result of former game influenced the evaluation of the next game, then in turn has an effect on the identification of team supporters. Considering these results in the e-sports side, e-sports gamer' attractivity, similarity, and match result seem to be important precedent variables of the identification with a gamer. So, these e-sport gamer attributes are examined on the relationship with both flow and identification with a gamer. Csikszentmihalyi(1988) defined the term flow as feeling status for him to be making current positive experience optimally. Hoffman and Novak(1996) also said that if a user experienced the flow he would visit a website without any reward. Therefore flow might be positively associated with user's identification with a gamer. And, Swanson(2003) disclosed that team identification influenced the positive results of sponsorship, which included attitude toward sponsors, sponsor patronage, and satisfaction with sponsors. That is, identification with a gamer expect to be connected with corporation identification significantly. According to the above, we can design the following research model. All variables used in this study(interactivity, anonymity, expandibility, attractivity, similarity, match result, flow, identification with a gamer, and identification with a sponsor) definitely were defined operationally underlying precedent researches. Sample collection was carried out to the person who has an experience to have enjoyed e-sports during June 2006. Much portion of samples is men because much more men than women enjoy e-sports in general. Two-step approach was used to test the hypotheses. First, confirmatory factor analysis was committed to guarantee the validity and reliability of variables. The results showed that all variables had not only intensive and discriminant validity, but also reliability. Then, research model was examined with fully structural equation using LISREL 8.3 version. The fitness of the suggested model mostly was at the acceptable level. Shortly speaking about the results, first of all, in e-sports game attributes, only interactivity which is called a basic feature in online situation affected flow positively. Secondly, in e-sports gamer's attributes, similarity with a gamer and match result influenced flow positively, but there was no significant effect in the relationship between the attractivity of a gamer and flow. And as expected, similarity had an effect on identification with a gamer significantly. But unexpectedly attractivity and match result did not influence identification with a gamer significantly. Just the same as the fact verified in the many precedent researches, flow greatly influenced identification with a gamer, and identification with a gamer continually had an influence on the identification with a sponsor significantly. There are some implications in these results. If the sponsor of e-sports supports the pro-game player who absolutely should have the superior ability to others and is similar to the user enjoying e-sports, many amateur gamers will feel much of the flow and identification with a pro-gamer, and then after all, feel the identification with a sponsor. Such identification with a sponsor leads people enjoying e-sports to have purchasing intention for products produced by the sponsor and to make a positive word-of-mouth for those products or the sponsor. For the future studies, we recommend a few ideas. Based on the results of this study, it is necessary to find new variables relating to the e-sports, which is not mentioned in this study. For this work to be possible, qualitative research seems to be needed to consider the inherent e-sport attributes. Finally, to generalize the results related to e-sports, a wide range of generations not a specific generation should be researched.

  • PDF

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

Scalable Collaborative Filtering Technique based on Adaptive Clustering (적응형 군집화 기반 확장 용이한 협업 필터링 기법)

  • Lee, O-Joun;Hong, Min-Sung;Lee, Won-Jin;Lee, Jae-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.73-92
    • /
    • 2014
  • An Adaptive Clustering-based Collaborative Filtering Technique was proposed to solve the fundamental problems of collaborative filtering, such as cold-start problems, scalability problems and data sparsity problems. Previous collaborative filtering techniques were carried out according to the recommendations based on the predicted preference of the user to a particular item using a similar item subset and a similar user subset composed based on the preference of users to items. For this reason, if the density of the user preference matrix is low, the reliability of the recommendation system will decrease rapidly. Therefore, the difficulty of creating a similar item subset and similar user subset will be increased. In addition, as the scale of service increases, the time needed to create a similar item subset and similar user subset increases geometrically, and the response time of the recommendation system is then increased. To solve these problems, this paper suggests a collaborative filtering technique that adapts a condition actively to the model and adopts the concepts of a context-based filtering technique. This technique consists of four major methodologies. First, items are made, the users are clustered according their feature vectors, and an inter-cluster preference between each item cluster and user cluster is then assumed. According to this method, the run-time for creating a similar item subset or user subset can be economized, the reliability of a recommendation system can be made higher than that using only the user preference information for creating a similar item subset or similar user subset, and the cold start problem can be partially solved. Second, recommendations are made using the prior composed item and user clusters and inter-cluster preference between each item cluster and user cluster. In this phase, a list of items is made for users by examining the item clusters in the order of the size of the inter-cluster preference of the user cluster, in which the user belongs, and selecting and ranking the items according to the predicted or recorded user preference information. Using this method, the creation of a recommendation model phase bears the highest load of the recommendation system, and it minimizes the load of the recommendation system in run-time. Therefore, the scalability problem and large scale recommendation system can be performed with collaborative filtering, which is highly reliable. Third, the missing user preference information is predicted using the item and user clusters. Using this method, the problem caused by the low density of the user preference matrix can be mitigated. Existing studies on this used an item-based prediction or user-based prediction. In this paper, Hao Ji's idea, which uses both an item-based prediction and user-based prediction, was improved. The reliability of the recommendation service can be improved by combining the predictive values of both techniques by applying the condition of the recommendation model. By predicting the user preference based on the item or user clusters, the time required to predict the user preference can be reduced, and missing user preference in run-time can be predicted. Fourth, the item and user feature vector can be made to learn the following input of the user feedback. This phase applied normalized user feedback to the item and user feature vector. This method can mitigate the problems caused by the use of the concepts of context-based filtering, such as the item and user feature vector based on the user profile and item properties. The problems with using the item and user feature vector are due to the limitation of quantifying the qualitative features of the items and users. Therefore, the elements of the user and item feature vectors are made to match one to one, and if user feedback to a particular item is obtained, it will be applied to the feature vector using the opposite one. Verification of this method was accomplished by comparing the performance with existing hybrid filtering techniques. Two methods were used for verification: MAE(Mean Absolute Error) and response time. Using MAE, this technique was confirmed to improve the reliability of the recommendation system. Using the response time, this technique was found to be suitable for a large scaled recommendation system. This paper suggested an Adaptive Clustering-based Collaborative Filtering Technique with high reliability and low time complexity, but it had some limitations. This technique focused on reducing the time complexity. Hence, an improvement in reliability was not expected. The next topic will be to improve this technique by rule-based filtering.

A Study on the Function of Oral Medicine as the Secondary Clinic Based on Analysis on Admissive Channel and Case Features (내원경위 분석과 환자 특성 평가에 따른 2차 진료기관으로서 구강내과 역할에 대한 연구)

  • Lee, You-Mee;Lee, Jung-Hyun;Lim, Hyun-Dae
    • Journal of Oral Medicine and Pain
    • /
    • v.31 no.3
    • /
    • pp.199-210
    • /
    • 2006
  • The epidemiological researches on the inpatients hospitalized at the oral medicine ward have been continuously carried out since 1970, and most researches have been performed by centering around the oral medicine wards of college hospitals. Numerous specialists have been produced after the establishment of oral medicine, and they have been active in various fields. As dental clinics have gotten bigger, the function of oral medicine in the secondary clinics is being brought out. As admissive channel, case features, case composition and otherwise have not been researched for a long time, the related researches should be carried out from now on. Hereupon, this study was carried out by targeting the 100 inpatients hospitalized at the oral medicine ward of Sun Hospital located in Daejeon Korea, through questionnaire. As the result, the following results were derived. 1. The ages of the inpatients in Sun Hospital were $29.21{\pm}11.31$ on the average; 71 females' mean average was $29.63{\pm}11.29$ and 29 males' mean average was $28.17{\pm}11.48$. In regard of school career, the patients who finished high-school course or higher accounted for 78%; the patients' school career seemed to be relatively high. The patients who complained of temporomandibular pain accounted for the highest proportion with 65%. In motivation to visit this hospital, internet surfing was 11%, mass media was 10%, acquaintance's introduction was 38%. The patients, who were hospitalized at another hospital due to the same symptom, accounted for 56%. The dental clinics, which made the patients visit this hospital, accounted for 20%. The patients, who were previously aware that the present symptom should be treated by oral medicine, accounted for 38%. The patients, who were not aware of the fact in advance, were 62%. The respondents of 51% answered that they were aware of the fact one month or below before hospitalization. 2. The patients, who complained of craniocervical ache, accounted for 58%; the patients, whose ache aches affect dailylife, were 22%. Continuous ache was 14% and intermittent ache was 68%, and dull pain was 23%. 3. Life variations were compared with each other by using SRRS (Social Readjustment Rating Scale). In consequence, the variation within 3 years indicated a significant difference in the both groups but the variation within 6 months did not indicate any differences. 4. In regard of the questionnaire on the incidents happened for a week, the ache-group was compared with the group free from the ache. As the result, the number of strain arisen for a week, the decrease of favorite works and sudden fear indicated a significant difference. Pleasant feeling and the decrease of interests in looks did not indicate a significant difference, but came close to the significance. 5. In the questionnaire on impatience, the ache-group indicated higher value but there was not a significant difference. 6. In the questionnaire on the symptoms caused by stress, the two groups indicated significant differences in the item of 'the teethridge itches and feels a tooth rising' and 'the occiput or the nape is stiff.' In the item 'the inside of the cheek or the teethridge are widely peeled off, accompanied with ache and hemorrhage', 'the face has acne or pimple' and 'headache frequently attacks', a significant difference was not observed but the two groups came close to the significance.

Construction of Consumer Confidence index based on Sentiment analysis using News articles (뉴스기사를 이용한 소비자의 경기심리지수 생성)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.1-27
    • /
    • 2017
  • It is known that the economic sentiment index and macroeconomic indicators are closely related because economic agent's judgment and forecast of the business conditions affect economic fluctuations. For this reason, consumer sentiment or confidence provides steady fodder for business and is treated as an important piece of economic information. In Korea, private consumption accounts and consumer sentiment index highly relevant for both, which is a very important economic indicator for evaluating and forecasting the domestic economic situation. However, despite offering relevant insights into private consumption and GDP, the traditional approach to measuring the consumer confidence based on the survey has several limits. One possible weakness is that it takes considerable time to research, collect, and aggregate the data. If certain urgent issues arise, timely information will not be announced until the end of each month. In addition, the survey only contains information derived from questionnaire items, which means it can be difficult to catch up to the direct effects of newly arising issues. The survey also faces potential declines in response rates and erroneous responses. Therefore, it is necessary to find a way to complement it. For this purpose, we construct and assess an index designed to measure consumer economic sentiment index using sentiment analysis. Unlike the survey-based measures, our index relies on textual analysis to extract sentiment from economic and financial news articles. In particular, text data such as news articles and SNS are timely and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. There exist two main approaches to the automatic extraction of sentiment from a text, we apply the lexicon-based approach, using sentiment lexicon dictionaries of words annotated with the semantic orientations. In creating the sentiment lexicon dictionaries, we enter the semantic orientation of individual words manually, though we do not attempt a full linguistic analysis (one that involves analysis of word senses or argument structure); this is the limitation of our research and further work in that direction remains possible. In this study, we generate a time series index of economic sentiment in the news. The construction of the index consists of three broad steps: (1) Collecting a large corpus of economic news articles on the web, (2) Applying lexicon-based methods for sentiment analysis of each article to score the article in terms of sentiment orientation (positive, negative and neutral), and (3) Constructing an economic sentiment index of consumers by aggregating monthly time series for each sentiment word. In line with existing scholarly assessments of the relationship between the consumer confidence index and macroeconomic indicators, any new index should be assessed for its usefulness. We examine the new index's usefulness by comparing other economic indicators to the CSI. To check the usefulness of the newly index based on sentiment analysis, trend and cross - correlation analysis are carried out to analyze the relations and lagged structure. Finally, we analyze the forecasting power using the one step ahead of out of sample prediction. As a result, the news sentiment index correlates strongly with related contemporaneous key indicators in almost all experiments. We also find that news sentiment shocks predict future economic activity in most cases. In almost all experiments, the news sentiment index strongly correlates with related contemporaneous key indicators. Furthermore, in most cases, news sentiment shocks predict future economic activity; in head-to-head comparisons, the news sentiment measures outperform survey-based sentiment index as CSI. Policy makers want to understand consumer or public opinions about existing or proposed policies. Such opinions enable relevant government decision-makers to respond quickly to monitor various web media, SNS, or news articles. Textual data, such as news articles and social networks (Twitter, Facebook and blogs) are generated at high-speeds and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. Although research using unstructured data in economic analysis is in its early stages, but the utilization of data is expected to greatly increase once its usefulness is confirmed.

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."

Product Community Analysis Using Opinion Mining and Network Analysis: Movie Performance Prediction Case (오피니언 마이닝과 네트워크 분석을 활용한 상품 커뮤니티 분석: 영화 흥행성과 예측 사례)

  • Jin, Yu;Kim, Jungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.49-65
    • /
    • 2014
  • Word of Mouth (WOM) is a behavior used by consumers to transfer or communicate their product or service experience to other consumers. Due to the popularity of social media such as Facebook, Twitter, blogs, and online communities, electronic WOM (e-WOM) has become important to the success of products or services. As a result, most enterprises pay close attention to e-WOM for their products or services. This is especially important for movies, as these are experiential products. This paper aims to identify the network factors of an online movie community that impact box office revenue using social network analysis. In addition to traditional WOM factors (volume and valence of WOM), network centrality measures of the online community are included as influential factors in box office revenue. Based on previous research results, we develop five hypotheses on the relationships between potential influential factors (WOM volume, WOM valence, degree centrality, betweenness centrality, closeness centrality) and box office revenue. The first hypothesis is that the accumulated volume of WOM in online product communities is positively related to the total revenue of movies. The second hypothesis is that the accumulated valence of WOM in online product communities is positively related to the total revenue of movies. The third hypothesis is that the average of degree centralities of reviewers in online product communities is positively related to the total revenue of movies. The fourth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. The fifth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. To verify our research model, we collect movie review data from the Internet Movie Database (IMDb), which is a representative online movie community, and movie revenue data from the Box-Office-Mojo website. The movies in this analysis include weekly top-10 movies from September 1, 2012, to September 1, 2013, with in total. We collect movie metadata such as screening periods and user ratings; and community data in IMDb including reviewer identification, review content, review times, responder identification, reply content, reply times, and reply relationships. For the same period, the revenue data from Box-Office-Mojo is collected on a weekly basis. Movie community networks are constructed based on reply relationships between reviewers. Using a social network analysis tool, NodeXL, we calculate the averages of three centralities including degree, betweenness, and closeness centrality for each movie. Correlation analysis of focal variables and the dependent variable (final revenue) shows that three centrality measures are highly correlated, prompting us to perform multiple regressions separately with each centrality measure. Consistent with previous research results, our regression analysis results show that the volume and valence of WOM are positively related to the final box office revenue of movies. Moreover, the averages of betweenness centralities from initial community networks impact the final movie revenues. However, both of the averages of degree centralities and closeness centralities do not influence final movie performance. Based on the regression results, three hypotheses, 1, 2, and 4, are accepted, and two hypotheses, 3 and 5, are rejected. This study tries to link the network structure of e-WOM on online product communities with the product's performance. Based on the analysis of a real online movie community, the results show that online community network structures can work as a predictor of movie performance. The results show that the betweenness centralities of the reviewer community are critical for the prediction of movie performance. However, degree centralities and closeness centralities do not influence movie performance. As future research topics, similar analyses are required for other product categories such as electronic goods and online content to generalize the study results.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

Interlaboratory Comparison of Blood Lead Determination in Some Occupational Health Laboratories in Korea (일부 산업보건기관들의 혈중연 분석치 비교)

  • Ahn, Kyu Dong;Lee, Byung Kook
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.5 no.1
    • /
    • pp.8-15
    • /
    • 1995
  • The reliable measurement of metal in biological media in human body is one of critical indicators for the proper evaluation of its toxic effect on human health. Recently in Korea the necessity of quality assurance of measurement in occupational health and occupational hygiene fields brought out regulatory quality control program. Lead is often used as a standard metal for the program in both fields of occupational health and hygiene. During last 20 years lead poisoning was prevalent in Korea and still is one of main heavy metal poisoning and the capability of the measurement of blood lead is one of prerequisites for institute of specialized occupational health in Korea. Furthermore blood lead is most important indicator to evaluate lead burden of human exposure to lead and the reliable and accurate analysis is most needed whenever possible. To evaluate the extent of the interlaboratory differences of blood lead measurement in several well-known institute specialized in occupational health in Korea, authors prepared 68 blood samples from two storage battery industries and all samples were divided into samples with 2 ml. One set of 68 samples were analyzed by authors's laboratory(Soonchunhyang University Institute of Industrial Medicine: SIIM) and 40 samples of other set were analyzed by C University Institute of Industrial Medicine(CIIM) and the rest 28 samples of other set were analyzed by Japanese institute(K Occupational Health Center:KOHC). Authors also prepared test bovine samples which were obtained from Japanese Federation of Occupational Health Organization (JFOHO) for quality control. Authors selected 2 other well-known occupational health laboratories and one laboratory specialized for instrumental analysis. A total of 6 laboratories joined the interlaboratory comparison of blood lead measurement and the results obtained were as follows: 1. There was no significant difference in average blood lead between SIIM and CIIM in different group of blood lead concentration, and the relative standard deviation of two laboratories was less than 3.0%. On the other hand, there was also no significant difference of average blood lead between SIIM and KOHC with relative standard deviation of 6.84% as maximum. 2. Taking less than 15% difference of mean or less than 6 ug/dl difference in below 40 ug/dl in whole blood as a criteria of agreement of measurement between two laboratories, agreement rates were 87.5%(35/40) and 78.6%(22/28) between SIIM and CIIM, SIIM and KOHC respectively. 3. The correlation of blood lead between SIIM and CIIM was 0.975 (p=0.0001) and the regression equation was SIIM = 2.19 + 0.9243 ClIM, whereas the correlation between SUM and KOHC was O.965(p=0.0001) with the equation of SIIM = 1.91 + 0.9794 KOHC. 4. Taking the reference value as a dependent variable and each of 6 laboratories's measurement value as a independent variable, the determination coefficient($R^2$) of simple regression equations of blood lead measurement for bovine test samples were very high($R^2>0.99$), and the regression coefficient(${\beta}$) was between 0.972 and 1.15 which indicated fairly good agreement of measurement results.

  • PDF

Spatial effect on the diffusion of discount stores (대형할인점 확산에 대한 공간적 영향)

  • Joo, Young-Jin;Kim, Mi-Ae
    • Journal of Distribution Research
    • /
    • v.15 no.4
    • /
    • pp.61-85
    • /
    • 2010
  • Introduction: Diffusion is process by which an innovation is communicated through certain channel overtime among the members of a social system(Rogers 1983). Bass(1969) suggested the Bass model describing diffusion process. The Bass model assumes potential adopters of innovation are influenced by mass-media and word-of-mouth from communication with previous adopters. Various expansions of the Bass model have been conducted. Some of them proposed a third factor affecting diffusion. Others proposed multinational diffusion model and it stressed interactive effect on diffusion among several countries. We add a spatial factor in the Bass model as a third communication factor. Because of situation where we can not control the interaction between markets, we need to consider that diffusion within certain market can be influenced by diffusion in contiguous market. The process that certain type of retail extends is a result that particular market can be described by the retail life cycle. Diffusion of retail has pattern following three phases of spatial diffusion: adoption of innovation happens in near the diffusion center first, spreads to the vicinity of the diffusing center and then adoption of innovation is completed in peripheral areas in saturation stage. So we expect spatial effect to be important to describe diffusion of domestic discount store. We define a spatial diffusion model using multinational diffusion model and apply it to the diffusion of discount store. Modeling: In this paper, we define a spatial diffusion model and apply it to the diffusion of discount store. To define a spatial diffusion model, we expand learning model(Kumar and Krishnan 2002) and separate diffusion process in diffusion center(market A) from diffusion process in the vicinity of the diffusing center(market B). The proposed spatial diffusion model is shown in equation (1a) and (1b). Equation (1a) is the diffusion process in diffusion center and equation (1b) is one in the vicinity of the diffusing center. $$\array{{S_{i,t}=(p_i+q_i{\frac{Y_{i,t-1}}{m_i}})(m_i-Y_{i,t-1})\;i{\in}\{1,{\cdots},I\}\;(1a)}\\{S_{j,t}=(p_j+q_j{\frac{Y_{j,t-1}}{m_i}}+{\sum\limits_{i=1}^I}{\gamma}_{ij}{\frac{Y_{i,t-1}}{m_i}})(m_j-Y_{j,t-1})\;i{\in}\{1,{\cdots},I\},\;j{\in}\{I+1,{\cdots},I+J\}\;(1b)}}$$ We rise two research questions. (1) The proposed spatial diffusion model is more effective than the Bass model to describe the diffusion of discount stores. (2) The more similar retail environment of diffusing center with that of the vicinity of the contiguous market is, the larger spatial effect of diffusing center on diffusion of the vicinity of the contiguous market is. To examine above two questions, we adopt the Bass model to estimate diffusion of discount store first. Next spatial diffusion model where spatial factor is added to the Bass model is used to estimate it. Finally by comparing Bass model with spatial diffusion model, we try to find out which model describes diffusion of discount store better. In addition, we investigate the relationship between similarity of retail environment(conceptual distance) and spatial factor impact with correlation analysis. Result and Implication: We suggest spatial diffusion model to describe diffusion of discount stores. To examine the proposed spatial diffusion model, 347 domestic discount stores are used and we divide nation into 5 districts, Seoul-Gyeongin(SG), Busan-Gyeongnam(BG), Daegu-Gyeongbuk(DG), Gwan- gju-Jeonla(GJ), Daejeon-Chungcheong(DC), and the result is shown

    . In a result of the Bass model(I), the estimates of innovation coefficient(p) and imitation coefficient(q) are 0.017 and 0.323 respectively. While the estimate of market potential is 384. A result of the Bass model(II) for each district shows the estimates of innovation coefficient(p) in SG is 0.019 and the lowest among 5 areas. This is because SG is the diffusion center. The estimates of imitation coefficient(q) in BG is 0.353 and the highest. The imitation coefficient in the vicinity of the diffusing center such as BG is higher than that in the diffusing center because much information flows through various paths more as diffusion is progressing. A result of the Bass model(II) shows the estimates of innovation coefficient(p) in SG is 0.019 and the lowest among 5 areas. This is because SG is the diffusion center. The estimates of imitation coefficient(q) in BG is 0.353 and the highest. The imitation coefficient in the vicinity of the diffusing center such as BG is higher than that in the diffusing center because much information flows through various paths more as diffusion is progressing. In a result of spatial diffusion model(IV), we can notice the changes between coefficients of the bass model and those of the spatial diffusion model. Except for GJ, the estimates of innovation and imitation coefficients in Model IV are lower than those in Model II. The changes of innovation and imitation coefficients are reflected to spatial coefficient(${\gamma}$). From spatial coefficient(${\gamma}$) we can infer that when the diffusion in the vicinity of the diffusing center occurs, the diffusion is influenced by one in the diffusing center. The difference between the Bass model(II) and the spatial diffusion model(IV) is statistically significant with the ${\chi}^2$-distributed likelihood ratio statistic is 16.598(p=0.0023). Which implies that the spatial diffusion model is more effective than the Bass model to describe diffusion of discount stores. So the research question (1) is supported. In addition, we found that there are statistically significant relationship between similarity of retail environment and spatial effect by using correlation analysis. So the research question (2) is also supported.

  • PDF