Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)
-
- Journal of Intelligence and Information Systems
- /
- v.25 no.4
- /
- pp.141-154
- /
- 2019
Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.
The thermal analysis by mathematical model simulation makes it possible to reasonably predict heating and/or cooling requirements of certain greenhouses located under various geographical and climatic environment. It is another advantages of model simulation technique to be able to make it possible to select appropriate heating system, to set up energy utilization strategy, to schedule seasonal crop pattern, as well as to determine new greenhouse ranges. In this study, the control pattern for greenhouse microclimate is categorized as cooling and heating. Dynamic model was adopted to simulate heating requirements and/or energy conservation effectiveness such as energy saving by night-time thermal curtain, estimation of Heating Degree-Hours(HDH), long time prediction of greenhouse thermal behavior, etc. On the other hand, the cooling effects of ventilation, shading, and pad ||||&|||| fan system were partly analyzed by static model. By the experimental work with small size model greenhouse of 1.2m
Background: In patients with obstructive sleep apnea syndrome(OSAS), there are several factors increasing upper airway resistance and there is a predisposition to compromised respiratory function during waking and sleep related to constitutional factors including a tendency to obesity. Several recent studies have suggested a possible relationship between sleep apnea(SA) and systemic hypertension. But the possible pathophysiologic link between SA and hypertension is still unclear. In this study, we have examined the relationship among age, body mass index(BMI), pulmonary function parameters and polysomnographic data in patients with OSAS. And also we tried to know the difference among these parameters between hypertensive OSAS and normotensive OSAS patients. Methods: Patients underwent a full night of polysomnography and measured pulmonary function during waking. OSAS was diagnosed if patients had more than 5 apneas per hour(apnea index, AI). A careful history of previously known or present hypertension was obtained from each patient, and patients with systolic blood pressure
In this study, the early suicide prevention program was applied to the elementary school students and compared the prior & post effect of the program, and verified the status of psychology change like emotional status, or temptation to take a suicide, and presented the possibility as a suicide prevention program. The period of adolescence is the very unstable period in the process of growth being cognitively immature, emotionally impulsive period. It is the period emotionally unstable and unpredictable possible to select the method of suicide as an extreme method to escape the reality, or impulsive problem solving against small conflict or dispute situation. Many stress of the student such as recent nuclear family, expectation of parents to their children, education problem, socio-environmental elements, individual psychological factor lead students to the extreme activity of suicide in recent days. In this study, the scope of stress experienced in the elementary school as well as idea and degree of temptation regarding suicide by the suicide prevention program were identified, and through prevention program such as meditation training, breath training and through experience of anger control, emotion-expression, self overcome and establish positive self-identity and make understanding Self-control, Self-esteem & preciousness of life based on which the effect to suicide prevention was analyzed. The study was made targeting 51 students of 2 classes of 6th grade of elementary school of Goyang-si and processed 30 minutes every morning focused on through experience & activity of the principle & method of brain science. The data was collected for 20 times before starting morning class by using Suicide Probability Scale(herein SPS-A) designed to predict effectively suicide Probability, suicide risk prediction scale, surveyed by 7 areas such as Positive outlook, Within the family closeness, Impulsivity, Interpersonal hostility, Hopelessness, Hopelessness syndrome, suicide accident. Analytical methods and validation was used the Wilcoxon's signed rank test using SPSS Program. Though the process of program in short period, but there was a effective and positive results in the 7 areas in the average comparison. But in the t-test result, there was a different outcome. It indicated changes in the 3 questionnaires (No.7, No.14, No.19) out of 31 SPS-A questionnaires, and there was a no change to the rest item. It also indicated more changes of the students in the class A than class B. And in case of the class A students, psychological changes were verified in the areas of Hopelessness syndrome, suicide accident among 7 areas after the program was processed. Through this study, it could be verified that different results could be derived depending on the Student tendency, program professional(teacher in charge, processing lecturer). The suicide prevention program presented in this article can be a help in learning and suicide prevention with consistent systematization, activation through emotion and impulse control based on emotional stress relief and positive self-identity recovery, stabilization of brain waves, and let the short period program not to be died out but to be continued connecting from childhood to adolescence capable to make surrounding environment for spiritual, physical healthy growth for which this could be an effective program for suicide prevention of the social problem.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
Background : Tumor markers have been used in diagnosis, predicting the extent of disease, monitoring recurrence after therapy and prediction of prognosis. But the utility of markers in lung cancer has been limited by low sensitivity and specificity. TPA-M is recently developed marker using combined monoclonal antibody of Cytokeratin 8, 18, and 19. This study was conducted to evaluate the efficacy of new tumor marker, TPA-M by comparing the estabilished markers SCC, CEA, Cyfra21-1 in lung cancer. Method : An immunoradiometric assay of serum CEA, sec, Cyfra21-1, and TPA-M was performed in 49 pathologically confirmed lung cancer patients who visited Keimyung University Hospital from April 1996 to August 1996, and 29 benign lung diseases. Commercially available kits, Ab bead CEA (Eiken) to CEA, SCC RIA BEAD (DAINABOT) to SCC, CA2H (TFB) to Cyfra2H. and TPA-M (DAIICHI) to TPA-M were used for this study. Results : The mean serum values of lung cancer group and control group were
Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.