Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)
-
- Journal of Intelligence and Information Systems
- /
- v.25 no.4
- /
- pp.141-154
- /
- 2019
Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.
Until recently, research trend in real estate has been focused on real estate market and the market analysis. But the studies on real estate training program development for real estate agents to improve their job performance are relatively short in numbers. Thus, this study shows empirical analysis of the needs for the training programs for real estate agents in Cheonan to improve their job performance. The results are as follows. First, in the survey of asking what educational contents they need in order to improve real estate agents' job performance, most of the respondents show their needs for the analysis of house's value, legal knowledge, real estate management, accounting, real estate marketing, and understanding of the real estate policy. This is because they are well aware that the best way of responding to the changing clients' needs comes from training programs. Secondly, asked about real estate marketing strategies, most of respondents showed their awareness of new strategies to meet the needs of clients. This is because new forms of marketing strategies including internet ads are needed in the field as the paradigm including Information Technology changes. Thirdly, asked about the need for real estate-related training programs, 92% of the respondents answered they need real estate education programs run by the continuing education centers of the universities. In addition, the survey showed their needs for retraining programs that utilize the resources in the local universities. Other than this, to have effective and efficient training programs, they demanded running a training system by utilizing the human resources of the universities under the name of the department of 'Real Estate Contract' for real estate agents' job performance. Fourthly, the survey revealed real estate management(44.2%) and real estate marketing(42.3%) is the most chosen contents they want to take in the regular course for improving real estate agents' job performance. This shows their will to understand clients' needs through the mind of real estate management and real estate marketing. The survey showed they prefer the training programs as an irregular course to those in the regular one. Despite the above results, this study chose subjects only in Cheanan and thus it needs to research more diverse areas. The needs of programs to improve real estate agents job performance should be analyzed empirically targeting the real estate agents not just in Cheonan but also cities like Pyeongchon, Ilsan and Bundang in which real estate business is booming, as well as undergraduate and graduate students whose major is real estate studies. These studies will be able to provide information to help develop the customized training programs by evaluating elements that real estate agents need in order to meet clients satisfaction and improve their job performance. Many variables of the program development learned through these studies can be incorporated in the curriculum of the real estate studies and used very practically as information for the development of the real estate studies in this fast changing era.
1. Introduction Today Internet is recognized as an important way for the transaction of products and services. According to the data surveyed by the National Statistical Office, the on-line transaction in 2007 for a year, 15.7656 trillion, shows a 17.1%(2.3060 trillion won) increase over last year, of these, the amount of B2C has been increased 12.0%(10.2258 trillion won). Like this, because the entry barrier of on-line market of Korea is low, many retailers could easily enter into the market. So the bigger its scale is, but on the other hand, the tougher its competition is. Particularly due to the Internet and innovation of IT, the existing market has been changed into the perfect competitive market(Srinivasan, Rolph & Kishore, 2002). In the early years of on-line business, they think that the main reason for success is a moderate price, they are awakened to its importance of on-line service quality with tough competition. If it's not sure whether customers can be provided with what they want, they can use the Web sites, perhaps they can trust their products that had been already bought or not, they have a doubt its viability(Parasuraman, Zeithaml & Malhotra, 2005). Customers can directly reserve and issue their air tickets irrespective of place and time at the Web sites of travel agencies or airlines, but its empirical studies about these Web sites for reserving and issuing air tickets are insufficient. Therefore this study goes on for following specific objects. First object is to measure service quality and service recovery of Web sites for reserving and issuing air tickets. Second is to look into whether above on-line service quality and on-line service recovery have an impact on overall service quality. Third is to seek for the relation with overall service quality and customer satisfaction, then this customer satisfaction and loyalty intention. 2. Theoretical Background 2.1 On-line Service Quality Barnes & Vidgen(2000; 2001a; 2001b; 2002) had invented the tool to measure Web sites' quality four times(called WebQual). The WebQual 1.0, Step one invented a measuring item for information quality based on QFD, and this had been verified by students of UK business school. The Web Qual 2.0, Step two invented for interaction quality, and had been judged by customers of on-line bookshop. The WebQual 3.0, Step three invented by consolidating the WebQual 1.0 for information quality and the WebQual2.0 for interactionquality. It includes 3-quality-dimension, information quality, interaction quality, site design, and had been assessed and confirmed by auction sites(e-bay, Amazon, QXL). Furtheron, through the former empirical studies, the authors changed sites quality into usability by judging that usability is a concept how customers interact with or perceive Web sites and It is used widely for accessing Web sites. By this process, WebQual 4.0 was invented, and is consist of 3-quality-dimension; information quality, interaction quality, usability, 22 items. However, because WebQual 4.0 is focusing on technical part, it's usable at the Website's design part, on the other hand, it's not usable at the Web site's pleasant experience part. Parasuraman, Zeithaml & Malhorta(2002; 2005) had invented the measure for measuring on-line service quality in 2002 and 2005. The study in 2002 divided on-line service quality into 5 dimensions. But these were not well-organized, so there needed to be studied again totally. So Parasuraman, Zeithaml & Malhorta(2005) re-worked out the study about on-line service quality measure base on 2002's study and invented E-S-QUAL. After they invented preliminary measure for on-line service quality, they made up a question for customers who had purchased at amazon.com and walmart.com and reassessed this measure. And they perfected an invention of E-S-QUAL consists of 4 dimensions, 22 items of efficiency, system availability, fulfillment, privacy. Efficiency measures assess to sites and usability and others, system availability measures accurate technical function of sites and others, fulfillment measures promptness of delivering products and sufficient goods and others and privacy measures the degree of protection of data about their customers and so on. 2.2 Service Recovery Service industries tend to minimize the losses by coping with service failure promptly. This responses of service providers to service failure mean service recovery(Kelly & Davis, 1994). Bitner(1990) went on his study from customers' view about service providers' behavior for customers to recognize their satisfaction/dissatisfaction at service point. According to them, to manage service failure successfully, exact recognition of service problem, an apology, sufficient description about service failure and some tangible compensation are important. Parasuraman, Zeithaml & Malhorta(2005) approached the service recovery from how to measure, rather than how to manage, and moved to on-line market not to off-line, then invented E-RecS-QUAL which is a measuring tool about on-line service recovery. 2.3 Customer Satisfaction The definition of customer satisfaction can be divided into two points of view. First, they approached customer satisfaction from outcome of comsumer. Howard & Sheth(1969) defined satisfaction as 'a cognitive condition feeling being rewarded properly or improperly for their sacrifice.' and Westbrook & Reilly(1983) also defined customer satisfaction/dissatisfaction as 'a psychological reaction to the behavior pattern of shopping and purchasing, the display condition of retail store, outcome of purchased goods and service as well as whole market.' Second, they approached customer satisfaction from process. Engel & Blackwell(1982) defined satisfaction as 'an assessment of a consistency in chosen alternative proposal and their belief they had with them.' Tse & Wilton(1988) defined customer satisfaction as 'a customers' reaction to discordance between advance expectation and ex post facto outcome.' That is, this point of view that customer satisfaction is process is the important factor that comparing and assessing process what they expect and outcome of consumer. Unlike outcome-oriented approach, process-oriented approach has many advantages. As process-oriented approach deals with customers' whole expenditure experience, it checks up main process by measuring one by one each factor which is essential role at each step. And this approach enables us to check perceptual/psychological process formed customer satisfaction. Because of these advantages, now many studies are adopting this process-oriented approach(Yi, 1995). 2.4 Loyalty Intention Loyalty has been studied by dividing into behavioral approaches, attitudinal approaches and complex approaches(Dekimpe et al., 1997). In the early years of study, they defined loyalty focusing on behavioral concept, behavioral approaches regard customer loyalty as "a tendency to purchase periodically within a certain period of time at specific retail store." But the loyalty of behavioral approaches focuses on only outcome of customer behavior, so there are someone to point the limits that customers' decision-making situation or process were neglected(Enis & Paul, 1970; Raj, 1982; Lee, 2002). So the attitudinal approaches were suggested. The attitudinal approaches consider loyalty contains all the cognitive, emotional, voluntary factors(Oliver, 1997), define the customer loyalty as "friendly behaviors for specific retail stores." However these attitudinal approaches can explain that how the customer loyalty form and change, but cannot say positively whether it is moved to real purchasing in the future or not. This is a kind of shortcoming(Oh, 1995). 3. Research Design 3.1 Research Model Based on the objects of this study, the research model derived is
Recently, the diversification and individualization of consumption patterns through the web and mobile devices based on the Internet have been rapid. As this happens, the efficient operation of the offline store, which is a traditional distribution channel, has become more important. In order to raise both the sales and profits of stores, stores need to supply and sell the most attractive products to consumers in a timely manner. However, there is a lack of research on which SKUs, out of many products, can increase sales probability and reduce inventory costs. In particular, if a company sells products through multiple in-store stores across multiple locations, it would be helpful to increase sales and profitability of stores if SKUs appealing to customers are recommended. In this study, the recommender system (recommender system such as collaborative filtering and hybrid filtering), which has been used for personalization recommendation, is suggested by SKU recommendation method of a store unit of a distribution company that handles a homogeneous brand through a plurality of sales stores by country and region. We calculated the similarity of each store by using the purchase data of each store's handling items, filtering the collaboration according to the sales history of each store by each SKU, and finally recommending the individual SKU to the store. In addition, the store is classified into four clusters through PCA (Principal Component Analysis) and cluster analysis (Clustering) using the store profile data. The recommendation system is implemented by the hybrid filtering method that applies the collaborative filtering in each cluster and measured the performance of both methods based on actual sales data. Most of the existing recommendation systems have been studied by recommending items such as movies and music to the users. In practice, industrial applications have also become popular. In the meantime, there has been little research on recommending SKUs for each store by applying these recommendation systems, which have been mainly dealt with in the field of personalization services, to the store units of distributors handling similar brands. If the recommendation method of the existing recommendation methodology was 'the individual field', this study expanded the scope of the store beyond the individual domain through a plurality of sales stores by country and region and dealt with the store unit of the distribution company handling the same brand SKU while suggesting a recommendation method. In addition, if the existing recommendation system is limited to online, it is recommended to apply the data mining technique to develop an algorithm suitable for expanding to the store area rather than expanding the utilization range offline and analyzing based on the existing individual. The significance of the results of this study is that the personalization recommendation algorithm is applied to a plurality of sales outlets handling the same brand. A meaningful result is derived and a concrete methodology that can be constructed and used as a system for actual companies is proposed. It is also meaningful that this is the first attempt to expand the research area of the academic field related to the existing recommendation system, which was focused on the personalization domain, to a sales store of a company handling the same brand. From 05 to 03 in 2014, the number of stores' sales volume of the top 100 SKUs are limited to 52 SKUs by collaborative filtering and the hybrid filtering method SKU recommended. We compared the performance of the two recommendation methods by totaling the sales results. The reason for comparing the two recommendation methods is that the recommendation method of this study is defined as the reference model in which offline collaborative filtering is applied to demonstrate higher performance than the existing recommendation method. The results of this model are compared with the Hybrid filtering method, which is a model that reflects the characteristics of the offline store view. The proposed method showed a higher performance than the existing recommendation method. The proposed method was proved by using actual sales data of large Korean apparel companies. In this study, we propose a method to extend the recommendation system of the individual level to the group level and to efficiently approach it. In addition to the theoretical framework, which is of great value.
As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.
The advent of 5G mobile communications, which is expected in 2020, will provide many services such as Internet of Things (IoT) and vehicle-to-infra/vehicle/nomadic (V2X) communication. There are many requirements to realizing these services: reduced latency, high data rate and reliability, and real-time service. In particular, a high level of reliability and delay sensitivity with an increased data rate are very important for M2M, IoT, and Factory 4.0. Around the world, 5G standardization organizations have considered these services and grouped them to finally derive the technical requirements and service scenarios. The first scenario is broadcast services that use a high data rate for multiple cases of sporting events or emergencies. The second scenario is as support for e-Health, car reliability, etc.; the third scenario is related to VR games with delay sensitivity and real-time techniques. Recently, these groups have been forming agreements on the requirements for such scenarios and the target level. Various techniques are being studied to satisfy such requirements and are being discussed in the context of software-defined networking (SDN) as the next-generation network architecture. SDN is being used to standardize ONF and basically refers to a structure that separates signals for the control plane from the packets for the data plane. One of the best examples for low latency and high reliability is an intelligent traffic system (ITS) using V2X. Because a car passes a small cell of the 5G network very rapidly, the messages to be delivered in the event of an emergency have to be transported in a very short time. This is a typical example requiring high delay sensitivity. 5G has to support a high reliability and delay sensitivity requirements for V2X in the field of traffic control. For these reasons, V2X is a major application of critical delay. V2X (vehicle-to-infra/vehicle/nomadic) represents all types of communication methods applicable to road and vehicles. It refers to a connected or networked vehicle. V2X can be divided into three kinds of communications. First is the communication between a vehicle and infrastructure (vehicle-to-infrastructure; V2I). Second is the communication between a vehicle and another vehicle (vehicle-to-vehicle; V2V). Third is the communication between a vehicle and mobile equipment (vehicle-to-nomadic devices; V2N). This will be added in the future in various fields. Because the SDN structure is under consideration as the next-generation network architecture, the SDN architecture is significant. However, the centralized architecture of SDN can be considered as an unfavorable structure for delay-sensitive services because a centralized architecture is needed to communicate with many nodes and provide processing power. Therefore, in the case of emergency V2X communications, delay-related control functions require a tree supporting structure. For such a scenario, the architecture of the network processing the vehicle information is a major variable affecting delay. Because it is difficult to meet the desired level of delay sensitivity with a typical fully centralized SDN structure, research on the optimal size of an SDN for processing information is needed. This study examined the SDN architecture considering the V2X emergency delay requirements of a 5G network in the worst-case scenario and performed a system-level simulation on the speed of the car, radius, and cell tier to derive a range of cells for information transfer in SDN network. In the simulation, because 5G provides a sufficiently high data rate, the information for neighboring vehicle support to the car was assumed to be without errors. Furthermore, the 5G small cell was assumed to have a cell radius of 50-100 m, and the maximum speed of the vehicle was considered to be 30-200 km/h in order to examine the network architecture to minimize the delay.
In this study, we attempt to propose a choice-based diffusion model with switching cost, which can be used to forecast the dynamic substitution and competition among previous and new products at both individual-level and aggregate level, especially when market data for new products is insufficient. Additionally, we apply the proposed model to the empirical case of substitution and competition among Analog Cable TV that represents previous fixed charged broadcasting service and Digital Cable TV and Internet Protocol TV (IPTV) that are new ones, verify the validities of our proposed model, and finally derive related empirical implications. For empirical application, we obtained data from survey conducted as follows. Survey was administered by Dongseo Research to 1,000 adults aging from 20 to 60 living in Seoul, Korea, in May of 2007, under the title of 'Demand analysis of next generation fixed interactive broadcasting services'. Conjoint survey modified as follows, was used. First, as the traditional approach in conjoint analysis, we extracted 16 hypothetical alternative cards from the orthogonal design using important attributes and levels of next generation interactive broadcasting services which were determined by previous literature review and experts' comments. Again, we divided 16 conjoint cards into 4 groups, and thus composed 4 choice sets with 4 alternatives each. Therefore, each respondent faces 4 different hypothetical choice situations. In addition to this, we added two ways of modification. First, we asked the respondents to include the status-quo broadcasting services they subscribe to, as another alternative in each choice set. As a result, respondents choose the most preferred alternative among 5 alternatives consisting of 1 alternative with current subscription and 4 hypothetical alternatives in 4 choice sets. Modification of traditional conjoint survey in this way enabled us to estimate the factors related to switching cost or switching threshold in addition to the effects of attributes. Also, by using both revealed preference data(1 alternative with current subscription) and stated preference data (4 hypothetical alternatives), additional advantages in terms of the estimation properties and more conservative and realistic forecast, can be achieved. Second, we asked the respondents to choose the most preferred alternative while considering their expected adoption timing or switching timing. Respondents are asked to report their expected adoption or switching timing among 14 half-year points after the introduction of next generation broadcasting services. As a result, for each respondent, 14 observations with 5 alternatives for each period, are obtained, which results in panel-type data. Finally, this panel-type data consisting of
Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.
With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.
Purpose: Uterine cervix cancer is one of the most prevalent women cancer in Korea. We analysed published papers in Korea with comparing Patterns of Care Study (PCS) articles of United States and Japan for the purpose of developing and processing Korean PCS. Materials and Methods: We searched PCS related foreign-produced papers in the PCS homepage (212 articles and abstracts) and from the Pub Med to find Structure and Process of the PCS. To compare their study with Korean papers, we used the internet site 'Korean Pub Med' to search 99 articles regarding uterine cervix cancer and radiation therapy. We analysed Korean paper by comparing them with selected PCS papers regarding Structure, Process and Outcome and compared their items between the period of before 1980's and 1990's. Results: Evaluable papers were 28 from United States, 10 from the Japan and 73 from the Korea which treated cervix PCS items. PCS papers for United States and Japan commonly stratified into shows, Step 1 and Step 2 are significant, and mediation variable has a significant effect on dependent variables and so does independent variables at Step 3, too. And there needs to prove the partial mediation effect, independent variable's estimate ability at Step 3(Standardized coefficient
shows, Step 1 and Step 2 are significant, and mediation variable has a significant effect on dependent variables and so does independent variables at Step 3, too. And there needs to prove the partial mediation effect, independent variable's estimate ability at Step 3(Standardized coefficient
SKU recommender system for retail stores that carry identical brands using collaborative filtering and hybrid filtering
(협업 필터링 및 하이브리드 필터링을 이용한 동종 브랜드 판매 매장간(間) 취급 SKU 추천 시스템)
Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information
(웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)
End to End Model and Delay Performance for V2X in 5G
(5G에서 V2X를 위한 End to End 모델 및 지연 성능 평가)
Forecasting Substitution and Competition among Previous and New products using Choice-based Diffusion Model with Switching Cost: Focusing on Substitution and Competition among Previous and New Fixed Charged Broadcasting Services
(전환 비용이 반영된 선택 기반 확산 모형을 통한 신.구 상품간 대체 및 경쟁 예측: 신.구 유료 방송서비스간 대체 및 경쟁 사례를 중심으로)
Intelligent VOC Analyzing System Using Opinion Mining
(오피니언 마이닝을 이용한 지능형 VOC 분석시스템)
Multi-Dimensional Analysis Method of Product Reviews for Market Insight
(마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)
Literature Analysis of Radiotherapy in Uterine Cervix Cancer for the Processing of the Patterns of Care Study in Korea
(한국에서 자궁경부알 방사선치료의 Patterns of Care Study 진행을 위한 문헌 비교 연구)
이메일무단수집거부
이용약관
제 1 장 총칙
제 2 장 이용계약의 체결
제 3 장 계약 당사자의 의무
제 4 장 서비스의 이용
제 5 장 계약 해지 및 이용 제한
제 6 장 손해배상 및 기타사항
Detail Search
Image Search
(β)