• Title/Summary/Keyword: structured modeling

Search Result 328, Processing Time 0.034 seconds

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

Quantitative Microbial Risk Assessment Model for Staphylococcus aureus in Kimbab (김밥에서의 Staphylococcus aureus에 대한 정량적 미생물위해평가 모델 개발)

  • Bahk, Gyung-Jin;Oh, Deog-Hwan;Ha, Sang-Do;Park, Ki-Hwan;Joung, Myung-Sub;Chun, Suk-Jo;Park, Jong-Seok;Woo, Gun-Jo;Hong, Chong-Hae
    • Korean Journal of Food Science and Technology
    • /
    • v.37 no.3
    • /
    • pp.484-491
    • /
    • 2005
  • Quantitative microbial risk assessment (QMRA) analyzes potential hazard of microorganisms on public health and offers structured approach to assess risks associated with microorganisms in foods. This paper addresses specific risk management questions associated with Staphylococcus aureus in kimbab and improvement and dissemination of QMRA methodology, QMRA model was developed by constructing four nodes from retail to table pathway. Predictive microbial growth model and survey data were combined with probabilistic modeling to simulate levels of S. aureus in kimbab at time of consumption, Due to lack of dose-response models, final level of S. aureus in kimbeb was used as proxy for potential hazard level, based on which possibility of contamination over this level and consumption level of S. aureus through kimbab were estimated as 30.7% and 3.67 log cfu/g, respectively. Regression sensitivity results showed time-temperature during storage at selling was the most significant factor. These results suggested temperature control under $10^{\circ}C$ was critical control point for kimbab production to prevent growth of S. aureus and showed QMRA was useful for evaluation of factors influencing potential risk and could be applied directly to risk management.

Structural Relationships Among Factors to Adoption of Telehealth Service (원격의료서비스 수용요인의 구조적 관계 실증연구)

  • Kim, Sung-Soo;Ryu, See-Won
    • Asia pacific journal of information systems
    • /
    • v.21 no.3
    • /
    • pp.71-96
    • /
    • 2011
  • Within the traditional medical delivery system, patients residing in medically vulnerable areas, those with body movement difficulties, and nursing facility residents have had limited access to good healthcare services. However, Information and Communication Technology (ICT) provides us with a convenient and useful means of overcoming distance and time constraints. ICT is integrated with biomedical science and technology in a way that offers a new high-quality medical service. As a result, rapid technological advancement is expected to play a pivotal role bringing about innovation in a wide range of medical service areas, such as medical management, testing, diagnosis, and treatment; offering new and improved healthcare services; and effecting dramatic changes in current medical services. The increase in aging population and chronic diseases has caused an increase in medical expenses. In response to the increasing demand for efficient healthcare services, a telehealth service based on ICT is being emphasized on a global level. Telehealth services have been implemented especially in pilot projects and system development and technological research. With the service about to be implemented in earnest, it is necessary to study its overall acceptance by consumers, which is expected to contribute to the development and activation of a variety of services. In this sense, the study aims at positively examining the structural relationship among the acceptance factors for telehealth services based on the Technology Acceptance Model (TAM). Data were collected by showing audiovisual material on telehealth services to online panels and requesting them to respond to a structured questionnaire sheet, which is known as the information acceleration method. Among the 1,165 adult respondents, 608 valid samples were finally chosen, while the remaining were excluded because of incomplete answers or allotted time overrun. In order to test the reliability and validity of the assessment scale items, we carried out reliability and factor analyses, and in order to explore the causal relation among potential variables, we conducted a structural equation modeling analysis using AMOS 7.0 and SPSS 17.0. The research outcomes are as follows. First, service quality, innovativeness of medical technology, and social influence were shown to affect perceived ease of use and perceived usefulness of the telehealth service, which was statistically significant, and the two factors had a positive impact on willingness to accept the telehealth service. In addition, social influence had a direct, significant effect on intention to use, which is paralleled by the TAM used in previous research on technology acceptance. This shows that the research model proposed in the study effectively explains the acceptance of the telehealth service. Second, the research model reveals that information privacy concerns had a insignificant impact on perceived ease of use of the telehealth service. From this, it can be gathered that the concerns over information protection and security are reduced further due to advancements in information technology compared to the initial period in the information technology industry, and thus the improvement in quality of medical services appeared to ensure that information privacy concerns did not act as a prohibiting factor in the acceptance of the telehealth service. Thus, if other factors have an enormous impact on ease of use and usefulness, concerns over these results in the initial period of technology acceptance may become irrelevant. However, it is clear that users' information privacy concerns, as other studies have revealed, is a major factor affecting technology acceptance. Thus, caution must be exercised while interpreting the result, and further study is required on the issue. Numerous information technologies with outstanding performance and innovativeness often attract few consumers. A revised bill for those urgently in need of telehealth services is about to be approved in the national assembly. As telemedicine is implemented between doctors and patients, a wide range of systems that will improve the quality of healthcare services will be designed. In this sense, the study on the consumer acceptance of telehealth services is meaningful and offers strong academic evidence. Based on the implications, it can be expected to contribute to the activation of telehealth services. Further study is needed to assess the acceptance factors for telehealth services, such as motivation to remain healthy, health care involvement, knowledge on health, and control of health-related behavior, in order to develop unique services according to the categorization of customers based on health factors. In addition, further study may focus on various theoretical cognitive behavior models other than the TAM, such as the health belief model.

Mapping Categories of Heterogeneous Sources Using Text Analytics (텍스트 분석을 통한 이종 매체 카테고리 다중 매핑 방법론)

  • Kim, Dasom;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.193-215
    • /
    • 2016
  • In recent years, the proliferation of diverse social networking services has led users to use many mediums simultaneously depending on their individual purpose and taste. Besides, while collecting information about particular themes, they usually employ various mediums such as social networking services, Internet news, and blogs. However, in terms of management, each document circulated through diverse mediums is placed in different categories on the basis of each source's policy and standards, hindering any attempt to conduct research on a specific category across different kinds of sources. For example, documents containing content on "Application for a foreign travel" can be classified into "Information Technology," "Travel," or "Life and Culture" according to the peculiar standard of each source. Likewise, with different viewpoints of definition and levels of specification for each source, similar categories can be named and structured differently in accordance with each source. To overcome these limitations, this study proposes a plan for conducting category mapping between different sources with various mediums while maintaining the existing category system of the medium as it is. Specifically, by re-classifying individual documents from the viewpoint of diverse sources and storing the result of such a classification as extra attributes, this study proposes a logical layer by which users can search for a specific document from multiple heterogeneous sources with different category names as if they belong to the same source. Besides, by collecting 6,000 articles of news from two Internet news portals, experiments were conducted to compare accuracy among sources, supervised learning and semi-supervised learning, and homogeneous and heterogeneous learning data. It is particularly interesting that in some categories, classifying accuracy of semi-supervised learning using heterogeneous learning data proved to be higher than that of supervised learning and semi-supervised learning, which used homogeneous learning data. This study has the following significances. First, it proposes a logical plan for establishing a system to integrate and manage all the heterogeneous mediums in different classifying systems while maintaining the existing physical classifying system as it is. This study's results particularly exhibit very different classifying accuracies in accordance with the heterogeneity of learning data; this is expected to spur further studies for enhancing the performance of the proposed methodology through the analysis of characteristics by category. In addition, with an increasing demand for search, collection, and analysis of documents from diverse mediums, the scope of the Internet search is not restricted to one medium. However, since each medium has a different categorical structure and name, it is actually very difficult to search for a specific category insofar as encompassing heterogeneous mediums. The proposed methodology is also significant for presenting a plan that enquires into all the documents regarding the standards of the relevant sites' categorical classification when the users select the desired site, while maintaining the existing site's characteristics and structure as it is. This study's proposed methodology needs to be further complemented in the following aspects. First, though only an indirect comparison and evaluation was made on the performance of this proposed methodology, future studies would need to conduct more direct tests on its accuracy. That is, after re-classifying documents of the object source on the basis of the categorical system of the existing source, the extent to which the classification was accurate needs to be verified through evaluation by actual users. In addition, the accuracy in classification needs to be increased by making the methodology more sophisticated. Furthermore, an understanding is required that the characteristics of some categories that showed a rather higher classifying accuracy of heterogeneous semi-supervised learning than that of supervised learning might assist in obtaining heterogeneous documents from diverse mediums and seeking plans that enhance the accuracy of document classification through its usage.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

The Determination of Trust in Franchisor-Franchisee Relationships in China (중국 프랜차이즈 시스템에서의 본부와 가맹점간 신뢰의 영향요인)

  • Shin, Geon-Cheol;Ma, Yaokun
    • Journal of Global Scholars of Marketing Science
    • /
    • v.18 no.2
    • /
    • pp.65-88
    • /
    • 2008
  • Since the implementation of economic reforms in 1978, the Chinese economy grows rapidly at an average annul growth rate of 9% over the post two decades. Franchising has been widely recognized as an important source of entrepreneurial activity. Trust is important in that it facilitates relational exchanges by permits partners to transcend short-run inequities or risks to concentrate on long-term profits or gains. In the relationship between the franchisors and franchisees, trust has been described as an important source of competitive advantage. However, little research has been done on the factors affecting trust in Chinese franchisor-franchisee relationships. The purpose of this study is to investigate what factors affect the trust in the franchise system in China, and to provide guidelines and insights to franchisors which enter Chinese market. In this study, according to Morgan and Hunt (1994), trust is defined as the extending when one party has confidence in an exchange partner's reliability and integrity. We offered a conceptual model of the empirical study. The model shows that the factors affecting the trust include franchisor's supports, communication, satisfaction with previous outcome and conflict. We also suggested the franchisor's supports and communication like to enhance the franchisee's satisfaction with previous outcome, and the franchisor's supports, communication and he franchisee's satisfaction with previous outcome tend to decrease conflict. Before the formal study, a pretest involving exploratory interviews with owners from three franchisees was conducted to make sure the questionnaire was relevant and clear to the respondents. The data were collected using trained interviewers to carry out personal interviews with the aid of an unidentified, muti-page, structured questionnaire. The respondents comprised of owners, managers, and owner managers of franchisee-owned food service franchises located in Beijing, China. Even though a total of 256 potential franchises were initially contacted, the finally usable sample consisted of 125 respondents. As expected, the sampling method was successful in soliciting respondents with waried personal and firm characteristics. Self-administrated questionnaires were used for all measures. And established scales were used to measure the latent constructs in this study. The measures tapped the franchisees' perceptions of the relationship with the referent franchisor. Five-point Likert-type scales ranging from "strongly disagree" (=1) to "strongly agree" (=7) were used throughout the constructs (trust, eight items; support, five items; communication, four items; satisfaction, six items; conflict, three items). The reliability measurements traditionally employed, such as the Cronbach's alpha, were used. All the reliabilities were greater than.80. The proposed measurement model was estimated using SPSS 12.0 and AMOS 5.0 analysis package. We conducted A series of exploratory factor analyses and confirmatory factor analyses to assess the convergent validity, discriminant validity, and reliability. The results indicate reasonable overall fits between the model and the observed data. The overall fit of measurement model were $X^2$= 159.699, p=0.004, d.f. = 116, GFI =.879, NFI =.898, CFI =.969, IFI =.970, TLI =.959, RMR =.058. The results demonstrated that the data reasonably fitted the model. We also examined construct reliability and reliability and average variance extracted (AVE). The construct reliability of each construct was greater than.80 and the AVE of each construct was greater than.50. According to the analysis of Structure Equation Modeling (SEM), the results of path model indicated an adequate fit of the model: $X^2$= 142.126, p = 0.044, d.f. = 115, GFI =.892, NFI =.909, CFI =.981, IFI =.981, TLI =.974, RMR =.057. As hypothesized, the results showed that it is strategically important to establish trust in a franchise system, and the franchisor's supports, communication and satisfaction with previous outcome tend to reinforce franchisee's trust. The results also showed trust seems to decrease as the experience of conflict episodes increases. And we also noticed that franchisor's supports and communication tend to enhance the franchisee's satisfaction with previous outcome, and communication tend to decrease conflict. If the trust between the franchisor and franchisee can be established in a franchise system, franchising offers many benefits and reduces many costs. To manage a mutual trust of relationship with their franchisees, franchisor's should provide support effectively to their franchisees. Effective assistant services have direct effect on franchisees' satisfaction with previous outcome and trust in franchisor. Especially, franchise sales process, orientation, and training in the start-up period are key elements for success of the franchise system. Franchisor's support is an accumulated separate satisfaction evaluation with different kind of service provided by the franchisor. And providing support definitely can improve the trustworthy image of the franchisor. In the franchise system, conflicts of interests and exertions of different power sources are very common. The experience of conflict episodes seems to negatively relate to trust. Therefore, it is important to reduce the negative side of the relationship conflicts. Communication actually plays a broader role in reducing conflict and establish mutual trust in franchisor-franchisee relationship. And effective communication between franchisors and franchisees can improve franchisees' satisfaction toward the franchise system. As the diversification of Chinese markets, both franchisors and franchisees must keep the relevant, timely, and reliable communication. And it is very important to improve the quality of communication. Satisfaction with precious outcomes seems to positively relate to trust. Franchisors and franchisees that are highly satisfied with the previous outcomes that flow from their relationship will perceive their partner as advancing their goal achievement. Therefore, it is necessary for both franchisor and their franchisees to make the welfare of partner with effort. Little literature has focused on what factors affect the trust between franchisors and their franchisees in China. This study developed the hypotheses regarding the factors affecting trust in the transaction relationship. The results of data analysis supported the hypotheses strongly. There are certain limitations in this study. First, we may point out that some other factors missed in this study could be significantly important. Second, the context of this study, food service industry, limits its potential generalizability for all franchise systems. More studies in different categories of franchise system are needed to broaden its generalizability. Third, the model was tested empirically in a sample in Beijing, more empirical tests of the proposed model in other Chinese areas are needed. Finally, the analysis in this study was solely based on the perception of franchisees and the opinions of franchisors were not included.

  • PDF

How Enduring Product Involvement and Perceived Risk Affect Consumers' Online Merchant Selection Process: The 'Required Trust Level' Perspective (지속적 관여도 및 인지된 위험이 소비자의 온라인 상인선택 프로세스에 미치는 영향에 관한 연구: 요구신뢰 수준 개념을 중심으로)

  • Hong, Il-Yoo B.;Lee, Jung-Min;Cho, Hwi-Hyung
    • Asia pacific journal of information systems
    • /
    • v.22 no.1
    • /
    • pp.29-52
    • /
    • 2012
  • Consumers differ in the way they make a purchase. An audio mania would willingly make a bold, yet serious, decision to buy a top-of-the-line home theater system, while he is not interested in replacing his two-decade-old shabby car. On the contrary, an automobile enthusiast wouldn't mind spending forty thousand dollars to buy a new Jaguar convertible, yet cares little about his junky component system. It is product involvement that helps us explain such differences among individuals in the purchase style. Product involvement refers to the extent to which a product is perceived to be important to a consumer (Zaichkowsky, 2001). Product involvement is an important factor that strongly influences consumer's purchase decision-making process, and thus has been of prime interest to consumer behavior researchers. Furthermore, researchers found that involvement is closely related to perceived risk (Dholakia, 2001). While abundant research exists addressing how product involvement relates to overall perceived risk, little attention has been paid to the relationship between involvement and different types of perceived risk in an electronic commerce setting. Given that perceived risk can be a substantial barrier to the online purchase (Jarvenpaa, 2000), research addressing such an issue will offer useful implications on what specific types of perceived risk an online firm should focus on mitigating if it is to increase sales to a fullest potential. Meanwhile, past research has focused on such consumer responses as information search and dissemination as a consequence of involvement, neglecting other behavioral responses like online merchant selection. For one example, will a consumer seriously considering the purchase of a pricey Guzzi bag perceive a great degree of risk associated with online buying and therefore choose to buy it from a digital storefront rather than from an online marketplace to mitigate risk? Will a consumer require greater trust on the part of the online merchant when the perceived risk of online buying is rather high? We intend to find answers to these research questions through an empirical study. This paper explores the impact of enduring product involvement and perceived risks on required trust level, and further on online merchant choice. For the purpose of the research, five types or components of perceived risk are taken into consideration, including financial, performance, delivery, psychological, and social risks. A research model has been built around the constructs under consideration, and 12 hypotheses have been developed based on the research model to examine the relationships between enduring involvement and five components of perceived risk, between five components of perceived risk and required trust level, between enduring involvement and required trust level, and finally between required trust level and preference toward an e-tailer. To attain our research objectives, we conducted an empirical analysis consisting of two phases of data collection: a pilot test and main survey. The pilot test was conducted using 25 college students to ensure that the questionnaire items are clear and straightforward. Then the main survey was conducted using 295 college students at a major university for nine days between December 13, 2010 and December 21, 2010. The measures employed to test the model included eight constructs: (1) enduring involvement, (2) financial risk, (3) performance risk, (4) delivery risk, (5) psychological risk, (6) social risk, (7) required trust level, (8) preference toward an e-tailer. The statistical package, SPSS 17.0, was used to test the internal consistency among the items within the individual measures. Based on the Cronbach's ${\alpha}$ coefficients of the individual measure, the reliability of all the variables is supported. Meanwhile, the Amos 18.0 package was employed to perform a confirmatory factor analysis designed to assess the unidimensionality of the measures. The goodness of fit for the measurement model was satisfied. Unidimensionality was tested using convergent, discriminant, and nomological validity. The statistical evidences proved that the three types of validity were all satisfied. Now the structured equation modeling technique was used to analyze the individual paths along the relationships among the research constructs. The results indicated that enduring involvement has significant positive relationships with all the five components of perceived risk, while only performance risk is significantly related to trust level required by consumers for purchase. It can be inferred from the findings that product performance problems are mostly likely to occur when a merchant behaves in an opportunistic manner. Positive relationships were also found between involvement and required trust level and between required trust level and online merchant choice. Enduring involvement is concerned with the pleasure a consumer derives from a product class and/or with the desire for knowledge for the product class, and thus is likely to motivate the consumer to look for ways of mitigating perceived risk by requiring a higher level of trust on the part of the online merchant. Likewise, a consumer requiring a high level of trust on the merchant will choose a digital storefront rather than an e-marketplace, since a digital storefront is believed to be trustworthier than an e-marketplace, as it fulfills orders by itself rather than acting as an intermediary. The findings of the present research provide both academic and practical implications. The first academic implication is that enduring product involvement is a strong motivator of consumer responses, especially the selection of a merchant, in the context of electronic shopping. Secondly, academicians are advised to pay attention to the finding that an individual component or type of perceived risk can be used as an important research construct, since it would allow one to pinpoint the specific types of risk that are influenced by antecedents or that influence consequents. Meanwhile, our research provides implications useful for online merchants (both online storefronts and e-marketplaces). Merchants may develop strategies to attract consumers by managing perceived performance risk involved in purchase decisions, since it was found to have significant positive relationship with the level of trust required by a consumer on the part of the merchant. One way to manage performance risk would be to thoroughly examine the product before shipping to ensure that it has no deficiencies or flaws. Secondly, digital storefronts are advised to focus on symbolic goods (e.g., cars, cell phones, fashion outfits, and handbags) in which consumers are relatively more involved than others, whereas e- marketplaces should put their emphasis on non-symbolic goods (e.g., drinks, books, MP3 players, and bike accessories).

  • PDF

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.