• Title/Summary/Keyword: BLOGs

Search Result 315, Processing Time 0.023 seconds

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

Analyzing the Effect of Online media on Overseas Travels: A Case study of Asian 5 countries (해외 출국에 영향을 미치는 온라인 미디어 효과 분석: 아시아 5개국을 중심으로)

  • Lee, Hea In;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.53-74
    • /
    • 2018
  • Since South Korea has an economic structure that has a characteristic which market-dependent on overseas, the tourism industry is considered as a very important industry for the national economy, such as improving the country's balance of payments or providing income and employment increases. Accordingly, the necessity of more accurate forecasting on the demand in the tourism industry has been raised to promote its industry. In the related research, economic variables such as exchange rate and income have been used as variables influencing tourism demand. As information technology has been widely used, some researchers have also analyzed the effect of media on tourism demand. It has shown that the media has a considerable influence on traveler's decision making, such as choosing an outbound destination. Furthermore, with the recent availability of online information searches to obtain the latest information and two-way communication in social media, it is possible to obtain up-to-date information on travel more quickly than before. The information in online media such as blogs can naturally create the Word-of-Mouth effect by sharing useful information, which is called eWOM. Like all other service industries, the tourism industry is characterized by difficulty in evaluating its values before it is experienced directly. And furthermore, most of the travelers tend to search for more information in advance from various sources to reduce the perceived risk to the destination, so they can also be influenced by online media such as online news. In this study, we suggested that the number of online media posting, which causes the effects of Word-of-Mouth, may have an effect on the number of outbound travelers. We divided online media into public media and private media according to their characteristics and selected online news as public media and blog as private media, one of the most popular social media in tourist information. Based on the previous studies about the eWOM effects on online news and blog, we analyzed a relationship between the volume of eWOM and the outbound tourism demand through the panel model. To this end, we collected data on the number of national outbound travelers from 2007 to 2015 provided by the Korea Tourism Organization. According to statistics, the highest number of outbound tourism demand in Korea are China, Japan, Thailand, Hong Kong and the Philippines, which are selected as a dependent variable in this study. In order to measure the volume of eWOM, we collected online news and blog postings for the same period as the number of outbound travelers in Naver, which is the largest portal site in South Korea. In this study, a panel model was established to analyze the effect of online media on the demand of Korean outbound travelers and to identify that there was a significant difference in the influence of online media by each time and countries. The results of this study can be summarized as follows. First, the impact of the online news and blog eWOM on the number of outbound travelers was significant. We found that the number of online news and blog posting have an influence on the number of outbound travelers, especially the experimental result suggests that both the month that includes the departure date and the three months before the departure were found to have an effect. It is shown that online news and blog are online media that have a significant influence on outbound tourism demand. Next, we found that the increased volume of eWOM in online news has a negative effect on departure, while the increase in a blog has a positive effect. The result with the country-specific models would be the same. This paper shows that online media can be used as a new variable in tourism demand by examining the influence of the eWOM effect of the online media. Also, we found that both social media and news media have an important role in predicting and managing the Korean tourism demand and that the influence of those two media appears different depending on the country.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

Issue tracking and voting rate prediction for 19th Korean president election candidates (댓글 분석을 통한 19대 한국 대선 후보 이슈 파악 및 득표율 예측)

  • Seo, Dae-Ho;Kim, Ji-Ho;Kim, Chang-Ki
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.199-219
    • /
    • 2018
  • With the everyday use of the Internet and the spread of various smart devices, users have been able to communicate in real time and the existing communication style has changed. Due to the change of the information subject by the Internet, data became more massive and caused the very large information called big data. These Big Data are seen as a new opportunity to understand social issues. In particular, text mining explores patterns using unstructured text data to find meaningful information. Since text data exists in various places such as newspaper, book, and web, the amount of data is very diverse and large, so it is suitable for understanding social reality. In recent years, there has been an increasing number of attempts to analyze texts from web such as SNS and blogs where the public can communicate freely. It is recognized as a useful method to grasp public opinion immediately so it can be used for political, social and cultural issue research. Text mining has received much attention in order to investigate the public's reputation for candidates, and to predict the voting rate instead of the polling. This is because many people question the credibility of the survey. Also, People tend to refuse or reveal their real intention when they are asked to respond to the poll. This study collected comments from the largest Internet portal site in Korea and conducted research on the 19th Korean presidential election in 2017. We collected 226,447 comments from April 29, 2017 to May 7, 2017, which includes the prohibition period of public opinion polls just prior to the presidential election day. We analyzed frequencies, associative emotional words, topic emotions, and candidate voting rates. By frequency analysis, we identified the words that are the most important issues per day. Particularly, according to the result of the presidential debate, it was seen that the candidate who became an issue was located at the top of the frequency analysis. By the analysis of associative emotional words, we were able to identify issues most relevant to each candidate. The topic emotion analysis was used to identify each candidate's topic and to express the emotions of the public on the topics. Finally, we estimated the voting rate by combining the volume of comments and sentiment score. By doing above, we explored the issues for each candidate and predicted the voting rate. The analysis showed that news comments is an effective tool for tracking the issue of presidential candidates and for predicting the voting rate. Particularly, this study showed issues per day and quantitative index for sentiment. Also it predicted voting rate for each candidate and precisely matched the ranking of the top five candidates. Each candidate will be able to objectively grasp public opinion and reflect it to the election strategy. Candidates can use positive issues more actively on election strategies, and try to correct negative issues. Particularly, candidates should be aware that they can get severe damage to their reputation if they face a moral problem. Voters can objectively look at issues and public opinion about each candidate and make more informed decisions when voting. If they refer to the results of this study before voting, they will be able to see the opinions of the public from the Big Data, and vote for a candidate with a more objective perspective. If the candidates have a campaign with reference to Big Data Analysis, the public will be more active on the web, recognizing that their wants are being reflected. The way of expressing their political views can be done in various web places. This can contribute to the act of political participation by the people.

A Qualitative Study on Facilitating Factors of User-Created Contents: Based on Theories of Folklore (사용자 제작 콘텐츠의 활성화 요인에 대한 정성적 연구: 구비문학 이론을 중심으로)

  • Jung, Seung-Ki;Lee, Ki-Ho;Lee, In-Seong;Kim, Jin-Woo
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.43-72
    • /
    • 2009
  • Recently, user-created content (UCC) have emerged as popular medium of on-line participation among users. The Internet environment has been constantly evolving, attracting active participation and information sharing among common users. This tendency is a significant deviation from the earlier Internet use as an one-way information channel through which users passively received information or contents from contents providers. Thanks to UCCs online users can now more freely generate and exchange contents; therefore, identifying the critical factors that affect content-generating activities has increasingly become an important issue. This paper proposes a set of critical factors for stimulating contents generation and sharing activities by Internet users. These factors were derived from the theories of folklores such as tales and songs. Based on some shared traits of folklores and UCC content, we found four critical elements which should be heeded in constructing UCC contents, which are: context of culture, context of situation, skill of generator, and response of audience. In addition, we selected three major UCC websites: a specialized contents portal, a general internet portal, and an official contents service site, They have different use environments, user interfaces, and service policies, To identify critical factors for generating, sharing and transferring UCC, we traced user activities, interactions and flows of content in the three UCC websites. Moreover, we conducted extensive interviews with users and operators as well as policy makers in each site. Based on qualitative and quantitative analyses of the data, this research identifies nine critical factors that facilitate contents generation and sharing activities among users. In the context of culture, we suggest voluntary community norms, proactive use of copyrights, strong user relationships, and a fair monetary reward system as critical elements in facilitating the process of contents generation and sharing activities. Norms which were established by users themselves regulate user behavior and influence content format. Strong relationships of users stimulate content generation activities by enhancing collaborative content generation. Particularly, users generate contents through collaboration with others, based on their enhanced relationship and specialized skills. They send and receive contents by leaving messages on website or blogs, using instant messenger or SMS. It is an interesting and important phenomenon, because the quality of contents can be constantly improved and revised, depending on the specialized abilities of those engaged in a particular content. In this process, the reward system is an essential driving factor. Yet, monetary reward should be considered only after some fair criterion is established. In terms of the context of the situation, the quality of contents uploading system was proposed to have strong influence on the content generating activities. Among other influential factors on contents generation activities are generators' specialized skills and involvement of the users were proposed. In addition, the audience response, especially effective development of shared interests as well as feedback, was suggested to have significant influence on contents generation activities. Content generators usually reflect the shared interest of others. Shared interest is a distinct characteristic of UCC and observed in all the three websites, in which common interest is formed by the "threads" embedded with content. Through such threads of information and contents users discuss and share ideas while continuously extending and updating shared contents in the process. Evidently, UCC is a new paradigm representing the next generation of the Internet. In order to fully utilize this innovative paradigm, we need to understand how users take advantage of this medium in generating contents, and what affects their content generation activities. Based on these findings, UCC service providers should design their websites as common playground where users freely interact and share their common interests. As such this paper makes an important first step to gaining better understand about this new communication paradigm created by UCC.

consumers' purchasing behavior of functional cosmetics and Inula based functional cosmetics merchandising research (국내 소비자의 기능성화장품 구매행태 및 선복화 활용 기능성화장품 상품화를 위한 연구)

  • Han, Do-Kyung;Lee, Hyun-Jun;Lee, Eun-Hee;Paik, Hyun-Dong;Shin, Dong-Kyoo;Park, Dae-Sub;Hwang, Hye-Seon;Hong, Wan-Soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.8
    • /
    • pp.236-250
    • /
    • 2016
  • This study was conducted to provide baseline data regarding functional cosmetics so that Inula. based cosmetics can increase its competitiveness in the market as well as to understand current trends to enable anticipation of demands for future product development. For this research, general consumers over the age of 20 residing in Seoul and the Gyeonggi district were surveyed. The results show consumers preferred serum-type products among various types of cosmetics, and that they purchased these once every 1-3 months. Consumers also preferred these products in less than 10-30ml capacity, and at costs of less than 30,000-50,000 KRW. For whitening, functional cosmetics consumers also preferred the serum type, in less than 30-50ml capacity and priced less than 30,000-50,000 KRW. Consumers preferred to purchase functional cosmetics in single units. The major purchasing location, with a high preference rate, was cosmetic stores, and the major sources of information, also with high preference rates, were 'experienced reviews from family, friends and acquaintances' and 'TV advertisements'. Respondents selected 'over 50,000 KRW' the most for all items when responding to 'Purchase Intent for Functional Cosmetics containing Inula', and responded that they were willing to pay 10%-30% more for functional cosmetics containing Inula compared to standard functional cosmetics. These results show that businesses in the cosmetics industry need to take consumer demand into account when developing new functional cosmetic products, as well as establish plans to create specialized spaces that provide better quality service and increase word of mouth effect through better utilization of various types of offline media, social media, and blogs. The study also shows a need for businesses to develop products fully utilizing the Inula flower, which has been shown to be effective as a natural skin whitener, wrinkle reducer and skin moisturizer, to appeal to the increasing number of customers interested in health and beauty.

An Analysis of the Roles of Experience in Information System Continuance (정보시스템의 지속적 사용에서 경험의 역할에 대한 분석)

  • Lee, Woong-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.4
    • /
    • pp.45-62
    • /
    • 2011
  • The notion of information systems (IS) continuance has recently emerged as one of the most important research issues in the field of IS. A great deal of research has been conducted thus far on the basis of theories adapted from various disciplines including consumer behaviors and social psychology, in addition to theories regarding information technology (IT) acceptance. This previous body of knowledge provides a robust research framework that can already account for the determination of IS continuance; however, this research points to other, thus-far-unelucidated determinant factors such as habit, which were not included in traditional IT acceptance frameworks, and also re-emphasizes the importance of emotion-related constructs such as satisfaction in addition to conscious intention with rational beliefs such as usefulness. Experiences should also be considered one of the most important factors determining the characteristics of information system (IS) continuance and the features distinct from those determining IS acceptance, because more experienced users may have more opportunities for IS use, which would allow them more frequent use than would be available to less experienced or non-experienced users. Interestingly, experience has dual features that may contradictorily influence IS use. On one hand, attitudes predicated on direct experience have been shown to predict behavior better than attitudes from indirect experience or without experience; as more information is available, direct experience may render IS use a more salient behavior, and may also make IS use more accessible via memory. Therefore, experience may serve to intensify the relationship between IS use and conscious intention with evaluations, On the other hand, experience may culminate in the formation of habits: greater experience may also imply more frequent performance of the behavior, which may lead to the formation of habits, Hence, like experience, users' activation of an IS may be more dependent on habit-that is, unconscious automatic use without deliberation regarding the IS-and less dependent on conscious intentions, Furthermore, experiences can provide basic information necessary for satisfaction with the use of a specific IS, thus spurring the formation of both conscious intentions and unconscious habits, Whereas IT adoption Is a one-time decision, IS continuance may be a series of users' decisions and evaluations based on satisfaction with IS use. Moreover. habits also cannot be formed without satisfaction, even when a behavior is carried out repeatedly. Thus, experiences also play a critical role in satisfaction, as satisfaction is the consequence of direct experiences of actual behaviors. In particular, emotional experiences such as enjoyment can become as influential on IS use as are utilitarian experiences such as usefulness; this is especially true in light of the modern increase in membership-based hedonic systems - including online games, web-based social network services (SNS), blogs, and portals-all of which attempt to provide users with self-fulfilling value. Therefore, in order to understand more clearly the role of experiences in IS continuance, analysis must be conducted under a research framework that includes intentions, habits, and satisfaction, as experience may not only have duration-based moderating effects on the relationship between both intention and habit and the activation of IS use, but may also have content-based positive effects on satisfaction. This is consistent with the basic assumptions regarding the determining factors in IS continuance as suggested by Oritz de Guinea and Markus: consciousness, emotion, and habit. The principal objective of this study was to explore and assess the effects of experiences in IS continuance, with special consideration given to conscious intentions and unconscious habits, as well as satisfaction. IN service of this goal, along with a review of the relevant literature regarding the effects of experiences and habit on continuous IS use, this study suggested a research model that represents the roles of experience: its moderating role in the relationships of IS continuance with both conscious intention and unconscious habit, and its antecedent role in the development of satisfaction. For the validation of this research model. Korean university student users of 'Cyworld', one of the most influential social network services in South Korea, were surveyed, and the data were analyzed via partial least square (PLS) analysis to assess the implications of this study. In result most hypotheses in our research model were statistically supported with the exception of one. Although one hypothesis was not supported, the study's findings provide us with some important implications. First the role of experience in IS continuance differs from its role in IS acceptance. Second, the use of IS was explained by the dynamic balance between habit and intention. Third, the importance of satisfaction was confirmed from the perspective of IS continuance with experience.

Learning Material Bookmarking Service based on Collective Intelligence (집단지성 기반 학습자료 북마킹 서비스 시스템)

  • Jang, Jincheul;Jung, Sukhwan;Lee, Seulki;Jung, Chihoon;Yoon, Wan Chul;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.179-192
    • /
    • 2014
  • Keeping in line with the recent changes in the information technology environment, the online learning environment that supports multiple users' participation such as MOOC (Massive Open Online Courses) has become important. One of the largest professional associations in Information Technology, IEEE Computer Society, announced that "Supporting New Learning Styles" is a crucial trend in 2014. Popular MOOC services, CourseRa and edX, have continued to build active learning environment with a large number of lectures accessible anywhere using smart devices, and have been used by an increasing number of users. In addition, collaborative web services (e.g., blogs and Wikipedia) also support the creation of various user-uploaded learning materials, resulting in a vast amount of new lectures and learning materials being created every day in the online space. However, it is difficult for an online educational system to keep a learner' motivation as learning occurs remotely, with limited capability to share knowledge among the learners. Thus, it is essential to understand which materials are needed for each learner and how to motivate learners to actively participate in online learning system. To overcome these issues, leveraging the constructivism theory and collective intelligence, we have developed a social bookmarking system called WeStudy, which supports learning material sharing among the users and provides personalized learning material recommendations. Constructivism theory argues that knowledge is being constructed while learners interact with the world. Collective intelligence can be separated into two types: (1) collaborative collective intelligence, which can be built on the basis of direct collaboration among the participants (e.g., Wikipedia), and (2) integrative collective intelligence, which produces new forms of knowledge by combining independent and distributed information through highly advanced technologies and algorithms (e.g., Google PageRank, Recommender systems). Recommender system, one of the examples of integrative collective intelligence, is to utilize online activities of the users and recommend what users may be interested in. Our system included both collaborative collective intelligence functions and integrative collective intelligence functions. We analyzed well-known Web services based on collective intelligence such as Wikipedia, Slideshare, and Videolectures to identify main design factors that support collective intelligence. Based on this analysis, in addition to sharing online resources through social bookmarking, we selected three essential functions for our system: 1) multimodal visualization of learning materials through two forms (e.g., list and graph), 2) personalized recommendation of learning materials, and 3) explicit designation of learners of their interest. After developing web-based WeStudy system, we conducted usability testing through the heuristic evaluation method that included seven heuristic indices: features and functionality, cognitive page, navigation, search and filtering, control and feedback, forms, context and text. We recruited 10 experts who majored in Human Computer Interaction and worked in the same field, and requested both quantitative and qualitative evaluation of the system. The evaluation results show that, relative to the other functions evaluated, the list/graph page produced higher scores on all indices except for contexts & text. In case of contexts & text, learning material page produced the best score, compared with the other functions. In general, the explicit designation of learners of their interests, one of the distinctive functions, received lower scores on all usability indices because of its unfamiliar functionality to the users. In summary, the evaluation results show that our system has achieved high usability with good performance with some minor issues, which need to be fully addressed before the public release of the system to large-scale users. The study findings provide practical guidelines for the design and development of various systems that utilize collective intelligence.

Mapping Categories of Heterogeneous Sources Using Text Analytics (텍스트 분석을 통한 이종 매체 카테고리 다중 매핑 방법론)

  • Kim, Dasom;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.193-215
    • /
    • 2016
  • In recent years, the proliferation of diverse social networking services has led users to use many mediums simultaneously depending on their individual purpose and taste. Besides, while collecting information about particular themes, they usually employ various mediums such as social networking services, Internet news, and blogs. However, in terms of management, each document circulated through diverse mediums is placed in different categories on the basis of each source's policy and standards, hindering any attempt to conduct research on a specific category across different kinds of sources. For example, documents containing content on "Application for a foreign travel" can be classified into "Information Technology," "Travel," or "Life and Culture" according to the peculiar standard of each source. Likewise, with different viewpoints of definition and levels of specification for each source, similar categories can be named and structured differently in accordance with each source. To overcome these limitations, this study proposes a plan for conducting category mapping between different sources with various mediums while maintaining the existing category system of the medium as it is. Specifically, by re-classifying individual documents from the viewpoint of diverse sources and storing the result of such a classification as extra attributes, this study proposes a logical layer by which users can search for a specific document from multiple heterogeneous sources with different category names as if they belong to the same source. Besides, by collecting 6,000 articles of news from two Internet news portals, experiments were conducted to compare accuracy among sources, supervised learning and semi-supervised learning, and homogeneous and heterogeneous learning data. It is particularly interesting that in some categories, classifying accuracy of semi-supervised learning using heterogeneous learning data proved to be higher than that of supervised learning and semi-supervised learning, which used homogeneous learning data. This study has the following significances. First, it proposes a logical plan for establishing a system to integrate and manage all the heterogeneous mediums in different classifying systems while maintaining the existing physical classifying system as it is. This study's results particularly exhibit very different classifying accuracies in accordance with the heterogeneity of learning data; this is expected to spur further studies for enhancing the performance of the proposed methodology through the analysis of characteristics by category. In addition, with an increasing demand for search, collection, and analysis of documents from diverse mediums, the scope of the Internet search is not restricted to one medium. However, since each medium has a different categorical structure and name, it is actually very difficult to search for a specific category insofar as encompassing heterogeneous mediums. The proposed methodology is also significant for presenting a plan that enquires into all the documents regarding the standards of the relevant sites' categorical classification when the users select the desired site, while maintaining the existing site's characteristics and structure as it is. This study's proposed methodology needs to be further complemented in the following aspects. First, though only an indirect comparison and evaluation was made on the performance of this proposed methodology, future studies would need to conduct more direct tests on its accuracy. That is, after re-classifying documents of the object source on the basis of the categorical system of the existing source, the extent to which the classification was accurate needs to be verified through evaluation by actual users. In addition, the accuracy in classification needs to be increased by making the methodology more sophisticated. Furthermore, an understanding is required that the characteristics of some categories that showed a rather higher classifying accuracy of heterogeneous semi-supervised learning than that of supervised learning might assist in obtaining heterogeneous documents from diverse mediums and seeking plans that enhance the accuracy of document classification through its usage.

Case Study on the Enterprise Microblog Usage: Focusing on Knowledge Management Strategy (기업용 마이크로블로그의 사용행태에 대한 사례연구: 지식경영전략을 중심으로)

  • Kang, Min Su;Park, Arum;Lee, Kyoung-Jun
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.47-63
    • /
    • 2015
  • As knowledge is paid attention as a new production factor that generates added value, studies continue to apply knowledge management to business environment. In addition, as ICT (Information Communication Technology) was engrafted in business environment, it leads to increasing task efficiency and productivity of individual workers. Accordingly, the way that a business achieves its goal has changed to one in which its individual members are willing to take part in the organization and share information to create new values (Han, 2003) and studies for the system and service to support such transition are carrying out. Of late, a new concept called 'Enterprise 2.0' newly appears. It is the extension of Wen 2.0 and its technology, which focus on participation, sharing and openness, to the work environment of a business (Jung, 2013). Enterprise 2.0 is being used as a collaborative tool to prop up individual creativity and group brain power by combining Web 2.0 technologies such as blog, Wiki, RSS and tag with business software (McAfee, 2006). As Tweeter gets popular, Enterprise Microblog (EMB), which is an example of Enterprise 2.0 for business, has been developed as equivalent to Tweeter in business circle and SaaS (Software as a Service) such as Yammer was introduced The studies of EMB mainly focus on demonstrating its usability in terms of intra-firm communication and knowledge management. However existing studies lean too much towards large-sized companies and certain departments, rather than a company as a whole. Therefore, few studies have been conducted on small and medium-sized companies that have difficulty preparing separate resources and supplying exclusive workforce to introduce knowledge management. In this respect, the present study placed its analytic focus on small-sized companies actually equipped with EMB to know how they use it. And, based on the findings, this study examined their knowledge management strategies for EMB from the point of codification and personalization. Hypothesis -"as a company grows, it shifts EMB strategy from codification to personalization'- was established on the basis of reviewing precedent studies and literature. To demonstrate the hypothesis, this study analyzed the usage of EMB by small companies that have used it from foundation. For case study, the duration of the use was divided into 2 spans and longitudinal analysis was employed to examine the contents of the blogs. Using the key findings of the analysis, this study is aimed to propose practical implications for the operation of knowledge management of small-sized company and the suitable application of knowledge management system for operation Knowledge Management Strategy can be classified by codification strategy and personalization strategy (Hansen et. al., 1999), and how to manage the two strategies were always studied. Also, current studies regarding the knowledge management strategy were targeted mostly for major companies, resulting in lack of studies in how it can be applied on SMEs. This research, with the knowledge management strategy suited for SMEs, sets an Enterprise Microblog (EMB), and with the EMB applied on SMEs' Knowledge Management Strategy, it is reviewed on the perspective of SMEs' Codification and Personalization Strategies. Through the advanced research regarding Knowledge Management Strategy and EMB, the hypothesis is set that "Depending on the development of the company, the main application of EMB alters from Codification Strategy to Personalization Strategy". To check the hypothesis, SME that have used the EMB called 'Yammer' was analyzed from the date of their foundation until today. The case study has implemented longitudinal analysis which divides the period when the EMBs were used into three stages and analyzes the contents. As the result of the study, this suggests a substantial implication regarding the application of Knowledge Management Strategy and its Knowledge Management System that is suitable for SME.