• Title/Summary/Keyword: 지능정보 기반

Search Result 4,487, Processing Time 0.03 seconds

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (비정형 텍스트 분석을 활용한 이슈의 동적 변이과정 고찰)

  • Lim, Myungsu;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Owing to the extensive use of Web media and the development of the IT industry, a large amount of data has been generated, shared, and stored. Nowadays, various types of unstructured data such as image, sound, video, and text are distributed through Web media. Therefore, many attempts have been made in recent years to discover new value through an analysis of these unstructured data. Among these types of unstructured data, text is recognized as the most representative method for users to express and share their opinions on the Web. In this sense, demand for obtaining new insights through text analysis is steadily increasing. Accordingly, text mining is increasingly being used for different purposes in various fields. In particular, issue tracking is being widely studied not only in the academic world but also in industries because it can be used to extract various issues from text such as news, (SocialNetworkServices) to analyze the trends of these issues. Conventionally, issue tracking is used to identify major issues sustained over a long period of time through topic modeling and to analyze the detailed distribution of documents involved in each issue. However, because conventional issue tracking assumes that the content composing each issue does not change throughout the entire tracking period, it cannot represent the dynamic mutation process of detailed issues that can be created, merged, divided, and deleted between these periods. Moreover, because only keywords that appear consistently throughout the entire period can be derived as issue keywords, concrete issue keywords such as "nuclear test" and "separated families" may be concealed by more general issue keywords such as "North Korea" in an analysis over a long period of time. This implies that many meaningful but short-lived issues cannot be discovered by conventional issue tracking. Note that detailed keywords are preferable to general keywords because the former can be clues for providing actionable strategies. To overcome these limitations, we performed an independent analysis on the documents of each detailed period. We generated an issue flow diagram based on the similarity of each issue between two consecutive periods. The issue transition pattern among categories was analyzed by using the category information of each document. In this study, we then applied the proposed methodology to a real case of 53,739 news articles. We derived an issue flow diagram from the articles. We then proposed the following useful application scenarios for the issue flow diagram presented in the experiment section. First, we can identify an issue that actively appears during a certain period and promptly disappears in the next period. Second, the preceding and following issues of a particular issue can be easily discovered from the issue flow diagram. This implies that our methodology can be used to discover the association between inter-period issues. Finally, an interesting pattern of one-way and two-way transitions was discovered by analyzing the transition patterns of issues through category analysis. Thus, we discovered that a pair of mutually similar categories induces two-way transitions. In contrast, one-way transitions can be recognized as an indicator that issues in a certain category tend to be influenced by other issues in another category. For practical application of the proposed methodology, high-quality word and stop word dictionaries need to be constructed. In addition, not only the number of documents but also additional meta-information such as the read counts, written time, and comments of documents should be analyzed. A rigorous performance evaluation or validation of the proposed methodology should be performed in future works.

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

An Efficient Estimation of Place Brand Image Power Based on Text Mining Technology (텍스트마이닝 기반의 효율적인 장소 브랜드 이미지 강도 측정 방법)

  • Choi, Sukjae;Jeon, Jongshik;Subrata, Biswas;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.113-129
    • /
    • 2015
  • Location branding is a very important income making activity, by giving special meanings to a specific location while producing identity and communal value which are based around the understanding of a place's location branding concept methodology. Many other areas, such as marketing, architecture, and city construction, exert an influence creating an impressive brand image. A place brand which shows great recognition to both native people of S. Korea and foreigners creates significant economic effects. There has been research on creating a strategically and detailed place brand image, and the representative research has been carried out by Anholt who surveyed two million people from 50 different countries. However, the investigation, including survey research, required a great deal of effort from the workforce and required significant expense. As a result, there is a need to make more affordable, objective and effective research methods. The purpose of this paper is to find a way to measure the intensity of the image of the brand objective and at a low cost through text mining purposes. The proposed method extracts the keyword and the factors constructing the location brand image from the related web documents. In this way, we can measure the brand image intensity of the specific location. The performance of the proposed methodology was verified through comparison with Anholt's 50 city image consistency index ranking around the world. Four methods are applied to the test. First, RNADOM method artificially ranks the cities included in the experiment. HUMAN method firstly makes a questionnaire and selects 9 volunteers who are well acquainted with brand management and at the same time cities to evaluate. Then they are requested to rank the cities and compared with the Anholt's evaluation results. TM method applies the proposed method to evaluate the cities with all evaluation criteria. TM-LEARN, which is the extended method of TM, selects significant evaluation items from the items in every criterion. Then the method evaluates the cities with all selected evaluation criteria. RMSE is used to as a metric to compare the evaluation results. Experimental results suggested by this paper's methodology are as follows: Firstly, compared to the evaluation method that targets ordinary people, this method appeared to be more accurate. Secondly, compared to the traditional survey method, the time and the cost are much less because in this research we used automated means. Thirdly, this proposed methodology is very timely because it can be evaluated from time to time. Fourthly, compared to Anholt's method which evaluated only for an already specified city, this proposed methodology is applicable to any location. Finally, this proposed methodology has a relatively high objectivity because our research was conducted based on open source data. As a result, our city image evaluation text mining approach has found validity in terms of accuracy, cost-effectiveness, timeliness, scalability, and reliability. The proposed method provides managers with clear guidelines regarding brand management in public and private sectors. As public sectors such as local officers, the proposed method could be used to formulate strategies and enhance the image of their places in an efficient manner. Rather than conducting heavy questionnaires, the local officers could monitor the current place image very shortly a priori, than may make decisions to go over the formal place image test only if the evaluation results from the proposed method are not ordinary no matter what the results indicate opportunity or threat to the place. Moreover, with co-using the morphological analysis, extracting meaningful facets of place brand from text, sentiment analysis and more with the proposed method, marketing strategy planners or civil engineering professionals may obtain deeper and more abundant insights for better place rand images. In the future, a prototype system will be implemented to show the feasibility of the idea proposed in this paper.

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

  • An, Jungkook;Kim, Hee-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.49-67
    • /
    • 2015
  • Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.

Analyzing the Effect of Online media on Overseas Travels: A Case study of Asian 5 countries (해외 출국에 영향을 미치는 온라인 미디어 효과 분석: 아시아 5개국을 중심으로)

  • Lee, Hea In;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.53-74
    • /
    • 2018
  • Since South Korea has an economic structure that has a characteristic which market-dependent on overseas, the tourism industry is considered as a very important industry for the national economy, such as improving the country's balance of payments or providing income and employment increases. Accordingly, the necessity of more accurate forecasting on the demand in the tourism industry has been raised to promote its industry. In the related research, economic variables such as exchange rate and income have been used as variables influencing tourism demand. As information technology has been widely used, some researchers have also analyzed the effect of media on tourism demand. It has shown that the media has a considerable influence on traveler's decision making, such as choosing an outbound destination. Furthermore, with the recent availability of online information searches to obtain the latest information and two-way communication in social media, it is possible to obtain up-to-date information on travel more quickly than before. The information in online media such as blogs can naturally create the Word-of-Mouth effect by sharing useful information, which is called eWOM. Like all other service industries, the tourism industry is characterized by difficulty in evaluating its values before it is experienced directly. And furthermore, most of the travelers tend to search for more information in advance from various sources to reduce the perceived risk to the destination, so they can also be influenced by online media such as online news. In this study, we suggested that the number of online media posting, which causes the effects of Word-of-Mouth, may have an effect on the number of outbound travelers. We divided online media into public media and private media according to their characteristics and selected online news as public media and blog as private media, one of the most popular social media in tourist information. Based on the previous studies about the eWOM effects on online news and blog, we analyzed a relationship between the volume of eWOM and the outbound tourism demand through the panel model. To this end, we collected data on the number of national outbound travelers from 2007 to 2015 provided by the Korea Tourism Organization. According to statistics, the highest number of outbound tourism demand in Korea are China, Japan, Thailand, Hong Kong and the Philippines, which are selected as a dependent variable in this study. In order to measure the volume of eWOM, we collected online news and blog postings for the same period as the number of outbound travelers in Naver, which is the largest portal site in South Korea. In this study, a panel model was established to analyze the effect of online media on the demand of Korean outbound travelers and to identify that there was a significant difference in the influence of online media by each time and countries. The results of this study can be summarized as follows. First, the impact of the online news and blog eWOM on the number of outbound travelers was significant. We found that the number of online news and blog posting have an influence on the number of outbound travelers, especially the experimental result suggests that both the month that includes the departure date and the three months before the departure were found to have an effect. It is shown that online news and blog are online media that have a significant influence on outbound tourism demand. Next, we found that the increased volume of eWOM in online news has a negative effect on departure, while the increase in a blog has a positive effect. The result with the country-specific models would be the same. This paper shows that online media can be used as a new variable in tourism demand by examining the influence of the eWOM effect of the online media. Also, we found that both social media and news media have an important role in predicting and managing the Korean tourism demand and that the influence of those two media appears different depending on the country.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.

The Effect of Mobile Advertising Platform through Big Data Analytics: Focusing on Advertising, and Media Characteristics (빅데이터 분석을 통한 모바일 광고플랫폼의 광고효과 연구: 광고특성, 매체특성을 중심으로)

  • Bae, Seong Deok;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.37-57
    • /
    • 2018
  • With the spread of smart phones, interest in mobile media is on the increase as useful media recently. Mobile media is assessed as having differentiated advantages from existing media in that not only can they provide consumers with desired information anytime and anywhere but also real-time interaction is possible in them. So far, studies on mobile advertising were mostly researches analyzing satisfaction with, and acceptance of, mobile advertising based on survey, researches focusing on the factors affecting acceptance of mobile advertising messages and researches verifying the effect of mobile advertising on brand recall, advertising attitude and brand attitude through experiments. Most of the domestic mobile advertising studies related to advertisement effect and advertisement attitude have been conducted through experiments and surveys. The advertising effectiveness measure of the mobile ad used the attitude of the advertisement, purchase intention, etc. To date, there have been few studies on the effects of mobile advertising on actual advertising data to prove the characteristics of the advertising platform and to prove the relationship between the factors influencing the advertising effect and the factors. In order to explore advertising effect of mobile advertising platform currently commercialized, this study defined advertising characteristics and media characteristics from the perspective of advertiser, advertising platform and publisher and analyzed the influence of each characteristic on advertising effect. As the advertisement characteristics, we classified advertisement format classified by bar type and floating type, and advertisement material classified by image and text. We defined advertisement characteristics of advertisement platform as Hedonic and Utilitarian media characteristics. As a dependent variable, we use CTR, which is the ratio of response (click) to ad exposure. The theoretical background and the analysis of the mobile advertising business, the hypothesis that the advertisement effect is different according to the advertisement specification, the advertisement material, In the ad standard, bar ads are classified as static framing, Floating ads can be categorized as dynamic framing, and the hypothetical definition of floating advertisements, which are high-profile dynamic framing ads, is highly responsive. In advertising, images with high salience are defined to have higher ad response than text. In the media characteristics classified as practical / hedonic type, it is defined that the hedonic type media has a more relaxed tendency than the practical media, and there is a high possibility of receiving various information because there is no clear target. In addition, image material and hedonic media are defined to be highly effective in the interaction between advertisement specification and advertisement material, advertisement specifications and media characteristics, and advertisement material and media characteristics. As the result of regression analysis on each characteristic, material standard, which is a characteristic of mobile advertisement, and media characteristics separated into 'Hedonic' and 'Utilitarian' had significant influence on advertisement effect and mutual interaction effect was also confirmed. In the mobile advertising standard, the advertising effect of the floating advertisement is higher than that of the bar advertisement, Floating ads were more effective than text ads for image ads. In addition, it was confirmed that the advertising effect is higher in the practical media than the hedonic media. The research was carried out with the big data collected from the mobile advertising platform, and it was possible to grasp the advertising effect of the measure index standard which is used in the practical work which could not be grasped in the previous research. In other words, the study was conducted using the CTR, which is a measure of the effectiveness of the advertisement used in the online advertisement and the mobile advertisement, which are not dependent on the attitude of the ad, the attitude of the brand, and the purchase intention. This study suggests that CTR is used as a dependent variable of advertising effect based on actual data of mobile ad platform accumulated over a long period of time. The results of this study is expected to contribute to establishment of optimum advertisement strategy such as creation of advertising materials and planning of media which suit advertised products at the time of mobile advertisement.

A Dynamic Management Method for FOAF Using RSS and OLAP cube (RSS와 OLAP 큐브를 이용한 FOAF의 동적 관리 기법)

  • Sohn, Jong-Soo;Chung, In-Jeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.39-60
    • /
    • 2011
  • Since the introduction of web 2.0 technology, social network service has been recognized as the foundation of an important future information technology. The advent of web 2.0 has led to the change of content creators. In the existing web, content creators are service providers, whereas they have changed into service users in the recent web. Users share experiences with other users improving contents quality, thereby it has increased the importance of social network. As a result, diverse forms of social network service have been emerged from relations and experiences of users. Social network is a network to construct and express social relations among people who share interests and activities. Today's social network service has not merely confined itself to showing user interactions, but it has also developed into a level in which content generation and evaluation are interacting with each other. As the volume of contents generated from social network service and the number of connections between users have drastically increased, the social network extraction method becomes more complicated. Consequently the following problems for the social network extraction arise. First problem lies in insufficiency of representational power of object in the social network. Second problem is incapability of expressional power in the diverse connections among users. Third problem is the difficulty of creating dynamic change in the social network due to change in user interests. And lastly, lack of method capable of integrating and processing data efficiently in the heterogeneous distributed computing environment. The first and last problems can be solved by using FOAF, a tool for describing ontology-based user profiles for construction of social network. However, solving second and third problems require a novel technology to reflect dynamic change of user interests and relations. In this paper, we propose a novel method to overcome the above problems of existing social network extraction method by applying FOAF (a tool for describing user profiles) and RSS (a literary web work publishing mechanism) to OLAP system in order to dynamically innovate and manage FOAF. We employed data interoperability which is an important characteristic of FOAF in this paper. Next we used RSS to reflect such changes as time flow and user interests. RSS, a tool for literary web work, provides standard vocabulary for distribution at web sites and contents in the form of RDF/XML. In this paper, we collect personal information and relations of users by utilizing FOAF. We also collect user contents by utilizing RSS. Finally, collected data is inserted into the database by star schema. The system we proposed in this paper generates OLAP cube using data in the database. 'Dynamic FOAF Management Algorithm' processes generated OLAP cube. Dynamic FOAF Management Algorithm consists of two functions: one is find_id_interest() and the other is find_relation (). Find_id_interest() is used to extract user interests during the input period, and find-relation() extracts users matching user interests. Finally, the proposed system reconstructs FOAF by reflecting extracted relationships and interests of users. For the justification of the suggested idea, we showed the implemented result together with its analysis. We used C# language and MS-SQL database, and input FOAF and RSS as data collected from livejournal.com. The implemented result shows that foaf : interest of users has reached an average of 19 percent increase for four weeks. In proportion to the increased foaf : interest change, the number of foaf : knows of users has grown an average of 9 percent for four weeks. As we use FOAF and RSS as basic data which have a wide support in web 2.0 and social network service, we have a definite advantage in utilizing user data distributed in the diverse web sites and services regardless of language and types of computer. By using suggested method in this paper, we can provide better services coping with the rapid change of user interests with the automatic application of FOAF.

Determinants of Mobile Application Use: A Study Focused on the Correlation between Application Categories (모바일 앱 사용에 영향을 미치는 요인에 관한 연구: 앱 카테고리 간 상관관계를 중심으로)

  • Park, Sangkyu;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.157-176
    • /
    • 2016
  • For a long time, mobile phone had a sole function of communication. Recently however, abrupt innovations in technology allowed extension of the sphere in mobile phone activities. Development of technology enabled realization of almost computer-like environment even on a very small device. Such advancement yielded several forms of new high-tech devices such as smartphone and tablet PC, which quickly proliferated. Simultaneously with the diffusion of the mobile devices, mobile applications for those devices also prospered and soon became deeply penetrated in consumers' daily lives. Numerous mobile applications have been released in app stores yielding trillions of cumulative downloads. However, a big majority of the applications are disregarded from consumers. Even after the applications are purchased, they do not survive long in consumers' mobile devices and are soon abandoned. Nevertheless, it is imperative for both app developers and app-store operators to understand consumer behaviors and to develop marketing strategies aiming to make sustainable business by first increasing sales of mobile applications and by also designing surviving strategy for applications. Therefore, this research analyzes consumers' mobile application usage behavior in a frame of substitution/supplementary of application categories and several explanatory variables. Considering that consumers of mobile devices use multiple apps simultaneously, this research adopts multivariate probit models to explain mobile application usage behavior and to derive correlation between categories of applications for observing substitution/supplementary of application use. The research adopts several explanatory variables including sociodemographic data, user experiences of purchased applications that reflect future purchasing behavior of paid applications as well as consumer attitudes toward marketing efforts, variables representing consumer attitudes toward rating of the app and those representing consumer attitudes toward app-store promotion efforts (i.e., top developer badge and editor's choice badge). Results of this study can be explained in hedonic and utilitarian framework. Consumers who use hedonic applications, such as those of game and entertainment-related, are of young age with low education level. However, consumers who are old and have received higher education level prefer utilitarian application category such as life, information etc. There are disputable arguments over whether the users of SNS are hedonic or utilitarian. In our results, consumers who are younger and those with higher education level prefer using SNS category applications, which is in a middle of utilitarian and hedonic results. Also, applications that are directly related to tangible assets, such as banking, stock and mobile shopping, are only negatively related to experience of purchasing of paid app, meaning that consumers who put weights on tangible assets do not prefer buying paid application. Regarding categories, most correlations among categories are significantly positive. This is because someone who spend more time on mobile devices tends to use more applications. Game and entertainment category shows significant and positive correlation; however, there exists significantly negative correlation between game and information, as well as game and e-commerce categories of applications. Meanwhile, categories of game and SNS as well as game and finance have shown no significant correlations. This result clearly shows that mobile application usage behavior is quite clearly distinguishable - that the purpose of using mobile devices are polarized into utilitarian and hedonic purpose. This research proves several arguments that can only be explained by second-hand real data, not by survey data, and offers behavioral explanations of mobile application usage in consumers' perspectives. This research also shows substitution/supplementary patterns of consumer application usage, which then explain consumers' mobile application usage behaviors. However, this research has limitations in some points. Classification of categories itself is disputable, for classification is diverged among several studies. Therefore, there is a possibility of change in results depending on the classification. Lastly, although the data are collected in an individual application level, we reduce its observation into an individual level. Further research will be done to resolve these limitations.

An Analysis of IT Trends Using Tweet Data (트윗 데이터를 활용한 IT 트렌드 분석)

  • Yi, Jin Baek;Lee, Choong Kwon;Cha, Kyung Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.143-159
    • /
    • 2015
  • Predicting IT trends has been a long and important subject for information systems research. IT trend prediction makes it possible to acknowledge emerging eras of innovation and allocate budgets to prepare against rapidly changing technological trends. Towards the end of each year, various domestic and global organizations predict and announce IT trends for the following year. For example, Gartner Predicts 10 top IT trend during the next year, and these predictions affect IT and industry leaders and organization's basic assumptions about technology and the future of IT, but the accuracy of these reports are difficult to verify. Social media data can be useful tool to verify the accuracy. As social media services have gained in popularity, it is used in a variety of ways, from posting about personal daily life to keeping up to date with news and trends. In the recent years, rates of social media activity in Korea have reached unprecedented levels. Hundreds of millions of users now participate in online social networks and communicate with colleague and friends their opinions and thoughts. In particular, Twitter is currently the major micro blog service, it has an important function named 'tweets' which is to report their current thoughts and actions, comments on news and engage in discussions. For an analysis on IT trends, we chose Tweet data because not only it produces massive unstructured textual data in real time but also it serves as an influential channel for opinion leading on technology. Previous studies found that the tweet data provides useful information and detects the trend of society effectively, these studies also identifies that Twitter can track the issue faster than the other media, newspapers. Therefore, this study investigates how frequently the predicted IT trends for the following year announced by public organizations are mentioned on social network services like Twitter. IT trend predictions for 2013, announced near the end of 2012 from two domestic organizations, the National IT Industry Promotion Agency (NIPA) and the National Information Society Agency (NIA), were used as a basis for this research. The present study analyzes the Twitter data generated from Seoul (Korea) compared with the predictions of the two organizations to analyze the differences. Thus, Twitter data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. To overcome these challenges, we used SAS IRS (Information Retrieval Studio) developed by SAS to capture the trend in real-time processing big stream datasets of Twitter. The system offers a framework for crawling, normalizing, analyzing, indexing and searching tweet data. As a result, we have crawled the entire Twitter sphere in Seoul area and obtained 21,589 tweets in 2013 to review how frequently the IT trend topics announced by the two organizations were mentioned by the people in Seoul. The results shows that most IT trend predicted by NIPA and NIA were all frequently mentioned in Twitter except some topics such as 'new types of security threat', 'green IT', 'next generation semiconductor' since these topics non generalized compound words so they can be mentioned in Twitter with other words. To answer whether the IT trend tweets from Korea is related to the following year's IT trends in real world, we compared Twitter's trending topics with those in Nara Market, Korea's online e-Procurement system which is a nationwide web-based procurement system, dealing with whole procurement process of all public organizations in Korea. The correlation analysis show that Tweet frequencies on IT trending topics predicted by NIPA and NIA are significantly correlated with frequencies on IT topics mentioned in project announcements by Nara market in 2012 and 2013. The main contribution of our research can be found in the following aspects: i) the IT topic predictions announced by NIPA and NIA can provide an effective guideline to IT professionals and researchers in Korea who are looking for verified IT topic trends in the following topic, ii) researchers can use Twitter to get some useful ideas to detect and predict dynamic trends of technological and social issues.