Search | Korea Science

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

Park, Jiae;Cho, Yoonho
- Journal of Intelligence and Information Systems
- /
- v.22 no.3
- /
- pp.143-163
- /
- 2016
The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.
https://doi.org/10.13088/jiis.2016.22.3.143 인용 PDF KSCI

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
- Asia pacific journal of information systems
- /
- v.21 no.1
- /
- pp.103-122
- /
- 2011
Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.
PDF KSCI

Understanding User Motivations and Behavioral Process in Creating Video UGC: Focus on Theory of Implementation Intentions (Video UGC 제작 동기와 행위 과정에 관한 이해: 구현의도이론 (Theory of Implementation Intentions)의 적용을 중심으로)

Kim, Hyung-Jin;Song, Se-Min;Lee, Ho-Geun
- Asia pacific journal of information systems
- /
- v.19 no.4
- /
- pp.125-148
- /
- 2009
UGC(User Generated Contents) is emerging as the center of e-business in the web 2.0 era. The trend reflects changing roles of users in production and consumption of contents on websites and helps us to understand new strategies of websites such as web portals and social network websites. Nowadays, we consume contents created by other non-professional users for both utilitarian (e.g., knowledge) and hedonic values (e.g., fun). Also, contents produced by ourselves (e.g., photo, video) are posted on websites so that our friends, family, and even the public can consume those contents. This means that non-professionals, who used to be passive audience in the past, are now creating contents and share their UGCs with others in the Web. Accessible media, tools, and applications have also reduced difficulty and complexity in the process of creating contents. Realizing that users create plenty of materials which are very interesting to other people, media companies (i.e., web portals and social networking websites) are adjusting their strategies and business models accordingly. Increased demand of UGC may lead to website visits which are the source of benefits from advertising. Therefore, they put more efforts into making their websites open platforms where UGCs can be created and shared among users without technical and methodological difficulties. Many websites have increasingly adopted new technologies such as RSS and openAPI. Some have even changed the structure of web pages so that UGC can be seen several times to more visitors. This mainstream of UGCs on websites indicates that acquiring more UGCs and supporting participating users have become important things to media companies. Although those companies need to understand why general users have shown increasing interest in creating and posting contents and what is important to them in the process of productions, few research results exist in this area to address these issues. Also, behavioral process in creating video UGCs has not been explored enough for the public to fully understand it. With a solid theoretical background (i.e., theory of implementation intentions), parts of our proposed research model mirror the process of user behaviors in creating video contents, which consist of intention to upload, intention to edit, edit, and upload. In addition, in order to explain how those behavioral intentions are developed, we investigated influences of antecedents from three motivational perspectives (i.e., intrinsic, editing software-oriented, and website's network effect-oriented). First, from the intrinsic motivation perspective, we studied the roles of self-expression, enjoyment, and social attention in forming intention to edit with preferred editing software or in forming intention to upload video contents to preferred websites. Second, we explored the roles of editing software for non-professionals to edit video contents, in terms of how it makes production process easier and how it is useful in the process. Finally, from the website characteristic-oriented perspective, we investigated the role of a website's network externality as an antecedent of users' intention to upload to preferred websites. The rationale is that posting UGCs on websites are basically social-oriented behaviors; thus, users prefer a website with the high level of network externality for contents uploading. This study adopted a longitudinal research design; we emailed recipients twice with different questionnaires. Guided by invitation email including a link to web survey page, respondents answered most of questions except edit and upload at the first survey. They were asked to provide information about UGC editing software they mainly used and preferred website to upload edited contents, and then asked to answer related questions. For example, before answering questions regarding network externality, they individually had to declare the name of the website to which they would be willing to upload. At the end of the first survey, we asked if they agreed to participate in the corresponding survey in a month. During twenty days, 333 complete responses were gathered in the first survey. One month later, we emailed those recipients to ask for participation in the second survey. 185 of the 333 recipients (about 56 percentages) answered in the second survey. Personalized questionnaires were provided for them to remind the names of editing software and website that they reported in the first survey. They answered the degree of editing with the software and the degree of uploading video contents to the website for the past one month. To all recipients of the two surveys, exchange tickets for books (about 5,000~10,000 Korean Won) were provided according to the frequency of participations. PLS analysis shows that user behaviors in creating video contents are well explained by the theory of implementation intentions. In fact, intention to upload significantly influences intention to edit in the process of accomplishing the goal behavior, upload. These relationships show the behavioral process that has been unclear in users' creating video contents for uploading and also highlight important roles of editing in the process. Regarding the intrinsic motivations, the results illustrated that users are likely to edit their own video contents in order to express their own intrinsic traits such as thoughts and feelings. Also, their intention to upload contents in preferred website is formed because they want to attract much attention from others through contents reflecting themselves. This result well corresponds to the roles of the website characteristic, namely, network externality. Based on the PLS results, the network effect of a website has significant influence on users' intention to upload to the preferred website. This indicates that users with social attention motivations are likely to upload their video UGCs to a website whose network size is big enough to realize their motivations easily. Finally, regarding editing software characteristic-oriented motivations, making exclusively-provided editing software more user-friendly (i.e., easy of use, usefulness) plays an important role in leading to users' intention to edit. Our research contributes to both academic scholars and professionals. For researchers, our results show that the theory of implementation intentions is well applied to the video UGC context and very useful to explain the relationship between implementation intentions and goal behaviors. With the theory, this study theoretically and empirically confirmed that editing is a different and important behavior from uploading behavior, and we tested the behavioral process of ordinary users in creating video UGCs, focusing on significant motivational factors in each step. In addition, parts of our research model are also rooted in the solid theoretical background such as the technology acceptance model and the theory of network externality to explain the effects of UGC-related motivations. For practitioners, our results suggest that media companies need to restructure their websites so that users' needs for social interaction through UGC (e.g., self-expression, social attention) are well met. Also, we emphasize strategic importance of the network size of websites in leading non-professionals to upload video contents to the websites. Those websites need to find a way to utilize the network effects for acquiring more UGCs. Finally, we suggest that some ways to improve editing software be considered as a way to increase edit behavior which is a very important process leading to UGC uploading.
PDF KSCI

The Influence of Organizational Commitment, Job Commitment and Job Satisfaction on Professionalism Perceived by Radiotechnologists Working in the Department of Radiation Oncology (방사선종양학과에 근무하는 방사선사의 조직몰입, 직무몰입, 직무만족이 전문 직업성에 미치는 영향)

Gim, Yang-Soo;Lee, Sun-Young;Lee, Joon-Seong;Gwak, Geun-Tak;Pak, Ju-Gyeong;Lee, Seung-Hoon;Hwang, Ho-In;Cha, Seok-Yong
- The Journal of Korean Society for Radiation Therapy
- /
- v.24 no.2
- /
- pp.67-75
- /
- 2012
Purpose: The study is to check the specialty of radiotherapists working in the department of radiation oncology and find job satisfaction, organizational commitment and job commitment having an effect on professional parts. After making analysis of the mutual relation, it is to provide radiotechnologists with making progress in the future. Materials and Methods: From March 2 to March 30, we had carried out a survey with email. It is possible to have 272 questionnaires answered in the survey. We make use of SPSS 13.0 for Windows to analyze the data collected for study. Frequency and a percentage are meant to show general characteristics, and t-test and ANOVA to do the difference between general properties and professionalism. Pearson's correlation coefficient also is meant to do the correlation of professionalism, organizational job commitment and job satisfaction, and multiple regression analysis to do the factor for a relevant variable to affect professionalism. Results: There are subdivisions in the professionalism informing us of the self-regulation $17.74{\pm}2.32/3.55{\pm}.46$, a sense of calling $17.58{\pm}2.63/3.52{\pm}.53$, reference of the professional $17.14{\pm}2.39/3.43{\pm}.48$, service to the public $15.97{\pm}2.48/3.19{\pm}50$, and autonomy $15.68{\pm}2.28/3.14{\pm}46$. Grand mean turns out to be $83.89{\pm}7.63$(Summation of items)/$3.37{\pm}0.49$ (Numbers of items). When it comes to a statistical relation between general characteristics and professionalism, the statistics have it that these come within age (P<.001), period of employment (P<.001), education status (P<.05), a monthly income (P<.001), radiotherapists who get a special license (P<.001), the position (P<.001), and an opportunity for developing (P<.001). As a result of organizational commitment, job commitment, and job satisfaction, grand mean in organizational commitment proves to be $80.10{\pm}8.15/3.34{\pm}.34$. There are subvisions showing affective commitment $28.64{\pm}4.61$/3.58, continuance commitment $27.54{\pm}4.22/3.44{\pm}.53$, and normative commitment $23.95{\pm}2.94/2.99{\pm}.37$ in order of precedence. The average grade in job commitment is $32.47{\pm}5.77/3.30{\pm}.60$ and that in job satisfaction is $63.39{\pm}10.16/3.17{\pm}.51$, respectively. We find the positive relationship between professionalism and organizational commitment (r=.522, P<.05), between professionalism and job commitment (r=.444, P<.05), and between professionalism and job satisfaction (r=.507, P<.05). And we also get the positive relationship between organizational commitment and job commitment (r=.549, P<.05), between organizational commitment and job satisfaction (r=.433, P<.05), and between job commitment and job satisfaction (r=.462, P<.05). To catch the factors influencing the professionalism of radiotherapists, we used multiple regression analysis. According to the final model, it appears affective commitment (B=.755, P<.05), normative commitment (B=.305, P<.05), job satisfaction (B=.092, P<.05), an opportunity for developing (B=-1.505, P<.05), and the position (B=-1.155, P<.05) in order of precedence. It seems that explaining influece on $R^2$ is 0.504. Conclusion: The results of the factors that influence professionalism working as radiotherapists in the department of radiation oncology have it that the more affective commitment, normative commitment, and job satisfaction we feel, the more professionalism we recognize. We think that the focus of professionalism is increased if getting the chances for radiotherapists to have little to do with developing opportunities given.
PDF

Search Result 374, Processing Time 0.02 seconds

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

Understanding User Motivations and Behavioral Process in Creating Video UGC: Focus on Theory of Implementation Intentions (Video UGC 제작 동기와 행위 과정에 관한 이해: 구현의도이론 (Theory of Implementation Intentions)의 적용을 중심으로)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)