• Title/Summary/Keyword: intelligence information society

Search Result 3,534, Processing Time 0.038 seconds

Determinants of Mobile Application Use: A Study Focused on the Correlation between Application Categories (모바일 앱 사용에 영향을 미치는 요인에 관한 연구: 앱 카테고리 간 상관관계를 중심으로)

  • Park, Sangkyu;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.157-176
    • /
    • 2016
  • For a long time, mobile phone had a sole function of communication. Recently however, abrupt innovations in technology allowed extension of the sphere in mobile phone activities. Development of technology enabled realization of almost computer-like environment even on a very small device. Such advancement yielded several forms of new high-tech devices such as smartphone and tablet PC, which quickly proliferated. Simultaneously with the diffusion of the mobile devices, mobile applications for those devices also prospered and soon became deeply penetrated in consumers' daily lives. Numerous mobile applications have been released in app stores yielding trillions of cumulative downloads. However, a big majority of the applications are disregarded from consumers. Even after the applications are purchased, they do not survive long in consumers' mobile devices and are soon abandoned. Nevertheless, it is imperative for both app developers and app-store operators to understand consumer behaviors and to develop marketing strategies aiming to make sustainable business by first increasing sales of mobile applications and by also designing surviving strategy for applications. Therefore, this research analyzes consumers' mobile application usage behavior in a frame of substitution/supplementary of application categories and several explanatory variables. Considering that consumers of mobile devices use multiple apps simultaneously, this research adopts multivariate probit models to explain mobile application usage behavior and to derive correlation between categories of applications for observing substitution/supplementary of application use. The research adopts several explanatory variables including sociodemographic data, user experiences of purchased applications that reflect future purchasing behavior of paid applications as well as consumer attitudes toward marketing efforts, variables representing consumer attitudes toward rating of the app and those representing consumer attitudes toward app-store promotion efforts (i.e., top developer badge and editor's choice badge). Results of this study can be explained in hedonic and utilitarian framework. Consumers who use hedonic applications, such as those of game and entertainment-related, are of young age with low education level. However, consumers who are old and have received higher education level prefer utilitarian application category such as life, information etc. There are disputable arguments over whether the users of SNS are hedonic or utilitarian. In our results, consumers who are younger and those with higher education level prefer using SNS category applications, which is in a middle of utilitarian and hedonic results. Also, applications that are directly related to tangible assets, such as banking, stock and mobile shopping, are only negatively related to experience of purchasing of paid app, meaning that consumers who put weights on tangible assets do not prefer buying paid application. Regarding categories, most correlations among categories are significantly positive. This is because someone who spend more time on mobile devices tends to use more applications. Game and entertainment category shows significant and positive correlation; however, there exists significantly negative correlation between game and information, as well as game and e-commerce categories of applications. Meanwhile, categories of game and SNS as well as game and finance have shown no significant correlations. This result clearly shows that mobile application usage behavior is quite clearly distinguishable - that the purpose of using mobile devices are polarized into utilitarian and hedonic purpose. This research proves several arguments that can only be explained by second-hand real data, not by survey data, and offers behavioral explanations of mobile application usage in consumers' perspectives. This research also shows substitution/supplementary patterns of consumer application usage, which then explain consumers' mobile application usage behaviors. However, this research has limitations in some points. Classification of categories itself is disputable, for classification is diverged among several studies. Therefore, there is a possibility of change in results depending on the classification. Lastly, although the data are collected in an individual application level, we reduce its observation into an individual level. Further research will be done to resolve these limitations.

Job Preference Analysis and Job Matching System Development for the Middle Aged Class (중장년층 일자리 요구사항 분석 및 인력 고용 매칭 시스템 개발)

  • Kim, Seongchan;Jang, Jincheul;Kim, Seong Jung;Chin, Hyojin;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.247-264
    • /
    • 2016
  • With the rapid acceleration of low-birth rate and population aging, the employment of the neglected groups of people including the middle aged class is a crucial issue in South Korea. In particular, in the 2010s, the number of the middle aged who want to find a new job after retirement age is significantly increasing with the arrival of the retirement time of the baby boom generation (born 1955-1963). Despite the importance of matching jobs to this emerging middle aged class, private job portals as well as the Korean government do not provide any online job service tailored for them. A gigantic amount of job information is available online; however, the current recruiting systems do not meet the demand of the middle aged class as their primary targets are young workers. We are in dire need of a specially designed recruiting system for the middle aged. Meanwhile, when users are searching the desired occupations on the Worknet website, provided by the Korean Ministry of Employment and Labor, users are experiencing discomfort to search for similar jobs because Worknet is providing filtered search results on the basis of exact matches of a preferred job code. Besides, according to our Worknet data analysis, only about 24% of job seekers had landed on a job position consistent with their initial preferred job code while the rest had landed on a position different from their initial preference. To improve the situation, particularly for the middle aged class, we investigate a soft job matching technique by performing the following: 1) we review a user behavior logs of Worknet, which is a public job recruiting system set up by the Korean government and point out key system design implications for the middle aged. Specifically, we analyze the job postings that include preferential tags for the middle aged in order to disclose what types of jobs are in favor of the middle aged; 2) we develope a new occupation classification scheme for the middle aged, Korea Occupation Classification for the Middle-aged (KOCM), based on the similarity between jobs by reorganizing and modifying a general occupation classification scheme. When viewed from the perspective of job placement, an occupation classification scheme is a way to connect the enterprises and job seekers and a basic mechanism for job placement. The key features of KOCM include establishing the Simple Labor category, which is the most requested category by enterprises; and 3) we design MOMA (Middle-aged Occupation Matching Algorithm), which is a hybrid job matching algorithm comprising constraint-based reasoning and case-based reasoning. MOMA incorporates KOCM to expand query to search similar jobs in the database. MOMA utilizes cosine similarity between user requirement and job posting to rank a set of postings in terms of preferred job code, salary, distance, and job type. The developed system using MOMA demonstrates about 20 times of improvement over the hard matching performance. In implementing the algorithm for a web-based application of recruiting system for the middle aged, we also considered the usability issue of making the system easier to use, which is especially important for this particular class of users. That is, we wanted to improve the usability of the system during the job search process for the middle aged users by asking to enter only a few simple and core pieces of information such as preferred job (job code), salary, and (allowable) distance to the working place, enabling the middle aged to find a job suitable to their needs efficiently. The Web site implemented with MOMA should be able to contribute to improving job search of the middle aged class. We also expect the overall approach to be applicable to other groups of people for the improvement of job matching results.

A Study on Design of Agent based Nursing Records System in Attending System (에이전트기반 개방병원 간호기록시스템 설계에 관한 연구)

  • Kim, Kyoung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.73-94
    • /
    • 2010
  • The attending system is a medical system that allows doctors in clinics to use the extra equipment in hospitals-beds, laboratory, operating room, etc-for their patient's care under a contract between the doctors and hospitals. Therefore, the system is very beneficial in terms of the efficiency of the usage of medical resources. However, it is necessary to develop a strong support system to strengthen its weaknesses and supplement its merits. If doctors use hospital beds under the attending system of hospitals, they would be able to check a patient's condition often and provide them with nursing care services. However, the current attending system lacks delivery and assistance support. Thus, for the successful performance of the attending system, a networking system should be developed to facilitate communication between the doctors and nurses. In particular, the nursing records in the attending system could help doctors monitor the patient's condition and provision of nursing care services. A nursing record is the formal documentation associated with nursing care. It is merely a data repository that helps nurses to track their activities; nursing records thus represent a resource of primary information that can be reused. In order to maximize their usefulness, nursing records have been introduced as part of computerized patient records. However, nursing records are internal data that are not disclosed by hospitals. Moreover, the lack of standardization of the record list makes it difficult to share nursing records. Under the attending system, nurses would want to minimize the amount of effort they have to put in for the maintenance of additional records. Hence, they would try to maintain the current level of nursing records in the form of record lists and record attributes, while doctors would require more detailed and real-time information about their patients in order to monitor their condition. Therefore, this study developed a system for assisting in the maintenance and sharing of the nursing records under the attending system. In contrast to previous research on the functionality of computer-based nursing records, we have emphasized the practical usefulness of nursing records from the viewpoint of the actual implementation of the attending system. We suggested that nurses could design a nursing record dictionary for their convenience, and that doctors and nurses could confirm the definitions that they looked up in the dictionary through negotiations with intelligent agents. Such an agent-based system could facilitate networking among medical institutes. Multi-agent systems are a widely accepted paradigm for the distribution and sharing of computation workloads in the scientific community. Agent-based systems have been developed with differences in functional cooperation, coordination, and negotiation. To increase such communication, a framework for a multi-agent based system is proposed in this study. The agent-based approach is useful for developing a system that promotes trade-offs between transactions involving multiple attributes. A brief summary of our contributions follows. First, we propose an efficient and accurate utility representation and acquisition mechanism based on a preference scale while minimizing user interactions with the agent. Trade-offs between various transaction attributes can also be easily computed. Second, by providing a multi-attribute negotiation framework based on the attribute utility evaluation mechanism, we allow both the doctors in charge and nurses to negotiate over various transaction attributes in the nursing record lists that are defined by the latter. Third, we have designed the architecture of the nursing record management server and a system of agents that provides support to the doctors and nurses with regard to the framework and mechanisms proposed above. A formal protocol has also been developed to create and control the communication required for negotiations. We verified the realization of the system by developing a web-based prototype. The system was implemented using ASP and IIS5.1.

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

Case Study on the Enterprise Microblog Usage: Focusing on Knowledge Management Strategy (기업용 마이크로블로그의 사용행태에 대한 사례연구: 지식경영전략을 중심으로)

  • Kang, Min Su;Park, Arum;Lee, Kyoung-Jun
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.47-63
    • /
    • 2015
  • As knowledge is paid attention as a new production factor that generates added value, studies continue to apply knowledge management to business environment. In addition, as ICT (Information Communication Technology) was engrafted in business environment, it leads to increasing task efficiency and productivity of individual workers. Accordingly, the way that a business achieves its goal has changed to one in which its individual members are willing to take part in the organization and share information to create new values (Han, 2003) and studies for the system and service to support such transition are carrying out. Of late, a new concept called 'Enterprise 2.0' newly appears. It is the extension of Wen 2.0 and its technology, which focus on participation, sharing and openness, to the work environment of a business (Jung, 2013). Enterprise 2.0 is being used as a collaborative tool to prop up individual creativity and group brain power by combining Web 2.0 technologies such as blog, Wiki, RSS and tag with business software (McAfee, 2006). As Tweeter gets popular, Enterprise Microblog (EMB), which is an example of Enterprise 2.0 for business, has been developed as equivalent to Tweeter in business circle and SaaS (Software as a Service) such as Yammer was introduced The studies of EMB mainly focus on demonstrating its usability in terms of intra-firm communication and knowledge management. However existing studies lean too much towards large-sized companies and certain departments, rather than a company as a whole. Therefore, few studies have been conducted on small and medium-sized companies that have difficulty preparing separate resources and supplying exclusive workforce to introduce knowledge management. In this respect, the present study placed its analytic focus on small-sized companies actually equipped with EMB to know how they use it. And, based on the findings, this study examined their knowledge management strategies for EMB from the point of codification and personalization. Hypothesis -"as a company grows, it shifts EMB strategy from codification to personalization'- was established on the basis of reviewing precedent studies and literature. To demonstrate the hypothesis, this study analyzed the usage of EMB by small companies that have used it from foundation. For case study, the duration of the use was divided into 2 spans and longitudinal analysis was employed to examine the contents of the blogs. Using the key findings of the analysis, this study is aimed to propose practical implications for the operation of knowledge management of small-sized company and the suitable application of knowledge management system for operation Knowledge Management Strategy can be classified by codification strategy and personalization strategy (Hansen et. al., 1999), and how to manage the two strategies were always studied. Also, current studies regarding the knowledge management strategy were targeted mostly for major companies, resulting in lack of studies in how it can be applied on SMEs. This research, with the knowledge management strategy suited for SMEs, sets an Enterprise Microblog (EMB), and with the EMB applied on SMEs' Knowledge Management Strategy, it is reviewed on the perspective of SMEs' Codification and Personalization Strategies. Through the advanced research regarding Knowledge Management Strategy and EMB, the hypothesis is set that "Depending on the development of the company, the main application of EMB alters from Codification Strategy to Personalization Strategy". To check the hypothesis, SME that have used the EMB called 'Yammer' was analyzed from the date of their foundation until today. The case study has implemented longitudinal analysis which divides the period when the EMBs were used into three stages and analyzes the contents. As the result of the study, this suggests a substantial implication regarding the application of Knowledge Management Strategy and its Knowledge Management System that is suitable for SME.

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

A Match-Making System Considering Symmetrical Preferences of Matching Partners (상호 대칭적 만족성을 고려한 온라인 데이트시스템)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.177-192
    • /
    • 2012
  • This is a study of match-making systems that considers the mutual satisfaction of matching partners. Recently, recommendation systems have been applied to people recommendation, such as recommending new friends, employees, or dating partners. One of the prominent domain areas is match-making systems that recommend suitable dating partners to customers. A match-making system, however, is different from a product recommender system. First, a match-making system needs to satisfy the recommended partners as well as the customer, whereas a product recommender system only needs to satisfy the customer. Second, match-making systems need to include as many participants in a matching pool as possible for their recommendation results, even with unpopular customers. In other words, recommendations should not be focused only on a limited number of popular people; unpopular people should also be listed on someone else's matching results. In product recommender systems, it is acceptable to recommend the same popular items to many customers, since these items can easily be additionally supplied. However, in match-making systems, there are only a few popular people, and they may become overburdened with too many recommendations. Also, a successful match could cause a customer to drop out of the matching pool. Thus, match-making systems should provide recommendation services equally to all customers without favoring popular customers. The suggested match-making system, called Mutually Beneficial Matching (MBM), considers the reciprocal satisfaction of both the customer and the matched partner and also considers the number of customers who are excluded in the matching. A brief outline of the MBM method is as follows: First, it collects a customer's profile information, his/her preferable dating partner's profile information and the weights that he/she considers important when selecting dating partners. Then, it calculates the preference score of a customer to certain potential dating partners on the basis of the difference between them. The preference score of a certain partner to a customer is also calculated in this way. After that, the mutual preference score is produced by the two preference values calculated in the previous step using the proposed formula in this study. The proposed formula reflects the symmetry of preferences as well as their quantities. Finally, the MBM method recommends the top N partners having high mutual preference scores to a customer. The prototype of the suggested MBM system is implemented by JAVA and applied to an artificial dataset that is based on real survey results from major match-making companies in Korea. The results of the MBM method are compared with those of the other two conventional methods: Preference-Based Matching (PBM), which only considers a customer's preferences, and Arithmetic Mean-Based Matching (AMM), which considers the preferences of both the customer and the partner (although it does not reflect their symmetry in the matching results). We perform the comparisons in terms of criteria such as average preference of the matching partners, average symmetry, and the number of people who are excluded from the matching results by changing the number of recommendations to 5, 10, 15, 20, and 25. The results show that in many cases, the suggested MBM method produces average preferences and symmetries that are significantly higher than those of the PBM and AMM methods. Moreover, in every case, MBM produces a smaller pool of excluded people than those of the PBM method.

A Study on the Intelligent Quick Response System for Fast Fashion(IQRS-FF) (패스트 패션을 위한 지능형 신속대응시스템(IQRS-FF)에 관한 연구)

  • Park, Hyun-Sung;Park, Kwang-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.163-179
    • /
    • 2010
  • Recentlythe concept of fast fashion is drawing attention as customer needs are diversified and supply lead time is getting shorter in fashion industry. It is emphasized as one of the critical success factors in the fashion industry how quickly and efficiently to satisfy the customer needs as the competition has intensified. Because the fast fashion is inherently susceptible to trend, it is very important for fashion retailers to make quick decisions regarding items to launch, quantity based on demand prediction, and the time to respond. Also the planning decisions must be executed through the business processes of procurement, production, and logistics in real time. In order to adapt to this trend, the fashion industry urgently needs supports from intelligent quick response(QR) system. However, the traditional functions of QR systems have not been able to completely satisfy such demands of the fast fashion industry. This paper proposes an intelligent quick response system for the fast fashion(IQRS-FF). Presented are models for QR process, QR principles and execution, and QR quantity and timing computation. IQRS-FF models support the decision makers by providing useful information with automated and rule-based algorithms. If the predefined conditions of a rule are satisfied, the actions defined in the rule are automatically taken or informed to the decision makers. In IQRS-FF, QRdecisions are made in two stages: pre-season and in-season. In pre-season, firstly master demand prediction is performed based on the macro level analysis such as local and global economy, fashion trends and competitors. The prediction proceeds to the master production and procurement planning. Checking availability and delivery of materials for production, decision makers must make reservations or request procurements. For the outsourcing materials, they must check the availability and capacity of partners. By the master plans, the performance of the QR during the in-season is greatly enhanced and the decision to select the QR items is made fully considering the availability of materials in warehouse as well as partners' capacity. During in-season, the decision makers must find the right time to QR as the actual sales occur in stores. Then they are to decide items to QRbased not only on the qualitative criteria such as opinions from sales persons but also on the quantitative criteria such as sales volume, the recent sales trend, inventory level, the remaining period, the forecast for the remaining period, and competitors' performance. To calculate QR quantity in IQRS-FF, two calculation methods are designed: QR Index based calculation and attribute similarity based calculation using demographic cluster. In the early period of a new season, the attribute similarity based QR amount calculation is better used because there are not enough historical sales data. By analyzing sales trends of the categories or items that have similar attributes, QR quantity can be computed. On the other hand, in case of having enough information to analyze the sales trends or forecasting, the QR Index based calculation method can be used. Having defined the models for decision making for QR, we design KPIs(Key Performance Indicators) to test the reliability of the models in critical decision makings: the difference of sales volumebetween QR items and non-QR items; the accuracy rate of QR the lead-time spent on QR decision-making. To verify the effectiveness and practicality of the proposed models, a case study has been performed for a representative fashion company which recently developed and launched the IQRS-FF. The case study shows that the average sales rateof QR items increased by 15%, the differences in sales rate between QR items and non-QR items increased by 10%, the QR accuracy was 70%, the lead time for QR dramatically decreased from 120 hours to 8 hours.

Finding Weighted Sequential Patterns over Data Streams via a Gap-based Weighting Approach (발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.55-75
    • /
    • 2010
  • Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledge embedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.

A Topic Modeling-based Recommender System Considering Changes in User Preferences (고객 선호 변화를 고려한 토픽 모델링 기반 추천 시스템)

  • Kang, So Young;Kim, Jae Kyeong;Choi, Il Young;Kang, Chang Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.43-56
    • /
    • 2020
  • Recommender systems help users make the best choice among various options. Especially, recommender systems play important roles in internet sites as digital information is generated innumerable every second. Many studies on recommender systems have focused on an accurate recommendation. However, there are some problems to overcome in order for the recommendation system to be commercially successful. First, there is a lack of transparency in the recommender system. That is, users cannot know why products are recommended. Second, the recommender system cannot immediately reflect changes in user preferences. That is, although the preference of the user's product changes over time, the recommender system must rebuild the model to reflect the user's preference. Therefore, in this study, we proposed a recommendation methodology using topic modeling and sequential association rule mining to solve these problems from review data. Product reviews provide useful information for recommendations because product reviews include not only rating of the product but also various contents such as user experiences and emotional state. So, reviews imply user preference for the product. So, topic modeling is useful for explaining why items are recommended to users. In addition, sequential association rule mining is useful for identifying changes in user preferences. The proposed methodology is largely divided into two phases. The first phase is to create user profile based on topic modeling. After extracting topics from user reviews on products, user profile on topics is created. The second phase is to recommend products using sequential rules that appear in buying behaviors of users as time passes. The buying behaviors are derived from a change in the topic of each user. A collaborative filtering-based recommendation system was developed as a benchmark system, and we compared the performance of the proposed methodology with that of the collaborative filtering-based recommendation system using Amazon's review dataset. As evaluation metrics, accuracy, recall, precision, and F1 were used. For topic modeling, collapsed Gibbs sampling was conducted. And we extracted 15 topics. Looking at the main topics, topic 1, top 3, topic 4, topic 7, topic 9, topic 13, topic 14 are related to "comedy shows", "high-teen drama series", "crime investigation drama", "horror theme", "British drama", "medical drama", "science fiction drama", respectively. As a result of comparative analysis, the proposed methodology outperformed the collaborative filtering-based recommendation system. From the results, we found that the time just prior to the recommendation was very important for inferring changes in user preference. Therefore, the proposed methodology not only can secure the transparency of the recommender system but also can reflect the user's preferences that change over time. However, the proposed methodology has some limitations. The proposed methodology cannot recommend product elaborately if the number of products included in the topic is large. In addition, the number of sequential patterns is small because the number of topics is too small. Therefore, future research needs to consider these limitations.