• Title/Summary/Keyword: similar operators

Search Result 124, Processing Time 0.021 seconds

A Collaborative Filtering System Combined with Users' Review Mining : Application to the Recommendation of Smartphone Apps (사용자 리뷰 마이닝을 결합한 협업 필터링 시스템: 스마트폰 앱 추천에의 응용)

  • Jeon, ByeoungKug;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.1-18
    • /
    • 2015
  • Collaborative filtering(CF) algorithm has been popularly used for recommender systems in both academic and practical applications. A general CF system compares users based on how similar they are, and creates recommendation results with the items favored by other people with similar tastes. Thus, it is very important for CF to measure the similarities between users because the recommendation quality depends on it. In most cases, users' explicit numeric ratings of items(i.e. quantitative information) have only been used to calculate the similarities between users in CF. However, several studies indicated that qualitative information such as user's reviews on the items may contribute to measure these similarities more accurately. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's reviews can be regarded as the informative source for identifying user's preference with accuracy. Under this background, this study proposes a new hybrid recommender system that combines with users' review mining. Our proposed system is based on conventional memory-based CF, but it is designed to use both user's numeric ratings and his/her text reviews on the items when calculating similarities between users. In specific, our system creates not only user-item rating matrix, but also user-item review term matrix. Then, it calculates rating similarity and review similarity from each matrix, and calculates the final user-to-user similarity based on these two similarities(i.e. rating and review similarities). As the methods for calculating review similarity between users, we proposed two alternatives - one is to use the frequency of the commonly used terms, and the other one is to use the sum of the importance weights of the commonly used terms in users' review. In the case of the importance weights of terms, we proposed the use of average TF-IDF(Term Frequency - Inverse Document Frequency) weights. To validate the applicability of the proposed system, we applied it to the implementation of a recommender system for smartphone applications (hereafter, app). At present, over a million apps are offered in each app stores operated by Google and Apple. Due to this information overload, users have difficulty in selecting proper apps that they really want. Furthermore, app store operators like Google and Apple have cumulated huge amount of users' reviews on apps until now. Thus, we chose smartphone app stores as the application domain of our system. In order to collect the experimental data set, we built and operated a Web-based data collection system for about two weeks. As a result, we could obtain 1,246 valid responses(ratings and reviews) from 78 users. The experimental system was implemented using Microsoft Visual Basic for Applications(VBA) and SAS Text Miner. And, to avoid distortion due to human intervention, we did not adopt any refining works by human during the user's review mining process. To examine the effectiveness of the proposed system, we compared its performance to the performance of conventional CF system. The performances of recommender systems were evaluated by using average MAE(mean absolute error). The experimental results showed that our proposed system(MAE = 0.7867 ~ 0.7881) slightly outperformed a conventional CF system(MAE = 0.7939). Also, they showed that the calculation of review similarity between users based on the TF-IDF weights(MAE = 0.7867) leaded to better recommendation accuracy than the calculation based on the frequency of the commonly used terms in reviews(MAE = 0.7881). The results from paired samples t-test presented that our proposed system with review similarity calculation using the frequency of the commonly used terms outperformed conventional CF system with 10% statistical significance level. Our study sheds a light on the application of users' review information for facilitating electronic commerce by recommending proper items to users.

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

  • Lee, Hong-Joo;Klein, Mark
    • Asia pacific journal of information systems
    • /
    • v.18 no.1
    • /
    • pp.79-96
    • /
    • 2008
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.

A Study on the Effects of the Dine-out Franchise Headquarter's Management and Support Policies and Franchise Business Operator's Managerial Characteristics on the Bilateral Relationship and Franchise Store's Satisfaction (외식 프랜차이즈 가맹본부의 관리 및 지원정책과 가맹점 사업자의 경영자적 특성이 양자간 관계와 가맹점의 만족에 미치는 영향에 관한 연구)

  • Seo, SangYun;Jang, JaeNam
    • Journal of Distribution Research
    • /
    • v.17 no.4
    • /
    • pp.81-101
    • /
    • 2012
  • A franchise system develops competitive products for a franchise store through the system established by the franchise head office. Therefore, it has advantages of expanding the marketing effect since the risk of failure is reduced for a founder and the franchise head office supports the overall sales, advertisement and promotional activities. Also, a franchise store has advantages of fulfilling necessary facilities and tools on advantageous terms, reducing expenses by purchasing in bulk, and getting a supply of products with stable qualities. However, aside from such advantages, franchise head offices are forcing franchise stores to make unnecessary investments in equipments and remodel the interior. Also, franchise business operators are being made to share the cost of marketing and multiple franchise stores are being approved within the same business district, and franchise business operators are suffering damages. Therefore, cases of shutting down a franchise store or not renewing the contract are frequent. From the position of a franchise head office, profits that are generated from franchise fees, interior remodeling fees and supplying facilities and materials will increase as the number of new franchise stores increases. However, franchise stores are faced with difficulties due to excessive competitions between similar types of businesses and the overlapping of business districts that come from increases in the number of stores, and they eventually end up shutting down. Therefore, in order for a franchise business operator and franchise head office to grow and develop continuously, opening new stores is important, but successfully renewing the contract by maintaining a relationship with an existing franchise business operator is desirable. In this aspect, a study that examines the elements that can affect the relationship between a franchise business operator and franchise head office is believed to be important for the development of the franchise industry and creating safe jobs for the public. With an emphasis on the relationship between a franchise head office and franchise store, this study attempted to examine the effect of characteristics of a franchise head office and franchise business operator on the bilateral relationship such as the faith and immersion, and wished to review the effects of such faith and immersion on the satisfaction of a franchise store, including an intention of renewing the contract. In particular, in the current situation of great uncertainties in the market, this study also wished to examine how uncertain market elements will affect the relationship between the characteristics of a franchise head office and franchise business operator, and the faith and immersion. The study revealed that among the characteristics of a franchise head office, the standardization management of a franchise head office hinders a franchise store's faith and immersion in a franchise head office. Also, a franchise head office's support was shown to increase a franchise store's faith and immersion. However, it was revealed that a franchise head office's regulation and incentive policies for a franchise store do not affect a franchise store's faith and immersion. Among characteristics of a franchise business operator, a franchise store's healthy financial status and entrepreneur spirits were shown to enhance the faith and immersion in a franchise head office. However, it was shown that excellent business abilities of a franchise business operator actually reduce the immersion for a franchise head office. Also, the faith and immersion in a franchise head office were shown to enhance the intention of renewing the contract by increasing the satisfaction for a franchise head office. In addition, it was originally believed that the effects of a franchise business operator's characteristics on the faith and immersion in a franchise head office will vary depending on the market uncertainty, but the effect of a franchise business operator's characteristics depending on the recognition of uncertainties was shown to be insignificant. Such findings show that instead of making a franchise store pay for equipment investments and marketing and obtaining profits by force, a franchise head office should actively support a franchise store so that a franchise store's business activities can be conducted well, which will bring profits to a franchise store and ultimately to a franchise head office. This is a more desirable direction for the development of both parties. Implications of such findings are summarized as follows. First, it was shown that a franchise head office's standardization management actually reduces a franchise store's faith and immersion. Therefore, it is believed that instead of conducting standardization managements for regulating and managing franchise stores, measures should be developed so that franchise stores can actually participate voluntarily. For this, a head office should put in efforts to develop and provide standardized manuals, and make sure that a self-review system takes root. Second, a franchise head office's incentives did not have significant effects on the faith and immersion, but the support was shown to be effective. Therefore, it can be seen that instead of taking post-measures for a franchise store, taking pre-measures of actively supporting is more effective in maintaining a franchise store. Third, among characteristics of a franchise head office, it was shown that a franchise store's healthy financial status increased the faith and immersion in a franchise head office. Therefore, when selecting a franchise business operator, instead of thoughtlessly opening up franchise stores for the profit of a head office, it is believed that reviewing a franchise business operator's financial firepower and credit status is necessary. As for academic implications, previous studies examined the relationship by focusing on the characteristics of a franchise head office and franchise store, but this study focused on the characteristics of a franchise business operator. Therefore, this study dealt with the importance of a franchise business operator's competence, and is significant because it revealed the fact that a franchise business operator's excellent commercialization ability can become an element that hinders the immersion in a franchise head office. It was originally believed that a franchise store's characteristics will have different effects on the faith and immersion depending on the market uncertainty, but it was shown that the effect of a franchise store's characteristics depending on the recognition of uncertainties was insignificant, and that is the limitation of this study.

  • PDF

A Study on the Regional Characteristics of Broadband Internet Termination by Coupling Type using Spatial Information based Clustering (공간정보기반 클러스터링을 이용한 초고속인터넷 결합유형별 해지의 지역별 특성연구)

  • Park, Janghyuk;Park, Sangun;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.45-67
    • /
    • 2017
  • According to the Internet Usage Research performed in 2016, the number of internet users and the internet usage have been increasing. Smartphone, compared to the computer, is taking a more dominant role as an internet access device. As the number of smart devices have been increasing, some views that the demand on high-speed internet will decrease; however, Despite the increase in smart devices, the high-speed Internet market is expected to slightly increase for a while due to the speedup of Giga Internet and the growth of the IoT market. As the broadband Internet market saturates, telecom operators are over-competing to win new customers, but if they know the cause of customer exit, it is expected to reduce marketing costs by more effective marketing. In this study, we analyzed the relationship between the cancellation rates of telecommunication products and the factors affecting them by combining the data of 3 cities, Anyang, Gunpo, and Uiwang owned by a telecommunication company with the regional data from KOSIS(Korean Statistical Information Service). Especially, we focused on the assumption that the neighboring areas affect the distribution of the cancellation rates by coupling type, so we conducted spatial cluster analysis on the 3 types of cancellation rates of each region using the spatial analysis tool, SatScan, and analyzed the various relationships between the cancellation rates and the regional data. In the analysis phase, we first summarized the characteristics of the clusters derived by combining spatial information and the cancellation data. Next, based on the results of the cluster analysis, Variance analysis, Correlation analysis, and regression analysis were used to analyze the relationship between the cancellation rates data and regional data. Based on the results of analysis, we proposed appropriate marketing methods according to the region. Unlike previous studies on regional characteristics analysis, In this study has academic differentiation in that it performs clustering based on spatial information so that the regions with similar cancellation types on adjacent regions. In addition, there have been few studies considering the regional characteristics in the previous study on the determinants of subscription to high-speed Internet services, In this study, we tried to analyze the relationship between the clusters and the regional characteristics data, assuming that there are different factors depending on the region. In this study, we tried to get more efficient marketing method considering the characteristics of each region in the new subscription and customer management in high-speed internet. As a result of analysis of variance, it was confirmed that there were significant differences in regional characteristics among the clusters, Correlation analysis shows that there is a stronger correlation the clusters than all region. and Regression analysis was used to analyze the relationship between the cancellation rate and the regional characteristics. As a result, we found that there is a difference in the cancellation rate depending on the regional characteristics, and it is possible to target differentiated marketing each region. As the biggest limitation of this study and it was difficult to obtain enough data to carry out the analyze. In particular, it is difficult to find the variables that represent the regional characteristics in the Dong unit. In other words, most of the data was disclosed to the city rather than the Dong unit, so it was limited to analyze it in detail. The data such as income, card usage information and telecommunications company policies or characteristics that could affect its cause are not available at that time. The most urgent part for a more sophisticated analysis is to obtain the Dong unit data for the regional characteristics. Direction of the next studies be target marketing based on the results. It is also meaningful to analyze the effect of marketing by comparing and analyzing the difference of results before and after target marketing. It is also effective to use clusters based on new subscription data as well as cancellation data.