• Title/Summary/Keyword: Similarity matrix

Search Result 315, Processing Time 0.024 seconds

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

  • Thay, Setha;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.1-20
    • /
    • 2013
  • Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.

Genetic Differences and Variations in Freshwater Crab(Eriocheir sinensis) and Swimming Crab(Portunus trituberculatus) (참게(Eriocheir sinensis)와 꽃게(Portunus trituberculatus)의 유전적 차이와 변이)

  • Yoon, Jong-Man
    • Development and Reproduction
    • /
    • v.10 no.1
    • /
    • pp.19-32
    • /
    • 2006
  • Genomic DNA isolated from two species of Korean freshwater crab(Eriocheir sinensis) and swimming crab(Portunus trituberculatus) was amplified several times by PCR reactions. The seven arbitrarily selected primers OPA-05, OPA-13, OPA-16, OPB-06, OPB-15, OPB-17 and OPD-10 were used to generate the identical, polymorphic, and specific fragments. 505 fragments were identified in the freshwater crab species, and 513 in the swimming crab from Buan: 81 specific fragments(16.0%) in the freshwater crab species and 100(19.5%) in the swimming crab. 165 identical fragments, with an average of 23.6 per primer, were observed in the freshwater crab species. 66 fragments, with an average of 9.4 per primer, were identified in the swimming crab species. The numbers of polymorphic fragments in the freshwater crab and swimming crab were 50 and 14, respectively. The oligonucleotides decamer primer OPB-17 generated identical DNA fragments, approximately 300 bp, in both the freshwater crab and swimming crab species. Compared separately, the average genetic difference was higher in the swimming crab than in the freshwater crab species. The average genetic difference was $0.726{\pm}0.004$ between the freshwater crab and swimming crab species. The dendrogram obtained by the seven primers indicates four genetic clusters: cluster 1(FRESHWATER 01), cluster 2(FRESHWATER 02, 03, 04, 05 and 06), cluster 3(FRESHWATER 07, 08, 09, 10 and 11), and cluster 4(SWIMMING 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 and 22). The shortest genetic distance displaying significant molecular difference was between individuals SWIMMING no. 18 and SWIMMING no. 17 from swimming crab(0.096). Ultimately, individual no. 02 of the freshwater crab was most distantly related to freshwater crab no. 03(genetic distance = 0.770). As stated above, the potential of RAPD-PCR to identify diagnostic markers for the identification of two crab species has been demonstrated.

  • PDF

A Study on Music Summarization (음악요약 생성에 관한 연구)

  • Kim Sung-Tak;Kim Sang-Ho;Kim Hoi-Rin;Choi Ji-Hoon;Lee Han-Kyu;Hong Jin-Woo
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.3-14
    • /
    • 2006
  • Music summarization means a technique which automatically generates the most importantand representative a part or parts ill music content. The techniques of music summarization have been studied with two categories according to summary characteristics. The first one is that the repeated part is provided as music summary and the second provides the combined segments which consist of segments with different characteristics as music summary in music content In this paper, we propose and evaluate two kinds of music summarization techniques. The algorithm using multi-level vector quantization which provides a repeated part as music summary gives fixed-length music summary is evaluated by overlapping ration between hand-made repeated parts and automatically generated summary. As results, the overlapping ratios of conventional methods are 42.2% and 47.4%, but that of proposed method with fixed-length summary is 67.1%. Optimal length music summary is evaluated by the portion of overlapping between summary and repeated part which is different length according to music content and the result shows that automatically-generated summary expresses more effective part than fixed-length summary with optimal length. The cluster-based algorithm using 2-D similarity matrix and k-means algorithm provides the combined segments as music summary. In order to evaluate this algorithm, we use MOS test consisting of two questions(How many similar segments are in summarized music? How many segments are included in same structure?) and the results show good performance.

Genetic Variation and Phylogenetic Relationship of Taraxacum Based on Chloroplast DNA (trnL-trnF and rps16-trnK) Sequences (엽록체 DNA (trnL-trnF, rps16-trnK) 염기서열에 의한 국내 민들레속 유전자원의 유전적 변이와 유연관계 분석)

  • Ryu, Jaihyunk;Lyu, Jae-il;Bae, Chang-Hyu
    • Korean Journal of Plant Resources
    • /
    • v.30 no.5
    • /
    • pp.522-534
    • /
    • 2017
  • This study was investigated genetic variation in 24 Taraxacum accessions from various regions in South Korea based on the sequences of two chloroplast DNA (cpDNA) regions (trnL-trnF and rps16-trnK). T. mongolicum, T. officinale, and T. laevigatum were triploid, and T. coreanum and T. coreanum var. flavescens were tetraploid. The trnL-trnF region in native Korean dandelions (T. mongolicum, T. coreanum, and T. coreanum var. flavescens) were ranged from 931 to 935 bp in length, and that of naturalized dandelions were ranged from 910 bp (T. officinale) to 975 bp (T. laevigatum) in length. The rps16-trnK region in T. mongolicum, T. coreanum, T. coreanum var. flavescens, T. officinale, and T. laevigatum was 882-883 bp, 875-881 bp, 878-883 bp, 874-876 bp, and 847-876 bp, respectively, in length. The sequence similarity matrix of the trnL-trnF region ranged from 0.860 to 1.00 with an average of 0.949, and that of the rps16-trnK region ranged from 0.919 to 1.000 with an average of 0.967. According to the phylogenetic analysis, the Korean native taxa and naturalized taxa were divided independent clade in two cpDNA region. T. coreanum var. flavescens clustered only with T. coreanum, and there were no significant differences in their nucleotide sequences. The finding that two accessions (T. coreanum; Jogesan, T. mongolicum; Gangyang) had a high level of genetic variation suggests their utility for breeding materials.

Professional Baseball Viewing Culture Survey According to Corona 19 using Social Network Big Data (소셜네트워크 빅데이터를 활용한 코로나 19에 따른 프로야구 관람문화조사)

  • Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.6
    • /
    • pp.139-150
    • /
    • 2020
  • The data processing of this study focuses on the textom and social media words about three areas: 'Corona 19 and professional baseball', 'Corona 19 and professional baseball', and 'Corona 19 and professional sports' The data was collected and refined in a web environment and then processed in batch, and the Ucinet6 program was used to visualize it. Specifically, the web environment was collected using Naver, Daum, and Google's channels, and was summarized into 30 words through expert meetings among the extracted words and used in the final study. 30 extracted words were visualized through a matrix, and a CONCOR analysis was performed to identify clusters of similarity and commonality of words. As a result of analysis, the clusters related to Corona 19 and Pro Baseball were composed of one central cluster and five peripheral clusters, and it was found that the contents related to the opening of professional baseball according to the corona 19 wave were mainly searched. The cluster related to Corona 19 and unrelated to professional baseball consisted of one central cluster and five peripheral clusters, and it was found that the keyword of the position of professional baseball related to the professional baseball game according to Corona 19 was mainly searched. Corona 19 and the cluster related to professional sports consisted of one central cluster and five peripheral clusters, and it was found that the keywords related to the start of professional sports according to the aftermath of Corona 19 were mainly searched.