• Title/Summary/Keyword: similarity weight

Search Result 376, Processing Time 0.028 seconds

Method of Related Document Recommendation with Similarity and Weight of Keyword (키워드의 유사도와 가중치를 적용한 연관 문서 추천 방법)

  • Lim, Myung Jin;Kim, Jae Hyun;Shin, Ju Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1313-1323
    • /
    • 2019
  • With the development of the Internet and the increase of smart phones, various services considering user convenience are increasing, so that users can check news in real time anytime and anywhere. However, online news is categorized by media and category, and it provides only a few related search terms, making it difficult to find related news related to keywords. In order to solve this problem, we propose a method to recommend related documents more accurately by applying Doc2Vec similarity to the specific keywords of news articles and weighting the title and contents of news articles. We collect news articles from Naver politics category by web crawling in Java environment, preprocess them, extract topics using LDA modeling, and find similarities using Doc2Vec. To supplement Doc2Vec, we apply TF-IDF to obtain TC(Title Contents) weights for the title and contents of news articles. Then we combine Doc2Vec similarity and TC weight to generate TC weight-similarity and evaluate the similarity between words using PMI technique to confirm the keyword association.

Analysis Method for Revision and Addition of the Specification to Appraisal (감정 대상 규격서의 수정 및 추가에 대한 분석 방법)

  • Chun, Byung-Tae
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.2
    • /
    • pp.37-44
    • /
    • 2020
  • As the information society develops, various cases of copyright infringement have occurred. In many disputes between companies, software similarity appraisal is dominated. This thesis is a study on the method of calculating the similarity of the standards subject to appraisal. In other words, it is a study to calculate the amount of revision and addition of the specification to be assessed. The analysis method compares the table of contents of both specifications and finds the same or similar part. The similarity weight is determined according to the degree of similarity. Weights identify and assign the degree of similarity between the expert's expertise and the specification. If it is completely newly added, the similarity weight is 1, if it is partially modified, the similarity weight is 0.4, and if it is almost the same as before, it is calculated by giving a weight of 0.05. Through this paper, it was found that the result of calculating the similarity to the specification is 21.2 pages.

On the Study of Perfect Coverage for Recommender System

  • Lee, Hee-Choon;Lee, Seok-Jun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1151-1160
    • /
    • 2006
  • The similarity weight, the pearson's correlation coefficient, which is used in the recommender system has a weak point that it cannot predict all of the prediction value. The similarity weight, the vector similarity, has a weak point of the high MAE although the prediction coverage using the vector similarity is higher than that using the pearson's correlation coefficient. The purpose of this study is to suggest how to raise the prediction coverage. Also, the MAE using the suggested method in this study was compared both with the MAE using the pearson's correlation coefficient and with the MAE using the vector similarity, so was the prediction coverage. As a result, it was found that the low of the MAE in the case of using the suggested method was higher than that using the pearson's correlation coefficient. However, it was also shown that it was lower than that using the vector similarity. In terms of the prediction coverage, when the suggested method was compared with two similarity weights as I mentioned above, it was found that its prediction coverage was higher than that pearson's correlation coefficient as well as vector similarity.

  • PDF

A relevance-based pairwise chromagram similarity for improving cover song retrieval accuracy (커버곡 검색 정확도 향상을 위한 적합도 기반 크로마그램 쌍별 유사도)

  • Jin Soo Seo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.200-206
    • /
    • 2024
  • Computing music similarity is an indispensable component in developing music search service. This paper proposes a relevance weight of each chromagram vector for cover song identification in computing a music similarity function in order to boost identification accuracy. We derive a music similarity function using the relevance weight based on the probabilistic relevance model, where higher relevance weights are assigned to less frequently-occurring discriminant chromagram vectors while lower weights to more frequently-occurring ones. Experimental results performed on two cover music datasets show that the proposed music similarity improves the cover song identification performance.

Comparative Evaluation of User Similarity Weight for Improving Prediction Accuracy in Personalized Recommender System (개인화 추천 시스템의 예측 정확도 향상을 위한 사용자 유사도 가중치에 대한 비교 평가)

  • Jung Kyung-Yong;Lee Jung-Hyun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.6
    • /
    • pp.63-74
    • /
    • 2005
  • In Electronic Commerce, the latest most of the personalized recommender systems have applied to the collaborative filtering technique. This method calculates the weight of similarity among users who have a similar preference degree in order to predict and recommend the item which hits to propensity of users. In this case, we commonly use Pearson Correlation Coefficient. However, this method is feasible to calculate a correlation if only there are the items that two users evaluated a preference degree in common. Accordingly, the accuracy of prediction falls. The weight of similarity can affect not only the case which predicts the item which hits to propensity of users, but also the performance of the personalized recommender system. In this study, we verify the improvement of the prediction accuracy through an experiment after observing the rule of the weight of similarity applying Vector similarity, Entropy, Inverse user frequency, and Default voting of Information Retrieval field. The result shows that the method combining the weight of similarity using the Entropy with Default voting got the most efficient performance.

A Study on the Maximizing Coverage for Recommender System

  • Lee, Hee-Choon;Lee, Seok-Jun;Park, Ji-Won;Kim, Chul-Seoung
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.11a
    • /
    • pp.119-128
    • /
    • 2006
  • The similarity weight, the pearson's correlation coefficient, which is used in the recommender system has a weak point that it cannot predict all of the prediction value. The similarity weight, the vector similarity, has a weak point of the high MAE although the prediction coverage using the vector similarity is higher than that using the pearson's correlation coefficient. The purpose of this study is to suggest how to raise the prediction coverage. Also, the MAE using the suggested method in this study was compared both with the MAE using the pearson's correlation coefficient and with the MAE using the vector similarity, so was the prediction coverage. As a result, it was found that the low of the MAE in the case of using the suggested method was higher than that using the pearson's correlation coefficient. However, it was also shown that it was lower than that using the vector similarity In terms of the prediction coverage, when the suggested method was compared with two similarity weights as I mentioned above, it was found that its prediction coverage was higher than that pearson's correlation coefficient as well as vector similarity.

  • PDF

On the Effect of Significance of Correlation Coefficient for Recommender System

  • Lee, Hee-Choon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1129-1139
    • /
    • 2006
  • Pearson's correlation coefficient and vector similarity are generally applied to The users' similarity weight of user based recommender system. This study is needed to find that the correlation coefficient of similarity weight is effected by the number of pair response and significance probability. From the classified correlation coefficient by the significance probability test on the correlation coefficient and pair of response, the change of MAE is studied by comparing the predicted precision of the two. The results are experimentally related with the change of MAE from the significant correlation coefficient and the number of pair response.

  • PDF

The evaluation of fabric on the Internet -The difference of cotton fabric texture perceived between on-line and off-line- (인터넷에서의 소재 평가에 대한 연구 -실물과 영상에서의 면직물 유사성 평가-)

  • 신혜원;이정순
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.28 no.3_4
    • /
    • pp.396-402
    • /
    • 2004
  • The purpose of this study was to investigate the difference of cotton fabric texture perceived between on-line(screening fabric) and off-line(real fabric), and to analyze fabric characteristics having an effect on the difference. The similarity of 55 various cotton fabrics perceived between on-line and on-line were measured showing simultaneously real fabrics and screening fabrics by 7-scale questionnaire. And the characteristics of cotton fabrics such as weave structure, thickness, weight, fabric density, stiffness, Hunter's L, a, b, and hue were measured. Cotton fabrics were classified into 3 groups by extent of similarity. There were no significant differences in weft density, stiffness, Hunter's L, a, b, and hue among 3 groups. But there were significant differences in weave structure, thickness, weight, warp density, and difference of warp & weft density. The fabrics having large similarity were thick and heavy, had small warp density and difference of warp & weft density, and distinct surface texture. The group having medium similarity included fabrics of medium thickness and weight, having weak surface texture, large warp density and difference of warp & weft density. The group having small similarity, which the differences between on-line and off-line were large, included thin and light fabrics having smooth surface and large warp density and difference of warp & weft density.

Stability of the classifier based on fuzzy similarity in generalized Lukasiewicz Structure

  • Sampo, J.;Luukka, P.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1324-1329
    • /
    • 2004
  • In this article we have tested stability of classifier based on fuzzy similarity in generalized Lukasiewicz structure. Two different tests for stability was made:In on test stability was checked respect to weight parameters and other test was carried out for idealvectors. Tests have made with three different classification problems.

  • PDF

Optimal Associative Neighborhood Mining using Representative Attribute (대표 속성을 이용한 최적 연관 이웃 마이닝)

  • Jung Kyung-Yong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.50-57
    • /
    • 2006
  • In Electronic Commerce, the latest most of the personalized recommender systems have applied to the collaborative filtering technique. This method calculates the weight of similarity among users who have a similar preference degree in order to predict and recommend the item which hits to propensity of users. In this case, we commonly use Pearson Correlation Coefficient. However, this method is feasible to calculate a correlation if only there are the items that two users evaluated a preference degree in common. Accordingly, the accuracy of prediction falls. The weight of similarity can affect not only the case which predicts the item which hits to propensity of users, but also the performance of the personalized recommender system. In this study, we verify the improvement of the prediction accuracy through an experiment after observing the rule of the weight of similarity applying Vector similarity, Entropy, Inverse user frequency, and Default voting of Information Retrieval field. The result shows that the method combining the weight of similarity using the Entropy with Default voting got the most efficient performance.