• Title/Summary/Keyword: Similarity measures

Search Result 304, Processing Time 0.029 seconds

A Study on Relative Mutual Information Coefficients (상호정보량의 정규화에 대한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.37 no.4
    • /
    • pp.178-198
    • /
    • 2003
  • Mutual information as an association measure, has been used for various purposes as well as for calculating term similarity. There we, however, some limits in mutual information. It tends to emphasize low frequency terms extremely because the marginal value of mutual information changes inversely to frequency of terms. To compensate for this limit this study suggests relative mutual information(RMI) coefficients which normalize mutual information, and examines their characteristics in some details. The RMI coefficients also improve effectiveness of global query expansion when they are adapted to three different collections.

Developing a recommendation system for e-newspaper articles through personalizing digital contents

  • Ha Sung Ho;Yi Jae-Shin
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.430-460
    • /
    • 2004
  • This study presented a personalization system that adopted a methodology which is applicable for digital content recommendation and executed by the Internet service providers. The system made a recommendation to the users on the basis of their preferences, while most techniques for recommending digital content have focused on considering the similarity of content. In addition, it developed a method of evaluation to determine the priority of recommendations and adopted measures when selecting a set of recommendations. To experiment the feasibility and effectiveness of the presented methodology, a prototype system was developed and was applied to an English newspaper on the Internet.

  • PDF

Traffic Modeling and Performance Analysis of Mobile Multimedia Data Services (이동통신 멀티미디어 데이터서비스의 트래픽 특성 모델링 및 성능분석)

  • 정용주;백천현;김후곤;최택진;양원석;황흥석
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.2
    • /
    • pp.139-155
    • /
    • 2003
  • The aim of this study is to identify the data traffic capacity of 3G mobile communication networks, especially of cdma2000-1X networks. Three-layered ON/OFF traffic model is used to describe the dynamics of data traffics and the process of data transmission such as packet scheduling. We construct a simulator fully incorporating packet handling process of cdma2000-lX data network as well as three-layered ON/OFF traffic model describing the behavior of source data traffics. To get influence of traffic parameters on performance measures, the extensive simulations were performed for several data sets which are obtained from real trace data or previous studies. The experimental results show that the engineered throughput satisfying QoS criteria is approximately 25% of total capacity. Finally, some proposals to improve the system capacity are followed.

Empirical Comparisons of Clustering Algorithms using Silhouette Information

  • Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.1
    • /
    • pp.31-36
    • /
    • 2010
  • Many clustering algorithms have been used in diverse fields. When we need to group given data set into clusters, many clustering algorithms based on similarity or distance measures are considered. Most clustering works have been based on hierarchical and non-hierarchical clustering algorithms. Generally, for the clustering works, researchers have used clustering algorithms case by case from these algorithms. Also they have to determine proper clustering methods subjectively by their prior knowledge. In this paper, to solve the subjective problem of clustering we make empirical comparisons of popular clustering algorithms which are hierarchical and non hierarchical techniques using Silhouette measure. We use silhouette information to evaluate the clustering results such as the number of clusters and cluster variance. We verify our comparison study by experimental results using data sets from UCI machine learning repository. Therefore we are able to use efficient and objective clustering algorithms.

Prediction Method for the Implicit Interpersonal Trust Between Facebook Users (페이스북 사용자간 내재된 신뢰수준 예측 방법)

  • Song, Hee Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.20 no.2
    • /
    • pp.177-191
    • /
    • 2013
  • Social network has been expected to increase the value of social capital through online user interactions which remove geographical boundary. However, online users in social networks face challenges of assessing whether the anonymous user and his/her providing information are reliable or not because of limited experiences with a small number of users. Therefore. it is vital to provide a successful trust model which builds and maintains a web of trust. This study aims to propose a prediction method for the interpersonal trust which measures the level of trust about information provider in Facebook. To develop the prediction method. we first investigated behavioral research for trust in social science and extracted 5 antecedents of trust : lenience, ability, steadiness, intimacy, and similarity. Then we measured the antecedents from the history of interactive behavior and built prediction models using the two decision trees and a computational model. We also applied the proposed method to predict interpersonal trust between Facebook users and evaluated the prediction accuracy. The predicted trust metric has dynamic feature which can be adjusted over time according to the interaction between two users.

Distance Measure for Images Using 2D Integra-Normalizer (2D 인테그라-노말라이저를 이용한 2D 영상간의 거리 측정방법)

  • Kim, Sung-Soo
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.4
    • /
    • pp.474-477
    • /
    • 1999
  • In this paper, a new method of measuring of distance between digital images, the 2D Integra-Normalizer, is proposed and compared with the grey block distance (GBD) to show its superiority of images. The 2D Integra-Normalizer removes a restriction that the image to be compared is {{{{ { 2}^{n } }}}} dimension where n is a positive integer, which means that any dimensional image can be applied to the 2D Integra-Normalizer for measuring distance of images. In addition, the 2D Integra-Normalizer measures the distance of images more in detail than the GBD with a better interpretation that is more close to human's intuitive understanding.

  • PDF

Comparing the Performance of Global Query Expansion according to Similarity Measures (유사계수에 따른 전역적 질의확장 검색 성능 비교)

  • 이재윤
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.526-528
    • /
    • 2003
  • 공기빈도를 이용한 전역적 질의확장 검색에서 공기유사도를 판정하는데 이용되는 유사계수의 특성에 따른 질의확장 성능을 비교해보았다. 먼저 각 유사계수의 통계적인 특성을 말뭉치와 검색실험 문서집단을 대상으로 살펴본 결과 코사인 계수, 자카드 계수는 고빈도어 선호경향을 보이고 상호정보량과 율의 Y는 저빈도어 선호경향을 보이는 것으로 나타났다. 질의확장 검색실험에서는 고빈도어 선호경향을 가진 유사계수에 비해서 저빈도어 선호경향을 가진 유사계수률 이용할 때 더 종은 성능이 나타났다. 특히 율의 Y는 질의어의 DF가 1에 가깝게 매우 낮을 때 다른 유사계수와 달리 고빈도어를 선호함으로써 항상 저빈도어를 선호하는 상호정보량에 비해서 질의확장 검색에 유리함을 알 수가 있었다.

  • PDF

A Post-analysis of the Association Rule Mining Applied to Internee Shopping Mall

  • Kim, Jae-Kyeong;Song, Hee-Seok
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.06a
    • /
    • pp.253-260
    • /
    • 2001
  • Understanding and adapting to changes of customer behavior is an important aspect for a company to survive in continuously changing environment. The aim of this paper is to develop a methodology which detects changes of customer behavior automatically from customer profiles and sales data at different time snapshots. For this purpose, we first define three types of changes as emerging pattern, unexpected change and the added / perished rule. Then we develop similarity and difference measures for rule matching to detect all types of change. Finally, the degree of change is evaluated to detect significantly changed rules. Our proposed methodology can evaluate degree of changes as well as detect all kinds of change automatically from different time snapshot data. A case study for evaluation and practical business implications for this methodology are also provided.

  • PDF

Non-Iterative Threshold based Recovery Algorithm (NITRA) for Compressively Sensed Images and Videos

  • Poovathy, J. Florence Gnana;Radha, S.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.10
    • /
    • pp.4160-4176
    • /
    • 2015
  • Data compression like image and video compression has come a long way since the introduction of Compressive Sensing (CS) which compresses sparse signals such as images, videos etc. to very few samples i.e. M < N measurements. At the receiver end, a robust and efficient recovery algorithm estimates the original image or video. Many prominent algorithms solve least squares problem (LSP) iteratively in order to reconstruct the signal hence consuming more processing time. In this paper non-iterative threshold based recovery algorithm (NITRA) is proposed for the recovery of images and videos without solving LSP, claiming reduced complexity and better reconstruction quality. The elapsed time for images and videos using NITRA is in ㎲ range which is 100 times less than other existing algorithms. The peak signal to noise ratio (PSNR) is above 30 dB, structural similarity (SSIM) and structural content (SC) are of 99%.

Fingerprint Pattern Recognition Algorithm (지문 Pattern 인식 Algorithm)

  • 김정규;김봉일
    • Korean Journal of Remote Sensing
    • /
    • v.3 no.1
    • /
    • pp.25-39
    • /
    • 1987
  • The purpose of this research is to develop the Automatic Fingerprint Verfication System by digital computer based on specially in PC level. Fingerprint is used as means of personal identity verification in view of that it has the high reliability and safety. Fingerprint pattern recognition algorithm is constitute of 3 stages, namely of the preprocessing, the feature extraction and the recognition. The preprocessing stage includes smoothing, binarization, thinning and restoration. The feature extraction stage includes the extraction of minutiae and its features. The recognition stage includes the registration and the matching score calculation which measures the similarity between two images. Tests for this study with 325 pairs of fingerprint resulted in 100% of separation which which in turn is turned out to be the reliability of this algorithm.