• Title/Summary/Keyword: Page-Rank

Search Result 102, Processing Time 0.025 seconds

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.

New Evaluation Method of Patents by National R&D Program with Patent Citation Network Analysis (특허 인용 네트워크 분석을 활용한 국가연구개발사업 특허의 평가 방안)

  • Lim, Hongrae
    • Journal of Technology Innovation
    • /
    • v.27 no.4
    • /
    • pp.1-19
    • /
    • 2019
  • This study presents a new method to evaluate patents by public R&D program using patent citation network analysis. I used forward citation, degree centrality, betweenness centrality and page rank as the dependent variables which represents the quality of patents. I used primary independent variable as a dummy of public R&D program and controlled patents characteristics, applicant characteristics, technological characteristics and year effect. The empirical result shows that the patents of public R&D program is superior to other patents in regard to the number of forward citation, the degree centrality, the betweenness centrality and the page rank. This empirical result implies that patents of public R&D program directly and effectively connects technologies. Also patents from public R&D program connects important technologies.

A Comparative Study on the Centrality Measures for Analyzing Research Collaboration Networks (공동연구 네트워크 분석을 위한 중심성 지수에 대한 비교 연구)

  • Lee, Jae Yun
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.3
    • /
    • pp.153-179
    • /
    • 2014
  • This study explores the characteristics of centrality measures for analyzing researchers' impact and structural positions in research collaboration networks. We investigate four binary network centrality measures (degree centrality, closeness centrality, betweenness centrality, and PageRank), and seven existing weighted network centrality measures (triangle betweenness centrality, mean association, weighted PageRank, collaboration h-index, collaboration hs-index, complex degree centrality, and c-index) for research collaboration networks. And we propose SSR, which is a new weighted centrality measure for collaboration networks. Using research collaboration data from three different research domains including architecture, library and information science, and marketing, the above twelve centrality measures are calculated and compared each other. Results indicate that the weighted network centrality measures are needed to consider collaboration strength as well as collaboration range in research collaboration networks. We also recommend that when considering both collaboration strength and range, it is appropriate to apply triangle betweenness centrality and SSR to investigate global centrality and local centrality in collaboration networks.

The Distinct Impact Dimensions of the Prestige Indices in Author Citation Networks (저자 인용 네트워크에서 명망성 지표의 차별된 영향력 측정기준에 관한 연구)

  • Ahn, Hyerim;Park, Ji-Hong
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.61-76
    • /
    • 2016
  • This study aims at proposing three prestige indices-closeness prestige, input domain, and proximity prestige- as useful measures for the impact of a particular node in citation networks. It compares these prestige indices with other impact indices as it is still unknown what dimensions of impact these indices actually measure. The prestige indices enable us to distinguish the most prominent actors in a directed network, similar to the centrality indices in undirected networks. Correlation analysis and principal component analysis were conducted on the author citation network to identify the differentiated implications of the three prestige indices from the existing impact indices. We selected simple citation counting, h-index, PageRank, and the three kinds of centrality indices which assume undirected networks as the existing impact measures for comparison with the three prestige indices. The results indicate that these prestige indices demonstrate distinct impact dimension from the other impact indices. The prestige indices reflect indirect impact while the others direct impact.

Implementation of Efficient Power Method on CUDA GPU (CUDA 기반 GPU에서 효율적인 Power Method의 구현)

  • Kim, Jung-Hwan;Kim, Jin-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.2
    • /
    • pp.9-16
    • /
    • 2011
  • GPU computing is emerging in high performance application area since it can easily exploit massive parallelism in a way of cost-effective computing. The power method which finds the eigen vector of a given matrix is widely used in various applications such as PageRank for calculating importance of web pages. In this research we made the power method efficiently parallelized on GPU and also suggested how it can be improved to enhance its performance. The power method mainly consists of matrix-vector product and it can be easily parallelized. However, it should decide the convergence of the eigen vector and need scaling of the vector subsequently. Such operations incur several calls to GPU kernels and data movement between host and GPU memories. We improved the performance of the power method by means of reduced calls to GPU kernels, optimized thread allocation and enhanced decision operation for the convergence.

User Reputation Evaluation Using Co-occurrence Feature and Collective Intelligence (동시출현 자질과 집단 지성을 이용한 지식검색 문서 사용자 명성 평가)

  • Lee, Hyun-Woo;Han, Yo-Sub;Kim, Lae-Hyun;Cha, Jeong-Won
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.4
    • /
    • pp.459-476
    • /
    • 2008
  • The user needs to find the answer to your question is growing fast at the service using collective intelligent knowledge. In the previous researches, it was proven that the non-text information like view counting, referrer number, and number of answer is good in evaluating answers. There were also many works about evaluating answers using the various kinds of word dictionaries. In this work, we propose new method to evaluate answers to question effectively using user reputation that estimated by the social activity. We use a modified PageRank algorithm for estimating user reputation. We also use the similarity between question and answer. From the result of experiment in the Naver GisikiN corpus, we can see that the proposed method gives meaningful performance to complement the answer selection rate.

  • PDF

A Research for Web Documents Genre Classification using STW (STW를 이용한 웹 문서 장르 분류에 관한 연구)

  • Ko, Byeong-Kyu;Oh, Kun-Seok;Kim, Pan-Koo
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.413-422
    • /
    • 2012
  • Many researchers have been studied to reveal human natural language to let machine understand its meaning by text based, page rank based or more. Particularly, it has been considered that URL and HTML Tag information in web documents are attracting people' attention again to analyze huge amount of web document automatically. In this paper, we propose a STW (Semantic Term Weight) approach based on syntactic and linguistic structure of web documents in order to classify what genres are. For the evaluation, we analyzed more than 1,000 documents from 20-Genre-collection corpus for training the documents based on SVM algorithm. Afterwards, we tested KI-04 corpus to evaluate performance of our proposed method. This paper measured their accuracy by classifying them into an experiment using STW and one without u sing STW. As the results, the proposed STW based approach showed approximately 10.2% which Is higher than one without use of STW.

Global Technical Knowledge Flow Analysis in Intelligent Information Technology : Focusing on South Korea (지능정보기술 분야에서의 글로벌 기술 지식 경쟁력 분석 : 한국을 중심으로)

  • Kwak, Gihyun;Yoon, Jungsub
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.1
    • /
    • pp.24-38
    • /
    • 2021
  • This study aims to measure Korea's global competitiveness in intelligent information technology, which is the core technology of the 4th industrial revolution. For analysis, we collect patents of each field and prior patents cited by them, which are applied at the U.S. Patent Office (USPTO) between 2010 and 2018 from PATSTAT Online. A global knowledge transfer network was established by grouping citing- and cited-relationships at a national level. The in-degree centrality is used to evaluate technology acceptance, which indicates the process of absorbing existing technological knowledge to create new knowledge in each field. Second, to evaluate the impact of existing technological knowledge on the creation of new one, the out-degree centrality is investigated. Third, we apply the PageRank algorithm to qualitatively and quantitatively investigate the importance of the relationships between countries. As a result, it is confirmed through all the indicators that the AI sector is currently the least competitive.

Development of an impact Identification Program in Mathematical Education Research Using Machine Learning and Network (기계학습과 네트워크를 이용한 수학교육 연구의 영향력 판별 프로그램 개발)

  • Oh, Se Jun;Kwon, Oh Nam
    • Communications of Mathematical Education
    • /
    • v.37 no.1
    • /
    • pp.21-45
    • /
    • 2023
  • This study presents a machine learning program designed to identify impactful papers in the field of mathematics education. To achieve this objective, we examined the impact of papers from a scientific econometrics perspective, developed a mathematics education research network, and defined the impact of mathematics education research using PageRank, a network centrality index. We developed a machine learning model to determine the impact of mathematics education research and identified the journals with the highest percentage of impactful articles to be the Journal for Research in Mathematics Education (25.66%), Educational Studies in Mathematics (22.12%), Zentralblatt für Didaktik der Mathematik (8.46%), Journal of Mathematics Teacher Education (5.8%), and Journal of Mathematical Behaviour (5.51%). The results of the machine learning program were similar to the findings of previous studies that were read and evaluated qualitatively by experts in mathematics education. Significantly, the AI-assisted impact evaluation of mathematics education research, which typically requires significant human resources and time, was carried out efficiently in this study.

Snippet Extraction Method using Fuzzy Implication Operator and Relevance Feedback (연관 피드백과 퍼지 함의 연산자를 이용한 스니핏 추출 방법)

  • Park, Sun;Shim, Chun-Sik;Lee, Seong-Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.3
    • /
    • pp.424-431
    • /
    • 2012
  • In information retrieval, search engine provide the rank of web page and the summary of the web page information to user. Snippet is a summaries information of representing web pages. Visiting the web page by the user is affected by the snippet. User sometime visits the wrong page with respect to user intention when uses snippet. The snippet extraction method is difficult to accurate comprehending user intention. In order to solve above problem, this paper proposes a new snippet extraction method using fuzzy implication operator and relevance feedback. The proposed method uses relevance feedback to expand the use's query. The method uses the fuzzy implication operator between the expanded query and the web pages to extract snippet to be well reflected semantic user's intention. The experimental results demonstrate that the proposed method can achieve better snippet extraction performance than the other methods.