• Title/Summary/Keyword: Page-Rank

Search Result 102, Processing Time 0.025 seconds

Discipline Bias of Document Citation Impact Indicators: Analyzing Articles in Korean Citation Index (논문 인용 영향력 측정 지수의 편향성에 대한 연구: KCI 수록 논문을 대상으로)

  • Lee, Jae Yun;Choi, Sanghee
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.4
    • /
    • pp.205-221
    • /
    • 2015
  • The impact of a journal is commonly used as the impact of an individual paper within that journal. It is problematic to interpret a journal's impact as a single paper's impact of the journal, so there are several researches to measure a single paper's impact with its own citation counts. This study applied 8 impact indicators to Korean Citation Index database and examined discipline bias of each indicator. Analyzed indicators are simple citation counts, PageRank, f-value, CCI, c-index, single publication h-index, single publication hs-index, and cl-index. PageRank has the least discipline bias at highly ranked papers and journal bias in a discipline. On the contrary, simple citation counts showed strongly biased results toward a certain discipline or a journal. KCI database provides only simple citation counts. It needs to show PageRank (global indicator) to discover influential papers in diverse areas. Furthermore it needs to consider to provide the best of local indicators. Local indicators can be calculated only with papers in users' search results because they uses citation counts of citing papers and the number of references. They are more efficient than global indicators which explore the whole database. KCI should also consider to provide Cl-index (local indicator).

Analyzing the Main Paths and Intellectual Structure of the Data Literacy Research Domain (데이터 리터러시 연구 분야의 주경로와 지적구조 분석)

  • Jae Yun Lee
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.4
    • /
    • pp.403-428
    • /
    • 2023
  • This study investigates the development path and intellectual structure of data literacy research, aiming to identify emerging topics in the field. A comprehensive search for data literacy-related articles on the Web of Science reveals that the field is primarily concentrated in Education & Educational Research and Information Science & Library Science, accounting for nearly 60% of the total. Citation network analysis, employing the PageRank algorithm, identifies key papers with high citation impact across various topics. To accurately trace the development path of data literacy research, an enhanced PageRank main path algorithm is developed, which overcomes the limitations of existing methods confined to the Education & Educational Research field. Keyword bibliographic coupling analysis is employed to unravel the intellectual structure of data literacy research. Utilizing the PNNC algorithm, the detailed structure and clusters of the derived keyword bibliographic coupling network are revealed, including two large clusters, one with two smaller clusters and the other with five smaller clusters. The growth index and mean publishing year of each keyword and cluster are measured to pinpoint emerging topics. The analysis highlights the emergence of critical data literacy for social justice in higher education amidst the ongoing pandemic and the rise of AI chatbots. The enhanced PageRank main path algorithm, developed in this study, demonstrates its effectiveness in identifying parallel research streams developing across different fields.

Identification of Key Nodes in Microblog Networks

  • Lu, Jing;Wan, Wanggen
    • ETRI Journal
    • /
    • v.38 no.1
    • /
    • pp.52-61
    • /
    • 2016
  • A microblog is a service typically offered by online social networks, such as Twitter and Facebook. From the perspective of information dissemination, we define the concept behind a spreading matrix. A new WeiboRank algorithm for identification of key nodes in microblog networks is proposed, taking into account parameters such as a user's direct appeal, a user's influence region, and a user's global influence power. To investigate how measures for ranking influential users in a network correlate, we compare the relative influence ranks of the top 20 microblog users of a university network. The proposed algorithm is compared with other algorithms - PageRank, Betweeness Centrality, Closeness Centrality, Out-degree - using a new tweets propagation model - the Ignorants-Spreaders-Rejecters model. Comparison results show that key nodes obtained from the WeiboRank algorithm have a wider transmission range and better influence.

A Study on Document Citation Indicators Based on Citation Network Analysis (인용 네트워크 분석에 근거한 문헌 인용 지수 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.45 no.2
    • /
    • pp.119-143
    • /
    • 2011
  • This study identifies the characteristics of recent citation-based indicators for assessing a single paper in the context of their co-relationships. Five predefined indicators were examined with three variants of h-index which are convened in this study; the formers are PageRank, SCEAS Rank, CCI, f-value, and single paper h-index and the latters are $h_S$-index, h1-index, and $h_S$1-index. The correlation analysis and cluster analysis were performed to group the indicators by common characteristics, after which the indicators were calculated with the dataset from KSCI DB. The results show statistical evidence that distinguishes h-index type indicators from others. The characteristics of the indicators were verified with citation frequency factors using correlation analysis. Finally, the implications for applications and further studies are discussed.

A Query Language for Quantitative Analysis on Graph Databases (그래프 데이터베이스의 양적 분석을 위한 질의 언어)

  • Park, Sung-Chan;Lee, Sang-Goo
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06a
    • /
    • pp.77-80
    • /
    • 2011
  • 그래프는 전산학의 주요 주제 중 하나이며 World Wide Web과 Social Network의 중요성이 커지면서 더욱 주목을 받고 있다. 그래프와 관련하여 그래프 데이터베이스에 대한 질의 모델에 관한 연구도 중요하게 다투어져 왔다. 하지만 이들 연구는 패턴 매칭을 통한 질의를 주로 다루었다. 하지만 그래프 데이터를 추천이나 검색 등의 응용하기 위해서는 PageRank 등 그래프 내의 연결 구조를 양으로 분석해내는 작업이 요구된다. 또한 SimRank 및 Random Walk with Restart 등 다양한 양적 분석 측도가 제안되고 있다. 이에 따라 본 연구에서는 Random Walk를 기반으로 하는 그래프에 대한 유연한 양적 분석을 지원하는 질의 언어를 제시한다. 또한 기존의 양적 분석 측도들이 본 질의 모델을 통하여 어떻게 표현되는지를 통하여 본 질의 모델의 유용성 및 확장성을 보인다.

A Distribution-Free Rank Test for Ordered Alternatives in a Randomized Block Design

  • Kim, Dong-Hee;Song, Moon-Sup;Kim, Woo-Chul
    • Journal of the Korean Statistical Society
    • /
    • v.15 no.1
    • /
    • pp.9-25
    • /
    • 1986
  • In this paper we propose a distribution-free rank test for ordered alternatives in a randomized block design and investigate the properties of the proposed test. The proposed test is an extension of the Page test to allow replications in each cell. Some asymptotic properties including ARE's are investigated. A small sample Monte Carlo study was performed to compare the powers of the test considered in this paper for small samples. The results show that our proposed test is robust and efficient in the case of equally-spaced treatment effects.

  • PDF

ANALYZING RELATIONSHIPS THE AMONG WEB LINK STRUCTURE, WEBPAGE KEYWORD, AND POPULAR RANK : Travel Industry (웹링크 구조, 키워드, 사이트인기도 간의 관계성 분석에 관한 연구 : 관광산업을 중심으로)

  • Joun, Hyo-Jae;Cho, Nam-Jae
    • Journal of Information Technology Applications and Management
    • /
    • v.13 no.4
    • /
    • pp.167-180
    • /
    • 2006
  • Websites in the Internet are uncontrollable domain and various contents in websites lead people's activities and thoughts and new business paradigms for the future. These phenomena are from expanding the social network based on the endless growth of information technology. Websites are composed with many of links and communicating and expanding their virtual area by links, inbound, outbound, onsite, and of offsite links. Research and practice in digital information on the web have focused on finding and measuring artifacts, factors and attributes of web structure and contents from the perspective that information is a resource and property of products and services. Websites links is one of the core artifacts for understanding the virtual area. This study identifies the role of web link structure and webpage keyword as artifacts and examines their relationships by webpage rank by a minimal hub as performance in the business websites that are serving tourism information. Discovering relationships of links provides managerial insights on organizations virtual activities and systematic understandings about digitalized organizational information in the information use environment.

  • PDF

A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach (시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법)

  • Rho, Sang-Kyu;Park, Hyun-Jung;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.31-59
    • /
    • 2007
  • We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.

Measuring the Prestige of Domestic Journals in Korean Journal Citation Network (국내 학술지의 인용 네트워크 지수 측정)

  • Lee, Jae Yun;Choi, Seon-Heui
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2010.08a
    • /
    • pp.15-20
    • /
    • 2010
  • 최근 Web of Science에 도입된 Eigenfactor지수와 논문 영향력 지수(Article Influence Score), 그리고 Scopus에 도입된 SJR 지수는 구글의 PageRank 알고리즘과 같은 네트워크 분석 방식의 인용지수이다. 국내 인용 색인 데이터베이스는 인용 링크가 외부로 향하는 비율과 자기 인용 비율이 높으므로 기존의 네트워크 인용 지수 산출 방식을 그대로 적용하기에는 어려움이 많다. 이 연구에서는 국내 인용색인DB에 대해서 대표적인 네트워크 인용 지수인 저널 페이지랭크를 시험적으로 측정해보고 국내 학술지의 상황을 고려한 개선방안을 모색하였다.

  • PDF