Search | Korea Science

Analysis and Improvement of Ranking Algorithm for Web Mining System on the Hierarchical Web Environment

Heebyung Yoon;Lee, Kil-Seup;Kim, Hwa-Soo
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2003.09a
- /
- pp.455-458
- /
- 2003
The variety of document ranking algorithms have developed to provide efficient mining results for user's query on the web environment. The typical ranking algorithms are the Vector-Space Model based on the text, PsgeRank and HITS algorithms based on the hyperlink structures and other several improvement algorithms. All these are for the user's convenience and preference. However, these algorithms are usually developed on then Horizontal and non-hierarchial web environments and are not suitable for the hierarchial web environments such as enterprise and defense networks. Thus, we must consider the special environment factors in order to improve the ranking algorithms. In this paper, we analyze the several typical algorithms used by hyperlink structures on the web environment. We, then suggest a configuration of the hierarchical web environment and also give the relations between agents of the web mining system. Next, we propose an improved ranking algorithm suitable to this kind of special environments. The proposed algorithm is considered both the hyperlink structures of the documents and the location of the user of the hierarchical web.
PDF

A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach (시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법)

Rho, Sang-Kyu;Park, Hyun-Jung;Park, Jin-Soo
- Asia pacific journal of information systems
- /
- v.17 no.4
- /
- pp.31-59
- /
- 2007
We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.
PDF KSCI

Fast Contingency Ranking Algorithm of Power Equipment (전력설비의 신속한 상정사고 선택 앨고리즘)

박규홍;정재길
- Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
- /
- v.12 no.1
- /
- pp.20-25
- /
- 1998
This paper presents an algorithm for contingency ranking using line outage distribution factors(LODF) which are established by generation shift distribution factors(GSDF) from DC load flow solutions. By using the LODF, the line flow can be calculated according to the modification of base load flow if the contingency occur. To obtain faster contingency ranking, only the loading line more than 35[%](60[%] at 154[kV]) is included in the computation of Performance Index(PI). The proposed algorithm has been validated in tests on a 6-bus test system.system.
PDF

Object Detection in a Still FLIR Image using Intensity Ranking Feature (밝기순위 특징을 이용한 적외선 정지영상 내 물체검출기법)

Park Jae-Hee;Choi Hak-Hun;Kim Seong-Dae
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.2 s.302
- /
- pp.37-48
- /
- 2005
In this paper, a new object detection method for FLIR images is proposed. The proposed method consists of intensity ranking feature and a classification algerian using the feature. The intensity ranking feature is a representation of an image, from which intensity distribution is regularized. Each object candidate region is classified as object or non-object by the proposed classification algorithm which is based on the intensity ranking similarity between the candidate and object training images. Using the proposed algorithm pixel-wise detection results can be obtained without any additional candidate selection algorithm. In experimental results, it is shown that the proposed ranking feature is appropriate for object detection in a FLIR image and some vehicle detection results in the situation of existing noise, scale variation, and rotation of the objects are presented.
PDF KSCI

The Study on the Ranking Algorithm of Web-based Sear ching Using Hyperlink Structure (하이퍼링크 구조를 이용한 웹 검색의 순위 알고리즘에 관한 연구)

Kim, Sung-Hee;O, Gun-Teak
- Journal of Information Management
- /
- v.37 no.2
- /
- pp.33-50
- /
- 2006
In this paper, after reviewing hyperlink based ranking methods, we saw various other parameters that effect ranking. Then, We analyzed the PageRank and HITS(Hypertext Induced Topic Search) algorithm, which are two popular methods that use eigenvector computations to rank results in terms of their characteristics. Finally, google and Ask.com search engines were examined as examples for applying those methods. The results showed that use of Hyperlink structure can be useful for efficiency of web site search.
https://doi.org/10.1633/JIM.2006.37.2.033 인용 PDF

An Estimated Closeness Centrality Ranking Algorithm for Large-Scale Workflow Affiliation Networks (대규모 워크플로우 소속성 네트워크를 위한 근접 중심도 랭킹 알고리즘)

Lee, Do-kyong;Ahn, Hyun;Kim, Kwang-hoon Pio
- Journal of Internet Computing and Services
- /
- v.17 no.1
- /
- pp.47-53
- /
- 2016
A type of workflow affiliation network is one of the specialized social network types, which represents the associative relation between actors and activities. There are many methods on a workflow affiliation network measuring centralities such as degree centrality, closeness centrality, betweenness centrality, eigenvector centrality. In particular, we are interested in the closeness centrality measurements on a workflow affiliation network discovered from enterprise workflow models, and we know that the time complexity problem is raised according to increasing the size of the workflow affiliation network. This paper proposes an estimated ranking algorithm and analyzes the accuracy and average computation time of the proposed algorithm. As a result, we show that the accuracy improves 47.5%, 29.44% in the sizes of network and the rates of samples, respectively. Also the estimated ranking algorithm's average computation time improves more than 82.40%, comparison with the original algorithm, when the network size is 2400, sampling rate is 30%.
https://doi.org/10.7472/jksii.2016.17.1.47 인용 PDF KSCI

Ranking Query Processing in Multimedia Databases

Kim, Byung-Gon;Han, Jong-Woon;Lee, Jaeho;Haechull Lim
- Proceedings of the IEEK Conference
- /
- 2000.07a
- /
- pp.294-297
- /
- 2000
Among the multi-dimensional query types, ranking query is needed if we want the object one by one until we satisfy for the result. In multi-dimensional indexing structures like R-tree or its variants, not many methods are introduced in this area. In this paper, we introduce new ranking query processing algorithm which use the filtering mechanism in the R-tree variants.
PDF

A Similarity Ranking Algorithm for Image Databases (이미지 데이터베이스 유사도 순위 매김 알고리즘)

Cha, Guang-Ho
- Journal of KIISE:Databases
- /
- v.36 no.5
- /
- pp.366-373
- /
- 2009
In this paper, we propose a similarity search algorithm for image databases. One of the central problems regarding content-based image retrieval (CBIR) is the semantic gap between the low-level features computed automatically from images and the human interpretation of image content. Many search algorithms used in CBIR have used the Minkowski metric (or $L_p$-norm) to measure similarity between image pairs. However those functions cannot adequately capture the aspects of the characteristics of the human visual system as well as the nonlinear relationships in contextual information. Our new search algorithm tackles this problem by employing new similarity measures and ranking strategies that reflect the nonlinearity of human perception and contextual information. Our search algorithm yields superior experimental results on a real handwritten digit image database and demonstrates its effectiveness.
PDF KSCI

Ranking Quality Evaluation of PageRank Variations (PageRank 변형 알고리즘들 간의 순위 품질 평가)

Pham, Minh-Duc;Heo, Jun-Seok;Lee, Jeong-Hoon;Whang, Kyu-Young
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.46 no.5
- /
- pp.14-28
- /
- 2009
The PageRank algorithm is an important component for ranking Web pages in Google and other search engines. While many improvements for the original PageRank algorithm have been proposed, it is unclear which variations (and their combinations) provide the "best" ranked results. In this paper, we evaluate the ranking quality of the well-known variations of the original PageRank algorithm and their combinations. In order to do this, we first classify the variations into link-based approaches, which exploit the link structure of the Web, and knowledge-based approaches, which exploit the semantics of the Web. We then propose algorithms that combine the ranking algorithms in these two approaches and implement both the variations and their combinations. For our evaluation, we perform extensive experiments using a real data set of one million Web pages. Through the experiments, we find the algorithms that provide the best ranked results from either the variations or their combinations.
PDF KSCI

Contingency Ranking Using A Line Outage Distribution Factor (선로사고분배계수를 이용한 상정사고 선택)

Park, K.H.;Yoo, H.J.;Chung, J.K.;Kang, Y.M.
- Proceedings of the KIEE Conference
- /
- 1996.07b
- /
- pp.760-763
- /
- 1996
This paper presents an algorithm for the contingency ranking in a power system. The method utilizes line outage distribution factors(LODF) which are established from DC load flow solutions. The LODF are formulated using changes in network power generations to simulate the outaged line from the network. To abtain better ranking. one can take a line loading of 60% over into account in the computation of PI. The proposed algorithm has been validated in tests on a 6-bus test system.
PDF

Search Result 201, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)