• Title/Summary/Keyword: URL 목록 검색

Search Result 7, Processing Time 0.019 seconds

Fast URL Lookup Using URL Prefix Hash Tree (URL Prefix 해시 트리를 이용한 URL 목록 검색 속도 향상)

  • Park, Chang-Wook;Hwang, Sun-Young
    • Journal of KIISE:Information Networking
    • /
    • v.35 no.1
    • /
    • pp.67-75
    • /
    • 2008
  • In this paper, we propose an efficient URL lookup algorithm for URL list-based web contents filtering systems. Converting a URL list into URL prefix form and building a hash tree representation of them, the proposed algorithm performs tree searches for URL lookups. It eliminates redundant searches of hash table method. Experimental results show that proposed algorithm is $62%{\sim}210%$ faster, depending on the number of segment, than conventional hash table method.

Constructing a Metadata DB to Facilitate Retrieval of Faculty Syllabi on the Internet (인터넷 대학강의안의 검색을 위한 Metadata DB 구축)

  • 오삼균
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.2
    • /
    • pp.149-164
    • /
    • 1999
  • The purpose of this paper is to introduce and discuss a newly-constructed metadata database system that facilitates the retrieval of faculty syllabi available on the Internet. This gateway system aims to provide users with one-stop access to syllabi posted by the faculty of the institutes of post-secondary education from all around the world. Several elements of the Dublin Core (DC) and other supplementary elements were used for cataloging the syllabi. The conceptual schema of all the selected elements of the syllabi was developed following the entity-relationship model. The metadata of the syllabi was then stored in a relational database system. Various searching and browsing interfaces were implemented to facilitate effective retrieval. The prototype, named as Gateway to Faculty Syllabi (GFS), is available at http:/Ais.skku.ac.kr/gfs/.

  • PDF

A Study on LibraryLookup Services Using Bookmarklets (북마크릿을 활용한 LibraryLookup 서비스 제공방안에 관한 연구)

  • Gu, Jung-Eok;Lee, Eung-Bong
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.3 s.61
    • /
    • pp.49-68
    • /
    • 2006
  • It is required to enhance the value of ISBN as a tool for book search, identification, browsing, and improve the accessability and search capability of library OPAC. Bookmarklet is a small size javascript which can be saved as URL in a web browser bookmark or web page hyperlink. Open source bookmarklet can extract ISBN from web pages and search a book from library OPAC using the ISBN, so it is recognized as a simple but powerful search tool. In foreign countries, commercial library system vendors, libraries, OCLC, etc. are providing bookmarklets which allow a user to search for library holdings and loan information in a real time while he/she is travelling in an online bookshop web page. Therefore, this paper compared and analyzed international bookmarklets application examples and proposed LibraryLookup service in which library OPAC and online bookshop can make use of the bookmarklets.

An EFASIT model considering the emotion criteria in Knowledge Monitoring System (지식모니터링시스템에서 감성기준을 고려한 EFASIT 모델)

  • Ryu, Kyung-Hyun;Pi, Su-Young
    • Journal of Internet Computing and Services
    • /
    • v.12 no.4
    • /
    • pp.107-117
    • /
    • 2011
  • The appearance of Web has brought an substantial revolution to all fields of society such knowledge management and business transaction as well as traditional information retrieval. In this paper, we propose an EFASIT(Extended Fuzzy AHP and SImilarity Technology) model considering the emotion analysis. And we combine the Extended Fuzzy AHP Method(EFAM) with SImilarity Technology(SIT) based on the domain corpus information in order to efficiently retrieve the document on the Web. The proposed the EFASIT model can generate the more definite rule according to integration of fuzzy knowledge of various decision-maker, and can give a help to decision-making, and confirms through the experiment.

Hidden Markov Model-based Extraction of Internet Information (은닉 마코브 모델을 이용한 인터넷 정보 추출)

  • Park, Dong-Chul
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.3
    • /
    • pp.8-14
    • /
    • 2009
  • A Hidden Markov Model(HMM)-based information extraction method is proposed in this paper. The proposed extraction method is applied to extraction of products' prices. The input of the proposed IESHMM is the URLs of a search engine's interface, which contains the names of the product types. The output of the system is the list of extracted slots of each product: name, price, image, and URL. With the observation data set Maximum Likelihood algorithm and Baum-Welch algorithm are used for the training of HMM and The Viterbi algorithm is then applied to find the state sequence of the maximal probability that matches the observation block sequence. When applied to practical problems, the proposed HMM-based system shows improved results over a conventional method, PEWEB, in terms of recall ration and accuracy.

Development of an Intelligent Illegal Gambling Site Detection Model Based on Tag2Vec (Tag2vec 기반의 지능형 불법 도박 사이트 탐지 모형 개발)

  • Song, ChanWoo;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.211-227
    • /
    • 2022
  • Illegal gambling through online gambling sites has become a significant social problem. The development of Internet technology and the spread of smartphones have led to the proliferation of illegal gambling sites, so now illegal online gambling has become accessible to anyone. In order to mitigate its negative effect, the Korean government is trying to detect illegal gambling sites by using self-monitoring agents or reporting systems such as 'Nuricops.' However, it is difficult to detect all illegal sites due to limitations such as a lack of staffing. Accordingly, several scholars have proposed intelligent illegal gambling site detection techniques. Xu et al. (2019) found that fake or illegal websites generally have unique features in the HTML tag structure. It implies that the HTML tag structure can be important for detecting illegal sites. However, prior studies to improve the model's performance by utilizing the HTML tag structure in the illegal site detection model are rare. Against this background, our study aimed to improve the model's performance by utilizing the HTML tag structure and proposes Tag2Vec, a modified version of Doc2Vec, as a methodology to vectorize the HTML tag structure properly. To validate the proposed model, we perform the empirical analysis using a data set consisting of the list of harmful sites from 'The Cheat' and normal sites through Google search. As a result, it was confirmed that the Tag2Vec-based detection model proposed in this study showed better classification accuracy, recall, and F1_Score than the URL-based detection model-a comparative model. The proposed model of this study is expected to be effectively utilized to improve the health of our society through intelligent technology.

Constructing a Metadata Database to Enhance Internet Retrieval of Educational Materials

  • Oh Sam-Gyun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.3
    • /
    • pp.143-156
    • /
    • 1998
  • This paper reports the GEM (Gateway to Educational Materials) project whose goal is to develop an operational framework to provide the K-12 teachers in the world with 'one-stop/any-stop' access to thousands of lesson plans, curriculum units and other Internet-based educational resources. To the IS-element Dublin Core base package, the GEM project added an 8-element, domain-specific GEM package. The GEM project employed the conceptual data modeling approach to designing the GEM database, used the Sybase relational database management system (RDBMS) to construct the backend database for storing the metadata of educational resources, and also employed the active server page (ASP) technology to provide Web interfaces to that database. The consortium members catalog lesson plans and other Internet-based educational resources using a cataloging module program that produces HTML meta tags. A harvest program collects these meta tags across the Internet and outputs an ASCII file that conforms to the standard agreed by the consortium members. A parser program processes this file to enter meta tags automatically into appropriate relational tables in the Sybase database. The conceptual/logical schemas of Dublin Core and GEM profile are presented. The advantages of conceptual modeling approach to manage metadata are discussed. A prototype system that provides access to the GEM metadata is available at http://lis.skku.ac.kr/gem/.

  • PDF