• Title/Summary/Keyword: relevant information retrieval

Search Result 188, Processing Time 0.026 seconds

A Search Efficiency Improvement Method using Internal Contiguity in Query Terms (질의 내부 단어 인접도를 이용한 검색 효율 향상 기법)

  • Yoon, Soung-Woong;Chae, Jin-Ki;Lee, Sang-Hoon
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.192-198
    • /
    • 2008
  • It is difficult to get relevant information on vast Web data. Search engines summarize and store Web information and show the ranked lists based on user queries affected by relative importance and user-adaptation. But these have limitation with showing user-intended information at the top priority. User intention is presented in general within query itself. In this paper, we propose the selective rankup methodology of user-intended search results based on weighting internal contiguity in query terms. With experimental results, we can find user-intended results with 75.8% probability using this simple method only, and efficiency of rerank proposed outperforms ordinary case by $13{\sim}20%$.

Analyzing empirical performance of correlation based feature selection with company credit rank score dataset - Emphasis on KOSPI manufacturing companies -

  • Nam, Youn Chang;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.4
    • /
    • pp.63-71
    • /
    • 2016
  • This paper is about applying efficient data mining method which improves the score calculation and proper building performance of credit ranking score system. The main idea of this data mining technique is accomplishing such objectives by applying Correlation based Feature Selection which could also be used to verify the properness of existing rank scores quickly. This study selected 2047 manufacturing companies on KOSPI market during the period of 2009 to 2013, which have their own credit rank scores given by NICE information service agency. Regarding the relevant financial variables, total 80 variables were collected from KIS-Value and DART (Data Analysis, Retrieval and Transfer System). If correlation based feature selection could select more important variables, then required information and cost would be reduced significantly. Through analysis, this study show that the proposed correlation based feature selection method improves selection and classification process of credit rank system so that the accuracy and credibility would be increased while the cost for building system would be decreased.

Applications of Transaction Log Analysis for the Web Searching Field (웹 검색 분야에서의 로그 분석 방법론의 활용도)

  • Park, So-Yeon;Lee, Joon-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.41 no.1
    • /
    • pp.231-242
    • /
    • 2007
  • Transaction logs capture the interactions between online information retrieval systems and the users. Given the nature of the Web and Web users, transaction logs appear to be a reasonable and relevant method to collect and investigate information searching behaviors from a large number of Web users. Based on a series of research studies that analyzed Naver transaction logs, this study examines how transaction log analysis can be applied and contributed to the field of web searching and suggests future implications for the web searching field. It is expected that this study could contribute to the development and implementation of more effective Web search systems and services.

A Methodology for Performance Evaluation of Web Robots (웹 로봇의 성능 평가를 위한 방법론)

  • Kim, Kwang-Hyun;Lee, Joon-Ho
    • The KIPS Transactions:PartD
    • /
    • v.11D no.3
    • /
    • pp.563-570
    • /
    • 2004
  • As the use of the Internet becomes more popular, a huge amount of information is published on the Web, and users can access the information effectively with Web search services. Since Web search services retrieve relevant documents from those collected by Web robots we need to improve the crawling quality of Web robots. In this paper, we suggest evaluation criteria for Web robots such as efficiency, continuity, freshness, coverage, silence, uniqueness and safety, and present various functions to improve the performance of Web robots. We also investigate the functions implemented in the conventional Web robots of NAVER, Google, AltaVista etc. It is expected that this study could contribute the development of more effective Web robots.

Hierarchical Organization of Neural Agents for Distributed Information Retrieval (분산 정보 검색을 위한 신경망 에이전트의 계층적 구성)

  • Choi, Yong S.
    • The Journal of Korean Association of Computer Education
    • /
    • v.8 no.6
    • /
    • pp.113-121
    • /
    • 2005
  • Since documents on the Web are naturally partitioned into many document databases, the efficient information retrieval (IR) process requires identifying the document databases that are most likely to provide relevant documents to the query and then querying the identified document databases. We first introduce a neural net agent for such an efficient IR, and then propose the hierarchically organized multi-agent IR system in order to scale our agent with the large number of document databases. In this system, the hierarchical organization of neural net agents reduced the total training cost at an acceptable level without degrading the IR effectiveness in terms of precision and recall. In the experiment, we introduce two neural net IR systems based on single agent approach and multi-agent approach respectively, and evaluate the performance of those systems by comparing their experimental results to those of the conventional statistical systems.

  • PDF

Development of Computer Vision System for Individual Recognition and Feature Information of Cow (I) - Individual recognition using the speckle pattern of cow - (젖소의 개체인식 및 형상 정보화를 위한 컴퓨터 시각 시스템 개발 (I) - 반문에 의한 개체인식 -)

  • 이종환
    • Journal of Biosystems Engineering
    • /
    • v.27 no.2
    • /
    • pp.151-160
    • /
    • 2002
  • Cow image processing technique would be useful not only for recognizing an individual but also for establishing the image database and analyzing the shape of cows. A cow (Holstein) has usually the unique speckle pattern. In this study, the individual recognition of cow was carried out using the speckle pattern and the content-based image retrieval technique. Sixty cow images of 16 heads were captured under outdoor illumination, which were complicated images due to shadow, obstacles and walking posture of cow. Sixteen images were selected as the reference image for each cow and 44 query images were used for evaluating the efficiency of individual recognition by matching to each reference image. Run-lengths and positions of runs across speckle area were calculated from 40 horizontal line profiles for ROI (region of interest) in a cow body image after 3 passes of 5$\times$5 median filtering. A similarity measure for recognizing cow individuals was calculated using Euclidean distance of normalized G-frame histogram (GH). normalized speckle run-length (BRL), normalized x and y positions (BRX, BRY) of speckle runs. This study evaluated the efficiency of individual recognition of cow using Recall(Success rate) and AVRR(Average rank of relevant images). Success rate of individual recognition was 100% when GH, BRL, BRX and BRY were used as image query indices. It was concluded that the histogram as global property and the information of speckle runs as local properties were good image features for individual recognition and the developed system of individual recognition was reliable.

A Development of Ontology-Based Law Retrieval System: Focused on Railroad R&D Projects (온톨로지 기반 법령 검색시스템의 개발: 철도·교통 분야 연구개발사업을 중심으로)

  • Won, Min-Jae;Kim, Dong-He;Jung, Hae-Min;Lee, Sang Keun;Hong, June Seok;Kim, Wooju
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.209-225
    • /
    • 2015
  • Research and development projects in railroad domain are different from those in other domains in terms of their close relationship with laws. Some cases are reported that new technologies from R&D projects could not be industrialized because of relevant laws restricting them. This problem comes from the fact that researchers don't know exactly what laws can affect the result of R&D projects. To deal with this problem, we suggest a model for law retrieval system that can be used by researchers of railroad R&D projects to find related legislation. Input of this system is a research plan describing the main contents of projects. After laws related to the R&D project is provided with their rankings, which are assigned by scores we developed. A ranking of a law means its order of priority to be checked. By using this system, researchers can search the laws that may affect R&D projects throughout all the stages of project cycle. So, using our system model, researchers can get a list of laws to be considered before the project they participate ends. As a result, they can adjust their project direction by checking the law list, avoiding their elaborate projects being useless.

An Improved Approach to Ranking Web Documents

  • Gupta, Pooja;Singh, Sandeep K.;Yadav, Divakar;Sharma, A.K.
    • Journal of Information Processing Systems
    • /
    • v.9 no.2
    • /
    • pp.217-236
    • /
    • 2013
  • Ranking thousands of web documents so that they are matched in response to a user query is really a challenging task. For this purpose, search engines use different ranking mechanisms on apparently related resultant web documents to decide the order in which documents should be displayed. Existing ranking mechanisms decide on the order of a web page based on the amount and popularity of the links pointed to and emerging from it. Sometime search engines result in placing less relevant documents in the top positions in response to a user query. There is a strong need to improve the ranking strategy. In this paper, a novel ranking mechanism is being proposed to rank the web documents that consider both the HTML structure of a page and the contextual senses of keywords that are present within it and its back-links. The approach has been tested on data sets of URLs and on their back-links in relation to different topics. The experimental result shows that the overall search results, in response to user queries, are improved. The ordering of the links that have been obtained is compared with the ordering that has been done by using the page rank score. The results obtained thereafter shows that the proposed mechanism contextually puts more related web pages in the top order, as compared to the page rank score.

Disambiguation of Korean Names in References

  • Kim, Sungwon
    • Journal of Information Science Theory and Practice
    • /
    • v.6 no.2
    • /
    • pp.62-70
    • /
    • 2018
  • One of the characteristics of academic writing is the inclusion of citations and references. As the development of reference styles used for international scholarly communication has mostly been led by Western academic societies, the reference styles developed in Western nations do not reflect the characteristics of Korean names. As a result, it is hard to distinguish Korean authors through citations based on Western reference styles, which in turn decreases the retrieval efficiency of relevant authors and ultimately the efficiency of scholarly communication. This paper intends to analyze author name disambiguation of Korean authors indicated according to Western reference styles. It aims to suggest the necessity for enhancing name disambiguation of Korean authors and revision of reference styles. Its ultimate goal is to increase the efficiency of scholarly communication through the improvement of name disambiguation of Korean authors. For this purpose, this study collected and analyzed name data of Korean researchers and compared name disambiguation of authors by reference style. Based on research results, this study confirmed a necessity for revising reference styles to improve name disambiguation of authors and suggested a necessity for research into the improvement of plans for revision.

Web Crawler Service Implementation for Information Retrieval based on Big Data Analysis (빅데이터 분석 기반의 정보 검색을 위한 웹 크롤러 서비스 구현)

  • Kim, Hye-Suk;Han, Na;Lim, Suk-Ja
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.933-942
    • /
    • 2017
  • In this paper, we propose a web crawler service method for collecting information efficiently about college students and job-seeker's external activities, competition, and scholarship. The proposed web crawler service uses Jsoup tree analysis and Json format data transmission method to avoid problems of duplicated crawling while crawling at high speed. After collecting relevant information for 24 hours, we were able to confirm that the web crawler service is running with an accuracy of 100%. It is expected that the web crawler service can be applied to various web sites in the future to improve the web crawler service.