Document Filtering Algorithm for Efficient Preprocessing of XML Information Retrieval

Kong Yong-Hae;Kim Myung-Sook;

한국산학기술학회논문지 (Journal of the Korea Academia-Industrial cooperation Society)

제6권1호
/
Pages.1-11
/
2005
/
1975-4701(pISSN)
/
2288-4688(eISSN)

한국산학기술학회 (The Korea Academia-Industrial cooperation Society)

XML 정보검색의 효율적 전처리를 위한 문서여과 알고리즘

Document Filtering Algorithm for Efficient Preprocessing of XML Information Retrieval

공용해 (순천향대학교 정보기술공학부) ;
김명숙 (순천향대학교 정보기술공학부)

발행 : 2005.02.01

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문은 다수의 XML문서들을 대상으로 하는 XML 정보검색에서, XML의 효율적 질의검색을 위한 전처리 방법을 제안한다. 기존의 전처리 방법은 질의의 키워드에 대하여 XML 문서를 파싱하거나, 질의와 XML 문서로부터 생성된 시그너처 정보를 비교하여 XML 문서를 여과한다. 그러나 이러한 방법은 질의에 종속적이며 다량의 XML 문서들이 존재할 경우 매우 비효율적이다. 이를 위하여, 본 연구는 온톨로지를 사용하여 서로 다른 구조와 속성을 갖지만 동일 영역의 정보를 포함하고 있는 XML 문서에 적용 가능한 포괄적 DTD를 생성하고, 이를 이용하여 검색 영역에 포함되지 않는 불필요한 XML문서를 여과한다. 예제 XML 문서를 적용하여 제안한 문서여과 알고리즘의 성능을 테스트한다.

The paper proposes a preprocessing method for efficient processing of XML queries in information retrieval with a large amount of XML documents. The conventional preprocessing methods filter out XML documents by parsing XML document for keyword of query or by comparing query signatures with signatures of XML document to be generated. But these methods are dependent on a query and are very in efficient for a large amount of XML documents. For this, we generate a universal DTD based on ontology of a domain. The universal DTD is applicable to the XML documents when they contain information of a same domain even when they have different structures and attributes. Then, using the universal DTD, we filter out the XML documents that are not bounded in the domain. We evaluate the performance of this method through experiments.

한국산학기술학회논문지 (Journal of the Korea Academia-Industrial cooperation Society)

XML 정보검색의 효율적 전처리를 위한 문서여과 알고리즘

Document Filtering Algorithm for Efficient Preprocessing of XML Information Retrieval

초록

키워드

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)