The Refinement Effect of Foreign Word Transliteration Query on Meta Search

Lee, Jae-Sung;

doi:10.3745/KIPSTB.2008.15-B.2.171

정보처리학회논문지B (The KIPS Transactions:PartB)

제15B권2호
/
Pages.171-178
/
2008
/
1598-284X(pISSN)

한국정보처리학회 (Korea Information Processing Society)

DOI QR Code

메타 검색에서 외래어 질의 정제 효과

The Refinement Effect of Foreign Word Transliteration Query on Meta Search

이재성 (충북대학교 컴퓨터교육과)

Lee, Jae-Sung

발행 : 2008.04.30

https://doi.org/10.3745/KIPSTB.2008.15-B.2.171 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

문서에서 외래어가 일관되게 사용되지 않고 여러 이형태로 사용되고 있기 때문에, 정확한 질의어 일치를 지원하는 검색 시스템에서 외래어 질의로 문서를 검색하는데 어려움이 많다. 본 논문에서는 하나의 외래어로 질의할 경우, 원 질의어와 같은 뜻의 다양한 이형태 외래어 질의로 자동 확장하고 정제하여 더 많은 관련 문서를 손쉽게 검색할 수 있는 메타 검색 방법을 제안한다. 이 방법은 1차로 원 질의어에서 다양한 외래어 이형태를 통계적 방법으로 확장하고, 2차로 그 결과를 각 검색 엔진에게 질의하여 일정 개수 이상의 질의어가 문서에 나타났는지, 원 질의어의 문맥과 유사한 문맥에서 그 질의어가 쓰였는지를 비교하여, 같은 뜻의 유효한 외래어를 판별해 내고 이를 이용하여 검색할 수 있도록 한다. 실험 결과, 기준점으로 쓰인 1차로 만든 이형태로 검색했을 때 F값은 평균 38%이었으나, 제안된 방법인 2차로 정제된 질의어로 검색했을 때의 F값은 평균 81%로 매우 향상된 결과를 보였다.

Foreign word transliterations are not consistently used in documents, which hinders retrieving some important relevant documents in exact term matching information retrieval systems. In this paper, a meta search method is proposed, which expands and refines relevant variant queries from an original input foreign word transliteration query to retrieve the more relevant documents. The method firstly expands a transliteration query to the variants using a statistical method. Secondly the method selects the valid variants: it queries each variant to the retrieval systems beforehand and checks the validity of each variant by counting the number of appearance of the variant in the retrieved document and calculating the similarity of the context of the variant. Experiment result showed that querying with the variants produced at the first step, which is a base method of the test, performed 38% in average F measure, and querying with the refined variants at the second step, which is a proposed method, significantly improved the performance to 81% in average F measure.

키워드

참고문헌

Jeong, K., S. H. Myaeng, J. S. Lee and K.-S. Choi, “Automatic Identification and Back-Transliteration of Foreign Words for Information Retrieval,” Information Processing and Management, Vol.35, No.4, pp.523-540, 1999 https://doi.org/10.1016/S0306-4573(98)00055-7
Lee, J. S., K. Choi, “English to Korean Statistical Transliteration for Information Retrieval,” Computer Processing of Oriental Languages, Vol.12, No.1, pp. 17-37, 1998
이재성, “다국어 정보검색을 위한 영한 음차 표기 및 복원 모델,” 박사학위논문, 한국과학기술원, 1999
이희승, 안병주, “한글 맞춤법 강의-고친판,” 신구문화사, 1994
이현복, “외래어 표기법 개정 시안의 문제점,” 어학연구 15.1, pp.39-59, 1979
SERI/KIST, “지능형 정보처리기의 개발에 관한 연구,” 제1차년도 최종 보고서, 과학기술처, 1995
강병주, 이재성, 최기선, “외국어 음차 표기의 음성적 유사 도 비교 알고리즘,” 정보과학회 논문지(B), 제26권, 제10호, pp.1237-1246, 1999
강병주, “한국어 정보검색에서 외래어와 영어로 인한 단어 불일치문제의 해결,” 박사학위논문, 한국과학기술원, 2001
Cheon, S. M. “Construction of English Loanwords Contents for the Development of Educational Tools: a Step Towards the Prosperity of CALL Courseware,” Ph. D dissertation. Hankuk University of Foreign Studies, 2005
Mettler, M. “TRW Japanese Fast Data Finder,” TIPSTER Text Program Phase I Proc., Sep., pp.113-116, 1993
김병혜, “영어단어의 알파벳표기로부터 한글표기로의 자동변환,” 석사학위논문, 서강대학교 공공정책대학원, 1991
이재성, “효과적인 외래어 이형태 생성을 위한 확률 문맥 의존 치환 방법,” 한국 콘텐츠학회논문지, 제7권, 제2호, pp. 73-83, 2007
Aslam, J. A., M. Montague, “Models for metasearch,” Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 276-284, New Orleans, USA, 2001 https://doi.org/10.1145/383952.384007
Salton, G., “Automatic Text Processing - The transformation, analysis, and retrieval of information by computer,” pp.318-319, Addison Wesley Publishing Company, 1989
Google, http://www.google.co.kr/
Naver, http://www.naver.com/
Manning, C., H. Schutze, “Foundations of Statistical Natural Language Processing,” pp.268-269, The MIT Press, 1999

정보처리학회논문지B (The KIPS Transactions:PartB)

메타 검색에서 외래어 질의 정제 효과

The Refinement Effect of Foreign Word Transliteration Query on Meta Search

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)