Search | Korea Science

Lee Duk-Ryong;Kim Woo-Youn;Oh Il-Seok
- The Journal of the Korea Contents Association
- /
- v.5 no.2
- /
- pp.229-242
- /
- 2005
We constructed a full-text retrieval system for the scanned Hangul document images. The system consists of three parts; preprocessing, recognition, and retrieval components. The retrieval algorithm uses recognition results up to k-ranks. The algorithm is not only insensitive to the recognition errors, but also has the advantage of user-controllable recall and precision. For the objective performance evaluation, we used the scanned images of the Journal of Korea Information Science Society provided by KISTI. The system was shown to be practical through theevaluationofrecognitionandretrievalrates.
PDF

Kwon, Young-Sook
- Journal of Information Management
- /
- v.30 no.3
- /
- pp.44-54
- /
- 1999
A description of SGML(Standard Generalized Markup Language) is given together with a detailed description of WIPO Standard ST.32. The benefits of the use of SGML are highlighted-its system Independence and flexibility in building publication systems and full-text databases. A structure of WIPO Standard ST,32 based patent content is defined by DTD(document type definition) written in ST.32, and full-text itself is described with generalized markup depending on DTD. This article explains how to represent a document structure : a hierarchy structure like a entire document, a specific, sub-document, a paragraph, or non-hirarchy structure like a table drawings, or chemical structures. Merits of SGML In patent document processing are also discussed.
PDF

사공철;서경주
- Journal of the Korean Society for information Management
- /
- v.13 no.2
- /
- pp.19-37
- /
- 1996
The development of information retrieval between 1950s and 1990s is described chronologically. For each decade, the following information retrieval systems are examined : post-coordinate and KWIC indexing methods for the 1950s ; off-line and experimental on-line systems for the 1960s ; on-line and full-text retrieval systems for the 1970s ; full-text databases, on-line interfaces, and overseas and domestic on-line databases for the 1980s ; and finally for the 1990s, CD-ROM, multimedia, hypertext, and Internet. The prospects for the future are also discussed.
PDF

김상준
- Journal of the Korean Society for information Management
- /
- v.13 no.1
- /
- pp.119-141
- /
- 1996
The purpose of this study is to seek how to effectively acquire a full-text using LHI(Loca1 Holdings Information) when we search bibliographic DB. For this study, the pre-investigation of LHI within Online and CD-ROM DB was made using local and overseas literature. The questionary survey was made of librarians and users who have used CD-ROM. The finding of this study is that LHI is not very widely used in libraries, thus more study on ,LHI within Online and CD-ROM DB is required to get high quality information service.
PDF

Lee, Hye-Young;Kwak, Seung-Jin
- Journal of the Korean Society for information Management
- /
- v.25 no.1
- /
- pp.191-210
- /
- 2008
We would generally use subject terms such as subject indexing for searching and accessing documents. So then, there must be any relationship between document's full-text and its subject terms. This study is started in this question. Master's theses in field of science and technology are worked with because full-text is relatively formatted. This study is to study locations of subject term on Thesis, distribution patterns of subject terms on content of full-text; 'Contents', 'Introduction', 'Theory', 'Main subject', 'Conclusion' and 'References'. Thesis were averagely composed of 1226.3 terms. And Subject terms were averagely compose of $12{\sim}13$ terms. As a result, 'Contents' and 'Introduction' have had the most frequency of subject.
https://doi.org/10.3743/KOSIM.2008.25.1.191 인용 PDF

Kim, Dong-Joo;Kim, Han-Woo
- Proceedings of the KIEE Conference
- /
- 2006.10c
- /
- pp.521-523
- /
- 2006
This paper proposes an algorithm and its data structure to support real-time full-text search for the streamed or broadcasted multimedia data containing real-time stenograph text. Since the traditional indexing method used at information retrieval area uses the linguistic information, there is a heavy cost. Therefore, we propose the algorithm and its data structure based on suffix array, which is a simple data structure and has low space complexity. Suffix array is useful frequently to search for huge text. However, subtitle text of multimedia data is to get longer by time. Therefore, suffix array must be reconstructed because subtitle text is continually changed. We propose the data structure called prefix array and search algorithm using it.
PDF