• Title/Summary/Keyword: subject thesaurus

Search Result 39, Processing Time 0.021 seconds

Automatic Korean Text Categorization by Subject Thesaurus (분야별 관련어사전에 의한 한글 웹문서 자동분류)

  • Kim, Young;Chae, Soo-Hoan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.771-774
    • /
    • 2005
  • 인터넷이 폭 넓게 보급되어 온라인 상에서 얻을 수 있는 텍스트 정보의 양이 급증함에 따라 산재해 있는 문서들에 대한 효과적인 정보 관리 및 검색이 요구되고 있다. 자동 문서분류란 문서의 내용에 기반하여 미리 정의되어 있는 범주에 문서를 자동으로 할당하는 작업으로써 효율적인 정보 관리 및 검색을 가능하게 한다. 특히 한국어 정보처리의 중요성에 비해 관련 분야의 자료들을 수집, 분류하는데 있어 많은 어려움이 있다. 따라서 논문에서는 한글 웹문서 자동 문서 범주화에 대한 수행단계중 각 분야에 대해 사전구축을 하고, 중복단어제거를 통한 보다 효과적인 분야별 문서분류를 제안하고자한다.

  • PDF

Personalized I-Mail Classification System Using Dynamic Thesaurus and Genetic Algorithm (동적 시소러스와 GA을 이용한 개별화된 E-Mail1 분류시스템 (PECS))

  • 안희국;노희영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.472-474
    • /
    • 2002
  • 본 논문에서는 전자메일을 사용자 적합도(선호도)를 기준으로 분류하기 위한 구조를 제안한다. 분류는 1차 분류와 2차 분류로 나눠지는데, 1차 분류에서는 사용자 적합도를 판단하기 위해 사용자 관련 정보로부터 동적 시소러스를 구축하고, 구축된 시소러스와의 비교를 통해 사용자에게 유용한 메일인지 아닌지를 결정하고, 2차 분류에서는 사용자가 지정한 폴더키워드를 중심으로 사용자 시소러스로부터 유전자 알고리즘을 이용해 추출한 키워드들과의 적합도 비교를 통해서 특정 폴더로의 분류가 이뤄지게 된다 테스트에는 메일 정보값(Mail Information Word)을 추출하기 위해 HAM(Hangup Analysys Module)을 포함하는 메일정보추줄 에이전트를 사용하였고, mail의 subject와 본문(body)로부터 추출된 16개의 word정보와 시소러스 적합도 정보, 분류 적합도 정보를 하나의 데이터구조로 사용하였다. 이러한 통할된 시스템 구조와 data structure를 이용해 mail을 사용자의 선호도에 따라. 1차와 2차에 걸친 분류시 분류가 사용자 선호도에 근접하게 이루어 질 수 있음을 확인하였다.

  • PDF

A Study on Hangul Qualifier for Homographic Descriptors (동형이의어의 구별을 위한 한글한정어 사용에 관한 연구)

  • 김태수;최석두
    • Journal of the Korean Society for information Management
    • /
    • v.14 no.1
    • /
    • pp.107-124
    • /
    • 1997
  • It is our main aim in this study to discriminate the conceptual relationship between homographic descriptors. The roles of qualifier and the problems of the recent usage of qualifier such as Hangul, Hanja and foreign languages, which is based largely on the dictionaries, subject heading lists and thesauri, re analyzed within the framework of the our test thesaurus developed as a macro-the-saurus. Finally, we proposed some new ideas must be integrated into the Hangul qualifier in order to make it generally applicable within the field of dictionary, and the method of representing, selection principles and priority of Hangul qualifiers.

  • PDF

Query Processing Model Using Two-level Fuzzy Knowledge Base (2단계 퍼지 지식베이스를 이용한 질의 처리 모델)

  • Lee, Ki-Young;Kim, Young-Un
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.4 s.36
    • /
    • pp.1-16
    • /
    • 2005
  • When Web-based special retrieval systems for scientific field extremely restrict the expression of user's information request, the process of the information content analysis and that of the information acquisition become inconsistent. Accordingly, this study suggests the re-ranking retrieval model which reflects the content based similarity between user's inquiry terms and index words by grasping the document knowledge structure. In order to accomplish this, the former constructs a thesaurus and similarity relation matrix to provide the subject analysis mechanism and the latter propose the algorithm which establishes a search model such as query expansion in order to analyze the user's demands. Therefore, the algorithm that this study suggests as retrieval utilizing the information structure of a retrieval system can be content-based retrieval mechanism to establish a 2-step search model for the preservation of recall and improvement of accuracy which was a weak point of the previous fuzzy retrieval model.

  • PDF

A Study on metadata structuralization for context representation of women's oral life history (여성구술생애기록물 맥락 표현을 위한 메타데이터 구조화에 관한 연구)

  • Lee, Jung Yeon;LEe, Jung Yeoun;Ryoo, Jong Duk;Lee, Jong Yoon
    • The Korean Journal of Archival Studies
    • /
    • no.30
    • /
    • pp.57-88
    • /
    • 2011
  • Oral history is the work to make the record of the verbal content recreated by the memories of the survivors. Oral history recording is accomplished through the collaboration of the interviewee, the interviewer, the cameraman, the recorder, the transcriber and etc. Therefore, it is important for the context at the time of the production to be expressed. So planning for the collection of oral records, the collection of oral records, and their preservation and maintenance should be managed systematically. This study, being started from this sense of problem, designed conceptual model of metadata to well reflect the contextual characteristics of the oral records of the women life of among the oral records and extracted the elements through this. The whole process of records management including from planning, production, preservation, management, and leading to use, related to the oral records of the women life, was classified into a hierarchy. It also proposed the system which can express the characteristics of the 'gender' through authority records and subject thesaurus.

A Comparative Analysis of Subject Headings Related to Korea in the CCT and NDLSH (『중국분류주제사표(中国分类主题词表)』와 『국립국회도서관건명표목표(国立国会図書館件名標目表)』에 나타난 한국 관련 주제명표목에 대한 비교 분석)

  • Moon, Ji-Hyun
    • Journal of Korean Library and Information Science Society
    • /
    • v.43 no.3
    • /
    • pp.121-141
    • /
    • 2012
  • This study compares and analyzes the numbers and characteristics of Korea-related subjects included in the 2008 Japanese edition of the National Diet Library Subject Headings (NDLSH) and the $2^{nd}$ edition of the Chinese Classified Thesaurus(CCT). The analysis results show that 258 subjects, approximately twice as many as 137 subjects in CCT, were found in NDLSH. There are more pure subjects that exclude the references in CCT than in NDLSH. On the other hand, much more subjects are found in NDLSH in the event that personal names, corporate headings, and subjects combined with detailed headings are included. Meanwhile, more subjects are relatively engaged in the fields of politics, diplomacy, and military in CCT because CCT is characterized by socialism and by being a pro-North Korea. Moreover, the considerable numbers of subjects reflecting North Korea's viewpoint are included in CCT. NDLSH changed only recently the names of both South and North Korea into the "Republic of Korea" and the "Democratic People's Republic of Korea", respectively. On the other hand, CCT more frequently uses "Joseon" than "Korea", and the distinction between the names is unclear. CCT thoroughly supports the stance of the developed country, directly involved in the disputable subjects between two countries such as "Dokdo", "the East Sea", "Dumangang", and "Baekdusan". Both heading lists consider "Balhae" as part of Chinese history in CCT, which has ignored the position of Korea.

A Study on Improving of Access to School Library Collection through Elementary School Students' DLS Search Behavior Analysis (초등학생의 학교도서관 자료 검색 행태 분석을 통한 독서로DLS의 자료 접근성 향상 방안 고찰)

  • Bongsuk Kang;Jeonghoon Lim
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.58 no.2
    • /
    • pp.317-342
    • /
    • 2024
  • The purpose of this study is to explore ways to improve accessibility to school library materials through analysis of elementary school students' information search behavior in DLS. Accordingly, the DLS search process was recorded for 26 students attempting a DLS search in the school library, and data was collected through a pre-search questionnaire on overall information needs and a post-search questionnaire on the search process and results. As a result of the analysis, satisfaction was found to be low when the main purpose of DLS use was simple leisure reading, when the search time and number of search words were long, and when there were too many search results. Accordingly, it was emphasized that curriculum subject-related metadata elements should be developed and a curriculum subject-specific thesaurus should be built and used to build lists and support user searches. In addition, it was suggested that the basic functions provided in external searches should be included, and a foundation should be laid in terms of resources and curriculum to systematically provide information utilization education to elementary school students who lack the ability to select search terms and judge the suitability of results after the search. It was proposed to provide an integrated search service with external resources and a personalized book recommendation service.

A Study on the Retrieval Effectiveness of KoreaMed using MeSH Search Filter and Word-Proximity Search (검색용 MeSH 필터와 단어인접탐색 기법을 활용한 KoreaMed 검색 효율성 향상 연구)

  • Jeong, So-Na;Jeong, Ji-Na
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.5
    • /
    • pp.596-607
    • /
    • 2017
  • This study examined the method for adding related to "stomach neoplasms" as filters to the Medical Subject Headings (MeSH) for search as well as a method for improving the search efficiency through a word-proximity search by measuring the distance of co-occurring terms. A total of 8,625 articles published between 2007 and 2016 with the major topic terms "stomach neoplasms" were downloaded from PubMed article titles. The vocabulary to be added to the MeSH for search were analyzed. The search efficiency was verified by 277 articles that had "Stomach Neoplasms" indexed as MEDLINE MeSH in KoreaMed. As a result, 973 terms were selected as the candidate vocabulary. "Gastric Cancer" (2,780 appearances) was the most frequent term and 7,376 compound words (88.51%) combined the histological terms of "stomach" and "neoplasm", such as "gastric adenocarcinoma" and "gastric MALT lymphoma". A total of 5,234 compounds words (70.95%), in which the co-occurring distance was two words, were found. The matching rate through the MEDLINE MeSH and KoreaMed MeSH Indexer was 209 articles (75.5%). The search efficiency improved to 263 articles (94.9%) when the search filters were added, and to 268 articles (96.7%) when the 13 word-proximity search technique of the co-occurring terms was applied. This study showed that the use of a thesaurus as a means of improving the search efficiency in a natural language search could maintain the advantages of controlled vocabulary. The search accuracy can be improved using the word-proximity search instead of a Boolean search.

A Curricular Study on AI & ES in Library and Information Science (문헌정보학에서의 인공지능과 전문가시스템 교육과정 연구)

  • Koo Bon-Young;Park Mi-Young
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.2
    • /
    • pp.211-232
    • /
    • 1998
  • It is the purpose of this study to specify contents of Library and Information Science to train information professional to meet environment change of technology and system. Among them. recognizing necessity of present Artificial Intelligence and Export System (AI and ES) required by changing environment of latest Information technology, it is also the purpose of this work to figure out fundamental data and the way of solution how to introduce what contents out of AI and ES to Library and Information Science. The briefed results are as follows. 1. Due to rapid change of high Information technology and computer application it is the most important essential points, In order of Importance, in finding available network source, In indexing on-line data base, in analysing and design information system. and in computer application ability. 2. In contents of AI and ES, most Important training portion for Library and Information Science are : data base treating, thesaurus, natural language processing. and knowledge representation. 3. Library and information science professors recognize It necessary for bigger number of Library and Information Science students to be educated artificial intelligence and expert system. 4. During forthcoming age it shows more important reorganization that artificial intelligence and expert system improves information professional in reference service, cataloging, classification, information retrieval, and documentation delivery 5. According to library and information science professors more important reorganization on the subject of AI and ES, the curricular on AI and ES is, forthcoming, to be Introduced to curricular on library and information science in the nation, In order of importance, (see 1. above).

  • PDF