• Title/Summary/Keyword: 자동 주제 분류

Search Result 108, Processing Time 0.029 seconds

A Experimental Study on the Development of a Book Recommendation System Using Automatic Classification, Based on the Personality Type (자동분류기반 성격 유형별 도서추천시스템 개발을 위한 실험적 연구)

  • Cho, Hyun-Yang
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.2
    • /
    • pp.215-236
    • /
    • 2017
  • The purpose of this study is to develop an automatic classification system for recommending appropriate books of 9 enneagram personality types, using book information data reviewed by librarians. Data used for this study are book review of 501 recommended titles for children and young adults from National Library for Children and Young Adults. This study is implemented on the assumption that most people prefer different types of books, depending on their preference or personality type. Performance test for two different types of machine learning models, nonlinear kernel and linear kernel, composed of 360 clustering models with 6 different types of index term weighting and feature selections, and 10 feature selection critical mass were experimented. It is appeared that LIBLINEAR has better performance than that of LibSVM(RBF kernel). Although the performance of the developed system in this study is relatively below expectations, and the high level of difficulty in personality type base classification take into consideration, it is meaningful as a result of early stage of the experiment.

A Study on Organizing the Web Using Facet Analysis (패싯 분석을 이용한 웹 자원의 조직)

  • Yoo, Yeong-Jun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.15 no.1
    • /
    • pp.23-41
    • /
    • 2004
  • In indexing and organizing Web resources, there have been two basic methods: automatic indexing by extracting key words and library classification schemes or subject directories of search engines. But, both methods have failed to satisfy the user's information needs, due to the lack of standard criteria and the irrationality of its structural system. In this paper I have examined the limits of library classification scheme's structures and the problems related to the nature of Web resources such as specificity and exhaustivity. I have also attempted to explain the logicality of Web resources organization by facet analysis and its strengths and limitations. In so doing, I have proposed three specific methods in using facet analysis: firstly, indexing system by facet analysis; secondly, the alternative transformation of the enumerative classification scheme into facet classification scheme; and finally, the facet model of subject directory of domestic search engine. After examining the three methods, my study concludes that a controlled vocabulary by facet analysis can be employed as a useful method in organizing Web resources.

  • PDF

An Analysis of the Applicable Fields of UDC (UDC의 적용분야에 관한 연구)

  • Lee, Chang-Soo
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.4
    • /
    • pp.1-21
    • /
    • 2004
  • The purpose of this study is to investigate historical backgrounds, maintenance, revision and application areas of UDC(Universal Decimal Classification) in order to understand current issues of it systematically. Since 1905, n has been extensively developed and is now administered by UDC Consortium(UDCC). UDCC updates MRF(Master Reference File), an electronic form of the UDC schedules, once a year. UDC updates and publishes standard edition extended edition, and abridged edition according to the degrees of notion abridgement, and is available on the web. UDC can be now applicable to collection arrangement, SDI(Selective Dissemination of Information) service, searching subject bibliographies, switching language or subject gateway and metadata on the Internet, and automatic classification.

  • PDF

A study on the use of DDC scheme in directory search engine for research information resources on internet (인터넷 학술정보자원의 디렉토리 서비스 설계에 있어서 DDC 분류체계의 활용에 관한 연구)

  • 최재황
    • Journal of the Korean Society for information Management
    • /
    • v.15 no.2
    • /
    • pp.47-68
    • /
    • 1998
  • Although the research information resources on Internet are spread out on thousands of computers, it is not always easy to get them on the right time by the right manner. The purpose of this study is to use DDC(Dewey Decimal Classification) scheme in subject-based directory search engine for research information resourcees to aid retrieval on the Internet. For the design of classification code, this study followed 'systematic order' of DDC to arrange subjects from the general o the specific in a logical order, and for the design of classification dictionary, 'Relative Index' of DDC was used to bring together the various aspects of subjects.

  • PDF

A Study on the Musical Theme Clustering for Searching Note Sequences (음렬 탐색을 위한 주제소절 자동분류에 관한 연구)

  • 심지영;김태수
    • Journal of the Korean Society for information Management
    • /
    • v.19 no.3
    • /
    • pp.5-30
    • /
    • 2002
  • In this paper, classification feature is selected with focus of musical content, note sequences pattern, and measures similarity between note sequences followed by constructing clusters by similar note sequences, which is easier for users to search by showing the similar note sequences with the search result in the CBMR system. Experimental document was $\ulcorner$A Dictionary of Musical Themes$\lrcorner$, the index of theme bar focused on classical music and obtained kern-type file. Humdrum Toolkit version 1.0 was used as note sequences treat tool. The hierarchical clustering method is by stages focused on four-type similarity matrices by whether the note sequences segmentation or not and where the starting point is. For the measurement of the result, WACS standard is used in the case of being manual classification and in the case of the note sequences starling from any point in the note sequences, there is used common feature pattern distribution in the cluster obtained from the clustering result. According to the result, clustering with segmented feature unconnected with the starting point Is higher with distinct difference compared with clustering with non-segmented feature.

Semantic Topic Selection Method of Document for Classification (문서분류를 위한 의미적 주제선정방법)

  • Ko, kwang-Sup;Kim, Pan-Koo;Lee, Chang-Hoon;Hwang, Myung-Gwon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.1
    • /
    • pp.163-172
    • /
    • 2007
  • The web as global network includes text document, video, sound, etc and connects each distributed information using link Through development of web, it accumulates abundant information and the main is text based documents. Most of user use the web to retrieve information what they want. So, numerous researches have progressed to retrieve the text documents using the many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both the subject and the semantics of documents. As a result user have to find by their hand again. Especially, it is more hard to find the korean document because the researches of korean document classification is insufficient. So, to overcome the previous problems, we propose the korean document classification method for semantic retrieval. This method firstly, extracts TF value and RV value of concepts that is included in document, and maps into U-WIN that is korean vocabulary dictionary to select the topic of document. This method is possible to classify the document semantically and showed the efficiency through experiment.

An Automatic Classification of Discourse Relations in the Arguing Structure of Korean Texts (한국어 텍스트의 논증 구조 내 담화 관계의 자동 분류 연구)

  • Lee, Sana;Shin, Hyopil
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.59-64
    • /
    • 2015
  • 최근 온라인 텍스트 자료를 이용하여 대중의 의견을 분석하는 작업이 활발히 이루어지고 있다. 이러한 작업에는 주관적 방향성을 갖는 텍스트의 논증 구조와 중요 내용을 파악하는 과정이 필요하며, 자료의 양과 다양성이 급격히 증가하면서 그 과정의 자동화가 불가피해지고 있다. 본 연구에서는 정책에 대한 찬반 의견으로 구성된 한국어 텍스트 자료를 직접 구축하고, 글을 구성하는 기본 단위들 사이의 담화 관계를 정의하였다. 각 단위들 사이의 관계는 기계학습과 규칙 기반 방식을 이용하여 예측되고, 그 결과는 합성되어 하나의 글에 대응되는 트리 구조를 이룬다. 또한 텍스트의 구조상에서 주제문을 직접적으로 뒷받침하는 문장 혹은 절을 추출하여 글의 중요 내용을 얻고자 하였다.

  • PDF

Debatable SNS Post Detection using 2-Phase Convolutional Neural Network (2-Phase CNN을 이용한 SNS 글의 논쟁 유발성 판별)

  • Heo, Sang-Min;Lee, Yeon-soo;Lee, Ho-Yeop
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.171-175
    • /
    • 2016
  • 본 연구는 SNS 문서의 논쟁 유발성을 자동으로 감지하기 위한 연구이다. 논쟁 유발성 분류는 글의 주제와 문체, 뉘앙스 등 추상화된 자질로서 인지되기 때문에 단순히 n-gram을 보는 기존의 어휘적 자질을 이용한 문서 분류 기법으로 해결하기가 어렵다. 본 연구에서는 문서 전체에서 전역적으로 나타난 추상화된 자질을 학습하기 위해 2-phase CNN 기반 논쟁 유발성 판별모델을 제안한다. SNS에서 수집한 글을 바탕으로 실험을 진행한 결과, 제안하는 모델은 기존의 문서 분류에서 가장 많이 사용된 SVM에 비해 월등한 성능 향상을, 단순한 CNN에 비해 상당한 성능 향상을 보였다.

  • PDF

Debatable SNS Post Detection using 2-Phase Convolutional Neural Network (2-Phase CNN을 이용한 SNS 글의 논쟁 유발성 판별)

  • Heo, Sang-Min;Lee, Yeon-soo;Lee, Ho-Yeop
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.171-175
    • /
    • 2016
  • 본 연구는 SNS 문서의 논쟁 유발성을 자동으로 감지하기 위한 연구이다. 논쟁 유발성 분류는 글의 주제와 문체, 뉘앙스 등 추상화된 자질로서 인지되기 때문에 단순히 n-gram을 보는 기존의 어휘적 자질을 이용한 문서 분류 기법으로 해결하기가 어렵다. 본 연구에서는 문서 전체에서 전역적으로 나타난 추상화된 자질을 학습하기 위해 2-phase CNN 기반 논쟁 유발성 판별 모델을 제안한다. SNS에서 수집한 글을 바탕으로 실험을 진행한 결과, 제안하는 모델은 기존의 문서 분류에서 가장 많이 사용된 SVM에 비해 월등한 성능 향상을, 단순한 CNN에 비해 상당한 성능 향상을 보였다.

  • PDF

A Study on Varieties of Subject Access and Usabilities of the National Library of Korea Subject Headings (주제 접근의 다양성과 국립중앙도서관 주제명 표목의 활용가능성에 관한 연구)

  • Chung, Yeon Kyoung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.25 no.4
    • /
    • pp.171-185
    • /
    • 2014
  • The purposes of this study are to examine the various methods of subject access in the rapidly changing environment and to suggest the future of subject access in National Library of Korea (NLK). First of all, current status and problems of Library of Congress Subject Headings List as an representative subject headings in the world and the ways of improving effectiveness of subject retrieval were dealt with. As the ways of improving subject access, social bookmarking, folksonomy, tagging, facet applications, automatic assignment of keyword, thesauri, classification system, and auto-assigned search box were suggested. Finally, current status of NLK subject headings and the ways of improving for utilization of the subject headings as subject access were provided.