• Title/Summary/Keyword: 주제연관성

Search Result 252, Processing Time 0.029 seconds

Assessing the Utilization and Interrelatedness of Scopus Subject Categories (Scopus에 설정된 주제분류 활용도 및 상호 연관성에 대한 고찰)

  • Kim, Eungi
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.1
    • /
    • pp.251-272
    • /
    • 2019
  • This study investigated the utilization and interrelatedness of Scopus subject categories. To conduct this study, major and minor subject categories of journals listed in the 2017 Scopus index were used. The results showed varying degrees of interrelatedness of subject categories. At the major subject category level, the utilization was the highest in Medicine, while Social Sciences showed a greater degree of interrelatedness in comparison to Medicine. Yet, at the minor subject level, 2700 General Medicine was particularly dominant in terms of utilization and interrelatedness. Moreover, co-occurrences of minor subject categories showed varying degrees of interrelatedness between pairs of minor subject categories. Pairs of minor subject categories showed the following characteristics: a) two subject categories having identical or closely identical descriptions, b) two different categories having an interrelationship by subject areas, and c) one category conceptually encompassing another category. Due to varying degrees of utilization and interrelatedness among subject categories, minor subject categories that may greatly influence the major subject categories in conducting research studies should be investigated in detail.

Keyword-based Document C lustering Algorithm (주제어 기반 문서 클러스터링 알고리즘)

  • 장성호;강승식
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.469-471
    • /
    • 2002
  • 높은 연관성을 갖는 문서들을 서로 집단화시키는 문서 클러스터링은 문서와 문서간의 연관성을 확인할 수 있는 문서의 주제어 추출이 중요한 문제이며 일반적인 정보검색 시스템에서 사용하는 출현빈도에 의한 주제어 추출은 성능 향상에 한계가 있다. 또한, 문서 클러스터링은 문서를 집단화시키기 위해 문서간 연관성을 확인하기 위해 유사도 계산에 따른 시간과 공간을 많이 소비하는 문제를 가지고 있다. 본 논문에서는 주제어 추출 기법을 적용하여 주제어 연관성에 의해 문서들을 집단화시키는 새로운 방법의 문서 클러스터링 알고리즘을 제안한다.

  • PDF

Time Analysis of Structural Element and Theme Association of Television News Imagery (텔레비전 뉴스 영상의 구조적 요소와 주제연관성 시계열 분석)

  • Park, Dug-Chun
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.7
    • /
    • pp.100-109
    • /
    • 2011
  • This thesis is a content analysis on whether the proportion of structural element and theme association of television news imagery is different, depending on the historical background, and on what it means, which can be the index of scene-based and realistic report. Most researches of television news are horizontal studies of the same period, making light of vertical studies reflecting the change of age. Therefore, This study analyzed 729 items composed of 11,945 shots extracted from MBC Newsdesk from 1987, to 2007, the samples of which were extracted by systematic random sampling with five years' interval. This content analysis found out that there was high proportion of scene-based and realistic report such as 'sound-bite', 'event footage', 'direct matching' in the year 1987, 2007, and high proportion of 'corroboration shot', 'file footage', 'indirect reference', 'literal matching only' in the year 1997, which revealed the fact that reality-based report had not been faithfully accomplished in 1997.

A Study on the Topical Associations of Simultaneously Borrowed Books in Public Libraries (공공도서관 동시 대출 도서의 주제 연관성 분석 연구)

  • Woojin Kang;In Yeong Jeong;Jongwook Lee
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.3
    • /
    • pp.33-55
    • /
    • 2023
  • There has been research to understand users' information behaviors using book circulation data of public libraries. In this study, we examined the subject areas of books simultaneously borrowed by users of public libraries and aimed to identify the relationships among the subject areas. To accomplish this, we utilized the Korean Decimal Classification codes of 984,790 loaned books in 2019 to transform the lists of concurrently borrowed books, totaling 22,443,699 records, by the same users on the same day, into vectors using the ITEM2VEC technique. Next, we extracted ten highly related classification codes for each classification code, utilizing a total of 522 classification codes to create a network. We identified 15 communities within this network and examined the characteristics of each community. Among the 15 communities, those consisting of two or more main classes allowed us to identify meaningful thematic associations. This study, grounded in users' book usage behaviors, has suggested the topics of books that could be borrowed together. The findings offer valuable insights for library collection development and placement, recommending related subject materials, and revising classification systems.

Subject Association Analysis of Big Data Studies: Using Co-citation Networks (빅데이터 연구 논문의 주제 분야 연관관계 분석: 동시 인용 관계를 적용하여)

  • Kwak, Chul-Wan
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.13-32
    • /
    • 2018
  • The purpose of this study is to analyze the association among the subject areas of big data research papers. The subject group of the units of analysis was extracted by applying co-citation networks, and the rules of association were analyzed using Apriori algorithm of R program, and visualized using the arulesViz package of R program. As a result of the study, 22 subject areas were extracted and these subjects were divided into three clusters. As a result of analyzing the association type of the subject, it was classified into 'professional type', 'general type', 'expanded type' depending on the complexity of association. The professional type included library and information science and journalism. The general type included politics & diplomacy, trade, and tourism. The expanded types included other humanities, general social sciences, and general tourism. This association networks show a tendency to cite other subject areas that are relevant when citing a subject field, and the library should consider services that use the association for academic information services.

Application Plans of Thematic Overlap Function in Electronic Cultural Atlas (전자문화지도에서의 주제별 중첩 기능 활용 방안)

  • Lee, Dong-yul;Kang, Ji-hoon;Moon, Sang-ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.445-447
    • /
    • 2014
  • 최근에 전자문화지도에 대한 관심이 늘어나면서 다양한 주제를 기반으로 한 전자문화지도들이 연구 및 개발되고 있는 추세이다. 그러나 기존의 전자문화지도들은 대부분 단일 주제로 제작되므로 주제들 간의 연관성 분석이 어렵고, 전자문화지도들이 서로 연계되어 있지 않아 다양한 관점을 기반으로 한 활용이 미흡한 문제점이 있다. 본 논문에서는 이러한 문제점을 해결하기 위하여 하나의 전자문화지도에 레이어 기능을 활용하여 다양한 주제들을 표현하는 방안을 제시한다. 또한, 이를 기반으로 전자문화지도에서의 주제별 중첩 기능을 활용하여 주제들 간의 연계 관계를 효율적으로 파악하고 다양한 주제들의 연관 관계를 통해 새로운 지식을 도출해 낼 수 있는 활용 방안을 제시하고자 한다.

  • PDF

An Automated Topic Specific Web Crawler Calculating Degree of Relevance (연관도를 계산하는 자동화된 주제 기반 웹 수집기)

  • Seo Hae-Sung;Choi Young-Soo;Choi Kyung-Hee;Jung Gi-Hyun;Noh Sang-Uk
    • Journal of Internet Computing and Services
    • /
    • v.7 no.3
    • /
    • pp.155-167
    • /
    • 2006
  • It is desirable if users surfing on the Internet could find Web pages related to their interests as closely as possible. Toward this ends, this paper presents a topic specific Web crawler computing the degree of relevance. collecting a cluster of pages given a specific topic, and refining the preliminary set of related web pages using term frequency/document frequency, entropy, and compiled rules. In the experiments, we tested our topic specific crawler in terms of the accuracy of its classification, crawling efficiency, and crawling consistency. First, the classification accuracy using the set of rules compiled by CN2 was the best, among those of C4.5 and back propagation learning algorithms. Second, we measured the classification efficiency to determine the best threshold value affecting the degree of relevance. In the third experiment, the consistency of our topic specific crawler was measured in terms of the number of the resulting URLs overlapped with different starting URLs. The experimental results imply that our topic specific crawler was fairly consistent, regardless of the starting URLs randomly chosen.

  • PDF

A Topic Related Word Extraction Method Using Deep Learning Based News Analysis (딥러닝 기반의 뉴스 분석을 활용한 주제별 최신 연관단어 추출 기법)

  • Kim, Sung-Jin;Kim, Gun-Woo;Lee, Dong-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.873-876
    • /
    • 2017
  • 최근 정보검색의 효율성을 위해 데이터를 분석하여 해당 데이터를 가장 잘 나타내는 연관단어를 추출 및 추천하는 연구가 활발히 이루어지고 있다. 현재 관련 연구들은 출현 빈도수를 사용하는 방법이나 LDA와 같은 기계학습 기법을 활용해 데이터를 분석하여 연관단어를 생성하는 방법을 제안하고 있다. 기계학습 기법은 결과 값을 찾는데 사용되는 특징들을 전문가가 직접 설계해야 하며 좋은 결과를 내는 적절한 특징을 찾을 때까지 많은 시간이 필요하다. 또한, 파라미터들을 직접 설정해야 하므로 많은 시간과 노력을 필요로 한다는 단점을 지닌다. 이러한 기계학습 기법의 단점을 극복하기 위해 인공신경망을 다층구조로 배치하여 데이터를 분석하는 딥러닝이 최근 각광받고 있다. 본 논문에서는 기존 기계학습 기법을 사용하는 연관단어 추출연구의 한계점을 극복하기 위해 딥러닝을 활용한다. 먼저, 인공신경망 기반 단어 벡터 생성기인 Word2Vec를 사용하여 다양한 텍스트 데이터들을 학습하고 룩업 테이블을 생성한다. 그 후, 생성된 룩업 테이블을 바탕으로 인공신경망의 한 종류인 합성곱 신경망을 활용하여 사용자가 입력한 주제어와 관련된 최근 뉴스데이터를 분석한 후, 주제별 최신 연관단어를 추출하는 시스템을 제안한다. 또한 제안한 시스템을 통해 생성된 연관단어의 정확률을 측정하여 성능을 평가하였다.

A Study on Collection Use of an Public Libraries Focused of the Clustering Analysis of Circulation Statistics of the Seoul Borough A Library Users (공공도서관의 주제별 자료 이용 현황 분석: 서울특별시 A구 산하 공공도서관을 중심으로)

  • Kim, Wan-Jong
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.3
    • /
    • pp.353-369
    • /
    • 2014
  • The goal of this study is to analyze use patterns of library collections using circulation statistics of 9 public libraries user's of the Seoul borough "A". For this study, the 2,723,115 circulation-related data of 9 public libraries located in borough "A" which were occurred between June 2006 and June 2014 were collected and used. According to the Korea Decimal Classification (KDC), All circulation records is divided into 10 categories from general (000) to history (900) and 100 divisions from general (000) to biography (990), is analyzed the frequency by category and is analyzed by cluster analysis based on thematic relevance.

Document Summarization Based on Sentence Clustering Using Graph Division (그래프 분할을 이용한 문장 클러스터링 기반 문서요약)

  • Lee Il-Joo;Kim Min-Koo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.149-154
    • /
    • 2006
  • The main purpose of document summarization is to reduce the complexity of documents that are consisted of sub-themes. Also it is to create summarization which includes the sub-themes. This paper proposes a summarization system which could extract any salient sentences in accordance with sub-themes by using graph division. A document can be represented in graphs by using chosen representative terms through term relativity analysis based on co-occurrence information. This graph, then, is subdivided to represent sub-themes through connected information. The divided graphs are types of sentence clustering which shows a close relationship. When salient sentences are extracted from the divided graphs, summarization consisted of core elements of sentences from the sub-themes can be produced. As a result, the summarization quality will be improved.