• 제목/요약/키워드: topic

검색결과 4,611건 처리시간 0.028초

지방자치단체의 스마트시티 조례 분석: 토픽모델링을 활용하여 (Analysis of Municipal Ordinances for Smart Cities of Municipal Governments: Using Topic Modeling)

  • 서형준
    • 정보화정책
    • /
    • 제30권1호
    • /
    • pp.41-66
    • /
    • 2023
  • 본 연구는 72개 지자체의 74개 스마트시티 조례를 대상으로, 지자체 스마트시티 조례의 방향성을 확인하고자 토픽모델링을 활용하여 조례의 주요 키워드를 확인하고, 조례의 키워드에 따른 주제분류를 진행하였다. 분석결과 주요 키워드는 스마트도시위원회의 구성 및 운영에 관한 키워드가 조례 내에서 높은 빈도를 보였다. 조례에 대한 토픽모델링 Latent Dirichlet Allocation(LDA) 분석결과 관련 키워드에 따라 총 8개의 주제로 분류할 수 있었다. 구체적으로 주제-1(스마트시티 추진사항 보안), 주제-2(스마트시티 산업진흥), 주제-3(스마트시티 주민협의체 구성), 주제-4(스마트시티 추진체계 지원), 주제-5(개인정보 관리), 주제-6(스마트시티 데이터 활용), 주제-7(지능정보화 행정구현), 주제-8(스마트시티 홍보) 등으로, 주제의 비중은 주제-6, 주제-4, 주제-1 등의 순으로 나타났다. 권역별 주제분류는 수도권은 주제-5, 주제-6, 주제-8 의 비중이 높았고, 지방권은 주제-2, 주제-3, 주제-4의 비중이 높아 수도권은 스마트시티의 실질 운영 관련 주제가 높았고, 지방권은 스마트시티 추진을 위한 준비단계 관련 주제 비중이 높았다.

Company Name Discrimination in Tweets using Topic Signatures Extracted from News Corpus

  • Hong, Beomseok;Kim, Yanggon;Lee, Sang Ho
    • Journal of Computing Science and Engineering
    • /
    • 제10권4호
    • /
    • pp.128-136
    • /
    • 2016
  • It is impossible for any human being to analyze the more than 500 million tweets that are generated per day. Lexical ambiguities on Twitter make it difficult to retrieve the desired data and relevant topics. Most of the solutions for the word sense disambiguation problem rely on knowledge base systems. Unfortunately, it is expensive and time-consuming to manually create a knowledge base system, resulting in a knowledge acquisition bottleneck. To solve the knowledge-acquisition bottleneck, a topic signature is used to disambiguate words. In this paper, we evaluate the effectiveness of various features of newspapers on the topic signature extraction for word sense discrimination in tweets. Based on our results, topic signatures obtained from a snippet feature exhibit higher accuracy in discriminating company names than those from the article body. We conclude that topic signatures extracted from news articles improve the accuracy of word sense discrimination in the automated analysis of tweets.

감정 딥러닝 필터를 활용한 토픽 모델링 방법론 (Topic Modeling with Deep Learning-based Sentiment Filters)

  • 최병설;김남규
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제28권4호
    • /
    • pp.271-291
    • /
    • 2019
  • Purpose The purpose of this study is to propose a methodology to derive positive keywords and negative keywords through deep learning to classify reviews into positive reviews and negative ones, and then refine the results of topic modeling using these keywords. Design/methodology/approach In this study, we extracted topic keywords by performing LDA-based topic modeling. At the same time, we performed attention-based deep learning to identify positive and negative keywords. Finally, we refined the topic keywords using these keywords as filters. Findings We collected and analyzed about 6,000 English reviews of Gyeongbokgung, a representative tourist attraction in Korea, from Tripadvisor, a representative travel site. Experimental results show that the proposed methodology properly identifies positive and negative keywords describing major topics.

Generative probabilistic model with Dirichlet prior distribution for similarity analysis of research topic

  • Milyahilu, John;Kim, Jong Nam
    • 한국멀티미디어학회논문지
    • /
    • 제23권4호
    • /
    • pp.595-602
    • /
    • 2020
  • We propose a generative probabilistic model with Dirichlet prior distribution for topic modeling and text similarity analysis. It assigns a topic and calculates text correlation between documents within a corpus. It also provides posterior probabilities that are assigned to each topic of a document based on the prior distribution in the corpus. We then present a Gibbs sampling algorithm for inference about the posterior distribution and compute text correlation among 50 abstracts from the papers published by IEEE. We also conduct a supervised learning to set a benchmark that justifies the performance of the LDA (Latent Dirichlet Allocation). The experiments show that the accuracy for topic assignment to a certain document is 76% for LDA. The results for supervised learning show the accuracy of 61%, the precision of 93% and the f1-score of 96%. A discussion for experimental results indicates a thorough justification based on probabilities, distributions, evaluation metrics and correlation coefficients with respect to topic assignment.

‘-은/는’의 분포에 대하여 (On the Distribution of‘-(N)un’in Korean)

  • 염재일
    • 한국언어정보학회지:언어와정보
    • /
    • 제5권2호
    • /
    • pp.57-74
    • /
    • 2001
  • In this paper, I propose syntactic, semantic and pragmatic restrictions on the distribution of the contrastive topic marker‘-(n)un’in Korean. A contrastive topic is associated with another focus. The association with focus is subject to syntactic islands. On the other hand, there is no syntactic restriction between a phrase attached with‘-(n)un’and a focused expression within the ‘-(n)un’phrase itself. In this area there is a semantic requirement that the alternatives generated by a focused expression be maintained up to the phrase attached with‘-(n)un’. Finally, when‘-(n)un’is used in an embedded clause, the whole sentence becomes natural when the contrastive topic introduced by‘-(n)un’and its alternative contrastive topic, which is presupposed by the contrastive topic marker, jointly constitute a more complex topic which is related to the whole context. And exclusiveness facilitates the formation of the whole complex context.

  • PDF

Topic Map 기반의 MARC 적용 방안 연구 (A Study on MARC Based Topic Map)

  • 장화수;고일주
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2008년도 제38차 하계학술발표논문집 16권1호
    • /
    • pp.309-315
    • /
    • 2008
  • 문헌정보처리 표준화도구인 MARC는 포멧의 문제점과 다양한 웹자원 메타데이터 정보조직의 문제점으로 인하여 웹 기반의 XML표준 포멧의 도입을 시도하였고, MARCXML로 변환되어 시스템간 상호운용되고 있으나, MARCXML은 서지정보의 의미특성이나 메타데이터의 표현을 고려하지 않고 단순히 MARC 레코드의 표현을 XML 구조로 변환한 것일 뿐이다. 시맨틱의 핵심기술로 부각되고 있는 Topic Map은 XML기반의 표준기술언어인 ISO의 XTM을 이용해 정보와 지식의 분산 관리를 지원하는 기술이다. 학술정보자원에 대한 DB 구축 시 Topic Map언어인 XTM을 이용한다면 이미 개발된 여러 메타데이터 등을 한곳으로 통합하면서도 신축성과 확장성을 제공하는 것이 용이하게 된다. 하지만, 기존 시스템에서 새로운 Topic Map을 구축하는 것은 많은 비용과 시간이 소요되는 등 어려운 일이다. 본 연구에서는 기 구축된 학술DB로부터 Topic Map에서 재활용할 수 있는 요소들을 추출하기 위한 정보 소스로서 데이터베이스 스키마와 MARC에서 언급하는 메타데이터를 이용하는 것은, XML의 특징인 시스템간 상호운용성을 확보함과 동시에 기초 학문자료의 복잡한 관계의 개념구조, 자료유형 및 자료간의 의미적 상관관계 등을 표현에 있어 효율적인 개발방법임을 제안한다.

  • PDF

한국과 미국 유아의 의사소통에서 주제 수행에 대한 비교문화 연구 (Topic Performance: A Cross-Cultural Study of Korean and American 3-Year-Old Children)

  • 이순형;;성미영
    • 아동학회지
    • /
    • 제18권2호
    • /
    • pp.121-130
    • /
    • 1997
  • This study investigated differences in the topic performance of 3-year-old Korean and American children. Sixteen mother and child dyads (8 Americans and 8 Koreans) were tape-recorded during naturally occurring conversations. The cape-recorded data were transcribed on the observational chechlist by Kertoy Vetter(1995). Korean children engaged in topic performance nearly twice as often as American children. Korean children engaged in topic termination/initiation and continuation more often than American children. Also, Korean children engaged in topic collaboration and incorporation more often than American children, but there was no difference in off-topic.

  • PDF

Topics and Trends in Metadata Research

  • Oh, Jung Sun;Park, Ok Nam
    • Journal of Information Science Theory and Practice
    • /
    • 제6권4호
    • /
    • pp.39-53
    • /
    • 2018
  • While the body of research on metadata has grown substantially, there has been a lack of systematic analysis of the field of metadata. In this study, we attempt to fill this gap by examining metadata literature spanning the past 20 years. With the combination of a text mining technique, topic modeling, and network analysis, we analyzed 2,713 scholarly papers on metadata published between 1995 and 2014 and identified main topics and trends in metadata research. As the result of topic modeling, 20 topics were discovered and, among those, the most prominent topics were reviewed in detail. In addition, the changes over time in the topic composition, in terms of both the relative topic proportions and the structure of topic networks, were traced to find past and emerging trends in research. The results show that a number of core themes in metadata research have been established over the past decades and the field has advanced, embracing and responding to the dynamic changes in information environments as well as new developments in the professional field.

Trend Analysis of Research Topics in Ecological Research

  • Suntae Kim
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • 제4권1호
    • /
    • pp.43-48
    • /
    • 2023
  • This study analyzed research trends in the field of ecological research. Data were collected based on a keyword search of the SCI, SSCI, and A&HCI databases from January 2002 to September 2022. The seven keywords, including biodiversity, ecology, ecotourism, species, climate change, ecosystem, restoration, wildlife, were recommended by ecological research experts. Word clouds were created for each of the searched keywords, and topic map analysis was performed. Topic map analysis using biodiversity, climate change, ecology, ecosystem, and restoration each generated 10 topics; topic maps analysis using the ecotourism keyword generated 5 topics; and topic map analysis using the wildlife keyword generated 4 topics. Each topic contained six keywords.

토픽모델링을 이용한 비대면 신문 기사 키워드 분석 (Non face-to-face News Articles Keyword Using Topic Modeling)

  • Shin, Ari;Hwangbo, Jun Kwon
    • 한국정보통신학회논문지
    • /
    • 제26권11호
    • /
    • pp.1751-1754
    • /
    • 2022
  • The news articles collected with keyword "non face-to-face" were analyzed through topic modeling applied with LDA algorithm. In this study, collected articles were divided into two periods, period 1(the beginning of COVID-19 spread) and period 2(the end of COVID-19 spread), according to issued date of the articles. The articles of period 1 showed support for non-face-to-face treatment, smart library, the beginning of the online financial era, non-face-to-face entrance exam and employment, stock investment for main topic words. And the articles of period 2 showed conversion to non face-to-face classes, increasing unmanned stores, online finance, education industry, home treatment for main topic words. Also, further issues were discussed through visualization of topic words. These results provide evidence that education and unmanned business in non-face-to-face industries are growing.