• Title/Summary/Keyword: 의미적 토픽

Search Result 128, Processing Time 0.023 seconds

Topic Model Augmentation and Extension Method using LDA and BERTopic (LDA와 BERTopic을 이용한 토픽모델링의 증강과 확장 기법 연구)

  • Kim, SeonWook;Yang, Kiduk
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.99-132
    • /
    • 2022
  • The purpose of this study is to propose AET (Augmented and Extended Topics), a novel method of synthesizing both LDA and BERTopic results, and to analyze the recently published LIS articles as an experimental approach. To achieve the purpose of this study, 55,442 abstracts from 85 LIS journals within the WoS database, which spans from January 2001 to October 2021, were analyzed. AET first constructs a WORD2VEC-based cosine similarity matrix between LDA and BERTopic results, extracts AT (Augmented Topics) by repeating the matrix reordering and segmentation procedures as long as their semantic relations are still valid, and finally determines ET (Extended Topics) by removing any LDA related residual subtopics from the matrix and ordering the rest of them by F1 (BERTopic topic size rank, Inverse cosine similarity rank). AET, by comparing with the baseline LDA result, shows that AT has effectively concretized the original LDA topic model and ET has discovered new meaningful topics that LDA didn't. When it comes to the qualitative performance evaluation, AT performs better than LDA while ET shows similar performances except in a few cases.

'Korean Wave' News Analysis Using News Big Data ('한류' 경향에 관한 국내 언론 기사 빅데이터 분석 연구)

  • Hwang, Seo-I;Park, Jeong-Bae
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.5
    • /
    • pp.1-14
    • /
    • 2020
  • This study conducted a topic modeling and semantic network analysis of 'korean wave' and its meaning in Korean society from 2000 to 2019 by applying an agenda setting theory. For this purpose, a total of 197,992 newspaper articles which reported 'korean wave' issues were analyzed by applying topic modeling and semantic network analysis. As a result, first, the word 'korean wave' mainly appeared in korean-related regions in the korean press. culture and economy. second, a total of 9 topics related to korean wave issues appeared. This was followed by 'broadcast', 'export', 'domestic and foreign affairs', 'education', 'beauty and fashion', 'music and performance', 'tourism', 'media(platform)', and 'region'. Lastly, korean wave was mainly discussed at the cultural and economic ares. In addition, it was clustered into five characteristics: 'cultural hallyu', 'business hallyu', 'education', 'environment', and 'geography'.

Tweets analysis using a Dynamic Topic Modeling : Focusing on the 2019 Koreas-US DMZ Summit (트윗의 타임 시퀀스를 활용한 DTM 분석 : 2019 남북미정상회동 이벤트를 중심으로)

  • Ko, EunJi;Choi, SunYoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.308-313
    • /
    • 2021
  • In this study, tweets about the 2019 Koreas-US DMZ Summit were collected along with a time sequence and analyzed by a sequential topic modeling method, Dynamic Topic Modeling(DTM). In microblogging services such as Twitter, unstructured data that mixes news and an opinion about a single event occurs at the same time on a large scale, and information and reactions are produced in the same message format. Therefore, to grasp a topic trend, the contextual meaning can be found only by performing pattern analysis reflecting the characteristics of sequential data. As a result of calculating the DTM after obtaining the topic coherence score and evaluating the Latent Dirichlet Allocation(LDA), 30 topics related to news reports and opinions were derived, and the probability of occurrence of each topic and keywords were dynamically evolving. In conclusion, the study found that DTM is a suitable model for analyzing the trend of integrated topics in a specific event over time.

Ontology based Retrieval System for Shopping Sites Customer (온톨로지 기반의 쇼핑 사이트 고객을 위한 검색 시스템)

  • Gu Mi-Sug;Hwang Jeong-Hee;Ryu Keun-Ho
    • Annual Conference of KIPS
    • /
    • 2004.11a
    • /
    • pp.51-54
    • /
    • 2004
  • 시멘틱 웹은 기존의 웹과는 달리 정보의 의미가 정의되고, 이들 간의 의미적 연결을 지원한다는 특징이 있어서, 최근 차세대 웹으로 부각되고 있다. 이러한 의미적 연결을 위해서 시맨틱 웹의 기반인 온톨로지가 필요하다. 온톨로지는 리소스에 대한 메타데이터를 정의하여 의미적 연결이 가능하게 하므로 효율적인 정보 검색이 가능하다. 이 논문에서는 정보 검색의 효율을 증가시키기 위해서 시맨틱 웹의 핵심인 온톨로지 기반의 정보 검색 시스템을 제안한다. 쇼핑 사이트에서 효율적인 마케팅을 위해 사용자의 구매 패턴을 조사하여 고객에게 알맞은 정보 추천을 하기 위한 것을 목적으로 한다. 온톨로지의 구축은 XTM을 기반으로 토픽맵을 이용하였다. 그리고 온톨로지를 기반으로, 사용자의 구매패턴을 찾아서 정확한 정보 전달을 위해서 데이터 마이닝 기법을 이용하였다. 빈발패턴 트리 기법을 기반으로 하는 멀티 레벨 멀티 디멘션 빈발 패턴 마이닝 알고리즘을 이용하여 사용자 패턴을 분석하여 정보 검색에 효율을 기하였다.

  • PDF

Topic modeling and topic change trend analysis for advanced construction technologies (건설신기술에 대한 토픽 모델링 및 토픽 변화추이 분석)

  • Jeong, Seong Yun;Kim, Nam Gon
    • Smart Media Journal
    • /
    • v.10 no.4
    • /
    • pp.102-110
    • /
    • 2021
  • Currently, the advanced construction technology endorsement system is being operated to promote the development of domestic construction technology. We tried to examine the implicit meanings inherent in advanced construction technologies by analyzing the relationship between emerging vocabularies with high importance in relation to the advanced construction technologies endorsed through this system. For this purpose, 918 cases of advanced construction technology information were collected. Based on the endorsed year and summary of the advanced construction technologies, the importance of the emerging vocabularies was measured for each advanced construction technology. And, based on the LDA model, the degree of influence between related vocabularies was evaluated for each of the four topic areas. Topics according to the technical application fields were analyzed. From 1990 to 2021, the trend of changes in highly influential vocabularies by each topic was inferred. In the future, changes in the degree of influence of the topics of environment, machinery, facilities, and maintenance and reinforcement of structures and related technology fields were predicted.

Topic Modeling and Network Analysis of Peace Education and Unification Education Based on Big Data Analysis (빅데이터 분석에 기반한 평화교육과 통일교육의 토픽 모델링 및 네트워크 분석)

  • Kim, Byung-Man
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.3
    • /
    • pp.25-37
    • /
    • 2022
  • The purpose of this study is to comprehensively check trends in policies, discourses, educational directions and contents, and social issues by deriving the subjective characteristics of peace education and unification education based on big data analysis. The results of this study are as follows. First, 'peace', 'unification', 'education', 'research', 'student', 'school', 'teacher', 'target', and 'Korean Peninsula' were commonly important keywords in peace education and unification education. Second, the top topic of peace education was 'peace education and civic education', and the top topic of unification education was ' sympathy and participation in unification education'. Third, topics that show an upward trend by regime in peace education were 'World Peace and Human Rights' and 'Object and Direction of Peace Education', and 'Subject of Unification Education' as topics that showed an upward trend by regime in unification education. Fourth, in peace education, the centrality of 'peace', 'education', 'student', 'school', and 'peace education' was high, and in unification education, 'unification', 'education', 'unification', 'school', and 'teacher' were high. Based on these results, it was intended to expand the horizon of understanding peace education and unification education, and to provide meaningful implications for establishing policies and conducting follow-up studies.

WV-BTM: A Technique on Improving Accuracy of Topic Model for Short Texts in SNS (WV-BTM: SNS 단문의 주제 분석을 위한 토픽 모델 정확도 개선 기법)

  • Song, Ae-Rin;Park, Young-Ho
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.51-58
    • /
    • 2018
  • As the amount of users and data of NS explosively increased, research based on SNS Big data became active. In social mining, Latent Dirichlet Allocation(LDA), which is a typical topic model technique, is used to identify the similarity of each text from non-classified large-volume SNS text big data and to extract trends therefrom. However, LDA has the limitation that it is difficult to deduce a high-level topic due to the semantic sparsity of non-frequent word occurrence in the short sentence data. The BTM study improved the limitations of this LDA through a combination of two words. However, BTM also has a limitation that it is impossible to calculate the weight considering the relation with each subject because it is influenced more by the high frequency word among the combined words. In this paper, we propose a technique to improve the accuracy of existing BTM by reflecting semantic relation between words.

Topic Modeling on Fine Dust Issues Using LDA Analysis (LDA 기법을 이용한 미세먼지 이슈의 토픽모델링 분석)

  • Yoon, soonuk;Kim, Minchul
    • Journal of Energy Engineering
    • /
    • v.29 no.2
    • /
    • pp.23-29
    • /
    • 2020
  • In this study, the last 10 years of news data on fine dust was collected and 80 topics are selected through LDA analysis. As a result, weather-related information made up the main words for the topic, and we can see that fine dust becomes a big issue below 10 degrees Celsius. The frequency of exposure to the media and the maximum concentration of fine dust are correlated with positive. Topics related to fine dust reduction measures and the government's comprehensive measures over the past decade, topics related to products such as air purifiers related to fine dust, topics related to policies protecting vulnerable people from fine dust, and topics on fine dust reduction through R&D were found to be major topics. Measures against fine dust as a social issue can be seen to be closely related to the government's policy.

Structure Modeling Techniques for the 3D Architecture using Topic Maps (토픽맵을 이용한 3D 건축물의 구조모델링 기법 연구)

  • Kim, So-Young;Lim, Soon-Bum;Woo, Sung-Ho;Choy, Yoon-Chul
    • Annual Conference of KIPS
    • /
    • 2006.11a
    • /
    • pp.223-226
    • /
    • 2006
  • 가상현실 기술은 3 차원 가상공간 및 물체를 시각화하는데 중점을 두고 있다. 이는 사용자가 3 차원으로 데이터를 충분히 활용하지 못하고, 시각자료로만 사용하게 되는 요인이 된다. 이를 보완하기 위해 시각 정보뿐만 아니라 구조 및 관계에 대한 정보까지도 효과적으로 활용 할 수 있는 연구가 필요하다. 따라서 본 연구에서는 외형뿐만 아니라 내부적인 구조와 관계에도 의미를 부여하기 위하여 3 차원 건축물에 XML 기반의 토픽맵을 적용하였다. 전통 건축물의 공포 부분을 모델링하고, 각각의 객체가 사용자에 의해 조작이 가능하도록 하였으며, 객체들의 구조와 연결관계를 분석하고, 정의된 구조 및 관계를 토대로 토픽맵을 작성하였다. 작성된 토픽맵은 모델링 데이터에 적용 가능하도록 DOM 을 이용하여 변환하였다. 이 연구를 통해 아무리 복잡한 구조물이라도 그에 대한 구조 정보를 쉽게 파악할 수 있었고, 계층적 연결 관계도 쉽게 파악 할 수 있었다.

  • PDF

Conflict Detection and Resolution Method for Merging of Ontologies based on Decision Support Tree (온톨로지 병합을 위한 의사지원트리 기반 충돌 탐지 및 해결 기법)

  • Jeong, Hyeon-Suk;Kim, Jeong-Min;Lee, Seong-Ju
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.04a
    • /
    • pp.147-150
    • /
    • 2007
  • 본 논문에서는 토픽맵 기반의 온톨로지 병합 과정에서 발생할 수 있는 충돌의 유형을 트리 구조로 정의하고 충돌 탐지 및 해결을 통하여 두 온톨로지를 하나로 병합하는 기법을 제안한다. 병합충돌은 의미적 대응 요소들의 유사값에 기반하여 엘리먼트기반, 구조기반 임시기반의 트리 구조로 분류되고 이 충돌 트리를 이용하여 두 매핑 요소사이의 병합충돌을 탐지하고 해결한다. 실험을 위해 토픽맵 질의언어 tolog를 사용하여 동서양 철학온톨로지 및 독일 문학온톨로지들의 병합 전과 후의 질의 결과를 비교하고 이를 정확율과 재현율로 병합 성능을 평가하였으며 그 결과 손실없는 병합이 가능함을 보였다.

  • PDF