• Title/Summary/Keyword: 토픽모델

Search Result 176, Processing Time 0.034 seconds

A Multi-Strategic Mapping Approach for Distributed Topic Maps (분산 토픽맵의 다중 전략 매핑 기법)

  • Kim Jung-Min;Shin Hyo-phil;Kim Hyoung-Joo
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.1
    • /
    • pp.114-129
    • /
    • 2006
  • Ontology mapping is the task of finding semantic correspondences between two ontologies. In order to improve the effectiveness of ontology mapping, we need to consider the characteristics and constraints of data models used for implementing ontologies. Earlier research on ontology mapping, however, has proven to be inefficient because the approach should transform input ontologies into graphs and take into account all the nodes and edges of the graphs, which ended up requiring a great amount of processing time. In this paper, we propose a multi-strategic mapping approach to find correspondences between ontologies based on the syntactic or semantic characteristics and constraints of the topic maps. Our multi-strategic mapping approach includes a topic name-based mapping, a topic property-based mapping, a hierarchy-based mapping, and an association-based mapping approach. And it also uses a hybrid method in which a combined similarity is derived from the results of individual mapping approaches. In addition, we don't need to generate a cross-pair of all topics from the ontologies because unmatched pairs of topics can be removed by characteristics and constraints of the topic maps. For our experiments, we used oriental philosophy ontologies, western philosophy ontologies, Yahoo western philosophy dictionary, and Yahoo german literature dictionary as input ontologies. Our experiments show that the automatically generated mapping results conform to the outputs generated manually by domain experts, which is very promising for further work.

Comparison of policy perceptions between national R&D projects and standing committees using topic modeling analysis : focusing on the ICT field (토픽모델링 분석을 활용한 국가연구개발사업과제와 국회 상임위원회 사이의 정책 인식 비교 : ICT 분야를 중심으로)

  • Song, Byoungki;Kim, Sangung
    • Journal of Industrial Convergence
    • /
    • v.20 no.7
    • /
    • pp.1-11
    • /
    • 2022
  • In this paper, numerical values are derived using topic modeling among data-based evaluation methodologies discussed by various research institutes. In addition, we will focus on the ICT field to see if there is a difference in policy perception between the national R&D project and standing committee. First, we create model for classifying ICT documents by learning R&D project data using HAN model. And we perform LDA topic modeling analysis on ICT documents classified by applying the model, compare the distribution with the topics derived from the R&D project data and proceedings of standing committees. Specifically, a total of 26 topics were derived. Also, R&D project data had professionally topics, and the standing committee-discuss relatively social and popular issues. As the difference in perception can be numerically confirmed, it can be used as a basic study on indicators that can be used for future policy or project evaluation.

A Study on Technology Trend of Power Semiconductor Packaging using Topic model (토픽모델을 이용한 전력반도체 패키징 기술 동향 연구)

  • Park, Keunseo;Choi, Gyunghyun
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.27 no.2
    • /
    • pp.53-58
    • /
    • 2020
  • Analysis of electric semiconductor packaging technology for electric vehicles was performed. Topic modeling using LDA technique was performed by collecting valid patents by deriving valid patents. It was classified into 20 topics, and the definition of technology was defined through extracted words for each topic. In order to analyze the trend of each topic, the trend of power semiconductor packaging technology was analyzed by deriving hot and cold topics by topic through regression analysis on frequency by year. The package structure technology according to the withstand voltage, the input/output-related control technology and the heat dissipation technology were derived as the hot topic technology, and the inductance reduction technology was derived as the cold topic technology.

Investigation of Research Topic and Trends of National ICT Research-Development Using the LDA Model (LDA 토픽모델링을 통한 ICT분야 국가연구개발사업의 주요 연구토픽 및 동향 탐색)

  • Woo, Chang Woo;Lee, Jong Yun
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.7
    • /
    • pp.9-18
    • /
    • 2020
  • The research objectives investigates main research topics and trends in the information and communication technology(ICT) field, Korea using LDA(Latent Dirichlet Allocation), one of the topic modeling techniques. The experimental dataset of ICT research and development(R&D) project of 5,200 was acquired through matching with the EZone system of IITP after downloading R&D project dataset from NTIS(National Science and Technology Information Service) during recent five years. Consequently, our finding was that the majority research topics were found as intelligent information technologies such as AI, big data, and IoT, and the main research trends was hyper realistic media. Finally, it is expected that the research results of topic modeling on the national R&D foundation dataset become the powerful information about establishment of planning and strategy of future's research and development in the ICT field.

COVID-19 and Korean Family Life on Social Media: A Topic Model Approach (소셜 빅데이터로 알아본 코로나19와 가족생활: 토픽모델 접근)

  • Park, Sunyoung;Lee, Jaerim
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.3
    • /
    • pp.282-300
    • /
    • 2021
  • The purpose of this study was to explore what social media posts tell us about family life during the COVID-19 pandemic by examining the keywords and topics underlying posts on blogs and online forums. Our criteria for web crawling were (a) blog and forum posts on Naver and Daum, the top portal sites in Korea, (b) posts between February 23 and April 19, 2020, the period of the first heightened social distancing orders, and (c) inclusion of "COVID" and "family" or "COVID" and "home." We analyzed 351,734 posts using TF-IDF values and topic modeling based on latent Dirichlet allocation. We identified and named 22 topics including COVID-19 prevention, family infection, family health, dietary life and changes, religious life, stuck at home, postponed school year, family events, travel and vacations, concerns about family and friends, anxiety and stress, disaster and damage, COVID-19 warning text messages, family support policies, Shin-cheon-ji and Daegu. The results show that COVID-19 impacted various domains of family life including health, food, housing, religion, child care, education, rituals, and leisure as well as relationships and emotions.

Accelerated Loarning of Latent Topic Models by Incremental EM Algorithm (점진적 EM 알고리즘에 의한 잠재토픽모델의 학습 속도 향상)

  • Chang, Jeong-Ho;Lee, Jong-Woo;Eom, Jae-Hong
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.12
    • /
    • pp.1045-1055
    • /
    • 2007
  • Latent topic models are statistical models which automatically captures salient patterns or correlation among features underlying a data collection in a probabilistic way. They are gaining an increased popularity as an effective tool in the application of automatic semantic feature extraction from text corpus, multimedia data analysis including image data, and bioinformatics. Among the important issues for the effectiveness in the application of latent topic models to the massive data set is the efficient learning of the model. The paper proposes an accelerated learning technique for PLSA model, one of the popular latent topic models, by an incremental EM algorithm instead of conventional EM algorithm. The incremental EM algorithm can be characterized by the employment of a series of partial E-steps that are performed on the corresponding subsets of the entire data collection, unlike in the conventional EM algorithm where one batch E-step is done for the whole data set. By the replacement of a single batch E-M step with a series of partial E-steps and M-steps, the inference result for the previous data subset can be directly reflected to the next inference process, which can enhance the learning speed for the entire data set. The algorithm is advantageous also in that it is guaranteed to converge to a local maximum solution and can be easily implemented just with slight modification of the existing algorithm based on the conventional EM. We present the basic application of the incremental EM algorithm to the learning of PLSA and empirically evaluate the acceleration performance with several possible data partitioning methods for the practical application. The experimental results on a real-world news data set show that the proposed approach can accomplish a meaningful enhancement of the convergence rate in the learning of latent topic model. Additionally, we present an interesting result which supports a possible synergistic effect of the combination of incremental EM algorithm with parallel computing.

Mapping of Characteristics and Hierarchy between Heterogeneous Ontology Languages (이형 온톨로지 언어의 속성 및 계층구조 매핑)

  • Hong, Hyeun-Sool
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10b
    • /
    • pp.131-136
    • /
    • 2007
  • 토픽맵은 RDF에 기반을 둔 OWL과 많은 유사점을 갖지만, 양자는 역사적, 기술적, 의도하는 목적에서 차이가 있다. 토픽맵은 ISO 표준이지만, OWL은 W3C의 온톨로지 개발 표준언어로서 양자는 각각의 제약언어, 데이터 모델, 그리고 일련의 구문들을 별개로 갖는다. 그러나 토픽맵과 OWL 양자는 지식을 표현하는 온톨로지 언어라는 공통적 특성을 가지며, 술어로직에 기반을 두고 있고, XML포맷이기 때문에 상호간에 매핑이 가능하다. 논문의 목적은 토픽맵과 OWL의 메타모델로부터 온톨로지 정보자원의 공유, 교환, 통합에 접근시킨다. 따라서 각각의 메타모델에서 주요 요소를 추출하고, 이들의 의미적인 측면과 구조적인 측면의 요소들의 손실이 발생되지 않도록 매핑을 수행한다.

  • PDF

A Study on Mapping Users' Topic Interest for Question Routing for Community-based Q&A Service (커뮤니티 기반 Q&A서비스에서의 질의 할당을 위한 이용자의 관심 토픽 분석에 관한 연구)

  • Park, Jong Do
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.3
    • /
    • pp.397-412
    • /
    • 2015
  • The main goal of this study is to investigate how to route a question to some relevant users who have interest in the topic of the question based on users' topic interest. In order to assess users' topic interest, archived question-answer pairs in the community were used to identify latent topics in the chosen categories using LDA. Then, these topic models were used to identify users' topic interest. Furthermore, the topics of newly submitted questions were analyzed using the topic models in order to recommend relevant answerers to the question. This study introduces the process of topic modeling to investigate relevant users based on their topic interest.

Topic and Sentiment Analysis on COVID19 Research in Korea Using Text Analysis (텍스트 분석을 이용한 코로나19 관련 국내논문의 토픽 및 감성연구)

  • Heo, Seong-Min;Yang, Ji-Yeon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.329-331
    • /
    • 2021
  • 본 연구에서는 코로나19 관련 연구논문의 연구주제를 탐색하고 동향을 검토하고 있다. 또한 감성분석을 통해 부정적인 어조가 강한 경고가 되는 주제들을 알아본다. 잠재 디리슐레 할당(LDA)를 이용하여 총 8개의 토픽을 발견하 였고, 이를 구조적 토픽 모델링(STM)과 비교하여 비교적 안정적인 결과임을 확인하였다. 또한 k-means 군집 알고리즘을 통해 각 토픽별로 세부 연구주제를 발견하였고 주성분 분석을 이용하여 이를 시각적으로 표현하였다. 감성분석을 통해 각 토픽별 긍정적, 부정적인 단어들을 살펴보고 감성점수를 계산하여 연구논문의 주된 어조를 파악하였는데, 특히 생물 의학 관련, 국제적 역학관계, 심리적 영향과 관련된 연구에서 부정적인 어조가 강한 것으로 나타나 해당 부문에 대해서 주의와 관심이 요구된다. 향후 연구자들이 연구의 방향성을 탐색하고 정책결정자들이 연구지원 사업을 결정하는데 기초자료로 활용될 수 있을 것이다.

  • PDF

Analysis regarding Complaints of Courier Consumers and Workers in the Parcel Delivery Service by using Topic Model (토픽모델을 활용한 택배 서비스 소비자와 종사자의 불만 사항 분석)

  • Shin, Jin Gyu
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.2
    • /
    • pp.39-48
    • /
    • 2020
  • Many studies have been conducted to analyze factors that affect customer satisfaction, and service quality improvement in the parcel delivery industry. Most of these studies have a limited number of respondents using methods such as surveys and interviews. Therefore, this study aims to supplement the shortcomings of previous studies, by searching and analyzing the common major topics related to the complaints pointed out by consumers and suppliers in the parcel delivery service with cases of consumer counseling, and articles that reflect the complaints of workers in the industry. In addition, by analyzing the trend of these topics, we attempted to discover new topics and suggest implications. In conclusion, topics such as delay/lost/wrong deliveries as well as the fierce competition in the parcel delivery industry, turned out to be central aspects. As a result of the topic trend analysis, talks with international couriers have recently increased, and many conflicts related to apartment parcel delivery have been dealt with. The topics presented in this study are mainly focused on the contents of previous studies, but we expect that new and valuable topics can be derived by adding other data and analysis methods, such as internal counseling and academic literature.