• Title/Summary/Keyword: Multi-topic

Search Result 178, Processing Time 0.035 seconds

Relational Database Structure for Preserving Multi-role Topics in Topic Map (토픽맵의 다중역할 토픽 보존을 위한 관계형 데이터베이스 구조)

  • Jung, Yoonsoo;Y., Choon;Kim, Namgyu
    • The Journal of Information Systems
    • /
    • v.18 no.3
    • /
    • pp.327-349
    • /
    • 2009
  • Traditional keyword-based searching methods suffer from low accuracy and high complexity due to the rapid growth in the amount of information. Accordingly, many researchers attempt to implement a so-called semantic search which is based on the semantics of the user's query. Semantic information can be described using a semantic modeling language, such as Topic Map. In this paper, we propose a new method to map a topic map to a traditional Relational Database (RDB) without any information loss. Although there have been a few attempts to map topic maps to RDB, they have paid scant attention to handling multi-role topics. In this paper, we propose a new storage structure to map multi-role topics to traditional RDB. The proposed structure consists of a mapping table, role tables, and content tables. Additionally, we devise a query translator to convert a user's query to one appropriate to the proposed structure.

  • PDF

Exploratory Study of Developing a Synchronization-Based Approach for Multi-step Discovery of Knowledge Structures

  • Yu, So Young
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.2
    • /
    • pp.16-32
    • /
    • 2014
  • As Topic Modeling has been applied in increasingly various domains, the difficulty in naming and characterizing topics also has been recognized more. This study, therefore, explores an approach of combining text mining with network analysis in a multi-step approach. The concept of synchronization was applied to re-assign the top author keywords in more than one topic category, in order to improve the visibility of the topic-author keyword network, and to increase the topical cohesion in each topic. The suggested approach was applied using 16,548 articles with 2,881 unique author keywords in construction and building engineering indexed by KSCI. As a result, it was revealed that the combined approach could improve both the visibility of the topic-author keyword map and topical cohesion in most of the detected topic categories. There should be more cases of applying the approach in various domains for generalization and advancement of the approach. Also, more sophisticated evaluation methods should also be necessary to develop the suggested approach.

Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation

  • Ding, Wanying;Zhu, Junhuan;Guo, Lifan;Hu, Xiaohua;Luo, Jiebo;Wang, Haohong
    • Journal of Multimedia Information System
    • /
    • v.1 no.1
    • /
    • pp.55-67
    • /
    • 2014
  • Image topic and emotion analysis is an important component of online image retrieval, which nowadays has become very popular in the widely growing social media community. However, due to the gaps between images and texts, there is very limited work in literature to detect one image's Topics and Emotions in a unified framework, although topics and emotions are two levels of semantics that often work together to comprehensively describe one image. In this work, a unified model, Joint Topic/Emotion Multi-Modal Hierarchical Latent Dirichlet Allocation (JTE-MMHLDA) model, which extends previous LDA, mmLDA, and JST model to capture topic and emotion information at the same time from heterogeneous data, is proposed. Specifically, a two level graphical structured model is built to realize sharing topics and emotions among the whole document collection. The experimental results on a Flickr dataset indicate that the proposed model efficiently discovers images' topics and emotions, and significantly outperform the text-only system by 4.4%, vision-only system by 18.1% in topic detection, and outperforms the text-only system by 7.1%, vision-only system by 39.7% in emotion detection.

  • PDF

Topic-based Multi-document Summarization Using Non-negative Matrix Factorization and K-means (비음수 행렬 분해와 K-means를 이용한 주제기반의 다중문서요약)

  • Park, Sun;Lee, Ju-Hong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.4
    • /
    • pp.255-264
    • /
    • 2008
  • This paper proposes a novel method using K-means and Non-negative matrix factorization (NMF) for topic -based multi-document summarization. NMF decomposes weighted term by sentence matrix into two sparse non-negative matrices: semantic feature matrix and semantic variable matrix. Obtained semantic features are comprehensible intuitively. Weighted similarity between topic and semantic features can prevent meaningless sentences that are similar to a topic from being selected. K-means clustering removes noises from sentences so that biased semantics of documents are not reflected to summaries. Besides, coherence of document summaries can be enhanced by arranging selected sentences in the order of their ranks. The experimental results show that the proposed method achieves better performance than other methods.

Differences and Multi-dimensionality of the Perception of Career Success among Korean Employees: A Topic Modeling Approach (기업근로자 경력성공 인식의 다차원성과 차이: 토픽모델링의 적용)

  • Lee, Jaeeun;Chae, Chungil
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.6
    • /
    • pp.58-71
    • /
    • 2019
  • The purpose of this study is to explore the multi-dimensionality and the differences of the career success that is revealed by the employee's perception. In order to fulfill the research purpose, LDA topic modeling has applied to extract latent topics of career success from 126 Korean employees' open-end survey questionnaires. The extracted latent topics are social recognition, continuing service within an organization, expertise, financial rewards, and pursuing personal meaning. The occurrence probability of each topic was different by individual characteristics such as gender, education, position. Study findings showed there is multi-dimensionality in career success, and there are differences of topic occurrence probability by demographic characteristics. Additionally, this study showed how to apply the recently developed machine learning approach in order to reduce the researcher's bias by adapting the LDA topic modeling to the qualitative open-ended survey data.

A method of Multi-Layer Visualizations for XTM (XTM을 위한 다층적 시각화 방법)

  • 박영조;박호병;조용윤;유재우
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.529-531
    • /
    • 2004
  • 웹 상에는 많은 자원들과 정보들이 존재한다. XML은 이러한 자원들과 정보들을 구조화하기 위해서 개발되었다. XTM(XML Topic Maps)은 XML의 형태로 자원들과 정보들에 의미를 부여할 수 있는 언어이다. XTM은 Topic과 Association을 이용해서 자원들과 정보들이 가진 의미를 표현한다 XTM상에서 나타나는 Topic과 Association은 매우 거대하고 다양하기 때문에 모든 Topic과 Association을 한꺼번에 표현하기 어렵다 또한, 사용자가 수백만개의 Topic과 Association에서 원하는 Topic과 Association을 찾기 어렵다. 따라서 이러한 문제점을 해결하기 위해서 다양한 시각화 방법이 연구되었다. 현재 Topic Maps을 표현할 때 트리, 그래프, 맵 등 하나의 구조를 이용해서 표현한다. 하지만 추상화정도에 따라 시각화 방법은 장ㆍ단점을 지닌다. 본 논문에서는 웹 상의 자원, 정보들과 의미 사이에 여러 계층이 존재하는 다층적 시각화를 제안한다. 각 계층은 독립적인 표현구조로 나타내어 추상화정도에 따라 최적화된 구조를 사용한다. 사용자는 자신이 원하는 Topic과 Association을 점진적 접근을 통해서 원하는 Topic과 Association을 검색할 수 있다. 또한 Topic이 Association의 member처럼 사용되는 경우, 시각적으로 Topic이 표현되면 Topic은 연결된 Association과 직접적인 연결을 갖는다.

  • PDF

Multi-Topic Sentiment Analysis using LDA for Online Review (LDA를 이용한 온라인 리뷰의 다중 토픽별 감성분석 - TripAdvisor 사례를 중심으로 -)

  • Hong, Tae-Ho;Niu, Hanying;Ren, Gang;Park, Ji-Young
    • The Journal of Information Systems
    • /
    • v.27 no.1
    • /
    • pp.89-110
    • /
    • 2018
  • Purpose There is much information in customer reviews, but finding key information in many texts is not easy. Business decision makers need a model to solve this problem. In this study we propose a multi-topic sentiment analysis approach using Latent Dirichlet Allocation (LDA) for user-generated contents (UGC). Design/methodology/approach In this paper, we collected a total of 104,039 hotel reviews in seven of the world's top tourist destinations from TripAdvisor (www.tripadvisor.com) and extracted 30 topics related to the hotel from all customer reviews using the LDA model. Six major dimensions (value, cleanliness, rooms, service, location, and sleep quality) were selected from the 30 extracted topics. To analyze data, we employed R language. Findings This study contributes to propose a lexicon-based sentiment analysis approach for the keywords-embedded sentences related to the six dimensions within a review. The performance of the proposed model was evaluated by comparing the sentiment analysis results of each topic with the real attribute ratings provided by the platform. The results show its outperformance, with a high ratio of accuracy and recall. Through our proposed model, it is expected to analyze the customers' sentiments over different topics for those reviews with an absence of the detailed attribute ratings.

A Study on the Imjin War's Historical Materials with Multi-layer Network Analysis and Topic Modeling (다중 네트워크 분석과 토픽 모델링을 이용한 임진왜란 시기 사료에 관한 연구)

  • Cho, HyunChul;Song, Min
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.33 no.1
    • /
    • pp.167-198
    • /
    • 2022
  • Convergence science research is activated, and digital humanities research is also encouraged in humanities. Therefore, this study attempted to propose a experimental study that applies Text mining and Entitymetrics methods to historical materials. Annals of King Seonjo, revised Annals of King Seonjo, Miscellaneous Record of the War and Writings on Imjin War were used, also network analysis and DMR topic models were used to explore topic changes and common entities in historical sources. Through the results, it was possible to propose the availability of quantitative analysis for text data, presenting a timing change of a specific topic, and an undiscovered relationship between person entities.

Combining Ego-centric Network Analysis and Dynamic Citation Network Analysis to Topic Modeling for Characterizing Research Trends (자아 중심 네트워크 분석과 동적 인용 네트워크를 활용한 토픽모델링 기반 연구동향 분석에 관한 연구)

  • Yu, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.1
    • /
    • pp.153-169
    • /
    • 2015
  • The combined approach of using ego-centric network analysis and dynamic citation network analysis for refining the result of LDA-based topic modeling was suggested and examined in this study. Tow datasets were constructed by collecting Web of Science bibliographic records of White LED and topic modeling was performed by setting a different number of topics on each dataset. The multi-assigned top keywords of each topic were re-assigned to one specific topic by applying an ego-centric network analysis algorithm. It was found that the topical cohesion of the result of topic modeling with the number of topic corresponding to the lowest value of perplexity to the dataset extracted by SPLC network analysis was the strongest with the best values of internal clustering evaluation indices. Furthermore, it demonstrates the possibility of developing the suggested approach as a method of multi-faceted research trend detection.

Research Trends Analysis of Big Data: Focused on the Topic Modeling (빅데이터 연구동향 분석: 토픽 모델링을 중심으로)

  • Park, Jongsoon;Kim, Changsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.1
    • /
    • pp.1-7
    • /
    • 2019
  • The objective of this study is to examine the trends in big data. Research abstracts were extracted from 4,019 articles, published between 1995 and 2018, on Web of Science and were analyzed using topic modeling and time series analysis. The 20 single-term topics that appeared most frequently were as follows: model, technology, algorithm, problem, performance, network, framework, analytics, management, process, value, user, knowledge, dataset, resource, service, cloud, storage, business, and health. The 20 multi-term topics were as follows: sense technology architecture (T10), decision system (T18), classification algorithm (T03), data analytics (T17), system performance (T09), data science (T06), distribution method (T20), service dataset (T19), network communication (T05), customer & business (T16), cloud computing (T02), health care (T14), smart city (T11), patient & disease (T04), privacy & security (T08), research design (T01), social media (T12), student & education (T13), energy consumption (T07), supply chain management (T15). The time series data indicated that the 40 single-term topics and multi-term topics were hot topics. This study provides suggestions for future research.