• Title/Summary/Keyword: Topic Feature

Search Result 108, Processing Time 0.027 seconds

Topicality and Focality of Contrastive Topic (대조주제의 주제성과 초점성)

  • Wee, Hae-Kyung
    • Language and Information
    • /
    • v.14 no.2
    • /
    • pp.47-70
    • /
    • 2010
  • This study investigates the semantic and prosodic properties of the so-called contrastive topic. We posit two informational primitives, namely, topical feature [+-T] and focal feature [+-F], from which four different informational categories, i.e., [+T, +F], [+T, -F], [-T, +F], and [-T, -F], are yielded. It is proposed that the informational category of contrastive topic has focal property [+F] as well as topical property [+T]. Based on the semantic approach that regards the function of [+F] as identificational predication and that of [+T] as forming a semantic conditional clause, it is shown that the semantic function of contrastive topic, which is specified as [+T, +F], is the combination of these two functions, i.e., identificational predication in a semantic conditional clause. This is supported by a scrutinized exploration of the prosodic pattern of English contrastive topic.

  • PDF

Comments Classification System using Topic Signature (Topic Signature를 이용한 댓글 분류 시스템)

  • Bae, Min-Young;Cha, Jeong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.12
    • /
    • pp.774-779
    • /
    • 2008
  • In this work, we describe comments classification system using topic signature. Topic signature is widely used for selecting feature in document classification and summarization. Comments are short and have so many word spacing errors, special characters. We firstly convert comments into 7-gram. We consider the 7-gram as sentence. We convert the 7-gram into 3-gram. We consider the 3-gram as word. We select key feature using topic signature and classify new inputs by the Naive Bayesian method. From the result of experiments, we can see that the proposed method is outstanding over the previous methods.

A Study of Research on Methods of Automated Biomedical Document Classification using Topic Modeling and Deep Learning (토픽모델링과 딥 러닝을 활용한 생의학 문헌 자동 분류 기법 연구)

  • Yuk, JeeHee;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.63-88
    • /
    • 2018
  • This research evaluated differences of classification performance for feature selection methods using LDA topic model and Doc2Vec which is based on word embedding using deep learning, feature corpus sizes and classification algorithms. In addition to find the feature corpus with high performance of classification, an experiment was conducted using feature corpus was composed differently according to the location of the document and by adjusting the size of the feature corpus. Conclusionally, in the experiments using deep learning evaluate training frequency and specifically considered information for context inference. This study constructed biomedical document dataset, Disease-35083 which consisted biomedical scholarly documents provided by PMC and categorized by the disease category. Throughout the study this research verifies which type and size of feature corpus produces the highest performance and, also suggests some feature corpus which carry an extensibility to specific feature by displaying efficiency during the training time. Additionally, this research compares the differences between deep learning and existing method and suggests an appropriate method by classification environment.

Feature selection for text data via topic modeling (토픽 모형을 이용한 텍스트 데이터의 단어 선택)

  • Woosol, Jang;Ye Eun, Kim;Won, Son
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.6
    • /
    • pp.739-754
    • /
    • 2022
  • Usually, text data consists of many variables, and some of them are closely correlated. Such multi-collinearity often results in inefficient or inaccurate statistical analysis. For supervised learning, one can select features by examining the relationship between target variables and explanatory variables. On the other hand, for unsupervised learning, since target variables are absent, one cannot use such a feature selection procedure as in supervised learning. In this study, we propose a word selection procedure that employs topic models to find latent topics. We substitute topics for the target variables and select terms which show high relevance for each topic. Applying the procedure to real data, we found that the proposed word selection procedure can give clear topic interpretation by removing high-frequency words prevalent in various topics. In addition, we observed that, by applying the selected variables to the classifiers such as naïve Bayes classifiers and support vector machines, the proposed feature selection procedure gives results comparable to those obtained by using class label information.

Research on Community Knowledge Modeling of Readers Based on Interest Labels

  • Kai, Wang;Wei, Pan;Xingzhi, Chen
    • Journal of Information Processing Systems
    • /
    • v.19 no.1
    • /
    • pp.55-66
    • /
    • 2023
  • Community portraits can deeply explore the characteristics of community structures and describe the personalized knowledge needs of community users, which is of great practical significance for improving community recommendation services, as well as the accuracy of resource push. The current community portraits generally have the problems of weak perception of interest characteristics and low degree of integration of topic information. To resolve this problem, the reader community portrait method based on the thematic and timeliness characteristics of interest labels (UIT) is proposed. First, community opinion leaders are identified based on multi-feature calculations, and then the topic features of their texts are identified based on the LDA topic model. On this basis, a semantic mapping including "reader community-opinion leader-text content" was established. Second, the readers' interest similarity of the labels was dynamically updated, and two kinds of tag parameters were integrated, namely, the intensity of interest labels and the stability of interest labels. Finally, the similarity distance between the opinion leader and the topic of interest was calculated to obtain the dynamic interest set of the opinion leaders. Experimental analysis was conducted on real data from the Douban reading community. The experimental results show that the UIT has the highest average F value (0.551) compared to the state-of-the-art approaches, which indicates that the UIT has better performance in the smooth time dimension.

Company Name Discrimination in Tweets using Topic Signatures Extracted from News Corpus

  • Hong, Beomseok;Kim, Yanggon;Lee, Sang Ho
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.4
    • /
    • pp.128-136
    • /
    • 2016
  • It is impossible for any human being to analyze the more than 500 million tweets that are generated per day. Lexical ambiguities on Twitter make it difficult to retrieve the desired data and relevant topics. Most of the solutions for the word sense disambiguation problem rely on knowledge base systems. Unfortunately, it is expensive and time-consuming to manually create a knowledge base system, resulting in a knowledge acquisition bottleneck. To solve the knowledge-acquisition bottleneck, a topic signature is used to disambiguate words. In this paper, we evaluate the effectiveness of various features of newspapers on the topic signature extraction for word sense discrimination in tweets. Based on our results, topic signatures obtained from a snippet feature exhibit higher accuracy in discriminating company names than those from the article body. We conclude that topic signatures extracted from news articles improve the accuracy of word sense discrimination in the automated analysis of tweets.

A Ghost in the Shell? Influences of AI Features on Product Evaluations of Smart Speakers with Customer Reviews (A Ghost in the Shell? 고객 리뷰를 통한 스마트 스피커의 인공지능 속성이 평가에 미치는 영향 연구)

  • Lee, Hong Joo
    • Journal of Information Technology Services
    • /
    • v.17 no.2
    • /
    • pp.191-205
    • /
    • 2018
  • With the advancement of artificial intelligence (AI) techniques, many consumer products have adopted AI features for providing proactive and personalized services to customers. One of the most prominent products featuring AI techniques is a smart speaker. The fundamental of smart speaker is a portable wireless Internet connecting speaker which already have existed in a consumer market. By applying AI techniques, smart speakers can recognize human voices and communicate with them. In addition, they can control other connecting devices and provide offline services. The goal of this study is to identify the impact of AI techniques for customer rating to the products. We compared customer reviews of other portable speakers without AI features and those of a smart speaker. Amazon echo is used for a smart speaker and JBL Flip 4 Bluetooth Speaker and Ultimate Ears BOOM 2 Panther Limited Edition are used for the comparison. These products are in the same price range ($50~100) and selected as featured products in Amazon.com. All reviews for the products were collected and common words for all products and unique words of the smart speaker were identified. Information gain values were calculated to identify the influences of words to be rated as positive or negative. Positive and negative words in all the products or in Amazon echo were identified, too. Topic modeling was applied to the customer reviews on Amazon echo and the importance of each topic were measured by summating information gain values of each topic. This study provides a way of identifying customer responses on the AI feature and measuring the importance of the feature among diverse features of the products.

Research Trend on Diabetes Mobile Applications: Text Network Analysis and Topic Modeling (당뇨병 모바일 앱 관련 연구동향: 텍스트 네트워크 분석 및 토픽 모델링)

  • Park, Seungmi;Kwak, Eunju;Kim, Youngji
    • Journal of Korean Biological Nursing Science
    • /
    • v.23 no.3
    • /
    • pp.170-179
    • /
    • 2021
  • Purpose: The aim of this study was to identify core keywords and topic groups in the 'Diabetes mellitus and mobile applications' field of research for better understanding research trends in the past 20 years. Methods: This study was a text-mining and topic modeling study including four steps such as 'collecting abstracts', 'extracting and cleaning semantic morphemes', 'building a co-occurrence matrix', and 'analyzing network features and clustering topic groups'. Results: A total of 789 papers published between 2002 and 2021 were found in databases (Springer). Among them, 435 words were extracted from 118 articles selected according to the conditions: 'analyzed by text network analysis and topic modeling'. The core keywords were 'self-management', 'intervention', 'health', 'support', 'technique' and 'system'. Through the topic modeling analysis, four themes were derived: 'intervention', 'blood glucose level control', 'self-management' and 'mobile health'. The main topic of this study was 'self-management'. Conclusion: While more recent work has investigated mobile applications, the highest feature was related to self-management in the diabetes care and prevention. Nursing interventions utilizing mobile application are expected to not only effective and powerful glycemic control and self-management tools, but can be also used for patient-driven lifestyle modification.

Recognizing Actions from Different Views by Topic Transfer

  • Liu, Jia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.4
    • /
    • pp.2093-2108
    • /
    • 2017
  • In this paper, we describe a novel method for recognizing human actions from different views via view knowledge transfer. Our approach is characterized by two aspects: 1) We propose a unsupervised topic transfer model (TTM) to model two view-dependent vocabularies, where the original bag of visual words (BoVW) representation can be transferred into a bag of topics (BoT) representation. The higher-level BoT features, which can be shared across views, can connect action models for different views. 2) Our features make it possible to obtain a discriminative model of action under one view and categorize actions in another view. We tested our approach on the IXMAS data set, and the results are promising, given such a simple approach. In addition, we also demonstrate a supervised topic transfer model (STTM), which can combine transfer feature learning and discriminative classifier learning into one framework.

A Process-Centered Knowledge Model for Analysis of Technology Innovation Procedures

  • Chun, Seungsu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.1442-1453
    • /
    • 2016
  • Now, there are prodigiously expanding worldwide economic networks in the information society, which require their social structural changes through technology innovations. This paper so tries to formally define a process-centered knowledge model to be used to analyze policy-making procedures on technology innovations. The eventual goal of the proposed knowledge model is to apply itself to analyze a topic network based upon composite keywords from a document written in a natural language format during the technology innovation procedures. Knowledge model is created to topic network that compositing driven keyword through text mining from natural language in document. And we show that the way of analyzing knowledge model and automatically generating feature keyword and relation properties into topic networks.