• Title/Summary/Keyword: Topic distribution

Search Result 301, Processing Time 0.025 seconds

A Study of Slow Fashion on YouTube Through Big Data Analysis (유튜브에 나타난 슬로우 패션의 빅데이터 분석)

  • Sen Bin;Haejung Yum
    • Journal of Fashion Business
    • /
    • v.27 no.4
    • /
    • pp.50-66
    • /
    • 2023
  • The purpose of this study was to examine the word distribution and topic distribution of slow fashion appearing on YouTube in detail and identify the characteristics and aspects related to fashion design through big data analysis and content analysis methods. The specific research results were as follows. First, in the results of the word distribution analysis, "item" appeared the most, 203 times. Also, "one-piece" was a point to pay attention to, as the item had the highest frequency. Second, a total of 5 topics were defined in the topic distribution analysis: topic 1 was "vintage products," topic 2 was "fashion items," topic 3 was "eco-friendly," topic 4 was "life quality emphasis," and topic 5 was "prudent consumption." Third, looking at the relationship between word distribution and topic distribution above, Korean slow fashion on YouTube was actively selecting related design elements that express vintage images in clothing life regardless of trends. In addition, there was a tendency to pursue various basic and high-quality items. Other than those findings, basic items tended to be reinterpreted in various ways through styling methods matched to the vintage image. Lastly, the tendency of slow and small-volume production appeared to emphasize handicrafts and the cultural values of fashion products.

Generative probabilistic model with Dirichlet prior distribution for similarity analysis of research topic

  • Milyahilu, John;Kim, Jong Nam
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.4
    • /
    • pp.595-602
    • /
    • 2020
  • We propose a generative probabilistic model with Dirichlet prior distribution for topic modeling and text similarity analysis. It assigns a topic and calculates text correlation between documents within a corpus. It also provides posterior probabilities that are assigned to each topic of a document based on the prior distribution in the corpus. We then present a Gibbs sampling algorithm for inference about the posterior distribution and compute text correlation among 50 abstracts from the papers published by IEEE. We also conduct a supervised learning to set a benchmark that justifies the performance of the LDA (Latent Dirichlet Allocation). The experiments show that the accuracy for topic assignment to a certain document is 76% for LDA. The results for supervised learning show the accuracy of 61%, the precision of 93% and the f1-score of 96%. A discussion for experimental results indicates a thorough justification based on probabilities, distributions, evaluation metrics and correlation coefficients with respect to topic assignment.

Topic Extraction and Classification Method Based on Comment Sets

  • Tan, Xiaodong
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.329-342
    • /
    • 2020
  • In recent years, emotional text classification is one of the essential research contents in the field of natural language processing. It has been widely used in the sentiment analysis of commodities like hotels, and other commentary corpus. This paper proposes an improved W-LDA (weighted latent Dirichlet allocation) topic model to improve the shortcomings of traditional LDA topic models. In the process of the topic of word sampling and its word distribution expectation calculation of the Gibbs of the W-LDA topic model. An average weighted value is adopted to avoid topic-related words from being submerged by high-frequency words, to improve the distinction of the topic. It further integrates the highest classification of the algorithm of support vector machine based on the extracted high-quality document-topic distribution and topic-word vectors. Finally, an efficient integration method is constructed for the analysis and extraction of emotional words, topic distribution calculations, and sentiment classification. Through tests on real teaching evaluation data and test set of public comment set, the results show that the method proposed in the paper has distinct advantages compared with other two typical algorithms in terms of subject differentiation, classification precision, and F1-measure.

The Impact of Topic Distribution on Review Sentiment: A Comparative Study between South Korea and the U.S.

  • Cho, Mina;Hwang, Dugmee;Jeon, Seongmin
    • 한국벤처창업학회:학술대회논문집
    • /
    • 2022.04a
    • /
    • pp.123-126
    • /
    • 2022
  • Online reviews offer valuable information to businesses by reflecting consumer experiences about their products and services. Two important aspects of online reviews are first, the topics consumers choose to address and second, the sentiments expressed in their reviews. Building upon previous literature that shows online reviews are context-dependent, we examine the impact of topic distribution on review sentiment in South Korea and the U.S. during pre-and post-pandemic periods. After performing topic modeling on Airbnb app review data, we measure the contribution of each topic on review sentiment using SHAP values. Our results indicate variations in topic distribution trends between 2018 and 2021. Also, the order and magnitude of topics' impact on review sentiment change between pre-and post-pandemic periods for both countries. This study can help businesses to understand how topics and sentiments associated with their products and services changed after pandemic, and also help them identify areas of improvement.

  • PDF

Impact of Topic Distribution on Review Sentiment: A Comparative Study between South Korea and the U.S.

  • Mina Cho;Dugmee Hwang;SeongMin Jeon
    • Asia pacific journal of information systems
    • /
    • v.32 no.3
    • /
    • pp.514-536
    • /
    • 2022
  • Online reviews offer valuable information to businesses by reflecting consumer experiences about their products and services. Two crucial aspects of online reviews are the topics consumers choose to address, and the sentiments expressed in their reviews. Building upon previous literature that shows online reviews are context-dependent, we employ the Expectation-Confirmation Theory (ECT) to examine the impact of topic distribution on review sentiment in South Korea and the U.S. during pre- and post-pandemic periods. After applying a topic modeling to Airbnb app review data, we measure the contribution of each topic on review sentiment using SHAP values. Our results indicate variations in topic distribution trends between 2018 and 2021. In addition, the order and magnitude of topics' impact on review sentiment change between pre- and post-pandemic periods for both countries. This study can help businesses understand how topics and sentiments associated with their products and services changed after the pandemic and thus identify areas of improvement.

Semantic Dependency Link Topic Model for Biomedical Acronym Disambiguation (의미적 의존 링크 토픽 모델을 이용한 생물학 약어 중의성 해소)

  • Kim, Seonho;Yoon, Juntae;Seo, Jungyun
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.652-665
    • /
    • 2014
  • Many important terminologies in biomedical text are expressed as abbreviations or acronyms. We newly suggest a semantic link topic model based on the concepts of topic and dependency link to disambiguate biomedical abbreviations and cluster long form variants of abbreviations which refer to the same senses. This model is a generative model inspired by the latent Dirichlet allocation (LDA) topic model, in which each document is viewed as a mixture of topics, with each topic characterized by a distribution over words. Thus, words of a document are generated from a hidden topic structure of a document and the topic structure is inferred from observable word sequences of document collections. In this study, we allow two distinct word generation to incorporate semantic dependencies between words, particularly between expansions (long forms) of abbreviations and their sentential co-occurring words. Besides topic information, the semantic dependency between words is defined as a link and a new random parameter for the link presence is assigned to each word. As a result, the most probable expansions with respect to abbreviations of a given abstract are decided by word-topic distribution, document-topic distribution, and word-link distribution estimated from document collection though the semantic dependency link topic model. The abstracts retrieved from the MEDLINE Entrez interface by the query relating 22 abbreviations and their 186 expansions were used as a data set. The link topic model correctly predicted expansions of abbreviations with the accuracy of 98.30%.

On the Distribution of‘-(N)un’in Korean (‘-은/는’의 분포에 대하여)

  • 염재일
    • Language and Information
    • /
    • v.5 no.2
    • /
    • pp.57-74
    • /
    • 2001
  • In this paper, I propose syntactic, semantic and pragmatic restrictions on the distribution of the contrastive topic marker‘-(n)un’in Korean. A contrastive topic is associated with another focus. The association with focus is subject to syntactic islands. On the other hand, there is no syntactic restriction between a phrase attached with‘-(n)un’and a focused expression within the ‘-(n)un’phrase itself. In this area there is a semantic requirement that the alternatives generated by a focused expression be maintained up to the phrase attached with‘-(n)un’. Finally, when‘-(n)un’is used in an embedded clause, the whole sentence becomes natural when the contrastive topic introduced by‘-(n)un’and its alternative contrastive topic, which is presupposed by the contrastive topic marker, jointly constitute a more complex topic which is related to the whole context. And exclusiveness facilitates the formation of the whole complex context.

  • PDF

Analysis of Global Media Reporting Trends for K-fashion -Applying Dynamic Topic Modeling- (K 패션에 대한 글로벌 미디어 보도 경향 분석 -다이내믹 토픽 모델링(Dynamic Topic Modeling)의 적용-)

  • Hyosun An;Jiyoung Kim
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.46 no.6
    • /
    • pp.1004-1022
    • /
    • 2022
  • This study seeks to investigate K-fashion's external image by examining the trends in global media reporting. It applies Dynamic Topic Modeling (DTM), which captures the evolution of topics in a sequentially organized corpus of documents, and consists of text preprocessing, the determination of the number of topics, and a timeseries analysis of the probability distribution of words within topics. The data set comprised 551 online media articles on 'Korean fashion' or 'K-fashion' published on Google News between 2010 and 2021. The analysis identifies seven topics: 'brand look and style,' 'lifestyle,' 'traditional style,' 'Seoul Fashion Week (SFW) event,' 'model size,' 'K-pop,' and 'fashion market,' as well as annual topic proportion trends. It also explores annual word changes within the topic and indicates increasing and decreasing word patterns. In most topics, the probability distribution of the word 'brand' is confirmed to be on the increase, while 'digital,' 'platform,' and 'virtual' have been newly created in the 'SFW event' topic. Moreover, this study confirms the transition of each K-fashion topic over the past 12 years, along with various factors related to Hallyu content, traditional culture, government support, and digital technology innovation.

Exploration of Research Trends in The Journal of Distribution Science Using Keyword Analysis

  • YANG, Woo-Ryeong
    • The Journal of Industrial Distribution & Business
    • /
    • v.10 no.8
    • /
    • pp.17-24
    • /
    • 2019
  • Purpose - The purpose of this study is to find out research directions for distribution and fusion and complex field to many domestic and foreign researchers carrying out related academic research by confirming research trends in the Journal of Distribution Science (JDS). Research Design, Data, and Methodology - To do this, I used keywords from a total of 904 papers published in the JDS excluding 19 papers that were not presented with keywords among 923. The analysis utilized word clouding, topic modeling, and weighted frequency analysis using the R program. Results - As a result of word clouding analysis, customer satisfaction was the most utilized keyword. Topic modeling results were divided into ten topics such as distribution channels, communication, supply chain, brand, business, customer, comparative study, performance, KODISA journal, and trade. It is confirmed that only the service quality part is increased in the weighted frequency analysis result of applying to the year group. Conclusion - The results of this study confirm that the JDS has developed into various convergence and integration researches from the past studies limited to the field of distribution. However, JDS's identity is based on distribution. Therefore, it is also necessary to establish identity continuously through special editions of fields related to distribution.

Too Much Information - Trying to Help or Deceive? An Analysis of Yelp Reviews

  • Hyuk Shin;Hong Joo Lee;Ruth Angelie Cruz
    • Asia pacific journal of information systems
    • /
    • v.33 no.2
    • /
    • pp.261-281
    • /
    • 2023
  • The proliferation of online customer reviews has completely changed how consumers purchase. Consumers now heavily depend on authentic experiences shared by previous customers. However, deceptive reviews that aim to manipulate customer decision-making to promote or defame a product or service pose a risk to businesses and buyers. The studies investigating consumer perception of deceptive reviews found that one of the important cues is based on review content. This study aims to investigate the impact of the information amount of review on the review truthfulness. This study adopted the Information Manipulation Theory (IMT) as an overarching theory, which asserts that the violations of one or more of the Gricean maxim are deceptive behaviors. It is regarded as a quantity violation if the required information amount is not delivered or more information is delivered; that is an attempt at deception. A topic modeling algorithm is implemented to reveal the distribution of each topic embedded in a text. This study measures information amount as topic diversity based on the results of topic modeling, and topic diversity shows how heterogeneous a text review is. Two datasets of restaurant reviews on Yelp.com, which have Filtered (deceptive) and Unfiltered (genuine) reviews, were used to test the hypotheses. Reviews that contain more diverse topics tend to be truthful. However, excessive topic diversity produces an inverted U-shaped relationship with truthfulness. Moreover, we find an interaction effect between topic diversity and reviews' ratings. This result suggests that the impact of topic diversity is strengthened when deceptive reviews have lower ratings. This study contributes to the existing literature on IMT by building the connection between topic diversity in a review and its truthfulness. In addition, the empirical results show that topic diversity is a reliable measure for gauging information amount of reviews.