• 제목/요약/키워드: Topic Distribution

검색결과 298건 처리시간 0.028초

유튜브에 나타난 슬로우 패션의 빅데이터 분석 (A Study of Slow Fashion on YouTube Through Big Data Analysis)

  • 빈삼;염혜정
    • 패션비즈니스
    • /
    • 제27권4호
    • /
    • pp.50-66
    • /
    • 2023
  • The purpose of this study was to examine the word distribution and topic distribution of slow fashion appearing on YouTube in detail and identify the characteristics and aspects related to fashion design through big data analysis and content analysis methods. The specific research results were as follows. First, in the results of the word distribution analysis, "item" appeared the most, 203 times. Also, "one-piece" was a point to pay attention to, as the item had the highest frequency. Second, a total of 5 topics were defined in the topic distribution analysis: topic 1 was "vintage products," topic 2 was "fashion items," topic 3 was "eco-friendly," topic 4 was "life quality emphasis," and topic 5 was "prudent consumption." Third, looking at the relationship between word distribution and topic distribution above, Korean slow fashion on YouTube was actively selecting related design elements that express vintage images in clothing life regardless of trends. In addition, there was a tendency to pursue various basic and high-quality items. Other than those findings, basic items tended to be reinterpreted in various ways through styling methods matched to the vintage image. Lastly, the tendency of slow and small-volume production appeared to emphasize handicrafts and the cultural values of fashion products.

Generative probabilistic model with Dirichlet prior distribution for similarity analysis of research topic

  • Milyahilu, John;Kim, Jong Nam
    • 한국멀티미디어학회논문지
    • /
    • 제23권4호
    • /
    • pp.595-602
    • /
    • 2020
  • We propose a generative probabilistic model with Dirichlet prior distribution for topic modeling and text similarity analysis. It assigns a topic and calculates text correlation between documents within a corpus. It also provides posterior probabilities that are assigned to each topic of a document based on the prior distribution in the corpus. We then present a Gibbs sampling algorithm for inference about the posterior distribution and compute text correlation among 50 abstracts from the papers published by IEEE. We also conduct a supervised learning to set a benchmark that justifies the performance of the LDA (Latent Dirichlet Allocation). The experiments show that the accuracy for topic assignment to a certain document is 76% for LDA. The results for supervised learning show the accuracy of 61%, the precision of 93% and the f1-score of 96%. A discussion for experimental results indicates a thorough justification based on probabilities, distributions, evaluation metrics and correlation coefficients with respect to topic assignment.

Topic Extraction and Classification Method Based on Comment Sets

  • Tan, Xiaodong
    • Journal of Information Processing Systems
    • /
    • 제16권2호
    • /
    • pp.329-342
    • /
    • 2020
  • In recent years, emotional text classification is one of the essential research contents in the field of natural language processing. It has been widely used in the sentiment analysis of commodities like hotels, and other commentary corpus. This paper proposes an improved W-LDA (weighted latent Dirichlet allocation) topic model to improve the shortcomings of traditional LDA topic models. In the process of the topic of word sampling and its word distribution expectation calculation of the Gibbs of the W-LDA topic model. An average weighted value is adopted to avoid topic-related words from being submerged by high-frequency words, to improve the distinction of the topic. It further integrates the highest classification of the algorithm of support vector machine based on the extracted high-quality document-topic distribution and topic-word vectors. Finally, an efficient integration method is constructed for the analysis and extraction of emotional words, topic distribution calculations, and sentiment classification. Through tests on real teaching evaluation data and test set of public comment set, the results show that the method proposed in the paper has distinct advantages compared with other two typical algorithms in terms of subject differentiation, classification precision, and F1-measure.

The Impact of Topic Distribution on Review Sentiment: A Comparative Study between South Korea and the U.S.

  • Cho, Mina;Hwang, Dugmee;Jeon, Seongmin
    • 한국벤처창업학회:학술대회논문집
    • /
    • 한국벤처창업학회 2022년도 춘계학술대회
    • /
    • pp.123-126
    • /
    • 2022
  • Online reviews offer valuable information to businesses by reflecting consumer experiences about their products and services. Two important aspects of online reviews are first, the topics consumers choose to address and second, the sentiments expressed in their reviews. Building upon previous literature that shows online reviews are context-dependent, we examine the impact of topic distribution on review sentiment in South Korea and the U.S. during pre-and post-pandemic periods. After performing topic modeling on Airbnb app review data, we measure the contribution of each topic on review sentiment using SHAP values. Our results indicate variations in topic distribution trends between 2018 and 2021. Also, the order and magnitude of topics' impact on review sentiment change between pre-and post-pandemic periods for both countries. This study can help businesses to understand how topics and sentiments associated with their products and services changed after pandemic, and also help them identify areas of improvement.

  • PDF

Impact of Topic Distribution on Review Sentiment: A Comparative Study between South Korea and the U.S.

  • Mina Cho;Dugmee Hwang;SeongMin Jeon
    • Asia pacific journal of information systems
    • /
    • 제32권3호
    • /
    • pp.514-536
    • /
    • 2022
  • Online reviews offer valuable information to businesses by reflecting consumer experiences about their products and services. Two crucial aspects of online reviews are the topics consumers choose to address, and the sentiments expressed in their reviews. Building upon previous literature that shows online reviews are context-dependent, we employ the Expectation-Confirmation Theory (ECT) to examine the impact of topic distribution on review sentiment in South Korea and the U.S. during pre- and post-pandemic periods. After applying a topic modeling to Airbnb app review data, we measure the contribution of each topic on review sentiment using SHAP values. Our results indicate variations in topic distribution trends between 2018 and 2021. In addition, the order and magnitude of topics' impact on review sentiment change between pre- and post-pandemic periods for both countries. This study can help businesses understand how topics and sentiments associated with their products and services changed after the pandemic and thus identify areas of improvement.

의미적 의존 링크 토픽 모델을 이용한 생물학 약어 중의성 해소 (Semantic Dependency Link Topic Model for Biomedical Acronym Disambiguation)

  • 김선호;윤준태;서정연
    • 정보과학회 논문지
    • /
    • 제41권9호
    • /
    • pp.652-665
    • /
    • 2014
  • 생물학 도메인은 약어 표현이 빈번하며, 실제로 문서에서 중요한 의미를 지니는 개체명들이 약어로 표현되는 경우가 많다. 본 연구에서는 토픽과 링크 정보를 이용하여 약어 중의성을 해결하고 동일한 의미를 가지는 다양한 형태의 약어 원형들(variant forms)에 대한 그룹핑을 시도한다. 이를 위하여 LDA(latent Dirichlet allocation) 기반 의미적 의존 링크 토픽 모델(semantic dependency topic model)을 제안한다. 해당 모델은 생성 모델(generative model)의 일종으로 문서 집합의 각 문서에 등장하는 단어들은 문서에서 발생하는 토픽 분포와 토픽 당 단어 분포에 의해 생성되어 있는 것으로 가정하고, 관측 가능한 문서 집합의 단어들로부터 문서에 내재된 숨어있는 토픽 구조를 추론하여 단어 생성과 토픽 파라미터를 연결시킨다. 본 연구에서는 토픽 정보 외에 단어들 사이에 존재하는 의미적 의존성(semantic dependency)을 링크로 정의하고, 단어 간에 존재하는 링크 정보, 특히 원형과 문장에서 공기하는 단어들 사이의 링크를 파라미터화하여 중의성 해결에 이용하였다. 결과적으로 주어진 문서에 등장하는 약어에 대해 가장 가능성 있는 원형은 해당 모델을 이용하여 추론된 단어-토픽, 문서-토픽, 단어-링크 확률에 의해서 결정된다. 제안하는 모델은 MEDLINE 초록으로부터 Entrez 인터페이스를 이용해 22개의 약어 집합과 186개의 가능한 약어 원형을 이용하여 질의를 생성하고, 이를 이용해 검색된 문서들을 대상으로 학습과 테스트에 이용하였다. 실험은, 주어진 문서에 등장하는 해당 약어에 대한 원형이 무엇인지 예측하는 방식으로 98.3%의 정확률의 높은 성능을 보였다.

‘-은/는’의 분포에 대하여 (On the Distribution of‘-(N)un’in Korean)

  • 염재일
    • 한국언어정보학회지:언어와정보
    • /
    • 제5권2호
    • /
    • pp.57-74
    • /
    • 2001
  • In this paper, I propose syntactic, semantic and pragmatic restrictions on the distribution of the contrastive topic marker‘-(n)un’in Korean. A contrastive topic is associated with another focus. The association with focus is subject to syntactic islands. On the other hand, there is no syntactic restriction between a phrase attached with‘-(n)un’and a focused expression within the ‘-(n)un’phrase itself. In this area there is a semantic requirement that the alternatives generated by a focused expression be maintained up to the phrase attached with‘-(n)un’. Finally, when‘-(n)un’is used in an embedded clause, the whole sentence becomes natural when the contrastive topic introduced by‘-(n)un’and its alternative contrastive topic, which is presupposed by the contrastive topic marker, jointly constitute a more complex topic which is related to the whole context. And exclusiveness facilitates the formation of the whole complex context.

  • PDF

K 패션에 대한 글로벌 미디어 보도 경향 분석 -다이내믹 토픽 모델링(Dynamic Topic Modeling)의 적용- (Analysis of Global Media Reporting Trends for K-fashion -Applying Dynamic Topic Modeling-)

  • 안효선;김지영
    • 한국의류학회지
    • /
    • 제46권6호
    • /
    • pp.1004-1022
    • /
    • 2022
  • This study seeks to investigate K-fashion's external image by examining the trends in global media reporting. It applies Dynamic Topic Modeling (DTM), which captures the evolution of topics in a sequentially organized corpus of documents, and consists of text preprocessing, the determination of the number of topics, and a timeseries analysis of the probability distribution of words within topics. The data set comprised 551 online media articles on 'Korean fashion' or 'K-fashion' published on Google News between 2010 and 2021. The analysis identifies seven topics: 'brand look and style,' 'lifestyle,' 'traditional style,' 'Seoul Fashion Week (SFW) event,' 'model size,' 'K-pop,' and 'fashion market,' as well as annual topic proportion trends. It also explores annual word changes within the topic and indicates increasing and decreasing word patterns. In most topics, the probability distribution of the word 'brand' is confirmed to be on the increase, while 'digital,' 'platform,' and 'virtual' have been newly created in the 'SFW event' topic. Moreover, this study confirms the transition of each K-fashion topic over the past 12 years, along with various factors related to Hallyu content, traditional culture, government support, and digital technology innovation.

Exploration of Research Trends in The Journal of Distribution Science Using Keyword Analysis

  • YANG, Woo-Ryeong
    • 산경연구논집
    • /
    • 제10권8호
    • /
    • pp.17-24
    • /
    • 2019
  • Purpose - The purpose of this study is to find out research directions for distribution and fusion and complex field to many domestic and foreign researchers carrying out related academic research by confirming research trends in the Journal of Distribution Science (JDS). Research Design, Data, and Methodology - To do this, I used keywords from a total of 904 papers published in the JDS excluding 19 papers that were not presented with keywords among 923. The analysis utilized word clouding, topic modeling, and weighted frequency analysis using the R program. Results - As a result of word clouding analysis, customer satisfaction was the most utilized keyword. Topic modeling results were divided into ten topics such as distribution channels, communication, supply chain, brand, business, customer, comparative study, performance, KODISA journal, and trade. It is confirmed that only the service quality part is increased in the weighted frequency analysis result of applying to the year group. Conclusion - The results of this study confirm that the JDS has developed into various convergence and integration researches from the past studies limited to the field of distribution. However, JDS's identity is based on distribution. Therefore, it is also necessary to establish identity continuously through special editions of fields related to distribution.

Too Much Information - Trying to Help or Deceive? An Analysis of Yelp Reviews

  • Hyuk Shin;Hong Joo Lee;Ruth Angelie Cruz
    • Asia pacific journal of information systems
    • /
    • 제33권2호
    • /
    • pp.261-281
    • /
    • 2023
  • The proliferation of online customer reviews has completely changed how consumers purchase. Consumers now heavily depend on authentic experiences shared by previous customers. However, deceptive reviews that aim to manipulate customer decision-making to promote or defame a product or service pose a risk to businesses and buyers. The studies investigating consumer perception of deceptive reviews found that one of the important cues is based on review content. This study aims to investigate the impact of the information amount of review on the review truthfulness. This study adopted the Information Manipulation Theory (IMT) as an overarching theory, which asserts that the violations of one or more of the Gricean maxim are deceptive behaviors. It is regarded as a quantity violation if the required information amount is not delivered or more information is delivered; that is an attempt at deception. A topic modeling algorithm is implemented to reveal the distribution of each topic embedded in a text. This study measures information amount as topic diversity based on the results of topic modeling, and topic diversity shows how heterogeneous a text review is. Two datasets of restaurant reviews on Yelp.com, which have Filtered (deceptive) and Unfiltered (genuine) reviews, were used to test the hypotheses. Reviews that contain more diverse topics tend to be truthful. However, excessive topic diversity produces an inverted U-shaped relationship with truthfulness. Moreover, we find an interaction effect between topic diversity and reviews' ratings. This result suggests that the impact of topic diversity is strengthened when deceptive reviews have lower ratings. This study contributes to the existing literature on IMT by building the connection between topic diversity in a review and its truthfulness. In addition, the empirical results show that topic diversity is a reliable measure for gauging information amount of reviews.