• Title/Summary/Keyword: Dirichlet distribution

Search Result 75, Processing Time 0.034 seconds

A Study on Leadership Trends from the Perspective of Domestic Researcher's Using BERTopic and LDA

  • Sung-Su, SHIN;Hoe-Chang, Yang
    • East Asian Journal of Business Economics (EAJBE)
    • /
    • v.11 no.1
    • /
    • pp.53-71
    • /
    • 2023
  • Purpose - This study aims to find clues necessary for the direction of leadership development suitable for the current situation by exploring the direction in which leadership has been studied from the perspective of domestic researchers, along with the arrangement of leadership theories studied in various ways. Research design, data, and methodology - A total of 7,425 papers were obtained due to the search, and 5,810 papers with English abstracts were used for analysis. For analysis, word frequency analysis, word clouding, and co-occurrence were confirmed using Python 3.7. In addition, after classifying topics related to research trends through BERTopic and LDA, trends were identified through dynamic topic modeling and OLS regression analysis. Result - As a result of the BERTopic, 14 topics such as 'Leadership management and performance' and 'Sports leadership' were derived. As a result of conducting LDA on 1,976 outliers, five topics were derived. As a result of trend analysis on topics by year, it was confirmed that five topics, such as 'military police leadership' received relative attention. Conclusion - Through the results of this study, a study on the reinterpretation of past leadership studies, a study on LMX with an expanded perspective, and a study on integrated leadership sub-factors of modern leadership theory were proposed.

User Information Collection of Weibo Network Public Opinion under Python

  • Changhua Liu;Yanlin Han
    • Journal of Information Processing Systems
    • /
    • v.19 no.3
    • /
    • pp.310-322
    • /
    • 2023
  • Although the network environment is gradually improving, the virtual nature of the network is still the same fact, which has brought a great influence on the supervision of Weibo network public opinion dissemination. In order to reduce this influence, the user information of Weibo network public opinion dissemination is studied by using Python technology. Specifically, the 2019 "Ethiopian air crash" event was taken as the research subject, the relevant data were collected by using Python technology, and the data from March 10, 2019 to June 20, 2019 were constructed by using the implicit Dirichlet distribution topic model and the naive Bayes classifier. The Weibo network public opinion user identity graph model under the "Ethiopian air crash" on June 20 found that the public opinion users of ordinary netizens accounted for the highest proportion and were easily influenced by media public opinion users. This influence is not limited to ordinary netizens. Public opinion users have an influence on other types of public opinion users. That is to say, in the network public opinion space of the "Ethiopian air crash," media public opinion users play an important role in the dissemination of network public opinion information. This research can lay a foundation for the classification and identification of user identity information types under different public opinion life cycles. Future research can start from the supervision of public opinion and the type of user identity to improve the scientific management and control of user information dissemination through Weibo network public opinion.

Nonparametric Bayesian Statistical Models in Biomedical Research (생물/보건/의학 연구를 위한 비모수 베이지안 통계모형)

  • Noh, Heesang;Park, Jinsu;Sim, Gyuseok;Yu, Jae-Eun;Chung, Yeonseung
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.867-889
    • /
    • 2014
  • Nonparametric Bayesian (np Bayes) statistical models are popularly used in a variety of research areas because of their flexibility and computational convenience. This paper reviews the np Bayes models focusing on biomedical research applications. We review key probability models for np Bayes inference while illustrating how each of the models is used to answer different types of research questions using biomedical examples. The examples are chosen to highlight the problems that are challenging for standard parametric inference but can be solved using nonparametric inference. We discuss np Bayes inference in four topics: (1) density estimation, (2) clustering, (3) random effects distribution, and (4) regression.

Time dependent finite element analysis of steel-concrete composite beams considering partial interaction

  • Dias, Maiga M.;Tamayo, Jorge L.P.;Morsch, Inacio B.;Awruch, Armando M.
    • Computers and Concrete
    • /
    • v.15 no.4
    • /
    • pp.687-707
    • /
    • 2015
  • A finite element computer code for short-term analysis of steel-concrete composite structures is extended to study long-term effects under service loads, in the present work. Long-term effects are important in engineering design because they influence stress and strain distribution of the structural system and therefore contribute to the increment of deflections in these structures. For creep analysis, a rheological model based on a Kelvin chain, with elements placed in series, was employed. The parameters of the Kelvin chain were obtained using Dirichlet series. Creep and shrinkage models, proposed by the CEB FIP 90, were used. The shear-lag phenomenon that takes place at the concrete slab is usually neglected or not properly taken into account in the formulation of beam-column finite elements. Therefore, in this work, a three-dimensional numerical model based on the assemblage of shell finite elements for representing the steel beam and the concrete slab is used. Stud shear connectors are represented for special beam-column elements to simulate the partial interaction at the slab-beam interface. The two-dimensional representation of the concrete slab permits to capture the non-uniform shear stress distribution in the horizontal plane of the slab due to shear-lag phenomenon. The model is validated with experimental results of two full-scale continuous composite beams previously studied by other authors. Results are given in terms of displacements, bending moments and cracking patterns in order to shown the influence of long-term effects in the structural response and also the potentiality of the present numerical code.

Topic Modeling of News Article Related to Franchise Regulation Using LDA (LDA 를 이용한 '프랜차이즈 규제' 관련 뉴스기사 토픽모델링)

  • YANG, Woo-Ryeong;YANG, Hoe Chang
    • The Korean Journal of Franchise Management
    • /
    • v.13 no.4
    • /
    • pp.1-12
    • /
    • 2022
  • Purpose: In 2020, the franchise industry accomplished a significant growth compared to the previous year, as the number of franchise companies increased by 9.0% while the number of franchise brands increased by 12.5%. Despite growth in size, the Korean franchise industry underwent many negative incidents, such as franchise ownership sales to private equity funds, that led to deterioration of businesses. From this point of view, this study aims to make various proposals to help policy makers develop franchise industry policies by analyzing trends of the current and previous presidential administrations' franchise policies and regulations using newspaper articles. Research design, data and methodology: A total of 7,439 articles registered in Naver API from February 25, 2013 to November 29, 2021 were extracted. Among them, 34 unrelated video articles were deleted, and a total of 7,405 articles from both administrations were used for analysis. The R package was used for word frequency analysis, word clouding, word correlation analysis, and LDA (Latent Dirichlet Allocation) topic modeling. Results: The keyword frequency analysis shows that the most frequently mentioned keywords during the previous administration include 'no-brand', 'major company', 'bill', 'business field', and 'SMEs', and those mentioned during the current administration include 'industry' and 'policy'. As a result of LDA topic modeling, 9 topics such as 'global startups' and 'job creation' from the previous administration, and 10 topics such as 'franchise business' and 'distribution industry' from the current administration were derived. The results of LDAvis showed that the previous administration operated a policy based on mutual growth of large and small businesses rather than hostile regulations in the franchise business, whereas the current administration extended the regulation related to franchise business to the employment sector. Conclusions: The analysis of past two administrations' franchise policy, it can be suggested that franchisors and franchisees may complement each other in developing the Fair Transactions in Franchise Business Act and achieving balanced growth. Moreover, political support is needed for sound development of franchisors. Limitations and future research suggestions are presented at the end of this study.

Futures Price Prediction based on News Articles using LDA and LSTM (LDA와 LSTM를 응용한 뉴스 기사 기반 선물가격 예측)

  • Jin-Hyeon Joo;Keun-Deok Park
    • Journal of Industrial Convergence
    • /
    • v.21 no.1
    • /
    • pp.167-173
    • /
    • 2023
  • As research has been published to predict future data using regression analysis or artificial intelligence as a method of analyzing economic indicators. In this study, we designed a system that predicts prospective futures prices using artificial intelligence that utilizes topic probability data obtained from past news articles using topic modeling. Topic probability distribution data for each news article were obtained using the Latent Dirichlet Allocation (LDA) method that can extract the topic of a document from past news articles via unsupervised learning. Further, the topic probability distribution data were used as the input for a Long Short-Term Memory (LSTM) network, a derivative of Recurrent Neural Networks (RNN) in artificial intelligence, in order to predict prospective futures prices. The method proposed in this study was able to predict the trend of futures prices. Later, this method will also be able to predict the trend of prices for derivative products like options. However, because statistical errors occurred for certain data; further research is required to improve accuracy.

A Study on Human-Robot Interaction Trends Using BERTopic (BERTopic을 활용한 인간-로봇 상호작용 동향 연구)

  • Jeonghun Kim;Kee-Young Kwahk
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.185-209
    • /
    • 2023
  • With the advent of the 4th industrial revolution, various technologies have received much attention. Technologies related to the 4th industry include the Internet of Things (IoT), big data, artificial intelligence, virtual reality (VR), 3D printers, and robotics, and these technologies are often converged. In particular, the robotics field is combined with technologies such as big data, artificial intelligence, VR, and digital twins. Accordingly, much research using robotics is being conducted, which is applied to distribution, airports, hotels, restaurants, and transportation fields. In the given situation, research on human-robot interaction is attracting attention, but it has not yet reached the level of user satisfaction. However, research on robots capable of perfect communication is steadily being conducted, and it is expected that it will be able to replace human emotional labor. Therefore, it is necessary to discuss whether the current human-robot interaction technology can be applied to business. To this end, this study first examines the trend of human-robot interaction technology. Second, we compare LDA (Latent Dirichlet Allocation) topic modeling and BERTopic topic modeling methods. As a result, we found that the concept of human-robot interaction and basic interaction was discussed in the studies from 1992 to 2002. From 2003 to 2012, many studies on social expression were conducted, and studies related to judgment such as face detection and recognition were conducted. In the studies from 2013 to 2022, service topics such as elderly nursing, education, and autism treatment appeared, and research on social expression continued. However, it seems that it has not yet reached the level that can be applied to business. As a result of comparing LDA (Latent Dirichlet Allocation) topic modeling and the BERTopic topic modeling method, it was confirmed that BERTopic is a superior method to LDA.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

A Fast Bayesian Detection of Change Points Long-Memory Processes (장기억 과정에서 빠른 베이지안 변화점검출)

  • Kim, Joo-Won;Cho, Sin-Sup;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.735-744
    • /
    • 2009
  • In this paper, we introduce a fast approach for Bayesian detection of change points in long-memory processes. Since a heavy computation is needed to evaluate the likelihood function of long-memory processes, a method for simplifying the computational process is required to efficiently implement a Bayesian inference. Instead of estimating the parameter, we consider selecting a element from the set of possible parameters obtained by categorizing the parameter space. This approach simplifies the detection algorithm and reduces the computational time to detect change points. Since the parameter space is (0, 0.5), there is no big difference between the result of parameter estimation and selection under a proper fractionation of the parameter space. The analysis of Nile river data showed the validation of the proposed method.

Bayesian analysis of finite mixture model with cluster-specific random effects (군집 특정 변량효과를 포함한 유한 혼합 모형의 베이지안 분석)

  • Lee, Hyejin;Kyung, Minjung
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.57-68
    • /
    • 2017
  • Clustering algorithms attempt to find a partition of a finite set of objects in to a potentially predetermined number of nonempty subsets. Gibbs sampling of a normal mixture of linear mixed regressions with a Dirichlet prior distribution calculates posterior probabilities when the number of clusters was known. Our approach provides simultaneous partitioning and parameter estimation with the computation of classification probabilities. A Monte Carlo study of curve estimation results showed that the model was useful for function estimation. Examples are given to show how these models perform on real data.