Search | Korea Science

Learning Probabilistic Kernel from Latent Dirichlet Allocation

Lv, Qi;Pang, Lin;Li, Xiong
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.10 no.6
- /
- pp.2527-2545
- /
- 2016
Measuring the similarity of given samples is a key problem of recognition, clustering, retrieval and related applications. A number of works, e.g. kernel method and metric learning, have been contributed to this problem. The challenge of similarity learning is to find a similarity robust to intra-class variance and simultaneously selective to inter-class characteristic. We observed that, the similarity measure can be improved if the data distribution and hidden semantic information are exploited in a more sophisticated way. In this paper, we propose a similarity learning approach for retrieval and recognition. The approach, termed as LDA-FEK, derives free energy kernel (FEK) from Latent Dirichlet Allocation (LDA). First, it trains LDA and constructs kernel using the parameters and variables of the trained model. Then, the unknown kernel parameters are learned by a discriminative learning approach. The main contributions of the proposed method are twofold: (1) the method is computationally efficient and scalable since the parameters in kernel are determined in a staged way; (2) the method exploits data distribution and semantic level hidden information by means of LDA. To evaluate the performance of LDA-FEK, we apply it for image retrieval over two data sets and for text categorization on four popular data sets. The results show the competitive performance of our method.
https://doi.org/10.3837/tiis.2016.06.005 인용 PDF KSCI KPUBS HTML

Topics and Sentiment Analysis Based on Reviews of Omni-Channel Retailing

KIM, Soon-Hong;YOO, Byong-Kook
- Journal of Distribution Science
- /
- v.19 no.4
- /
- pp.25-35
- /
- 2021
Purpose: This study aims to analyze the factors affecting customer satisfaction in the customer reviews of omni-channel, posted on Internet blogs, cafes, and YouTube using text mining analysis. Research, data, and Methodology: In this study, frequency analysis is performed and the LDA (Latent Dirichlet Allocation) is used to analyze social big data to respond to reviewers' reaction to the recently opened omni-channel shopping reviews by L Shopping Company. Additionally, based on the topic analysis, we conduct a sentiment analysis on purchase reviews and analyze the characteristics of each topic on the positive or negative sentiments of omni-channel app users. Results: As a result of a topic analysis, four main topics are derived: delivery and events, economic value, recommendations and convenience, and product quality and brand awareness. The emotional analysis reveals that the reviewers have many positive evaluations for price policy and product promotion, but negative evaluations for app use, delivery, and product quality. Conclusions: Retailers can establish customized marketing strategies by identifying the customer's major interests through text mining analysis. Additionally, the analysis of sentiment by subject becomes an important indicator for developing products and services that customers want by identifying areas that satisfy customers and areas that evoke negative reactions.
https://doi.org/10.15722/jds.19.4.202104.25 인용 PDF KSCI HTML

Semiparametric Bayesian Hierarchical Selection Models with Skewed Elliptical Distribution (왜도 타원형 분포를 이용한 준모수적 계층적 선택 모형)

정윤식;장정훈
- The Korean Journal of Applied Statistics
- /
- v.16 no.1
- /
- pp.101-115
- /
- 2003
Lately there has been much theoretical and applied interest in linear models with non-normal heavy tailed error distributions. Starting Zellner(1976)'s study, many authors have explored the consequences of non-normality and heavy-tailed error distributions. We consider hierarchical models including selection models under a skewed heavy-tailed e..o. distribution proposed originally by Chen, Dey and Shao(1999) and Branco and Dey(2001) with Dirichlet process prior(Ferguson, 1973) in order to use a meta-analysis. A general calss of skewed elliptical distribution is reviewed and developed. Also, we consider the detail computational scheme under skew normal and skew t distribution using MCMC method. Finally, we introduce one example from Johnson(1993)'s real data and apply our proposed methodology.
https://doi.org/10.5351/KJAS.2003.16.1.101 인용 PDF KSCI

Stochastic Time Duration Model with Gamma-Dirichlet Distribution for Global and Local Duration of HMM (Gamma-Dirichlet 분포에 의한 HMM의 전역 및 지역 시간지속 모델)

Sin, Bong-Kee
- Proceedings of the Korean Information Science Society Conference
- /
- 2008.06c
- /
- pp.517-521
- /
- 2008
HMM의 약점인 상태 지속 분포를 개선하는 새로운 개념의 확률적 전역+지역 시간 지속 분포 segment 모델(GL-STDM)을 제안한다. 즉, 시계열 신호의 전역적 시간 정보를 표현하고, 각 상태 별 duration 모델과 각 상태의 duration 정보 사이의 상관관계를 표현하는 global pattern (shape 또는 long-term dependency)을 제안한다. 그러나 제안 모델은, Markov 가정을 깨뜨리기 때문에 dynamic programming이 자랑하는 단순함, 효율성을 유지하지는 못한다. 하지만 최근 부각되는 방법인 Monte Carlo 표본 기법을 이용하여 효과적으로 문제를 해결하는 방법을 제시하였다. 본 논문에서는 제안 모델 GL-STDM의 개념과 정의, 그리고 추론 방법과 모델 평가 방법을 기술하였다.
PDF

Automatic TV Program Recommendation using LDA based Latent Topic Inference (LDA 기반 은닉 토픽 추론을 이용한 TV 프로그램 자동 추천)

Kim, Eun-Hui;Pyo, Shin-Jee;Kim, Mun-Churl
- Journal of Broadcast Engineering
- /
- v.17 no.2
- /
- pp.270-283
- /
- 2012
With the advent of multi-channel TV, IPTV and smart TV services, excessive amounts of TV program contents become available at users' sides, which makes it very difficult for TV viewers to easily find and consume their preferred TV programs. Therefore, the service of automatic TV recommendation is an important issue for TV users for future intelligent TV services, which allows to improve access to their preferred TV contents. In this paper, we present a recommendation model based on statistical machine learning using a collaborative filtering concept by taking in account both public and personal preferences on TV program contents. For this, users' preference on TV programs is modeled as a latent topic variable using LDA (Latent Dirichlet Allocation) which is recently applied in various application domains. To apply LDA for TV recommendation appropriately, TV viewers's interested topics is regarded as latent topics in LDA, and asymmetric Dirichlet distribution is applied on the LDA which can reveal the diversity of the TV viewers' interests on topics based on the analysis of the real TV usage history data. The experimental results show that the proposed LDA based TV recommendation method yields average 66.5% with top 5 ranked TV programs in weekly recommendation, average 77.9% precision in bimonthly recommendation with top 5 ranked TV programs for the TV usage history data of similar taste user groups.
https://doi.org/10.5909/JEB.2012.17.2.270 인용 PDF KSCI

Noise reduction algorithm for an image using nonparametric Bayesian method (비모수 베이지안 방법을 이용한 영상 잡음 제거 알고리즘)

Woo, Ho-young;Kim, Yeong-hwa
- The Korean Journal of Applied Statistics
- /
- v.31 no.5
- /
- pp.555-572
- /
- 2018
Noise reduction processes that reduce or eliminate noise (caused by a variety of reasons) in noise contaminated image is an important theme in image processing fields. Many studies are being conducted on noise removal processes due to the importance of distinguishing between noise added to a pure image and the unique characteristics of original images. Adaptive filter and sigma filter are typical noise reduction filters used to reduce or eliminate noise; however, their effectiveness is affected by accurate noise estimation. This study generates a distribution of noise contaminating image based on a Dirichlet normal mixture model and presents a Bayesian approach to distinguish the characteristics of an image against the noise. In particular, to distinguish the distribution of noise from the distribution of characteristics, we suggest algorithms to develop a Bayesian inference and remove noise included in an image.
https://doi.org/10.5351/KJAS.2018.31.5.555 인용 PDF KSCI

Feature Expansion based on LDA Word Distribution for Performance Improvement of Informal Document Classification (비격식 문서 분류 성능 개선을 위한 LDA 단어 분포 기반의 자질 확장)

Lee, Hokyung;Yang, Seon;Ko, Youngjoong
- Journal of KIISE
- /
- v.43 no.9
- /
- pp.1008-1014
- /
- 2016
Data such as Twitter, Facebook, and customer reviews belong to the informal document group, whereas, newspapers that have grammar correction step belong to the formal document group. Finding consistent rules or patterns in informal documents is difficult, as compared to formal documents. Hence, there is a need for additional approaches to improve informal document analysis. In this study, we classified Twitter data, a representative informal document, into ten categories. To improve performance, we revised and expanded features based on LDA(Latent Dirichlet allocation) word distribution. Using LDA top-ranked words, the other words were separated or bundled, and the feature set was thus expanded repeatedly. Finally, we conducted document classification with the expanded features. Experimental results indicated that the proposed method improved the micro-averaged F1-score of 7.11%p, as compared to the results before the feature expansion step.
https://doi.org/10.5626/JOK.2016.43.9.1008 인용 KSCI

Development of Simulation Method of Doppler Power Spectrum and Raw Time Series Signal Using Average Moments of Radar Wind Profiler (윈드프로파일러의 평균모멘트 값을 이용한 도플러 파워 스펙트럼 및 시계열 원시신호 시뮬레이션기법 개발)

Lee, Sang-Yun;Lee, Gyu-Won
- The Journal of the Korea institute of electronic communication sciences
- /
- v.15 no.6
- /
- pp.1037-1044
- /
- 2020
Since radar wind profiler (RWP) provides wind field data with high time and space resolution in all weather conditions, their verification of the accuracy and quality is essential. The simultaneous wind measurement from rawinsonde is commonly used to evaluate wind vectors from RWP. In this study, the simulation algorithm which produces the spectrum and raw time series (I/Q) data from the average values of moments is presented as a step-by-step verification method for the signal processing algorithm. The possibility of the simulation algorithm was also confirmed through comparison with the raw data of LAP-3000. The Doppler power spectrum was generated by assuming the density function of the skew-normal distribution and by using the moment values as the parameter. The simulated spectrum was generated through random numbers. In addition, the coherent averaged I/Q data was generated by random phase and inverse discrete Fourier transform, and raw I/Q data was generated through the Dirichlet distribution.
https://doi.org/10.13067/JKIECS.2020.15.6.1037 인용 PDF KSCI

A Bayes Reliability Estimation from Life Test in a Stress-Strength Model

Park, Sung-Sub;Kim, Jae-Joo
- Journal of the Korean Statistical Society
- /
- v.12 no.1
- /
- pp.1-9
- /
- 1983
A stress-strength model is formulated for s out of k system of identical components. We consider the estimation of system reliability from survival count data from a Bayesian viewpoint. We assume a quadratic loss and a Dirichlet prior distribution. It is shown that a Bayes sequential procedure can be established. The Bayes estimator is compared with the UMVUE obtained by Bhattacharyya and with an estimator based on Mann-Whitney statistic.
PDF

A Penalized Likelihood Method for Model Complexity

Ahn, Sung M.
- Communications for Statistical Applications and Methods
- /
- v.8 no.1
- /
- pp.173-184
- /
- 2001
We present an algorithm for the complexity reduction of a general Gaussian mixture model by using a penalized likelihood method. One of our important assumptions is that we begin with an overfitted model in terms of the number of components. So our main goal is to eliminate redundant components in the overfitted model. As shown in the section of simulation results, the algorithm works well with the selected densities.
PDF

Search Result 75, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)