• Title/Summary/Keyword: nonparametric Bayesian

Search Result 52, Processing Time 0.029 seconds

Phrase-based Topic and Sentiment Detection and Tracking Model using Incremental HDP

  • Chen, YongHeng;Lin, YaoJin;Zuo, WanLi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.5905-5926
    • /
    • 2017
  • Sentiments can profoundly affect individual behavior as well as decision-making. Confronted with the ever-increasing amount of review information available online, it is desirable to provide an effective sentiment model to both detect and organize the available information to improve understanding, and to present the information in a more constructive way for consumers. This study developed a unified phrase-based topic and sentiment detection model, combined with a tracking model using incremental hierarchical dirichlet allocation (PTSM_IHDP). This model was proposed to discover the evolutionary trend of topic-based sentiments from online reviews. PTSM_IHDP model firstly assumed that each review document has been composed by a series of independent phrases, which can be represented as both topic information and sentiment information. PTSM_IHDP model secondly depended on an improved time-dependency non-parametric Bayesian model, integrating incremental hierarchical dirichlet allocation, to estimate the optimal number of topics by incrementally building an up-to-date model. To evaluate the effectiveness of our model, we tested our model on a collected dataset, and compared the result with the predictions of traditional models. The results demonstrate the effectiveness and advantages of our model compared to several state-of-the-art methods.

Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics

  • Chen, YongHeng;Zhang, Fuquan;Zuo, WanLi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.392-412
    • /
    • 2018
  • Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More specifically, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods.

A Study on Enhancing Outdoor Pedestrian Positioning Accuracy Using Smartphone and Double-Stacked Particle Filter (스마트폰과 Double-Stacked 파티클 필터를 이용한 실외 보행자 위치 추정 정확도 개선에 관한 연구)

  • Kwangjae Sung
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.2
    • /
    • pp.112-119
    • /
    • 2023
  • In urban environments, signals of Global Positioning System (GPS) can be blocked and reflected by tall buildings, large vehicles, and complex components of road network. Therefore, the performance of the positioning system using the GPS module in urban areas can be degraded due to the loss of GPS signals necessary for the position estimation. To deal with this issue, various localization schemes using inertial measurement unit (IMU) sensors, such as gyroscope and accelerometer, and Bayesian filters, such as Kalman filter (KF) and particle filter (PF), have been designed to enhance the performance of the GPS-based positioning system. Among Bayesian filters, the PF has been widely used for the target tracking and vehicle navigation, since it can provide superior performance in estimating the state of a dynamic system under nonlinear/non-Gaussian circumstance. This paper presents a positioning system that uses the double-stacked particle filter (DSPF) as well as the accelerometer, gyroscope, and GPS receiver on the smartphone to provide higher pedestrian positioning accuracy in urban environments. The DSPF employs a nonparametric technique (Parzen-window) to create the multimodal target distribution that approximates the posterior distribution. Experimental results show that the DSPF-based positioning system can provide the significant improvement of the pedestrian position estimation in urban environments.

  • PDF

Spatial Analysis for Mean Annual Precipitation Based On Neural Networks (신경망 기법을 이용한 연평균 강우량의 공간 해석)

  • Sin, Hyeon-Seok;Park, Mu-Jong
    • Journal of Korea Water Resources Association
    • /
    • v.32 no.1
    • /
    • pp.3-13
    • /
    • 1999
  • In this study, an alternative spatial analysis method against conventional methods such as Thiessen method, Inverse Distance method, and Kriging method, named Spatial-Analysis Neural-Network (SANN) is presented. It is based on neural network modeling and provides a nonparametric mean estimator and also estimators of high order statistics such as standard deviation and skewness. In addition, it provides a decision-making tool including an estimator of posterior probability that a spatial variable at a given point will belong to various classes representing the severity of the problem of interest and a Bayesian classifier to define the boundaries of subregions belonging to the classes. In this paper, the SANN is implemented to be used for analyzing a mean annual precipitation filed and classifying the field into dry, normal, and wet subregions. For an example, the whole area of South Korea with 39 precipitation sites is applied. Then, several useful results related with the spatial variability of mean annual precipitation on South Korea were obtained such as interpolated field, standard deviation field, and probability maps. In addition, the whole South Korea was classified with dry, normal, and wet regions.

  • PDF

Multiple Comparisons for a Bivariate Exponential Populations Based On Dirichlet Process Priors

  • Cho, Jang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.553-560
    • /
    • 2007
  • In this paper, we consider two components system which lifetimes have Freund's bivariate exponential model with equal failure rates. We propose Bayesian multiple comparisons procedure for the failure rates of I Freund's bivariate exponential populations based on Dirichlet process priors(DPP). The family of DPP is applied in the form of baseline prior and likelihood combination to provide the comparisons. Computation of the posterior probabilities of all possible hypotheses are carried out through Markov Chain Monte Carlo(MCMC) method, namely, Gibbs sampling, due to the intractability of analytic evaluation. The whole process of multiple comparisons problem for the failure rates of bivariate exponential populations is illustrated through a numerical example.

  • PDF

Probabilistic real-time updating for geotechnical properties evaluation

  • Ng, Iok-Tong;Yuen, Ka-Veng;Dong, Le
    • Structural Engineering and Mechanics
    • /
    • v.54 no.2
    • /
    • pp.363-378
    • /
    • 2015
  • Estimation of geotechnical properties is an essential but challenging task since they are major components governing the safety and reliability of the entire structural system. However, due to time and budget constraints, reliable geotechnical properties estimation using traditional site characterization approach is difficult. In view of this, an alternative efficient and cost effective approach to address the overall uncertainty is necessary to facilitate an economical, safe and reliable geotechnical design. In this paper a probabilistic approach is proposed for real-time updating by incorporating new geotechnical information from the underlying project site. The updated model obtained from the proposed method is advantageous because it incorporates information from both existing database and the site of concern. An application using real data from a site in Hong Kong will be presented to demonstrate the proposed method.

Nonparametric Bayesian Approach for Multichannel based Semantic Segmentation of TV Dramas (멀티채널 기반 드라마 동영상 의미 분절화를 위한 비모수 베이지안 방법)

  • Seok, Ho-Sik;Lee, Ba-Do;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.474-476
    • /
    • 2012
  • 본 논문에서는 드라마 동영상의 의미 분절화(Semantic segmentation)를 위한 멀티 채널 기반 비모수적 베이지만 방법론을 소개한다. 기존 방법론은 매우 한정적인 특징만을 이용하여 분절화를 시도하거나 이미지 채널이나 오디오 채널과 같은 단일 채널에서만 유효한 방법론을 이용하여 데이터 분석을 시도하였기에, TV 드라마와 같이 예측할 수 없는 변화를 보여주는 스트림 데이터에 적용하기에는 어려움이 많았다. 이와 같은 단점을 극복하기 위해 우리는 주어진 동영상을 단일 모달리티의 채널로 분할한 후 각 채널 별로 분절화를 시도하고 각 채널의 분절 결과를 동적으로 결합하여 주어진 동영상에서의 의미 분절화를 근사하는 방법을 개발하였다. 제안 방법은 실제 TV 동영상의 의미 분절화에 적용되었으며 인간 평가자에 의한 의미 변화 구간과의 비교를 통해 그 성능을 확인하였다.

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.

Pliable regression spline estimator using auxiliary variables

  • Oh, Jae-Kwon;Jhong, Jae-Hwan
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.5
    • /
    • pp.537-551
    • /
    • 2021
  • We conducted a study on a regression spline estimator with a few pre-specified auxiliary variables. For the implementation of the proposed estimators, we adapted a coordinate descent algorithm. This was implemented by considering a structure of the sum of the residuals squared objective function determined by the B-spline and the auxiliary coefficients. We also considered an efficient stepwise knot selection algorithm based on the Bayesian information criterion. This was to adaptively select smoothly functioning estimator data. Numerical studies using both simulated and real data sets were conducted to illustrate the proposed method's performance. An R software package psav is available.

A comparison of synthetic data approaches using utility and disclosure risk measures (유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구)

  • Seongbin An;Trang Doan;Juhee Lee;Jiwoo Kim;Yong Jae Kim;Yunji Kim;Changwon Yoon;Sungkyu Jung;Dongha Kim;Sunghoon Kwon;Hang J Kim;Jeongyoun Ahn;Cheolwoo Park
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.141-166
    • /
    • 2023
  • This paper investigates synthetic data generation methods and their evaluation measures. There have been increasing demands for releasing various types of data to the public for different purposes. At the same time, there are also unavoidable concerns about leaking critical or sensitive information. Many synthetic data generation methods have been proposed over the years in order to address these concerns and implemented in some countries, including Korea. The current study aims to introduce and compare three representative synthetic data generation approaches: Sequential regression, nonparametric Bayesian multiple imputations, and deep generative models. Several evaluation metrics that measure the utility and disclosure risk of synthetic data are also reviewed. We provide empirical comparisons of the three synthetic data generation approaches with respect to various evaluation measures. The findings of this work will help practitioners to have a better understanding of the advantages and disadvantages of those synthetic data methods.