• Title/Summary/Keyword: 디리클레 분포

Search Result 7, Processing Time 0.022 seconds

Classification and Allocation method of e-mail using possibility distribution and prediction (확률 분포와 추론에 의한 이메일 분류 및 정리 방법)

  • Go, Nam-Hyeon;Kim, Ji-Yun;Choi, Man-Kyu
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.07a
    • /
    • pp.95-96
    • /
    • 2016
  • 본 논문에서는 디리클레 분포와 베이즈 추론 모델을 활용하여 전자우편을 분류하고 정리하는 방법을 제안한다. 과거 원치 않는 광고성 이메일인 스팸 탐지에서 시작한 전자우편 분류는 지속적인 송수신 량의 증가와 내용의 다양화로 인해 광고성과 정보성의 판단 기준이 모호해진 상태이다. 스팸 탐지와 같은 이분법적 분류 방식이 아닌 내용의 주제 별로 자동 분류할 수 있는 방법이 필요하다. 본 논문에서 다루는 제안 기법은 전자우편의 내용에서 다뤄질 수 있는 주제의 종류를 예측하기 위한 방법을 제공한다. 발신하거나 수신된 전자우편이 속한 주제를 자동으로 정할 수 있다. 본 제안 기법의 활용을 통해 전자우편의 분류만이 아닌 업무 및 시장 동향 분석과 정보보안 분야에서는 악성코드 분류에 사용될 수 있을 것으로 기대된다.

  • PDF

Ensemble trading algorithm Using Dirichlet distribution-based model contribution prediction (디리클레 분포 기반 모델 기여도 예측을 이용한 앙상블 트레이딩 알고리즘)

  • Jeong, Jae Yong;Lee, Ju Hong;Choi, Bum Ghi;Song, Jae Won
    • Smart Media Journal
    • /
    • v.11 no.3
    • /
    • pp.9-17
    • /
    • 2022
  • Algorithmic trading, which uses algorithms to trade financial products, has a problem in that the results are not stable due to many factors in the market. To alleviate this problem, ensemble techniques that combine trading algorithms have been proposed. However, there are several problems with this ensemble method. First, the trading algorithm may not be selected so as to satisfy the minimum performance requirement (more than random) of the algorithm included in the ensemble, which is a necessary requirement of the ensemble. Second, there is no guarantee that an ensemble model that performed well in the past will perform well in the future. In order to solve these problems, a method for selecting trading algorithms included in the ensemble model is proposed as follows. Based on past data, we measure the contribution of the trading algorithms included in the ensemble models with high performance. However, for contributions based only on this historical data, since there are not enough past data and the uncertainty of the past data is not reflected, the contribution distribution is approximated using the Dirichlet distribution, and the contribution values are sampled from the contribution distribution to reflect the uncertainty. Based on the contribution distribution of the trading algorithm obtained from the past data, the Transformer is trained to predict the future contribution. Trading algorithms with high predicted future contribution are selected and included in the ensemble model. Through experiments, it was proved that the proposed ensemble method showed superior performance compared to the existing ensemble methods.

Noise reduction algorithm for an image using nonparametric Bayesian method (비모수 베이지안 방법을 이용한 영상 잡음 제거 알고리즘)

  • Woo, Ho-young;Kim, Yeong-hwa
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.5
    • /
    • pp.555-572
    • /
    • 2018
  • Noise reduction processes that reduce or eliminate noise (caused by a variety of reasons) in noise contaminated image is an important theme in image processing fields. Many studies are being conducted on noise removal processes due to the importance of distinguishing between noise added to a pure image and the unique characteristics of original images. Adaptive filter and sigma filter are typical noise reduction filters used to reduce or eliminate noise; however, their effectiveness is affected by accurate noise estimation. This study generates a distribution of noise contaminating image based on a Dirichlet normal mixture model and presents a Bayesian approach to distinguish the characteristics of an image against the noise. In particular, to distinguish the distribution of noise from the distribution of characteristics, we suggest algorithms to develop a Bayesian inference and remove noise included in an image.

Estimating the Number of Seats in Local Constituencies of a Party Using Exit Polls in the General Election (총선 출구조사에서 정당별 지역구 의석수 추정)

  • Kim, Ji-Hyun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.59-70
    • /
    • 2013
  • Exit polls failed to estimate the number of seats in the National Assembly for each party in the 2012 General Election, even though they estimated it in interval. Three major broadcast companies jointly carried out exit polls, but made projections independently. The exact methods of projection were not publicly released. This paper proposes confidence intervals for the number of seats in local constituencies using the results of exit polls, and conducted simulation studies to assess the performance of the cofidence intervals. The proposed confidence intervals were applied to the real data of 2012 General Election.

Development of Simulation Method of Doppler Power Spectrum and Raw Time Series Signal Using Average Moments of Radar Wind Profiler (윈드프로파일러의 평균모멘트 값을 이용한 도플러 파워 스펙트럼 및 시계열 원시신호 시뮬레이션기법 개발)

  • Lee, Sang-Yun;Lee, Gyu-Won
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1037-1044
    • /
    • 2020
  • Since radar wind profiler (RWP) provides wind field data with high time and space resolution in all weather conditions, their verification of the accuracy and quality is essential. The simultaneous wind measurement from rawinsonde is commonly used to evaluate wind vectors from RWP. In this study, the simulation algorithm which produces the spectrum and raw time series (I/Q) data from the average values of moments is presented as a step-by-step verification method for the signal processing algorithm. The possibility of the simulation algorithm was also confirmed through comparison with the raw data of LAP-3000. The Doppler power spectrum was generated by assuming the density function of the skew-normal distribution and by using the moment values as the parameter. The simulated spectrum was generated through random numbers. In addition, the coherent averaged I/Q data was generated by random phase and inverse discrete Fourier transform, and raw I/Q data was generated through the Dirichlet distribution.

A Trend Analysis of Radiological Research in Korea using Topic Modeling (토픽모델링을 이용한 국내 방사선 학술연구 트렌드 분석)

  • Hong, Dong-Hee
    • Journal of the Korean Society of Radiology
    • /
    • v.16 no.3
    • /
    • pp.343-349
    • /
    • 2022
  • We intend to use topic modeling to identify radiation-themed papers published from 1989 to 2022 and analyze the relevance and weight between topics. This study analyzed topics derived from national subjects for 717 papers published until recently in 2022 to contribute to the revitalization of research in the field of radiation. Through text mining, overall research trends on the subject distribution of the study were analyzed, and five topics were derived through topic modeling. First, among the papers to be analyzed, a total of 1,675 words were frequency-analyzed through the preprocessing process of key words in a total of 717 papers centered on keywords. Second, as a result of analyzing topics based on the association of constituent words for five topics, it was found that studies focused on minimizing dose in the range that does not degrade image quality in the fields of radiation, image, CT clinical. In addition, it was found that various studies were mainly conducted in the MRI, and the study of ultrasound in various areas of disease analysis was actively attempted.

Automatic TV Program Recommendation using LDA based Latent Topic Inference (LDA 기반 은닉 토픽 추론을 이용한 TV 프로그램 자동 추천)

  • Kim, Eun-Hui;Pyo, Shin-Jee;Kim, Mun-Churl
    • Journal of Broadcast Engineering
    • /
    • v.17 no.2
    • /
    • pp.270-283
    • /
    • 2012
  • With the advent of multi-channel TV, IPTV and smart TV services, excessive amounts of TV program contents become available at users' sides, which makes it very difficult for TV viewers to easily find and consume their preferred TV programs. Therefore, the service of automatic TV recommendation is an important issue for TV users for future intelligent TV services, which allows to improve access to their preferred TV contents. In this paper, we present a recommendation model based on statistical machine learning using a collaborative filtering concept by taking in account both public and personal preferences on TV program contents. For this, users' preference on TV programs is modeled as a latent topic variable using LDA (Latent Dirichlet Allocation) which is recently applied in various application domains. To apply LDA for TV recommendation appropriately, TV viewers's interested topics is regarded as latent topics in LDA, and asymmetric Dirichlet distribution is applied on the LDA which can reveal the diversity of the TV viewers' interests on topics based on the analysis of the real TV usage history data. The experimental results show that the proposed LDA based TV recommendation method yields average 66.5% with top 5 ranked TV programs in weekly recommendation, average 77.9% precision in bimonthly recommendation with top 5 ranked TV programs for the TV usage history data of similar taste user groups.