• 제목/요약/키워드: Conditional Probability

검색결과 332건 처리시간 0.027초

A probabilistic information retrieval model by document ranking using term dependencies (용어간 종속성을 이용한 문서 순위 매기기에 의한 확률적 정보 검색)

  • You, Hyun-Jo;Lee, Jung-Jin
    • The Korean Journal of Applied Statistics
    • /
    • 제32권5호
    • /
    • pp.763-782
    • /
    • 2019
  • This paper proposes a probabilistic document ranking model incorporating term dependencies. Document ranking is a fundamental information retrieval task. The task is to sort documents in a collection according to the relevance to the user query (Qin et al., Information Retrieval Journal, 13, 346-374, 2010). A probabilistic model is a model for computing the conditional probability of the relevance of each document given query. Most of the widely used models assume the term independence because it is challenging to compute the joint probabilities of multiple terms. Words in natural language texts are obviously highly correlated. In this paper, we assume a multinomial distribution model to calculate the relevance probability of a document by considering the dependency structure of words, and propose an information retrieval model to rank a document by estimating the probability with the maximum entropy method. The results of the ranking simulation experiment in various multinomial situations show better retrieval results than a model that assumes the independence of words. The results of document ranking experiments using real-world datasets LETOR OHSUMED also show better retrieval results.

Estimation of drought risk through the bivariate drought frequency analysis using copula functions (코플라 함수를 활용한 이변량 가뭄빈도해석을 통한 우리나라 가뭄 위험도 산정)

  • Yu, Ji Soo;Yoo, Ji Young;Lee, Joo-Heon;Kim, Tea-Woong
    • Journal of Korea Water Resources Association
    • /
    • 제49권3호
    • /
    • pp.217-225
    • /
    • 2016
  • The drought is generally characterized by duration and severity, thus it is required to conduct the bivariate frequency analysis simultaneously considering the drought duration and severity. However, since a bivariate joint probability distribution function (JPDF) has a 3-dimensional space, it is difficult to interpret the results in practice. In order to suggest the technical solution, this study employed copula functions to estimate an JPDF, then developed conditional JPDFs on various drought durations and estimated the critical severity corresponding to non-exceedance probability. Based on the historical severe drought events, the hydrologic risks were investigated for various extreme droughts with 95% non-exceedance probability. For the drought events with 10-month duration, the most hazardous areas were decided to Gwangju, Inje, and Uljin, which have 1.3-2.0 times higher drought occurrence probabilities compared with the national average. In addition, it was observed that southern regions were much higher drought prone areas than northern and central areas.

Channel Input-Traffic Control of FH/SSMA Systems with a Centralized Controller (기지국이 있는 주파수 도약 대역확산 통신 시스템에서의 채널 입력 트래픽 제어)

  • 김석찬;김정곤;송익호;김형명
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제21권1호
    • /
    • pp.175-186
    • /
    • 1996
  • An optimal channel input-traffic control (OCIC) policy is proposed for slotted frequency-hopped spread-spectrum multiple access communication systems. When the number of channel input packets is set to the optimal number, the conditional throughput for the OCIC policy is analyzed. The state transition probability is derived, the steady state performance is analyzed, and the mean pracket delay is obtained. It is shown that the mean packet delay decreases considerably when the priority of transmission is given to backlogged users. The smaller is the number of requency slots, the larger are the differences between the preformance of the OCIC policy and that of the other policies.

  • PDF

Error Intensity Function Models for ML Estimation of Signal Parameter, Part I : Model Derivation (신호 파라미터의 ML 추정기법에 대한 에러 밀도 함수 모델에 관한 연구 I : 모델 정립)

  • Joong Kyu Kim
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • 제30B권12호
    • /
    • pp.1-11
    • /
    • 1993
  • This paper concentrates on models useful for analyzing the error performance of ML(Maximum Likelihood) estimators of a single unknown signal parameter: that is the error intensity model. We first develop the point process representation for the estimation error and the conditional distribution of the estimator as well as the distribution of error candidate point process. Then the error intensity function is defined as the probability dessity of the estimate and the general form of the error intensity function is derived. We then develop several intensity models depending on the way we choose the candidate error locations. For each case, we compute the explicit form of the intensity function and discuss the trade-off among models as well as the extendability to the case of multiple parameter estimation.

  • PDF

Analysis of PN Code Acquisition Performance with Multiple Antennas in a UWB System (다중 안테나를 적용한 UWB 시스템의 PN 부호 포착 성능 분석)

  • Kim, Eun-Cheol;Kim, Jin-Young
    • Proceedings of the IEEK Conference
    • /
    • 대한전자공학회 2005년도 추계종합학술대회
    • /
    • pp.69-72
    • /
    • 2005
  • In this paper, pseudo noise (PN) code acquisition performance with multiple antennas in a UWB time hopping/code division multiple access system is analyzed. The closed form for the conditional probability is derived, using the Gauss-Hermite quadrature formula, when the signal with Gaussian distribution goes through the lognormal fading channel. The performance comparison of the above mentioned schemes shows that the code acquisition performance with a diversity combining technique, especially when increasing the number of antennas, is more robust than that using no diversity.

  • PDF

Numerical Investigations of Turbulent Stratified Premixed Flames (난류 성층 예혼합 화염장의 상세구조 해석)

  • Jeon, Sangtae;Kim, Namsu;Kim, Yongmo
    • 한국연소학회:학술대회논문집
    • /
    • 한국연소학회 2014년도 제49회 KOSCO SYMPOSIUM 초록집
    • /
    • pp.183-184
    • /
    • 2014
  • The multi-environment probability density function model has been applied to simulate the turbulent stratified premixed flames. The direct quadrature method of moments (DQMOM) has been adopted to solve the transport PDF equation due to its computational efficiency and robustness. Computations are made for the non-swirling turbulent stratified premixed flames including SWB1, SWB5 and SWB9. The numerical results obtained in this study are precisely compared with experimental data in terms of axial velocity, unconditional means and conditional means for scalar field including temperature and species mass fraction.

  • PDF

Improved Exact Inference in Logistic Regression Model

  • Kim, Donguk;Kim, Sooyeon
    • Communications for Statistical Applications and Methods
    • /
    • 제10권2호
    • /
    • pp.277-289
    • /
    • 2003
  • We propose modified exact inferential methods in logistic regression model. Exact conditional distribution in logistic regression model is often highly discrete, and ordinary exact inference in logistic regression is conservative, because of the discreteness of the distribution. For the exact inference in logistic regression model we utilize the modified P-value. The modified P-value can not exceed the ordinary P-value, so the test of size $\alpha$ based on the modified P-value is less conservative. The modified exact confidence interval maintains at least a fixed confidence level but tends to be much narrower. The approach inverts results of a test with a modified P-value utilizing the test statistic and table probabilities in logistic regression model.

A DATA COMPRESSION METHOD USING ADAPTIVE BINARY ARITHMETIC CODING AND FUZZY LOGIC

  • Jou, Jer-Min;Chen, Pei-Yin
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 한국퍼지및지능시스템학회 1998년도 The Third Asian Fuzzy Systems Symposium
    • /
    • pp.756-761
    • /
    • 1998
  • This paper describes an in-line lossless data compression method using adaptive binary arithmetic coding. To achieve better compression efficiency , we employ an adaptive fuzzy -tuning modeler, which uses fuzzy inference to deal with the problem of conditional probability estimation. The design is simple, fast and suitable for VLSI implementation because we adopt the table -look-up approach. As compared with the out-comes of other lossless coding schemes, our results are good and satisfactory for various types of source data.

  • PDF

A distance metric of nominal attribute based on conditional probability (조건부 확률에 기반한 범주형 자료의 거리 측정)

  • 이재호;우종하;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 한국퍼지및지능시스템학회 2003년도 추계 학술대회 학술발표 논문집
    • /
    • pp.53-56
    • /
    • 2003
  • 유사도 혹은 자료간의 거리 개념은 많은 기계학습 알고리즘에서 사용되고 있는 중요한 측정개념이다 하지만 입력되는 자료의 속성들중 순서가 정의되지 않은 범주형 속성이 포함되어 있는 경우, 자료간의 유사도나 거리 측정에 어려움이 따른다. 비거리 기반의 알고리즘들의 경우-C4.5, CART-거리의 측정없이 작동할 수 있지만, 거리기반의 알고리즘들의 경우 범주형 속성의 거리 정보 결여로 효과적으로 적용될 수 없는 문제점을 갖고 있다. 본 논문에서는 이러한 범주형 자료들간 거리 측정을 자료 집합의 특성을 충분히 고려한 방법을 제안한다. 이를 위해 자료 집합의 선험적인 정보를 필요로 한다. 이런 선험적 정보인 조건부 확률을 기반으로한 거리 측정방법을 제시하고 오류 피드백을 통해서 속성 간 거리 측정을 최적화 하려고 노력한다. 주어진 자료 집합에 대해 서로 다른 두 범주형 값이 목적 속성에 대해서 유사한 분포를 보인다면 이들 값들은 비교적 가까운 거리로 결정한다 이렇게 결정된 거리를 기반으로 학습 단계를 진행하며 이때 발생한 오류들에 대해 피드백 작업을 진행한다. UCI Machine Learning Repository의 자료들을 이용한 실험 결과를 통해 제안한 거리 측정 방법의 우수한 성능을 확인하였다.

  • PDF

Derivation of Design Flood Using Multisite Rainfall Simulation Technique and Continuous Rainfall-Runoff Model

  • Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 한국수자원학회 2009년도 학술발표회 초록집
    • /
    • pp.540-544
    • /
    • 2009
  • Hydrologic pattern under climate change has been paid attention to as one of the most important issues in hydrologic science group. Rainfall and runoff is a key element in the Earth's hydrological cycle, and associated with many different aspects such as water supply, flood prevention and river restoration. In this regard, a main objective of this study is to evaluate design flood using simulation techniques which can consider a full spectrum of uncertainty. Here we utilize a weather state based stochastic multivariate model as conditional probability model for simulating the rainfall field. A major premise of this study is that large scale climatic patterns are a major driver of such persistent year to year changes in rainfall probabilities. Uncertainty analysis in estimating design flood is inevitably needed to examine reliability for the estimated results. With regard to this point, this study applies a Bayesian Markov Chain Monte Carlo scheme to the NWS-PC rainfall-runoff model that has been widely used, and a case study is performed in Soyang Dam watershed in Korea. A comprehensive discussion on design flood under climate change is provided.

  • PDF