• 제목/요약/키워드: negative sampling

검색결과 649건 처리시간 0.022초

On Some Distributions Generated by Riff-Shuffle Sampling

  • Son M.S.;Hamdy H.I.
    • International Journal of Contents
    • /
    • 제2권2호
    • /
    • pp.17-24
    • /
    • 2006
  • The work presented in this paper is divided into two parts. The first part presents finite urn problems which generate truncated negative binomial random variables. Some combinatorial identities that arose from the negative binomial sampling and truncated negative binomial sampling are established. These identities are constructed and serve important roles when we deal with these distributions and their characteristics. Other important results including cumulants and moments of the distributions are given in somewhat simple forms. Second, the distributions of the maximum of two chi-square variables and the distributions of the maximum correlated F-variables are then derived within the negative binomial sampling scheme. Although multinomial theory applied to order statistics and standard transformation techniques can be used to derive these distributions, the negative binomial sampling approach provides more information and deeper insight regarding the nature of the relationship between the sampling vehicle and the probability distributions of these functions of chi-square variables. We also provide an algorithm to compute the percentage points of these distributions. We supplement our findings with exact simple computational methods where no interpolations are involved.

  • PDF

의미론적 feature 공간상에서의 negative sampling을 통한 검색 성능 개선 (Improving passage retrieval via negative sampling from semantic feature space)

  • 이정두;홍범석;최원석;한영섭;전병기;나승훈
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2022년도 제34회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.146-149
    • /
    • 2022
  • 최근 검색 태스크에서는 좋은 negative sample을 얻는 방법론들이 적용되어 큰 성능 향상을 이뤘다. 하지만 좋은 negative sample 대부분의 방법들은 큰 계산 비용이 든다. 따라서 본 논문에서는 계산 비용이 적고 효과적인 negative sample을 얻기 위해 Mixed Gaussian Recurrent Chain (MGRC) sampling을 사용하여 feature 공간상에서 의미론적으로 유사한 feature를 얻고 이를 negative sample로 활용하여 기존 baseline 모델보다 좋은 성능을 얻었다.

  • PDF

The Role of Negative Binomial Sampling In Determining the Distribution of Minimum Chi-Square

  • Hamdy H.I.;Bentil Daniel E.;Son M.S.
    • International Journal of Contents
    • /
    • 제3권1호
    • /
    • pp.1-8
    • /
    • 2007
  • The distributions of the minimum correlated F-variable arises in many applied statistical problems including simultaneous analysis of variance (SANOVA), equality of variance, selection and ranking populations, and reliability analysis. In this paper, negative binomial sampling technique is employed to derive the distributions of the minimum of chi-square variables and hence the distributions of the minimum correlated F-variables. The work presented in this paper is divided in two parts. The first part is devoted to develop some combinatorial identities arised from the negative binomial sampling. These identities are constructed and justified to serve important purpose, when we deal with these distributions or their characteristics. Other important results including cumulants and moments of these distributions are also given in somewhat simple forms. Second, the distributions of minimum, chisquare variable and hence the distribution of the minimum correlated F-variables are then derived within the negative binomial sampling framework. Although, multinomial theory applied to order statistics and standard transformation techniques can be used to derive these distributions, the negative binomial sampling approach provides more information regarding the nature of the relationship between the sampling vehicle and the probability distributions of these functions of chi-square variables. We also provide an algorithm to compute the percentage points of the distributions. The computation methods we adopted are exact and no interpolations are involved.

수산동물 지정검역물에 대한 표본검사 계획 검토 (Evaluation of Sample Testing Scheme for Designated Aquatic Animals)

  • 박선일
    • 한국임상수의학회지
    • /
    • 제29권1호
    • /
    • pp.58-62
    • /
    • 2012
  • To protect aquatic animal health of importing countries from the potential risks associated with exotic diseases introduced through international trade of live aquatic animals, inspection of designated commodities at ports of entry is a critical component of the safeguarding system. The only way to be 100% confident that no fishes in a shipment are infected with a specific agent is to test every fish in the commodity imported with a perfect diagnostic test. For the majority of cases, this is unrealistic since the group of interest may very large particularly for aquatic animals, or imperfect tests are often available. It is, therefore, more common to test a fixed proportion of a group by preplanned sampling schemes. However, decision making based on results of testing the sample can provide quite a chance that infected groups may be misclassified as uninfected, depending on sampling strategy employed. The objective of this study was to determine the possibility that one or more fishes in the group imported being infected but tests negative after inspecting samples. This question is critical to government authorities to examine whether sampling plan is sufficient to achieve the purpose intended for. At fixed population size, the maximum number of infected fishes when all tests negative was decreased as the sampling fraction increased. The probability of including at least one undetected but infected fish in a group for negative tests increased with the number of fish tested or true prevalence. The risk was much lesser where high sensitivity test was assumed; when increasing test sensitivity from 0.9 to 0.99, this risk was dramatically reduced to about a tenth or a fourth for prevalence ranges from 2 to 10%, given sample size ranges from 10 to 200. Based on the preliminary analysis, the author concluded that current sampling plan testing 4-8% of the import proposal for human consumption still can yield high false negative results. Therefore, from the quarantine inspection point of view, an enforced commodity-specific sampling design that accounts for the cost of testing with an imperfect test at the specified design prevalence is urgent.

Non-negative Unbiased MSE Estimation under Stratified Multi-stage Sampling

  • Kim, Kyuseong
    • Journal of the Korean Statistical Society
    • /
    • 제30권4호
    • /
    • pp.637-644
    • /
    • 2001
  • We investigated two kinds of mean square error (MSE) estimator of homogeneous linear estimator (HLE) for the population total under stratified multi-stage sampling. One is studied when the second stage variance component is estimable and the other is found in cafe it is not estimable. The proposed estimators are necessary forms of non-negative unbiased MSE estimators of HLE.

  • PDF

An Optimal Scheme of Inclusion Probability Proportional to Size Sampling

  • Kim Sun Woong
    • Communications for Statistical Applications and Methods
    • /
    • 제12권1호
    • /
    • pp.181-189
    • /
    • 2005
  • This paper suggest a method of inclusion probability proportional to size sampling that provides a non-negative and stable variance estimator. The sampling procedure is quite simple and flexible since a sampling design is easily obtained using mathematical programming. This scheme appears to be preferable to Nigam, Kumar and Gupta's (1984) method which uses a balanced incomplete block designs. A comparison is made with their method through an example in the literature.

이중표본에서 모비율의 구간추정 (Interval Estimation of Population Proportion in a Double Sampling Scheme)

  • 이승천;최병수
    • 응용통계연구
    • /
    • 제22권6호
    • /
    • pp.1289-1300
    • /
    • 2009
  • 표본추출 비용의 절감을 위해 흔히 사용되는 이중표본추출방법은 대부분의 표본들이 2종류의 오류에 의해 오염이 되어 있어 통계적 분석이 상대적으로 용이하지 않다. 특히, 비율의 추론을 위한 중요한 분석 도구인 구간추정은 현재까지 우도추정량의 정규근사에 의존하는 Wald 방법만이 알려져 있으나 Wald 신뢰구간은 포함확률의 근사성 등에서 많은 문제가 있다는 것이 여러 연구에서 확인되고 있다. 본 연구에서는 이중표본추출에서 Wald 신뢰구간의 문제점을 파악하고 이에 대한 대안으로 Agresti-Coull 유형의 신뢰구간을 제시한다.

A Probabilistic Sampling Method for Efficient Flow-based Analysis

  • Jadidi, Zahra;Muthukkumarasamy, Vallipuram;Sithirasenan, Elankayer;Singh, Kalvinder
    • Journal of Communications and Networks
    • /
    • 제18권5호
    • /
    • pp.818-825
    • /
    • 2016
  • Network management and anomaly detection are challenges in high-speed networks due to the high volume of packets that has to be analysed. Flow-based analysis is a scalable method which reduces the high volume of network traffic by dividing it into flows. As sampling methods are extensively used in flow generators such as NetFlow, the impact of sampling on the performance of flow-based analysis needs to be investigated. Monitoring using sampled traffic is a well-studied research area, however, the impact of sampling on flow-based anomaly detection is a poorly researched area. This paper investigates flow sampling methods and shows that these methods have negative impact on flow-based anomaly detection. Therefore, we propose an efficient probabilistic flow sampling method that can preserve flow traffic distribution. The proposed sampling method takes into account two flow features: Destination IP address and octet. The destination IP addresses are sampled based on the number of received bytes. Our method provides efficient sampled traffic which has the required traffic features for both flow-based anomaly detection and monitoring. The proposed sampling method is evaluated using a number of generated flow-based datasets. The results show improvement in preserved malicious flows.

The Effect of Word-of-Mouth on Purchase Intention: A Case Study of Low-Cost Carriers in Indonesia

  • SOELASIH, Yasintha;SUMANI, Sumani
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제8권4호
    • /
    • pp.433-440
    • /
    • 2021
  • This study aims at testing word-of-mouth (WOM) by mediating positive and negative perceptions of purchase intention on low-cost carriers (LCC) flights in Indonesia. One of the communications mixes that airlines can carry out is WOM. WOM is a form of communication between passengers after using a flight. The formation of a positive perception of WOM is expected by airlines. If a positive perception of WOM has formed, a purchase intention will arise. The study population included LCC flight passengers in Indonesia, involving 387 respondents. For indicators and variables, validity and reliability tests were conducted using CFA, CR, and AVE tools. Sampling locations were Soekarno-Hatta and Kualanamu airports. Sample collection was obtained through purposive sampling, and the analytical tool used was structural equation modeling (SEM) with Lisrel. The results showed that WOM influenced purchase intention through positive and negative perceptions of WOM. It can be seen that a positive perception of WOM has a direct effect, while a negative perception of WOM has the opposite effect. In conclusion, the mediation of perceptions influences purchase intention, whether it in the same direction or the opposite ones. To conclude, WOM is an antecedent for it influences purchase intention.

Self-Sampling Versus Physicians' Sampling for Cervical Cancer Screening - Agreement of Cytological Diagnoses

  • Othman, Nor Hayati;Zaki, Fatma Hariati Mohamad;Hussain, Nik Hazlina Nik;Yusoff, Wan Zahanim Wan;Ismail, Pazuddin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제17권7호
    • /
    • pp.3489-3494
    • /
    • 2016
  • Background: A major problem with cervical cancer screening in countries which have no organized national screening program for cervical cancer is sub-optimal participation. Implementation of self-sampling method may increase the coverage. Objective: We determined the agreement of cytological diagnoses made on samples collected by women themselves (self-sampling) versus samples collected by physicians (Physician sampling). Materials and Methods: We invited women volunteers to undergo two procedures; cervical self-sampling using the Evalyn brush and physician sampling using a Cervex brush. The women were shown a video presentation on how to take their own cervical samples before the procedure. The samples taken by physicians were taken as per routine testing (Gold Standard). All samples were subjected to Thin Prep monolayer smears. The diagnoses made were according to the Bethesda classification. The results from these two sampling methods were analysed and compared. Results: A total of 367 women were recruited into the study, ranging from 22 to 65 years age. There was a significant good agreement of the cytological diagnoses made on the samples from the two sampling methods with the Kappa value of 0.568 (p=0.040). Using the cytological smears taken by physicians as the gold standard, the sensitivity of self-sampling was 71.9% (95% CI:70.9-72.8), the specificity was 86.6% (95% CI:85.7-87.5), the positive predictive value was 74.2% (95% CI:73.3-75.1) and the negative predictive value was 85.1% (95% CI: 84.2-86.0). Self-sampling smears (22.9%) allowed detection of micro-organisms better than physicians samples (18.5%). Conclusions: This study shows that samples taken by women themselves (self-sampling) and physicians have good diagnostic agreement. Self-sampling could be the method of choice in countries in which the coverage of women attending clinics for screening for cervical cancer is poor.