• Title/Summary/Keyword: Probability Inferences

Search Result 41, Processing Time 0.031 seconds

Development of an Item Selection Method for Test-Construction by using a Relationship Structure among Abilities

  • Kim, Sung-Ho;Jeong, Mi-Sook;Kim, Jung-Ran
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.193-207
    • /
    • 2001
  • When designing a test set, we need to consider constraints on items that are deemed important by item developers or test specialists. The constraints are essentially on the components of the test domain or abilities relevant to a given test set. And so if the test domain could be represented in a more refined form, test construction would be made in a more efficient way. We assume that relationships among task abilities are representable by a causal model and that the item response theory (IRT) is not fully available for them. In such a case we can not apply traditional item selection methods that are based on the IRT. In this paper, we use entropy as an uncertainty measure for making inferences on task abilities and developed an optimal item selection algorithm which reduces most the entropy of task abilities when items are selected from an item pool.

  • PDF

Bayesian Multiple Change-Point Estimation and Segmentation

  • Kim, Jaehee;Cheon, Sooyoung
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.6
    • /
    • pp.439-454
    • /
    • 2013
  • This study presents a Bayesian multiple change-point detection approach to segment and classify the observations that no longer come from an initial population after a certain time. Inferences are based on the multiple change-points in a sequence of random variables where the probability distribution changes. Bayesian multiple change-point estimation is classifies each observation into a segment. We use a truncated Poisson distribution for the number of change-points and conjugate prior for the exponential family distributions. The Bayesian method can lead the unsupervised classification of discrete, continuous variables and multivariate vectors based on latent class models; therefore, the solution for change-points corresponds to the stochastic partitions of observed data. We demonstrate segmentation with real data.

Efficient random number generation from extreme tail areas of a t-distribution (t 분포의 극단 꼬리부분으로부터의 효율적인 난수생성)

  • 오만숙;김나영
    • The Korean Journal of Applied Statistics
    • /
    • v.9 no.1
    • /
    • pp.165-177
    • /
    • 1996
  • It is often needed to generate random numbers from truncated t-distributions to carry out Bayesian inferences, especially in Monte Carlo integration for estimation of posterior densities of constrained parameters. However, when the restricted area is an extreme tail area with a small probability most existing random generation methods are not efficient. In this paper, we propose an efficient acceptance-rejection method to generate random numbers from extreme tail areas of a t-distribution. Using some simulation results, we compare the proposed algorithm with other popular methods.

  • PDF

Environmental Survey Data Analysis by Data Fusion Technique

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.11a
    • /
    • pp.21-27
    • /
    • 2006
  • Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. Currently, Gyeongnam province is executing the social survey every year with the provincials. But, they have the limit of the analysis as execute the different survey to 3 year cycles. In this paper, we study to data fusion of environmental survey data using sas macro. We can use data fusion outputs in environmental preservation and environmental improvement.

  • PDF

software packages for survey data analysis (조사 데이터 분석용 소프트웨어 패키지)

  • 성내경
    • Survey Research
    • /
    • v.1 no.1
    • /
    • pp.109-123
    • /
    • 2000
  • In order to make statistically valid inferences for survey data based on complex probability sample designs, survey researchers must incorporate the sample design in the data analysis If this in not the case the variance estimates of survey statistics derived under the usual simple random sampling assumptions from an infinite population generally underestimate the true variance, which results in high Type l error level. In this article we introduce new software packages dedicated to analyze complex survey data In particular, we summarize analysis capabilities on SUDAAN Version 7.5 and SAS Version 8.

  • PDF

An Analysis of Middle School Student's Eye Movements in the Law of Large Numbers Simulation Activity (큰 수의 법칙 시뮬레이션에서 중학생의 안구 운동 분석)

  • Choi, In Yong;Cho, Han Hyuk
    • The Mathematical Education
    • /
    • v.56 no.3
    • /
    • pp.281-300
    • /
    • 2017
  • This study analyzed the difficulties of middle school students in computer simulation of the law of large numbers through eye movement analysis. Some students did not attend to the simulation results and could not make meaningful inferences. It is observed that students keep the existing concept even though they observe the simulation results which are inconsistent with the misconceptions they have. Since probabilistic intuition influence student's thinking very strongly, it is necessary to design a task that allows students to clearly recognize the difference between their erroneous intuitions and simulation results. In addition, we could confirm through eye movements analysis that students could not make meaningful observations and inferences if too much reasoning was needed even though the simulation included a rich context. It is necessary to use visual representations such as graphs to provide immediate feedback to students, to encourage students to attend to the results in a certain intentional way to discover the underlying mathematical structure rather than simply presenting experimental data. Some students focused their attention on the visually salient feature of the experimental results and have made incorrect conclusion. The simulation should be designed so that the patterns of the experimental results that the student must discover are not visually distorted and allow the students to perform a sufficient number of simulations. Based on the results of this study, we suggested that cumulative relative frequency graph showing multiple results at the same time, and the term 'generally tends to get closer' should be used in learning of the law of large numbers. In addition, it was confirmed that eye-tracking method is a useful tool for analyzing interaction in technology-based probabilistic learning.

Statistical Applications for the Prediction of White Hispanic Breast Cancer Survival

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Gabbidon, Kemesha;Ross, Elizabeth;Shrestha, Alice
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5571-5575
    • /
    • 2014
  • Background: The ability to predict the survival time of breast cancer patients is important because of the potential high morbidity and mortality associated with the disease. To develop a predictive inference for determining the survival of breast cancer patients, we applied a novel Bayesian method. In this paper, we propose the development of a databased statistical probability model and application of the Bayesian method to predict future survival times for White Hispanic female breast cancer patients, diagnosed in the US during 1973-2009. Materials and Methods: A stratified random sample of White Hispanic female patient survival data was selected from the Surveillance Epidemiology and End Results (SEER) database to derive statistical probability models. Four were considered to identify the best-fit model. We used three standard model-building criteria, which included Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and Deviance Information Criteria (DIC) to measure the goodness of fit. Furthermore, the Bayesian method was used to derive future survival inferences for survival times. Results: The highest number of White Hispanic female breast cancer patients in this sample was from New Mexico and the lowest from Hawaii. The mean (SD) age at diagnosis (years) was 58.2 (14.2). The mean (SD) of survival time (months) for White Hispanic females was 72.7 (32.2). We found that the exponentiated Weibull model best fit the survival times compared to other widely known statistical probability models. The predictive inference for future survival times is presented using the Bayesian method. Conclusions: The findings are significant for treatment planning and health-care cost allocation. They should also contribute to further research on breast cancer survival issues.

An Analysis of the Probability Unit in the Middle School Textbook 8-B in the Aspect of Information Analysis and Utilization (정보 분석 및 활용 측면에서의 중학교 2학년 확률 단원 분석)

  • Lee, Young-Ha;Kwon, Se-Lim
    • School Mathematics
    • /
    • v.11 no.3
    • /
    • pp.389-413
    • /
    • 2009
  • This thesis assumes that the teaching objective of the Probability unit of the 8th grade textbook under the 7th National Curriculum is to enhance the ability to analyze and utilize informations. And we examine them if this point of view is fully reflected. Based on the analysis of the textbook analysis, followings are found. 1) It is necessary to emphasize more enumerating all possible cases and to induce formulae counting the number of possible cases through organizing them 2) The probability is to be decribed more clearly as a likelihood of events and to be introduced and followed through various students' experiences and the relative frequencies. Less emphasis on probability computations, while more emphasis on probability comparisons of events are recommended. 3) The term "influential events"(a kind of stochastic correlation) is ambiguous. It is necessary to make clear what it means at tile level of the 8th grade or to discard it for it is to be learned at the 10th grade again. Especially, contingency table has been introduced at the 9th grade under the 7th National Curriculum. 4) Uses of the likelihood principle in making a decision and in learning the reliability of it should be encouraged. And students are to team the hazard of transitive inferences in probability comparisons. As a consequence of above, we feel that textbook authors and related stakeholder are to be more serious about the behavioral changes of students that may come along with the didactics of specific contents of school mathematics.

  • PDF

Analysis on the Changes of Choices according to the Conditions in the Realistic Probability Problem of the Elementary Gifted Students (확률 판단 문제에서 초등 수학영재들의 선택에 미친 요인 분석과 교육적 시사점)

  • Lee, Seung Eun;Song, Sang Hun
    • School Mathematics
    • /
    • v.15 no.3
    • /
    • pp.603-617
    • /
    • 2013
  • The major purpose of this article is to examine what kind of gap exists between mathematically gifted students' probability knowledge and the reality actually applying that knowledge and then analyze the cause of the gap. To attain the goal, 23 elementary mathematically gifted students at the highest level from G region were provided with problem situations internalizing a probability and expectation, and the problems are in series in which conditions change one by one. The study task is in a gaming situation where there can be the most reasonable answer mathematically, but the choice may differ by how much they consider a certain condition. To collect data, the students' individual worksheets are collected, and all the class procedures are recorded with a camcorder, and the researcher writes a class observation report. The biggest reason why the students do not make a decision solely based on their own mathematical knowledge is because of 'impracticality', one of the properties of probability, that in reality, all things are not realized according to the mathematical calculation and are impossible to be anticipated and also their own psychological disposition to 'avoid loss' about their entry fee paid. In order to provide desirable probability education, we should not be limited to having learners master probability knowledge included in the textbook by solving the problems based on algorithmic knowledge but provide them with plenty of experience to apply probabilistic inference with which they should make their own choice in diverse situations having context.

  • PDF

Statistical Estimates from Black Non-Hispanic Female Breast Cancer Data

  • Khan, Hafiz Mohammad Rafiqullah;Ibrahimou, Boubakari;Saxena, Anshul;Gabbidon, Kemesha;Abdool-Ghany, Faheema;Ramamoorthy, Venkataraghavan;Ullah, Duff;Stewart, Tiffanie Shauna-Jeanne
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.19
    • /
    • pp.8371-8376
    • /
    • 2014
  • Background: The use of statistical methods has become an imperative tool in breast cancer survival data analysis. The purpose of this study was to develop the best statistical probability model using the Bayesian method to predict future survival times for the black non-Hispanic female breast cancer patients diagnosed during 1973-2009 in the U.S. Materials and Methods: We used a stratified random sample of black non-Hispanic female breast cancer patient data from the Surveillance Epidemiology and End Results (SEER) database. Survival analysis was performed using Kaplan-Meier and Cox proportional regression methods. Four advanced types of statistical models, Exponentiated Exponential (EE), Beta Generalized Exponential (BGE), Exponentiated Weibull (EW), and Beta Inverse Weibull (BIW) were utilized for data analysis. The statistical model building criteria, Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and Deviance Information Criteria (DIC) were used to measure the goodness of fit tests. Furthermore, we used the Bayesian approach to obtain the predictive survival inferences from the best-fit data based on the exponentiated Weibull model. Results: We identified the highest number of black non-Hispanic female breast cancer patients in Michigan and the lowest in Hawaii. The mean (SD), of age at diagnosis (years) was 58.3 (14.43). The mean (SD), of survival time (months) for black non-Hispanic females was 66.8 (30.20). Non-Hispanic blacks had a significantly increased risk of death compared to Black Hispanics (Hazard ratio: 1.96, 95%CI: 1.51-2.54). Compared to other statistical probability models, we found that the exponentiated Weibull model better fits for the survival times. By making use of the Bayesian method predictive inferences for future survival times were obtained. Conclusions: These findings will be of great significance in determining appropriate treatment plans and health-care cost allocation. Furthermore, the same approach should contribute to build future predictive models for any health related diseases.