• Title/Summary/Keyword: representativeness of samples

Search Result 17, Processing Time 0.024 seconds

Active Learning on Sparse Graph for Image Annotation

  • Li, Minxian;Tang, Jinhui;Zhao, Chunxia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.10
    • /
    • pp.2650-2662
    • /
    • 2012
  • Due to the semantic gap issue, the performance of automatic image annotation is still far from satisfactory. Active learning approaches provide a possible solution to cope with this problem by selecting most effective samples to ask users to label for training. One of the key research points in active learning is how to select the most effective samples. In this paper, we propose a novel active learning approach based on sparse graph. Comparing with the existing active learning approaches, the proposed method selects the samples based on two criteria: uncertainty and representativeness. The representativeness indicates the contribution of a sample's label propagating to the other samples, while the existing approaches did not take the representativeness into consideration. Extensive experiments show that bringing the representativeness criterion into the sample selection process can significantly improve the active learning effectiveness.

A study on sensitivity of representativeness indicator in survey sampling (표본 추출법에서 R-지수의 민감도에 관한 연구)

  • Lee, Yujin;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.69-82
    • /
    • 2017
  • R-indicator (representativeness indicator) is used to check the representativeness of samples when non-responses occur. The representativeness is related with the accuracy of parameter estimator and the accuracy is related with bias of the estimator. Hence, unbiased estimator generates high accuracy. Therefore, high value of R-indicator guarantees the accuracy of parameter estimation with a small bias. R-indicator is calculated through propensity scores obtained by logit or probit modeling. In this paper we investigate the degree of relation between R-indicator and different non-response rates in strata using simulation studies. We also analyze a modified Korea Economic Census data for real data analysis.

Preservice Secondary Mathematics Teachers' Statistical Literacy in Understanding of Sample (중등수학 예비교사들의 통계적 소양 : 표본 개념에 대한 이해를 중심으로)

  • Tak, Byungjoo;Ku, Na-Young;Kang, Hyun-Young;Lee, Kyeong-Hwa
    • The Mathematical Education
    • /
    • v.56 no.1
    • /
    • pp.19-39
    • /
    • 2017
  • Taking samples of data and using samples to make inferences about unknown populations are at the core of statistical investigations. So, an understanding of the nature of sample as statistical thinking is involved in the area of statistical literacy, since the process of a statistical investigation can turn out to be totally useless if we don't appreciate the part sampling plays. However, the conception of sampling is a scheme of interrelated ideas entailing many statistical notions such as repeatability, representativeness, randomness, variability, and distribution. This complexity makes many people, teachers as well as students, reason about statistical inference relying on their incorrect intuitions without understanding sample comprehensively. Some research investigated how the concept of a sample is understood by not only students but also teachers or preservice teachers, but we want to identify preservice secondary mathematics teachers' understanding of sample as the statistical literacy by a qualitative analysis. We designed four items which asked preservice teachers to write their understanding for sampling tasks including representativeness and variability. Then, we categorized the similar responses and compared these categories with Watson's statistical literacy hierarchy. As a result, many preservice teachers turned out to be lie in the low level of statistical literacy as they ignore contexts and critical thinking, expecially about sampling variability rather than sample representativeness. Moreover, the experience of taking statistics courses in university did not seem to make a contribution to development of their statistical literacy. These findings should be considered when design preservice teacher education program to promote statistics education.

Evaluation of the Measurement Uncertainty from the Standard Operating Procedures(SOP) of the National Environmental Specimen Bank (국가환경시료은행 생태계 대표시료의 채취 및 분석 표준운영절차에 대한 단계별 측정불확도 평가 연구)

  • Lee, Jongchun;Lee, Jangho;Park, Jong-Hyouk;Lee, Eugene;Shim, Kyuyoung;Kim, Taekyu;Han, Areum;Kim, Myungjin
    • Journal of Environmental Impact Assessment
    • /
    • v.24 no.6
    • /
    • pp.607-618
    • /
    • 2015
  • Five years have passed since the first set of environmental samples was taken in 2011 to represent various ecosystems which would help future generations lead back to the past environment. Those samples have been preserved cryogenically in the National Environmental Specimen Bank(NESB) at the National Institute of Environmental Research. Even though there is a strict regulation (SOP, standard operating procedure) that rules over the whole sampling procedure to ensure each sample to represent the sampling area, it has not been put to the test for the validation. The question needs to be answered to clear any doubts on the representativeness and the quality of the samples. In order to address the question and ensure the sampling practice set in the SOP, many steps to the measurement of the sample, that is, from sampling in the field and the chemical analysis in the lab are broken down to evaluate the uncertainty at each level. Of the 8 species currently taken for the cryogenic preservation in the NESB, pine tree samples from two different sites were selected for this study. Duplicate samples were taken from each site according to the sampling protocol followed by the duplicate analyses which were carried out for each discrete sample. The uncertainties were evaluated by Robust ANOVA; two levels of uncertainty, one is the uncertainty from the sampling practice, and the other from the analytical process, were then compiled to give the measurement uncertainty on a measured concentration of the measurand. As a result, it was confirmed that it is the sampling practice not the analytical process that accounts for the most of the measurement uncertainty. Based on the top-down approach for the measurement uncertainty, the efficient way to ensure the representativeness of the sample was to increase the quantity of each discrete sample for the making of a composite sample, than to increase the number of the discrete samples across the site. Furthermore, the cost-effective approach to enhance the confidence level on the measurement can be expected from the efforts to lower the sampling uncertainty, not the analytical uncertainty. To test the representativeness of a composite sample of a sampling area, the variance within the site should be less than the difference from duplicate sampling. For that, a criterion, ${i.e.s^2}_{geochem}$(across the site variance) <${s^2}_{samp}$(variance at the sampling location) was proposed. In light of the criterion, the two representative samples for the two study areas passed the requirement. In contrast, whenever the variance of among the sampling locations (i.e. across the site) is larger than the sampling variance, more sampling increments need to be added within the sampling area until the requirement for the representativeness is achieved.

A Study on the Concept of Sample by a Historical Analysis (표본 개념에 대한 고찰: 역사적 분석을 중심으로)

  • Tak, Byungjoo;Ku, Na Young;Kang, Hyun-Young;Lee, Kyeong-Hwa
    • School Mathematics
    • /
    • v.16 no.4
    • /
    • pp.727-743
    • /
    • 2014
  • The concepts of sample and sampling are central to the statistical thinking and foundations of the statistical literacy, so we need to be emphasized their importance in the statistics education. However, many researches which dealt with samples only analyze textbooks or students' responses. In this study, the concept of sample is addressed by a historical consideration which is one aspect of the didactical analysis. Moreover, developing concept of sample is analyzed from the preceding studies about the statistical literacy, considering the sample representativeness and the sampling variability. The results say that the historical process of developing the concept of sample can be divided into three step: understanding the sample representativeness; appearing the sample variance; recognizing the sampling variability. Above all, it is important to aware and control the sampling variability, but many related researches might not consider sample variability. Therefore, it implies that the awareness and control of sampling variability are needed to reflect to the teaching-learing of sample for developing the students' statistical literacy.

  • PDF

A Study on the Statistical Representativeness of Samples taken from Radioactive Soil (방사성 토양폐기물 시료의 통계적 대표성에 관한 연구)

  • Cho Han-Seok;Kim T.K.;Lee K.M.;Ahn S.J.;Shon J.S.
    • Proceedings of the Korean Radioactive Waste Society Conference
    • /
    • 2005.06a
    • /
    • pp.151-157
    • /
    • 2005
  • For the treatment of regulatory clearance of the soils, a procedure for the radionuclides and radioactivity concentration analysis is under development. A strategy for soil sampling including random sampling after homogenization and standardization was set up. Statistical representativeness is considered for not only sampling strategy but also sample size. In this study, designed sample size was designed with confidence interval and error bound of soil using the pilot samples which were taken following the sampling strategy.

  • PDF

Pre-service Teachers' Understanding of Statistical Sampling (예비교사들의 통계적 표집에 대한 이해)

  • Ko, Eun-Sung;Lee, Kyeong-Hwa
    • Journal of Educational Research in Mathematics
    • /
    • v.21 no.1
    • /
    • pp.17-32
    • /
    • 2011
  • This study investigated pre-service teachers' understanding of statistical sampling. The researchers categorized major topics related to sampling into representativeness of samples, sampling variability, and sampling distribution, and selected concepts connected to each topic. Findings on this study are as follows: Even though most of the pre-service teachers considered the random sampling bringing unbiased outcomes as a proper sampling method, only 64% of them recognized that sample is a quasi-proportional, small-scale version of population; Few pre-service teachers understood that more important is the size of sample, not the portion of sample to population, and half of them appreciated that the number of sampling has a powerful effect on drawing of reliable results than the size of sample; Few pre-service teachers understood that sampling distribute is irrelevant to the shape of population and has a symmetrical bell-shape.

  • PDF

Study on the Levels of Informal Statistical Inference of the Middle and High School Students (중·고등학생들의 비형식적 통계적 추리의 수준 연구)

  • Lee, Jung Yeon;Lee, Kyeong Hwa
    • School Mathematics
    • /
    • v.19 no.3
    • /
    • pp.533-551
    • /
    • 2017
  • The statistical education researchers advise instructors to educate informal statistical inference and they are paying close attention to the progress of the statistical inference in general. This study was conducted by analyzing the levels and the traits of each levels of the informal statistical inference of the middle and high school students for comparing the samples of data and estimating the graph of a population. Research has shown that five levels of the informal statistical inference were identified for comparing the samples of data: responses that are distracted or misled by an irrelevant aspect, responses that focus on frequencies of individual data points and hold a local view of the sample data sets, responses that the student's view of the data is transitioning from local to global, responses that hold a global view but do not clearly integrate multiple aspects of the distribution, and responses that integrate multiple aspects of the distribution. Another five levels of the informal statistical inference were identified for estimating the graph of a population: responses that are distracted or misled by an irrelevant aspect, responses that focus only on representativeness, responses that consider both representativeness and variability and focus on one particular aspect of the distribution, responses that focus on multiple aspects of distribution but do not clearly integrate them, and responses that integrate multiple aspects of the distribution.

Surveillance Evaluation of the National Cancer Registry in Sabah, Malaysia

  • Jeffree, Saffree Mohammad;Mihat, Omar;Lukman, Khamisah Awang;Ibrahim, Mohd Yusof;Kamaludin, Fadzilah;Hassan, Mohd Rohaizat;Kaur, Nirmal;Myint, Than
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.7
    • /
    • pp.3123-3129
    • /
    • 2016
  • Background: Cancer is the fourth leading cause of death in Sabah Malaysia with a reported age-standardized incidence rate was 104.9 per 100,000 in 2007. The incidence rate depends on non-mandatory notification in the registry. Under-reporting will provide the false picture of cancer control program effectiveness. The present study was to evaluate the performance of the cancer registry system in terms of representativeness, data quality, simplicity, acceptability and timeliness and provision of recommendations for improvement. Materials and Methods: The evaluation was conducted among key informants in the National Cancer Registry (NCR) and reporting facilities from Feb-May 2012 and was based on US CDC guidelines. Representativeness was assessed by matching cancer case in the Health Information System (HIS) and state pathology records with those in NCR. Data quality was measured through case finding and re-abstracting of medical records by independent auditors. The re-abstracting portion comprised 15 data items. Self-administered questionnaires were used to assess simplicity and acceptability. Timeliness was measured from date of diagnosis to date of notification received and data dissemination. Results: Of 4613 cancer cases reported in HIS, 83.3% were matched with cancer registry. In the state pathology centre, 99.8% was notified to registry. Duplication of notification was 3%. Data completeness calculated for 104 samples was 63.4%. Registrars perceived simplicity in coding diagnosis as moderate. Notification process was moderately acceptable. Median duration of interval 1 was 5.7 months. Conclusions: The performances of registry's attributes are fairly positive in terms of simplicity, case reporting sensitivity, and predictive value positive. It is moderately acceptable, data completeness and inflexible. The usefulness of registry is the area of concern to achieve registry objectives. Timeliness of reporting is within international standard, whereas timeliness to data dissemination was longer up to 4 years. Integration between existing HIS and national registration department will improve data quality.

Mediating Effects of Emotional Venting via Instant Messaging (IM) and Positive Emotion in the Relationship between Negative Emotion and Depression (부정적 정서와 우울의 관계에서 인스턴트 메시징(Instant Messaging)을 통한 감정 표출과 긍정적 정서의 매개효과)

  • Lee, Hannah;An, Soontae
    • Research in Community and Public Health Nursing
    • /
    • v.30 no.4
    • /
    • pp.571-580
    • /
    • 2019
  • Purpose: The purpose of this study is to examine the mediating effects of emotional venting via instant messaging (IM) and positive emotion in the relationship between negative emotion and depression. Methods: Online survey was conducted in Korea between 2 April and 7 April 2019. To obtain samples with representativeness, data were gathered by the professional research firm. A total of 250 Koreans were participated in this study. The collected data were analyzed using descriptive statistics, Pearson's correlation coefficients, and SPSS PROCESS macro to test the mediating effects. Results: This study analyzed the direct/indirect effects of negative emotion on emotional venting via IM, in the relationship between positive emotion and depression. Negative emotion had indirect effects on depression through emotional venting via IM and positive emotion. Both emotional venting via IM and positive emotion had dual mediating effects in the influence of negative emotion on depression. Conclusion: These results suggest that it is important to manage negative emotion to prevent depression. Also, this study confirmed that emotional venting via IM is a powerful factor influencing emotional recovery.