• Title/Summary/Keyword: data sampling

Search Result 5,029, Processing Time 0.036 seconds

A Cost Effective Reference Data Sampling Algorithm Using Fractal Analysis (프랙탈 분석을 통한 비용효과적인 기준 자료추출알고리즘에 관한 연구)

  • 김창재
    • Spatial Information Research
    • /
    • v.8 no.1
    • /
    • pp.171-182
    • /
    • 2000
  • Random sampling or systematic sampling method is commonly used to assess the accuracy of classification results. In remote sensing, with these sampling method, much time and tedious works are required to acquire sufficient ground truth data. So , a more effective sampling method that can retain the characteristics of the population is required. In this study, fractal analysis is adopted as an index for reference sampling . The fractal dimensions of the whole study area and the sub-regions are calculated to choose sub-regions that have the most similar dimensionality to that of whole-area. Then the whole -area s classification accuracy is compared to those of sub-regions, respectively, and it is verified that the accuracies of selected sub regions are similar to that of full-area . Using the above procedure, a new kind of reference sampling method is proposed. The result shows that it is possible to reduced sampling area and sample size keeping up the same results as existing methods in accuracy tests. Thus, the proposed method is proved cost-effective for reference data sampling.

  • PDF

The systematic sampling for inferring the survey indices of Korean groundfish stocks

  • Hyun, Saang-Yoon;Seo, Young IL
    • Fisheries and Aquatic Sciences
    • /
    • v.21 no.8
    • /
    • pp.24.1-24.9
    • /
    • 2018
  • The Korean bottom trawl survey has been deployed on a regular basis for about the last decade as part of groundfish stock assessments. The regularity indicates that they sample groundfish once per grid cell whose sides are half of one latitude and that of one longitude, respectively, and whose inside is furthermore divided into nine nested grids. Unless they have a special reason (e.g., running into a rocky bottom), their sample location is at the center grid of the nine nested grids. Given data collected by the survey, we intended to show how to appropriately estimate not only the survey index of a fish stock but also its uncertainty. For the regularity reason, we applied the systematic sampling theory for the above purposes and compared its results with a reference, which was based on the simple random sampling. When using the survey data about 11 fish stocks, collected by the spring and fall surveys in 2014, the survey indices of those stocks estimated under the systematic sampling were overall more precise than those under the simple random sampling. In estimates of the survey indices in number, the standard errors of those estimates under the systematic sampling were reduced from those under the simple random sampling by 0.23~27.44%, while in estimates of the survey indices in weight, they decreased by 0.04~31.97%. In bias of the estimates, the systematic sampling was the same as the simple random sampling. Our paper is first in formally showing how to apply the systematic sampling theory to the actual data collected by the Korean bottom trawl surveys.

An Effective Design of Process Mean Control Chart in Subgroups Based on Cluster Sampling Type

  • Nam, Ho-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.939-950
    • /
    • 2003
  • Control charts are very useful tool for monitoring of process characteristics. This paper discusses the problem of design of control limits when the subgroups are composed by cluster sampling type. As an alternative method of design of control limits XbBar chart is proposed, which uses the control limits based on the variation between subgroups instead of using classical variation within subgroups. Two examples are presented for reasonable design of control limits and conditions of subgroups based on the cluster sampling. Through examples the guidelines for making proper control limits are proposed.

  • PDF

Modified Multi-Level Skip-Lot Sampling Plans

  • Cho, Gyo-Young;Choi, Eun-Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.915-927
    • /
    • 2003
  • This paper is the generalization of the modified two-level skip-lot sampling plan(MTSkSP1) to n-level. The general formulas of the operating characteristic(OC) function, average sample number(ASN) and average outgoing quality(AOQ) for the plan are derived using Markov chain properties. The operating characteristic curves, average sample numbers and average outgoing qualities of a reference plan, modified two-level, three-level and five-level skip-lot sampling plans are compared.

  • PDF

[ $\bar{X}$ ] Control Charts with Variable Sample Sizes and Variable Sampling Intervals

  • Lee, Jae-Heon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.3
    • /
    • pp.429-440
    • /
    • 2003
  • Variable sampling rate (VSR) control charts vary the sampling interval and/or the sample size according to value of the control statistic. It is known that $\bar{X}$ charts with VSR scheme lead to large improvements in performance over those with fixed sampling rate (FSR) scheme. In this paper, we studied $\bar{X}$ charts with several VSR schemes, and compared their statistical performance each other.

  • PDF

Bootstrap Confidence Intervals for a One Parameter Model using Multinomial Sampling

  • Jeong, Hyeong-Chul;Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.465-472
    • /
    • 1999
  • We considered a bootstrap method for constructing confidenc intervals for a one parameter model using multinomial sampling. The convergence rates or the proposed bootstrap method are calculated for model-based maximum likelihood estimators(MLE) using multinomial sampling. Monte Carlo simulation was used to compare the performance of bootstrap methods with normal approximations in terms of the average coverage probability criterion.

  • PDF

Comparisons of the Modified Skip-Lot Sampling Inspection Plans

  • Yang, Chang-Soo;Cho, Gyo-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1183-1189
    • /
    • 2008
  • The general formulas of the operating characteristic(OC) function, average sample number(ASN) and average outgoing quality(AOQ) for the modified n-level skip-lot sampling plan(MMSkSP2) were derived using Markov chain properties by Cho(2008). In this paper, the OC curve, ASN and AOQ of a reference plan, modified two-level, three-level and five-level skip-lot sampling plans are compared.

  • PDF

Effect of Sampling for Multi-set Cardinality Estimation (멀티셋의 크기 추정 기법에서 샘플링의 효과)

  • Dao, DinhNguyen;Nyang, DaeHun;Lee, KyungHee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.1
    • /
    • pp.15-22
    • /
    • 2015
  • Estimating the number of distinct values is really well-known problems in network data measurement and many effective algorithms are suggested. Recent works have built upon technique called Linear Counting to solve the estimation problem for massive sets or spreaders in small memory. Sampling is used to reduce the measurement data, and it is assumed that sampling gives bad effect on the accuracy. In this paper, however, we show that the sampling on multi-set estimation sometimes gives better results for CSE with sampling than for MCSE that examines all the packets without sampling in terms of accuracy and estimation range. To prove this, we presented mathematical analysis, conducted experiment with real data, and compared the results of CSE, MCSE, and CSES.

A Study on Measuring the Similarity Among Sampling Sites in Lake Yongdam with Water Quality Data Using Multivariate Techniques (다변량기법을 활용한 용담호 수질측정지점 유사성 연구)

  • Lee, Yosang;Kwon, Sehyug
    • Journal of Environmental Impact Assessment
    • /
    • v.18 no.6
    • /
    • pp.401-409
    • /
    • 2009
  • Multivariate statistical approaches to classify sampling sites with measuring their similarity by water quality data and understand the characteristics of classified clusters have been discussed for the optimal water quality monitering network. For empirical study, data of two years (2005, 2006) at the 9 sampling sites with the combination of 2 depth levels and 7 important variables related to water quality is collected in Yongdam reservoir. The similarity among sampling sites is measured with Euclidean distances of water quality related variables and they are classified by hierarchical clustering method. The clustered sites are discussed with principal component variables in the view of the geographical characteristics of them and reducing the number of measuring sites. Nine sampling sites are clustered as follows; One cluster of 5, 6, and 7 sampling sites shows the characteristic of low water depth and main stream of water. The sites of 2 and 4 are clustered into the same group by characteristics of hydraulics which come from that of main stream. But their changing pattern of water quality looks like different since the site of 2 is near to dam. The sampling sites of 3, 8, and 9 are individually positioned due to the different tributary.

How Should We Randomly Sample Marine Fish Landed at Korea Ports to Represent a Length Frequency Distribution of Those Fish? (한국 연근해 어업에서 수집되는 어류 개체군 체장자료의 표집(sampling) 방법 제안)

  • Park, Min Gyou;Hyun, Saang-Yoon
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.54 no.1
    • /
    • pp.80-89
    • /
    • 2021
  • In Korea, marine fish landed at ports are randomly sampled on a periodic basis (e.g., daily or weekly), and body sizes (e.g., lengths and weights) of those sampled fish are measured. The motivation for our study is whether or not such measurements reflect the size distribution, especially the length distribution of fish landed (= a population), because such length measurements are key data for a length-based assessment model. The current sampling method is to sample fish landed at ports by body size group (e.g., very small, small, medium, large, very large), using the sampling weights as the number of boxes by body size group. In this study, we showed that length composition data about fish sampled by the current method did not represent the length frequency distribution of the fish landed, and suggested that an alternative sampling method should be applied of using the sampling weights as the number of fish landed by body size group. We also introduced a method for determining an appropriate sample size.