• 제목/요약/키워드: data sampling

검색결과 5,056건 처리시간 0.034초

네트워크를 이용한 실시간 분산제어시스템에서 데이터 샘플링 주기 결정 알고리듬 (An Algorithm of Determining Data Sampling Times in the Network-Based Real-Time Distributed Control Systems)

  • Seung Ho Hong
    • 전자공학회논문지B
    • /
    • 제30B권1호
    • /
    • pp.18-28
    • /
    • 1993
  • Processes in the real-time distributed control systems share a network medium to exchange their data. Performance of feedback control loops in the real-time distributed control systems is subject to the network-induced delays from sensor to controller, and from controller to actuator. The network-induced delays are directly dependent upon the data sampling times of control components which share a network medium. In this study, an algorithm of determining data sampling times is developed using the "window concept". where the sampling datafrom the control components dynamically share a limited number of windows. The scheduling algorithm is validated through the aimulation experiments.

  • PDF

A Comparative Study Between Light Extinction and Direct Sampling Methods for Measuring Volume Fractions of Twin-Hole Sprays Using Tomographic Reconstruction

  • Lee, Choong-Hoon
    • Journal of Mechanical Science and Technology
    • /
    • 제17권12호
    • /
    • pp.1986-1993
    • /
    • 2003
  • The spatially resolved spray volume fractions from both line-of-sight data of direct measuring cells and a laser diffraction particle analyzer (LDPA) are tomographically reconstructed by the Convolution Fourier transformation, respectively. Asymmetric sprays generated from a twin-hole injector are tested with 12 equiangular projections of measurements. For each projection angle, a line-of-sight integrated injection rate was measured using a direct sampling method and also a liquid volume fraction from a set of line-of-sight Fraunhofer diffraction measurements was measured using a light extinction method. Interpolated data between the projection angles effectively increase the number of projections, significantly enhancing the signal-to-noise level in the reconstructed data. The reconstructed volume fractions from the direct sampling cells were used as reference data for evaluating the accuracy of the volume fractions from the LDPA.

상수도 관망 데이터의 사용목적에 관한 수집 주기 연구 (Study on the sampling rate for the purpose of use in water distribution network data)

  • 이경환;서정철;차헌주;송교신;최준모
    • 상하수도학회지
    • /
    • 제27권2호
    • /
    • pp.233-239
    • /
    • 2013
  • Sampling rate of Hydraulic pressure data, depending on the intended use of the water distribution system is an important factor. If sampling interval of hydraulic data is short, that will be more useful but it demand a lot of expense for maintenance. In this study, based on simulation of water distribution system 2 khz data, statistical techniques of student t distribution, non-exceedance probability using the optimal sampling rate for research.

Radioactive waste sampling for characterisation - A Bayesian upgrade

  • Pyke, Caroline K.;Hiller, Peter J.;Koma, Yoshikazu;Ohki, Keiichi
    • Nuclear Engineering and Technology
    • /
    • 제54권1호
    • /
    • pp.414-422
    • /
    • 2022
  • Presented in this paper is a methodology for combining a Bayesian statistical approach with Data Quality Objectives (a structured decision-making method) to provide increased levels of confidence in analytical data when approaching a waste boundary. Development of sampling and analysis plans for the characterisation of radioactive waste often use a simple, one pass statistical approach as underpinning for the sampling schedule. Using a Bayesian statistical approach introduces the concept of Prior information giving an adaptive sample strategy based on previous knowledge. This aligns more closely with the iterative approach demanded of the most commonly used structured decision-making tool in this area (Data Quality Objectives) and the potential to provide a more fully underpinned justification than the more traditional statistical approach. The approach described has been developed in a UK regulatory context but is translated to a waste stream from the Fukushima Daiichi Nuclear Power Station to demonstrate how the methodology can be applied in this context to support decision making regarding the ultimate disposal option for radioactive waste in a more global context.

이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교 (Comparison of resampling methods for dealing with imbalanced data in binary classification problem)

  • 박근우;정인경
    • 응용통계연구
    • /
    • 제32권3호
    • /
    • pp.349-374
    • /
    • 2019
  • 이분형 자료의 분류에서 자료의 불균형 정도가 심한 경우 분류 결과가 좋지 않을 수 있다. 이런 문제 해결을 위해 학습 자료를 변형시키는 등의 연구가 활발히 진행되고 있다. 본 연구에서는 이러한 이분형 자료의 분류문제에서 불균형을 다루기 위한 방법들 중 표본재추출 방법들을 비교하였다. 이를 통해 자료에서 희소계급의 탐지를 보다 효과적으로 하는 방법을 찾고자 하였다. 모의실험을 통하여 여러 오버샘플링, 언더샘플링, 오버샘플링과 언더샘플링 혼합방법의 총 20가지를 비교하였다. 분류문제에서 대표적으로 쓰이는 로지스틱 회귀분석, support vector machine, 랜덤포레스트 모형을 분류기로 사용하였다. 모의실험 결과, 정확도가 0.5 이상이면서 민감도가 높았던 표본재추출 방법은 random under sampling (RUS)였다. 그 다음으로 민감도가 높았던 방법은 오버샘플링 ADASYN (adaptive synthetic sampling approach)이었다. 이를 통해 RUS 방법이 희소계급값을 찾기 위한 방안으로는 적합했다는 것을 알 수 있었다. 몇 가지 실제 자료에 적용한 결과도 모의실험의 결과와 비슷한 양상을 보였다.

종속적 비평형 다중표본 계획법의 연구 (A Study of Dependent Nonstationary Multiple Sampling Plans)

  • 김원경
    • 한국시뮬레이션학회논문지
    • /
    • 제9권2호
    • /
    • pp.75-87
    • /
    • 2000
  • In this paper, nonstationary multiple sampling plans are discussed which are difficult to solve by analytical method when there exists dependency between the sample data. The initial solution is found by the sequential sampling plan using the sequential probability ration test. The number of acceptance and rejection in each step of the multiple sampling plan are found by grouping the sequential sampling plan's solution initially. The optimal multiple sampling plans are found by simulation. Four search methods are developed U and the optimum sampling plans satisfying the Type I and Type ll error probabilities. The performance of the sampling plans is measured and their algorithms are also shown. To consider the nonstationary property of the dependent sampling plan, simulation method is used for finding the lot rejection and acceptance probability function. As a numerical example Markov chain model is inspected. Effects of the dependency factor and search methods are compared to analyze the sampling results by changing their parameters.

  • PDF

Experimental Analysis of Equilibrization in Binary Classification for Non-Image Imbalanced Data Using Wasserstein GAN

  • Wang, Zhi-Yong;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권4호
    • /
    • pp.37-42
    • /
    • 2019
  • In this paper, we explore the details of three classic data augmentation methods and two generative model based oversampling methods. The three classic data augmentation methods are random sampling (RANDOM), Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic Sampling (ADASYN). The two generative model based oversampling methods are Conditional Generative Adversarial Network (CGAN) and Wasserstein Generative Adversarial Network (WGAN). In imbalanced data, the whole instances are divided into majority class and minority class, where majority class occupies most of the instances in the training set and minority class only includes a few instances. Generative models have their own advantages when they are used to generate more plausible samples referring to the distribution of the minority class. We also adopt CGAN to compare the data augmentation performance with other methods. The experimental results show that WGAN-based oversampling technique is more stable than other approaches (RANDOM, SMOTE, ADASYN and CGAN) even with the very limited training datasets. However, when the imbalanced ratio is too small, generative model based approaches cannot achieve satisfying performance than the conventional data augmentation techniques. These results suggest us one of future research directions.

불균형 데이터 분류를 위한 딥러닝 기반 오버샘플링 기법 (A Deep Learning Based Over-Sampling Scheme for Imbalanced Data Classification)

  • 손민재;정승원;황인준
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제8권7호
    • /
    • pp.311-316
    • /
    • 2019
  • 분류 문제는 주어진 입력 데이터에 대해 해당 데이터의 클래스를 예측하는 문제로, 자주 쓰이는 방법 중의 하나는 주어진 데이터셋을 사용하여 기계학습 알고리즘을 학습시키는 것이다. 이런 경우 분류하고자 하는 클래스에 따른 데이터의 분포가 균일한 데이터셋이 이상적이지만, 불균형한 분포를 가지고 경우 제대로 분류하지 못하는 문제가 발생한다. 이러한 문제를 해결하기 위해 본 논문에서는 Conditional Generative Adversarial Networks(CGAN)을 활용하여 데이터 수의 균형을 맞추는 오버샘플링 기법을 제안한다. CGAN은 Generative Adversarial Networks(GAN)에서 파생된 생성 모델로, 데이터의 특징을 학습하여 실제 데이터와 유사한 데이터를 생성할 수 있다. 따라서 CGAN이 데이터 수가 적은 클래스의 데이터를 학습하고 생성함으로써 불균형한 클래스 비율을 맞추어 줄 수 있으며, 그에 따라 분류 성능을 높일 수 있다. 실제 수집된 데이터를 이용한 실험을 통해 CGAN을 활용한 오버샘플링 기법이 효과가 있음을 보이고 기존 오버샘플링 기법들과 비교하여 기존 기법들보다 우수함을 입증하였다.

Is Simple Random Sampling Better than Quota Sampling? An Analysis Based on the Sampling Methods of Three Surveys in South Korea

  • Cho, Sung Kyum;Jang, Deok-Hyun;LoCascio, Sarah Prusoff
    • Asian Journal for Public Opinion Research
    • /
    • 제3권4호
    • /
    • pp.156-175
    • /
    • 2016
  • This paper considers whether random sampling always produces more accurate survey results in the case of South Korea. We compare information from the 2010 census to the demographic variables of three public opinion surveys from South Korea: Gallup Korea's Omnibus Survey (Survey A) is conducted every two months by Gallup Korea; the annual Social Survey (Survey B) is conducted by Statistics Korea (KOSTAT); the Korean General Social Survey (KGSS or Survey C) is conducted annually by the Survey Research Center (SRC) at Sungkyunkwan University (SKKU). Survey A uses quota sampling after randomly selecting the neighborhood and initial addresses; Survey B uses random sampling, but allows replacements in some situations; Survey C uses simple random sampling. Data from more than one year was used for each survey. Our analysis suggests that Survey B is the most representative in most respects, and, in some respects, Survey A may be more representative than Survey C. Data from Survey C was the least stable in terms of representativeness by geographical area and age. Single-person households were underrepresented in both Surveys A and C, but the problem was more severe in Survey A. Four-person households and married persons were both over-represented in Survey A. Less educated people were under-represented in both Survey A and Survey C. There were differences in income level between Survey A and Survey C, but income data was not available for Survey B or the census, so it is difficult to ascertain which survey was more representative in this case.

Particle Swarm Optimization Using Adaptive Boundary Correction for Human Activity Recognition

  • Kwon, Yongjin;Heo, Seonguk;Kang, Kyuchang;Bae, Changseok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권6호
    • /
    • pp.2070-2086
    • /
    • 2014
  • As a kind of personal lifelog data, activity data have been considered as one of the most compelling information to understand the user's habits and to calibrate diagnoses. In this paper, we proposed a robust algorithm to sampling rates for human activity recognition, which identifies a user's activity using accelerations from a triaxial accelerometer in a smartphone. Although a high sampling rate is required for high accuracy, it is not desirable for actual smartphone usage, battery consumption, or storage occupancy. Activity recognitions with well-known algorithms, including MLP, C4.5, or SVM, suffer from a loss of accuracy when a sampling rate of accelerometers decreases. Thus, we start from particle swarm optimization (PSO), which has relatively better tolerance to declines in sampling rates, and we propose PSO with an adaptive boundary correction (ABC) approach. PSO with ABC is tolerant of various sampling rate in that it identifies all data by adjusting the classification boundaries of each activity. The experimental results show that PSO with ABC has better tolerance to changes of sampling rates of an accelerometer than PSO without ABC and other methods. In particular, PSO with ABC is 6%, 25%, and 35% better than PSO without ABC for sitting, standing, and walking, respectively, at a sampling period of 32 seconds. PSO with ABC is the only algorithm that guarantees at least 80% accuracy for every activity at a sampling period of smaller than or equal to 8 seconds.