• Title/Summary/Keyword: data sampling

Search Result 5,013, Processing Time 0.042 seconds

An Algorithm of Determining Data Sampling Times in the Network-Based Real-Time Distributed Control Systems (네트워크를 이용한 실시간 분산제어시스템에서 데이터 샘플링 주기 결정 알고리듬)

  • Seung Ho Hong
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.1
    • /
    • pp.18-28
    • /
    • 1993
  • Processes in the real-time distributed control systems share a network medium to exchange their data. Performance of feedback control loops in the real-time distributed control systems is subject to the network-induced delays from sensor to controller, and from controller to actuator. The network-induced delays are directly dependent upon the data sampling times of control components which share a network medium. In this study, an algorithm of determining data sampling times is developed using the "window concept". where the sampling datafrom the control components dynamically share a limited number of windows. The scheduling algorithm is validated through the aimulation experiments.

  • PDF

A Comparative Study Between Light Extinction and Direct Sampling Methods for Measuring Volume Fractions of Twin-Hole Sprays Using Tomographic Reconstruction

  • Lee, Choong-Hoon
    • Journal of Mechanical Science and Technology
    • /
    • v.17 no.12
    • /
    • pp.1986-1993
    • /
    • 2003
  • The spatially resolved spray volume fractions from both line-of-sight data of direct measuring cells and a laser diffraction particle analyzer (LDPA) are tomographically reconstructed by the Convolution Fourier transformation, respectively. Asymmetric sprays generated from a twin-hole injector are tested with 12 equiangular projections of measurements. For each projection angle, a line-of-sight integrated injection rate was measured using a direct sampling method and also a liquid volume fraction from a set of line-of-sight Fraunhofer diffraction measurements was measured using a light extinction method. Interpolated data between the projection angles effectively increase the number of projections, significantly enhancing the signal-to-noise level in the reconstructed data. The reconstructed volume fractions from the direct sampling cells were used as reference data for evaluating the accuracy of the volume fractions from the LDPA.

Study on the sampling rate for the purpose of use in water distribution network data (상수도 관망 데이터의 사용목적에 관한 수집 주기 연구)

  • Lee, Kyounghwan;Suh, JungChul;Cha, Hunjoo;Song, Kyosin;Choi, Junemo
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.27 no.2
    • /
    • pp.233-239
    • /
    • 2013
  • Sampling rate of Hydraulic pressure data, depending on the intended use of the water distribution system is an important factor. If sampling interval of hydraulic data is short, that will be more useful but it demand a lot of expense for maintenance. In this study, based on simulation of water distribution system 2 khz data, statistical techniques of student t distribution, non-exceedance probability using the optimal sampling rate for research.

Radioactive waste sampling for characterisation - A Bayesian upgrade

  • Pyke, Caroline K.;Hiller, Peter J.;Koma, Yoshikazu;Ohki, Keiichi
    • Nuclear Engineering and Technology
    • /
    • v.54 no.1
    • /
    • pp.414-422
    • /
    • 2022
  • Presented in this paper is a methodology for combining a Bayesian statistical approach with Data Quality Objectives (a structured decision-making method) to provide increased levels of confidence in analytical data when approaching a waste boundary. Development of sampling and analysis plans for the characterisation of radioactive waste often use a simple, one pass statistical approach as underpinning for the sampling schedule. Using a Bayesian statistical approach introduces the concept of Prior information giving an adaptive sample strategy based on previous knowledge. This aligns more closely with the iterative approach demanded of the most commonly used structured decision-making tool in this area (Data Quality Objectives) and the potential to provide a more fully underpinned justification than the more traditional statistical approach. The approach described has been developed in a UK regulatory context but is translated to a waste stream from the Fukushima Daiichi Nuclear Power Station to demonstrate how the methodology can be applied in this context to support decision making regarding the ultimate disposal option for radioactive waste in a more global context.

Comparison of resampling methods for dealing with imbalanced data in binary classification problem (이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교)

  • Park, Geun U;Jung, Inkyung
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.349-374
    • /
    • 2019
  • A class imbalance problem arises when one class outnumbers the other class by a large proportion in binary data. Studies such as transforming the learning data have been conducted to solve this imbalance problem. In this study, we compared resampling methods among methods to deal with an imbalance in the classification problem. We sought to find a way to more effectively detect the minority class in the data. Through simulation, a total of 20 methods of over-sampling, under-sampling, and combined method of over- and under-sampling were compared. The logistic regression, support vector machine, and random forest models, which are commonly used in classification problems, were used as classifiers. The simulation results showed that the random under sampling (RUS) method had the highest sensitivity with an accuracy over 0.5. The next most sensitive method was an over-sampling adaptive synthetic sampling approach. This revealed that the RUS method was suitable for finding minority class values. The results of applying to some real data sets were similar to those of the simulation.

A Study of Dependent Nonstationary Multiple Sampling Plans (종속적 비평형 다중표본 계획법의 연구)

  • 김원경
    • Journal of the Korea Society for Simulation
    • /
    • v.9 no.2
    • /
    • pp.75-87
    • /
    • 2000
  • In this paper, nonstationary multiple sampling plans are discussed which are difficult to solve by analytical method when there exists dependency between the sample data. The initial solution is found by the sequential sampling plan using the sequential probability ration test. The number of acceptance and rejection in each step of the multiple sampling plan are found by grouping the sequential sampling plan's solution initially. The optimal multiple sampling plans are found by simulation. Four search methods are developed U and the optimum sampling plans satisfying the Type I and Type ll error probabilities. The performance of the sampling plans is measured and their algorithms are also shown. To consider the nonstationary property of the dependent sampling plan, simulation method is used for finding the lot rejection and acceptance probability function. As a numerical example Markov chain model is inspected. Effects of the dependency factor and search methods are compared to analyze the sampling results by changing their parameters.

  • PDF

Experimental Analysis of Equilibrization in Binary Classification for Non-Image Imbalanced Data Using Wasserstein GAN

  • Wang, Zhi-Yong;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.37-42
    • /
    • 2019
  • In this paper, we explore the details of three classic data augmentation methods and two generative model based oversampling methods. The three classic data augmentation methods are random sampling (RANDOM), Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic Sampling (ADASYN). The two generative model based oversampling methods are Conditional Generative Adversarial Network (CGAN) and Wasserstein Generative Adversarial Network (WGAN). In imbalanced data, the whole instances are divided into majority class and minority class, where majority class occupies most of the instances in the training set and minority class only includes a few instances. Generative models have their own advantages when they are used to generate more plausible samples referring to the distribution of the minority class. We also adopt CGAN to compare the data augmentation performance with other methods. The experimental results show that WGAN-based oversampling technique is more stable than other approaches (RANDOM, SMOTE, ADASYN and CGAN) even with the very limited training datasets. However, when the imbalanced ratio is too small, generative model based approaches cannot achieve satisfying performance than the conventional data augmentation techniques. These results suggest us one of future research directions.

A Deep Learning Based Over-Sampling Scheme for Imbalanced Data Classification (불균형 데이터 분류를 위한 딥러닝 기반 오버샘플링 기법)

  • Son, Min Jae;Jung, Seung Won;Hwang, Een Jun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.7
    • /
    • pp.311-316
    • /
    • 2019
  • Classification problem is to predict the class to which an input data belongs. One of the most popular methods to do this is training a machine learning algorithm using the given dataset. In this case, the dataset should have a well-balanced class distribution for the best performance. However, when the dataset has an imbalanced class distribution, its classification performance could be very poor. To overcome this problem, we propose an over-sampling scheme that balances the number of data by using Conditional Generative Adversarial Networks (CGAN). CGAN is a generative model developed from Generative Adversarial Networks (GAN), which can learn data characteristics and generate data that is similar to real data. Therefore, CGAN can generate data of a class which has a small number of data so that the problem induced by imbalanced class distribution can be mitigated, and classification performance can be improved. Experiments using actual collected data show that the over-sampling technique using CGAN is effective and that it is superior to existing over-sampling techniques.

Is Simple Random Sampling Better than Quota Sampling? An Analysis Based on the Sampling Methods of Three Surveys in South Korea

  • Cho, Sung Kyum;Jang, Deok-Hyun;LoCascio, Sarah Prusoff
    • Asian Journal for Public Opinion Research
    • /
    • v.3 no.4
    • /
    • pp.156-175
    • /
    • 2016
  • This paper considers whether random sampling always produces more accurate survey results in the case of South Korea. We compare information from the 2010 census to the demographic variables of three public opinion surveys from South Korea: Gallup Korea's Omnibus Survey (Survey A) is conducted every two months by Gallup Korea; the annual Social Survey (Survey B) is conducted by Statistics Korea (KOSTAT); the Korean General Social Survey (KGSS or Survey C) is conducted annually by the Survey Research Center (SRC) at Sungkyunkwan University (SKKU). Survey A uses quota sampling after randomly selecting the neighborhood and initial addresses; Survey B uses random sampling, but allows replacements in some situations; Survey C uses simple random sampling. Data from more than one year was used for each survey. Our analysis suggests that Survey B is the most representative in most respects, and, in some respects, Survey A may be more representative than Survey C. Data from Survey C was the least stable in terms of representativeness by geographical area and age. Single-person households were underrepresented in both Surveys A and C, but the problem was more severe in Survey A. Four-person households and married persons were both over-represented in Survey A. Less educated people were under-represented in both Survey A and Survey C. There were differences in income level between Survey A and Survey C, but income data was not available for Survey B or the census, so it is difficult to ascertain which survey was more representative in this case.

Particle Swarm Optimization Using Adaptive Boundary Correction for Human Activity Recognition

  • Kwon, Yongjin;Heo, Seonguk;Kang, Kyuchang;Bae, Changseok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.6
    • /
    • pp.2070-2086
    • /
    • 2014
  • As a kind of personal lifelog data, activity data have been considered as one of the most compelling information to understand the user's habits and to calibrate diagnoses. In this paper, we proposed a robust algorithm to sampling rates for human activity recognition, which identifies a user's activity using accelerations from a triaxial accelerometer in a smartphone. Although a high sampling rate is required for high accuracy, it is not desirable for actual smartphone usage, battery consumption, or storage occupancy. Activity recognitions with well-known algorithms, including MLP, C4.5, or SVM, suffer from a loss of accuracy when a sampling rate of accelerometers decreases. Thus, we start from particle swarm optimization (PSO), which has relatively better tolerance to declines in sampling rates, and we propose PSO with an adaptive boundary correction (ABC) approach. PSO with ABC is tolerant of various sampling rate in that it identifies all data by adjusting the classification boundaries of each activity. The experimental results show that PSO with ABC has better tolerance to changes of sampling rates of an accelerometer than PSO without ABC and other methods. In particular, PSO with ABC is 6%, 25%, and 35% better than PSO without ABC for sitting, standing, and walking, respectively, at a sampling period of 32 seconds. PSO with ABC is the only algorithm that guarantees at least 80% accuracy for every activity at a sampling period of smaller than or equal to 8 seconds.