• Title/Summary/Keyword: Sampling technique

Search Result 1,309, Processing Time 0.03 seconds

Experimental Analysis of Equilibrization in Binary Classification for Non-Image Imbalanced Data Using Wasserstein GAN

  • Wang, Zhi-Yong;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.37-42
    • /
    • 2019
  • In this paper, we explore the details of three classic data augmentation methods and two generative model based oversampling methods. The three classic data augmentation methods are random sampling (RANDOM), Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic Sampling (ADASYN). The two generative model based oversampling methods are Conditional Generative Adversarial Network (CGAN) and Wasserstein Generative Adversarial Network (WGAN). In imbalanced data, the whole instances are divided into majority class and minority class, where majority class occupies most of the instances in the training set and minority class only includes a few instances. Generative models have their own advantages when they are used to generate more plausible samples referring to the distribution of the minority class. We also adopt CGAN to compare the data augmentation performance with other methods. The experimental results show that WGAN-based oversampling technique is more stable than other approaches (RANDOM, SMOTE, ADASYN and CGAN) even with the very limited training datasets. However, when the imbalanced ratio is too small, generative model based approaches cannot achieve satisfying performance than the conventional data augmentation techniques. These results suggest us one of future research directions.

Application of Random Over Sampling Examples(ROSE) for an Effective Bankruptcy Prediction Model (효과적인 기업부도 예측모형을 위한 ROSE 표본추출기법의 적용)

  • Ahn, Cheolhwi;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.525-535
    • /
    • 2018
  • If the frequency of a particular class is excessively higher than the frequency of other classes in the classification problem, data imbalance problems occur, which make machine learning distorted. Corporate bankruptcy prediction often suffers from data imbalance problems since the ratio of insolvent companies is generally very low, whereas the ratio of solvent companies is very high. To mitigate these problems, it is required to apply a proper sampling technique. Until now, oversampling techniques which adjust the class distribution of a data set by sampling minor class with replacement have popularly been used. However, they are a risk of overfitting. Under this background, this study proposes ROSE(Random Over Sampling Examples) technique which is proposed by Menardi and Torelli in 2014 for the effective corporate bankruptcy prediction. The ROSE technique creates new learning samples by synthesizing the samples for learning, so it leads to better prediction accuracy of the classifiers while avoiding the risk of overfitting. Specifically, our study proposes to combine the ROSE method with SVM(support vector machine), which is known as the best binary classifier. We applied the proposed method to a real-world bankruptcy prediction case of a Korean major bank, and compared its performance with other sampling techniques. Experimental results showed that ROSE contributed to the improvement of the prediction accuracy of SVM in bankruptcy prediction compared to other techniques, with statistical significance. These results shed a light on the fact that ROSE can be a good alternative for resolving data imbalance problems of the prediction problems in social science area other than bankruptcy prediction.

Application of Sampling Theories to Data from Bottom Trawl Surveys Along the Korean Coastal Areas for Inferring the Relative Size of a Fish Population (한반도 연근해 저층 트롤 조사 자료에 표본론을 적용한 개체군의 상대적 크기 추정)

  • Lee, Hyotae;Hyun, Saang-Yoon
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.50 no.5
    • /
    • pp.594-604
    • /
    • 2017
  • The Korean National Institute of Fisheries Science (NIFS) has biannually (spring and fall, respectively) deployed a bottom trawl survey along the coastal areas for last decade, taking samples on a regular basis (i.e., a systematic sampling). Despite the availability of the survey data, NIFS has not yet officially reported the estimates of the groundfish population sizes as well as has not evaluated uncertainty of the estimates. The objectives of our study were to infer the relative size of a fish population, applying two different sampling techniques (namely simple and stratified sampling) with different observation units to the NIFS survey data, and to compare those two techniques in bias and precision. For demonstration purposes, we used data on Pacific cod (Gadus macrocephalus) collected by the 2011-2015 surveys, and the results of simple and stratified sampling showed that the point estimates and precision varied by observation unit as well as the sampling technique.

Complex Bandpass Sampling for SDR front-end (SDR front-end를 위한 Complex Bandpass Sampling)

  • Wang, Hong-Mei;Kim, Jae-Hyung;Kim, Hyung-Jung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.8
    • /
    • pp.1805-1812
    • /
    • 2011
  • Bandpass sampling technique has an advantage that it uses lower sampling frequency than Nyquist criterion. But special care is required in choosing sampling frequency to avoid self-image overlapping in the first Nyquist region. Recently, the second-order BPS techniques which can suppress possible self-image by using an additional ADC and by employing digital signal processing have been proposed. This paper addresses a complex BPS based SDR front-end. Unlike general second-order BPS, it needs simple FIR filter to compensate delay in the second ADC. We show a method to find proper sampling frequencies to down convert RF signals selected by tunable RF filter operating in arbitrary frequency range.

Sampling Error Variation due to Rainfall Seasonality

  • Yoo, Chulsang
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2001.05a
    • /
    • pp.7-14
    • /
    • 2001
  • In this study, we characterized the variation of sampling errors using the Waymire-Gupta-rodriguez-Iturbe multi-dimensional rainfall model (WGR model). The parameters used for this study are those derived by Jung et al. (2000) for the Han River Basin using a genetic algorithm technique. The sampling error problems considering in this study are those far using raingauge network, satellite observation and also for both combined. The characterization of sampling errors was done for each month and also for the downstream plain area and the upstream mountain area, separately. As results of the study we conclude: (1) The pattern of sampling errors estimated are obviously different from the seasonal pattern of mentally rainfall amounts. This result may be understood from the fact that the sampling error is estimated not simply by considering the rainfall amounts, but by considering all the mechanisms controlling the rainfall propagation along with its generation and decay. As the major mechanism of moisture source to the Korean Peninsula is obviously different each month, it seems rather norma1 to provide different pattern of sampling errors from that of monthly rainfall amounts. (2) The sampling errors estimated for the upstream mountain area is about twice higher than those for the down stream plain area. It is believed to be because of the higher variability of rainfall in the upstream mountain area than in the down stream plain area.

  • PDF

Policies for Improving the Survey of Research and Development in Science and Technology: The Case of Industrial Sector (과학기술연구개발활동조사의 개선방안 -기업부문을 중심으로-)

  • 유승훈;문혜선
    • Journal of Korea Technology Innovation Society
    • /
    • v.5 no.2
    • /
    • pp.228-244
    • /
    • 2002
  • The survey of research and development (R&D) in science and technology (S&T) covers the current status of R&D activities in S&T in Korea, and provides a basis for decision making regarding S&T policy. Continuous improvement of the survey is widely needed to present reliable national basic statistics. Therefore, the purpose of the study is two-fold: to introduce sampling survey method in industrial sector and to make statistical technique to deal with non-response data from industrial sector. To these ends, first, case studies of the United States and Japan are illustrated. A new sampling design for the R&D survey is proposed and implementing stratified random sampling scheme is suggested. Moreover, statistical analysis of the non-response data is dealt with. Based on several screening criteria, we develop a new imputation method suitable for the R&D survey and also provide more detailed implementation plan. Various solutions to a problem arising from non-response item are also presented. Finally, some implications of the results are discussed.

  • PDF

A Searching Algorithm for Minimum Bandpass Sampling Frequency in Simultaneous Down-Conversion of Multiple RF Signals

  • Bae, Jung-Hwa;Park, Jin-Woo
    • Journal of Communications and Networks
    • /
    • v.10 no.1
    • /
    • pp.55-62
    • /
    • 2008
  • Bandpass sampling (BPS) techniques for the direct down-conversion of RF bandpass signals have become an essential technique for software defined radio (SDR), due to their advantage of minimizing the radio frequency (RF) front-end hardware dependency. This paper proposes an algorithm for finding the minimum BPS frequency for simultaneously down-converting multiple RF signals through full permutation over all the valid sampling ranges found for the multiple RF signals. We also present a scheme for reducing the computational complexity resulting from the large scale of the purmutation calculation involved in searching for the minimum BPS frequency. In addition, we investigate the BPS frequency allowing for the guard-band between adajacent down-converted signals, which help lessen the severe requirements in practical implementations. The performance of the proposed method is compared with those of other pre-reported methods to prove its effectiveness.

Ripple Free Multirate Controller Design Using Lifting Technique (리프팅 기법을 이용한 리플 제거 멀티레이트 제어기 설계)

  • Jeong, Dong-Seul;Cho, Kyu-Nam;Chung, Chung-Choo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.11
    • /
    • pp.1040-1047
    • /
    • 2007
  • This paper presents ripple-free method that can occur in multirate controller design. The conventional multirate input controller has the problem that the ripple occurs in track-following because of chattering phenomenon in control input signal. In order to resolve the problem of rippling, it was proposed to eliminate the ripple phenomenon using feedforward compensator. This paper makes explains problems in conventional ripple-tree multirate controller and introduces a multirate controller design method applying lifting technique based on current estimators in condition space. Using the ripple-tree multirate controller, we show that chattering does not occur in the control input signal through applying the final value theorem from the viewpoint of discrete-time transformation. Also, this study proves that the ripple of the proposed controller decreases with the increase of this sampling frequency and, when sampling frequency is fixed, it decreases with the increase of the control input period.

Volume Rendering Technique for 3-D Visualization and Its Performance Improvements (물체의 3차원적 도시를 위한 입체묘사기법의 성능향상 및 그 응용)

  • Lee, Min-Seop;Cheon, Gang-Uk;Ra, J.B
    • Journal of Biomedical Engineering Research
    • /
    • v.12 no.2
    • /
    • pp.79-88
    • /
    • 1991
  • Semi-transparent volume rendering technique can provide 3-D visualization well by voxel level Processing and alleviate segmentation arf, ifacts compared wish the surface rendering technique. In this Paper, we consider several new schemes which can improve she Perform ance of volume rendering. A directional interpolation method is proposed to reduce the artifact due to the anisotrophic resolution in X-ray CT data. The computation time for rendering is shortened by using the depth information of the 3-D object. And also, we reduce the quantization artifacts in the rendering by introducing the opacity-dependent sampling interval to sampling in ray-tracing.

  • PDF

Periodic Sampled-Data Control for Fuzzy Systems;Intelligent Digital Redesign Approach

  • Kim, D.W.;Joo, Y.H.;Park, J.B.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1492-1495
    • /
    • 2005
  • This paper presents a new linear-matrix-inequality-based intelligent digital redesign (LMI-based IDR) technique to match the states of the analog and the digital T-S fuzzy control systems at the intersampling instants as well as the sampling ones. The main features of the proposed technique are: 1) the affine control scheme is employed to increase the degree of freedom; 2) the fuzzy-model-based periodic control is employed; and the control input is changed n times during one sampling period; 3) The proposed IDR technique is based on the approximately discretized version of the T-S fuzzy system; but its discretization error vanishes as n approaches the infinity. 4) some sufficient conditions involved in the state matching and the stability of the closed-loop discrete-time system can be formulated in the LMIs format.

  • PDF