• Title/Summary/Keyword: Oversampling method

Search Result 56, Processing Time 0.022 seconds

Churn Prediction Model using Logistic Regression (Logistic Regression을 이용한 이탈고객예측모형)

  • Jeong, Han-Na;Park, Hye-Jin;Kim, Nam-Hyeong;Jeon, Chi-Hyeok;Lee, Jae-Uk
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2008.10a
    • /
    • pp.324-328
    • /
    • 2008
  • 금융산업에서 고객의 이탈비율은 기대수익에 영향을 미친다는 점에서 예측이 필요한 부분이며 최근 들어 정확한 예측을 통한 비용관리가 이루어지면서 고객 이탈을 예측하는 것이 중요한 문제로 떠오르고 있다. 그러나 보험 고객 데이터가 대용량이고 불균형한 출력 값을 갖는 특성으로 인해 기존의 방법으로 예측 모델을 만드는 것이 적합하지 않다. 본 연구에서는 대용량 데이터를 처리하는 데 효과적으로 알려져 있는 Trust-region Newton method를 적용한 로지스틱 회귀분석을 통해 이탈고객을 예측하는 것을 주된 연구로 하며, 불균형한 데이터에서의 예측정확도를 높이기 위해 Oversampling, Clustering, Boosting 등을 이용하여 고객 데이터에 적합한 이탈 고객 예측 모형을 제시하고자 한다.

  • PDF

A Microcontroller-Based Lock-In Amplifier for Capacitive Sensors (용량형 센서를 위한 마이크로컨트롤러에 기반을 둔 록인 증폭기)

  • Kim, Cheong-Worl
    • Journal of Sensor Science and Technology
    • /
    • v.23 no.1
    • /
    • pp.24-28
    • /
    • 2014
  • A lock-in amplifier was proposed for capacitive sensor applications. This amplifier was based on a general-purpose microcontroller and had only a charge amplifier as analog circuits. All the other functions of lock-in amplifier except for the charge amplifier were implemented with firmware and the internal resources of the microcontroller. A rectangular signal, generated by the microcontroller, was used in a sensor-driving signal instead of a conventional sinusoidal signal. This makes it possible that the phase comparison circuit in the lockin amplifier is made with analog-to-digital converter, a timer and an interrupt controller. Using the oversampling method and the rectangular driving signal, we can make it easy to implement the peak detection function with software and sample the peak-to-peak signal at charge amplifier output. A charge amplifier was proposed to cancel out the base capacitance existing in capacitive sensors structurally. The experimental results show that the lock-in amplifier operating in the supply voltage of 3.0 V cancels out the base capacitance and has good linearity.

A study on data mining techniques for soil classification methods using cone penetration test results

  • Junghee Park;So-Hyun Cho;Jong-Sub Lee;Hyun-Ki Kim
    • Geomechanics and Engineering
    • /
    • v.35 no.1
    • /
    • pp.67-80
    • /
    • 2023
  • Due to the nature of the conjunctive Cone Penetration Test(CPT), which does not verify the actual sample directly, geotechnical engineers commonly classify the underground geomaterials using CPT results with the classification diagrams proposed by various researchers. However, such classification diagrams may fail to reflect local geotechnical characteristics, potentially resulting in misclassification that does not align with the actual stratification in regions with strong local features. To address this, this paper presents an objective method for more accurate local CPT soil classification criteria, which utilizes C4.5 decision tree models trained with the CPT results from the clay-dominant southern coast of Korea and the sand-dominant region in South Carolina, USA. The results and analyses demonstrate that the C4.5 algorithm, in conjunction with oversampling, outlier removal, and pruning methods, can enhance and optimize the decision tree-based CPT soil classification model.

Application of Random Over Sampling Examples(ROSE) for an Effective Bankruptcy Prediction Model (효과적인 기업부도 예측모형을 위한 ROSE 표본추출기법의 적용)

  • Ahn, Cheolhwi;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.525-535
    • /
    • 2018
  • If the frequency of a particular class is excessively higher than the frequency of other classes in the classification problem, data imbalance problems occur, which make machine learning distorted. Corporate bankruptcy prediction often suffers from data imbalance problems since the ratio of insolvent companies is generally very low, whereas the ratio of solvent companies is very high. To mitigate these problems, it is required to apply a proper sampling technique. Until now, oversampling techniques which adjust the class distribution of a data set by sampling minor class with replacement have popularly been used. However, they are a risk of overfitting. Under this background, this study proposes ROSE(Random Over Sampling Examples) technique which is proposed by Menardi and Torelli in 2014 for the effective corporate bankruptcy prediction. The ROSE technique creates new learning samples by synthesizing the samples for learning, so it leads to better prediction accuracy of the classifiers while avoiding the risk of overfitting. Specifically, our study proposes to combine the ROSE method with SVM(support vector machine), which is known as the best binary classifier. We applied the proposed method to a real-world bankruptcy prediction case of a Korean major bank, and compared its performance with other sampling techniques. Experimental results showed that ROSE contributed to the improvement of the prediction accuracy of SVM in bankruptcy prediction compared to other techniques, with statistical significance. These results shed a light on the fact that ROSE can be a good alternative for resolving data imbalance problems of the prediction problems in social science area other than bankruptcy prediction.

A Single-Bit 2nd-Order CIFF Delta-Sigma Modulator for Precision Measurement of Battery Current (배터리 전류의 정밀 측정을 위한 단일 비트 2차 CIFF 구조 델타 시그마 모듈레이터)

  • Bae, Gi-Gyeong;Cheon, Ji-Min
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.3
    • /
    • pp.184-196
    • /
    • 2020
  • In this paper, a single-bit 2nd-order delta-sigma modulator with the architecture of cascaded-of-integrator feedforward (CIFF) is proposed for precision measurement of current flowing through a secondary cell battery in a battery management system (BMS). The proposed modulator implements two switched capacitor integrators and a single-bit comparator with peripheral circuits such as a non-overlapping clock generator and a bias circuit. The proposed structure is designed to be applied to low-side current sensing method with low common mode input voltage. Using the low-side current measurement method has the advantage of reducing the burden on the circuit design. In addition, the ±30mV input voltage is resolved by the ADC with 15-bit resolution, eliminating the need for an additional programmable gain amplifier (PGA). The proposed a single-bit 2nd-order delta-sigma modulator has been implemented in a 350-nm CMOS process. It achieves 95.46-dB signal-to-noise-and-distortion ratio (SNDR), 96.01-dB spurious-free dynamic range (SFDR), and 15.56-bit effective-number-of-bits (ENOB) with an oversampling ratio (OSR) of 400 for 5-kHz bandwidth. The area and power consumption of the delta-sigma modulator are 670×490 ㎛2 and 414 ㎼, respectively.

Discontinuous Grids and Time-Step Finite-Difference Method for Simulation of Seismic Wave Propagation (지진파 전파 모의를 위한 불균등 격자 및 시간간격 유한차분법)

  • 강태섭;박창업
    • Proceedings of the Earthquake Engineering Society of Korea Conference
    • /
    • 2003.03a
    • /
    • pp.50-58
    • /
    • 2003
  • We have developed a locally variable time-step scheme matching with discontinuous grids in the flute-difference method for the efficient simulation of seismic wave propagation. The first-order velocity-stress formulations are used to obtain the spatial derivatives using finite-difference operators on a staggered grid. A three-times coarser grid in the high-velocity region compared with the grid in the low-velocity region is used to avoid spatial oversampling. Temporal steps corresponding to the spatial sampling ratio between both regions are determined based on proper stability criteria. The wavefield in the margin of the region with smaller time-step are linearly interpolated in time using the values calculated in the region with larger one. The accuracy of the proposed scheme is tested through comparisons with analytic solutions and conventional finite-difference scheme with constant grid spacing and time step. The use of the locally variable time-step scheme with discontinuous grids results in remarkable saving of the computation time and memory requirement with dependency of the efficiency on the simulation model. This implies that ground motion for a realistic velocity structures including near-surface sediments can be modeled to high frequency (several Hz) without requiring severe computer memory

  • PDF

Blind frequency offset estimation method in OFDM systems (OFDM에서 블라인드 주파수 옵셋 추정 방법)

  • Jeon, Hyoung-Goo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.4
    • /
    • pp.823-832
    • /
    • 2011
  • In this paper, an efficient blind carrier frequency offset (CFO) estimation method in orthogonal frequency division multiplexing (OFDM) systems is proposed. In the proposed method, we obtain two time different received OFDM symbols by using both the cyclic prefix and oversampling technique, and a cost function is defined by using the two OFDM symbols. We show that the cost function can be approximately expressed as a cosine function. Using a property of the cosine function, a formular for estimating the CFO is derived. The estimator of the CFO requires three independent cost function values calculated at three different points of frequency offset. The proposed method is very efficient in computational complexity since no searching operation for the minimum cost value is required. The proposed method reduces 97% of the amount of FFT computation, compared with the ML method. Unlike the conventional methods such as the ML method and the MUSIC] method, the accuracy of the proposed method is independent of the searching resolution since the closed form solution exists. The computer simulation shows that the performance of the proposed method is superior to those of the MUSIC and the ML method.

New Gain Optimization Method for Sigma-Delta A/D Converters Using CIC Decimation Filters (CIC 데시메이션 필터를 이용한 Sigma-Delta A/D 변환기 이득 최적화 방식)

  • Jang, Jin-Kyu;Jang, Young-Beom
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.47 no.4
    • /
    • pp.1-8
    • /
    • 2010
  • In this paper, we propose a new gain optimization technique for Sigma-Delta A/D converters. In the proposed scheme, multiple gain set candidates showing maximum SNR in the modulator block are selected, and then multiple gain set candidates are investigated for minimum MSE in decimation block. Through CIC decimation filter simulation, it is shown that second SNR ranking candidate in modulation block is the best gain set.

An Efficient Identification Algorithm in a Low SNR Channel (저 SNR을 갖는 채널에서 효율적인 인식 알고리즘)

  • Hwang, Jeewon;Cho, Juphil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.4
    • /
    • pp.790-796
    • /
    • 2014
  • Identification of communication channels is a problem of important current theoretical and practical concerns. Recently proposed solutions for this problem exploit the diversity induced by antenna array or time oversampling. The method resorts to an adaptive filter with a linear constraint. In this paper, an approach is proposed that is based on decomposition. Indeed, the eigenvector corresponding to the minimum eigenvalue of the covariance matrix of the received signals contains the channel impulse response. And we present an adaptive algorithm to solve this problem. Proposed technique shows the better performance than one of existing algorithms.

Malaria Epidemic Prediction Model by Using Twitter Data and Precipitation Volume in Nigeria

  • Nduwayezu, Maurice;Satyabrata, Aicha;Han, Suk Young;Kim, Jung Eon;Kim, Hoon;Park, Junseok;Hwang, Won-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.5
    • /
    • pp.588-600
    • /
    • 2019
  • Each year Malaria affects over 200 million people worldwide. Particularly, African continent is highly hit by this disease. According to many researches, this continent is ideal for Anopheles mosquitoes which transmit Malaria parasites to thrive. Rainfall volume is one of the major factor favoring the development of these Anopheles in the tropical Sub-Sahara Africa (SSA). However, the surveillance, monitoring and reporting of this epidemic is still poor and bureaucratic only. In our paper, we proposed a method to fast monitor and report Malaria instances by using Social Network Systems (SNS) and precipitation volume in Nigeria. We used Twitter search Application Programming Interface (API) to live-stream Twitter messages mentioning Malaria, preprocessed those Tweets and classified them into Malaria cases in Nigeria by using Support Vector Machine (SVM) classification algorithm and compared those Malaria cases with average precipitation volume. The comparison yielded a correlation of 0.75 between Malaria cases recorded by using Twitter and average precipitations in Nigeria. To ensure the certainty of our classification algorithm, we used an oversampling technique and eliminated the imbalance in our training Tweets.