• Title/Summary/Keyword: Imbalance Problem

Search Result 271, Processing Time 0.026 seconds

Differential Power Processing System for the Capacitor Voltage Balancing of Cost-effective Photovoltaic Multi-level Inverters

  • Jeon, Young-Tae;Kim, Kyoung-Tak;Park, Joung-Hu
    • Journal of Power Electronics
    • /
    • v.17 no.4
    • /
    • pp.1037-1047
    • /
    • 2017
  • The Differential Power Processing (DPP) converter is a promising multi-module photovoltaic inverter architecture recently proposed for photovoltaic systems. In this paper, a DPP converter architecture, in which each PV-panel has its own DPP converter in shunt, performs distributed maximum power point tracking (DMPPT) control. It maintains a high energy conversion efficiency, even under partial shading conditions. The system architecture only deals with the power differences among the PV panels, which reduces the power capacity of the converters. Therefore, the DPP systems can easily overcome the conventional disadvantages of PCS such as centralized, string, and module integrated converter (MIC) topologies. Among the various types of the DPP systems, the feed-forward method has been selected for both its voltage balancing and power transfer to a modified H-bridge inverter that needs charge balancing of the input capacitors. The modified H-bridge multi-level inverter had some advantages such as a low part count and cost competitiveness when compared to conventional multi-level inverters. Therefore, it is frequently used in photovoltaic (PV) power conditioning system (PCS). However, its simplified switching network draws input current asymmetrically. Therefore, input capacitors in series suffer from a problem due to a charge imbalance. This paper validates the operating principle and feasibility of the proposed topology through the simulation and experimental results. They show that the input-capacitor voltages maintain the voltage balance with the PV MPPT control operating with a 140-W hardware prototype.

Using Malmquist Analysis to evaluate productivity of relocating Public Institution (맘퀴스트 분석을 이용한 지방이전 공공기관 생산성 평가)

  • Kim, Ju-Young;Hong, Jong-Yi
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.9 no.1
    • /
    • pp.395-404
    • /
    • 2019
  • South Korea's economic growth was rapid, It was occasioned the concentration of resources in the metropolitan area. On the other hand, the province has a narrow economic base compared to the metropolitan area, and the underpopulation have been identified. Thus, The government announced the 'National balanced development law' in 2004 to reduce the problem of regional imbalances. The study collected data on 95 of relocated public institutions, and unlike previous studies, Mamquist analysis was used to compare the productivity before and after each public institutions' s relocation. The study focused on how the relocation of public institutions affected the productivity of public institutions through Malmquist Analysis. The results of the analysis show that productivity is declining after most public institutions have moved. And two independent samples T-test result, respective yearly average is nonsignificant.

Abnormal signal detection based on parallel autoencoders (병렬 오토인코더 기반의 비정상 신호 탐지)

  • Lee, Kibae;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.4
    • /
    • pp.337-346
    • /
    • 2021
  • Detection of abnormal signal generally can be done by using features of normal signals as main information because of data imbalance. This paper propose an efficient method for abnormal signal detection using parallel AutoEncoder (AE) which can use features of abnormal signals as well. The proposed Parallel AE (PAE) is composed of a normal and an abnormal reconstructors having identical AE structure and train features of normal and abnormal signals, respectively. The PAE can effectively solve the imbalanced data problem by sequentially training normal and abnormal data. For further detection performance improvement, additional binary classifier can be added to the PAE. Through experiments using public acoustic data, we obtain that the proposed PAE shows Area Under Curve (AUC) improvement of minimum 22 % at the expenses of training time increased by 1.31 ~ 1.61 times to the single AE. Furthermore, the PAE shows 93 % AUC improvement in detecting abnormal underwater acoustic signal when pre-trained PAE is transferred to train open underwater acoustic data.

Development of Demand Forecasting Model for Seoul Shared Bicycle (서울시 공유자전거의 수요 예측 모델 개발)

  • Lim, Heejong;Chung, Kwanghun
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.132-140
    • /
    • 2019
  • Recently, many cities around the world introduced and operated shared bicycle system to reduce the traffic and air pollution. Seoul also provides shared bicycle service called as "Ddareungi" since 2015. As the use of shared bicycle increases, the demand for bicycle in each station is also increasing. In addition to the restriction on budget, however, there are managerial issues due to the different demands of each station. Currently, while bicycle rebalancing is used to resolve the huge imbalance of demands among many stations, forecasting uncertain demand at the future is more important problem in practice. In this paper, we develop forecasting model for demand for Seoul shared bicycle using statistical time series analysis and apply our model to the real data. In particular, we apply Holt-Winters method which was used to forecast electricity demand, and perform sensitivity analysis on the parameters that affect on real demand forecasting.

A Development of Suicidal Ideation Prediction Model and Decision Rules for the Elderly: Decision Tree Approach (의사결정나무 기법을 이용한 노인들의 자살생각 예측모형 및 의사결정 규칙 개발)

  • Kim, Deok Hyun;Yoo, Dong Hee;Jeong, Dae Yul
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.249-276
    • /
    • 2019
  • Purpose The purpose of this study is to develop a prediction model and decision rules for the elderly's suicidal ideation based on the Korean Welfare Panel survey data. By utilizing this data, we obtained many decision rules to predict the elderly's suicide ideation. Design/methodology/approach This study used classification analysis to derive decision rules to predict on the basis of decision tree technique. Weka 3.8 is used as the data mining tool in this study. The decision tree algorithm uses J48, also known as C4.5. In addition, 66.6% of the total data was divided into learning data and verification data. We considered all possible variables based on previous studies in predicting suicidal ideation of the elderly. Finally, 99 variables including the target variable were used. Classification analysis was performed by introducing sampling technique through backward elimination and data balancing. Findings As a result, there were significant differences between the data sets. The selected data sets have different, various decision tree and several rules. Based on the decision tree method, we derived the rules for suicide prevention. The decision tree derives not only the rules for the suicidal ideation of the depressed group, but also the rules for the suicidal ideation of the non-depressed group. In addition, in developing the predictive model, the problem of over-fitting due to the data imbalance phenomenon was directly identified through the application of data balancing. We could conclude that it is necessary to balance the data on the target variables in order to perform the correct classification analysis without over-fitting. In addition, although data balancing is applied, it is shown that performance is not inferior in prediction rate when compared with a biased prediction model.

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

  • Lee, Dae-Bum;Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.5
    • /
    • pp.35-42
    • /
    • 2019
  • Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.

A study on intrusion detection performance improvement through imbalanced data processing (불균형 데이터 처리를 통한 침입탐지 성능향상에 관한 연구)

  • Jung, Il Ok;Ji, Jae-Won;Lee, Gyu-Hwan;Kim, Myo-Jeong
    • Convergence Security Journal
    • /
    • v.21 no.3
    • /
    • pp.57-66
    • /
    • 2021
  • As the detection performance using deep learning and machine learning of the intrusion detection field has been verified, the cases of using it are increasing day by day. However, it is difficult to collect the data required for learning, and it is difficult to apply the machine learning performance to reality due to the imbalance of the collected data. Therefore, in this paper, A mixed sampling technique using t-SNE visualization for imbalanced data processing is proposed as a solution to this problem. To do this, separate fields according to characteristics for intrusion detection events, including payload. Extracts TF-IDF-based features for separated fields. After applying the mixed sampling technique based on the extracted features, a data set optimized for intrusion detection with imbalanced data is obtained through data visualization using t-SNE. Nine sampling techniques were applied through the open intrusion detection dataset CSIC2012, and it was verified that the proposed sampling technique improves detection performance through F-score and G-mean evaluation indicators.

Development of a Demand Model for Physician Workforce Projection on Regional Inequity Problem in Korea Using System Dynamics (시스템 다이내믹스를 활용한 지역별 국내 의사인력 수요에 대한 추계모델 개발)

  • Lee, Gyeong Min;Yoo, Ki-Bong
    • Health Policy and Management
    • /
    • v.32 no.1
    • /
    • pp.73-93
    • /
    • 2022
  • Background: Appropriate physician workforce projection through reasonable discussions and decisions with a broad view on supply and demand of the workforce, thus, is very important for high-quality healthcare services. The study expects to provide preliminary research data on the workforce diagnosis standard model for Korean physician workforce policy decision through more flexible and objective physician workforce projection in reflection of diverse changes in healthcare policy and sociodemographic environments. Methods: A low flow rate through the causal map was developed, and an objective workforce demand projection from 2019 to 2040 was conducted. In addition, projections by scenarios under various situations were conducted with the low flow rate developed in the study. Lastly, the demand projection of the physician workforce by region of 17 cities and provinces was conducted. Results: First, demand of physicians in 2019 was 110,665, 113,450 in 2020, 129,496 in 2025, 146,837 in 2030, 163,719 in 2035, and 179,288 in 2040. Second, the scenario for the retirement of baby boomers led to a decrease in the growth rate due to time delay. Third, Seoul and Gyeonggi-do account for a high percentage of demand, a very high upward trend was identified in Gyeonggi-do, and as a result, the projection showed that the demand of the physician workforce in Gyeonggi-do would worsen over time. Conclusion: This study is meaningful in that rational and collective physician workforce supply and demand and its imbalance in workforce distribution were verified through various projections by scenarios and regions of Korea with System Dynamics.

A Study on Fraud Detection in the C2C Used Trade Market Using Doc2vec

  • Lim, Do Hyun;Ahn, Hyunchul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.173-182
    • /
    • 2022
  • In this paper, we propose a machine learning model that can prevent fraudulent transactions in advance and interpret them using the XAI approach. For the experiment, we collected a real data set of 12,258 mobile phone sales posts from Joonggonara, a major domestic online C2C resale trading platform. Characteristics of the text corresponding to the post body were extracted using Doc2vec, dimensionality was reduced through PCA, and various derived variables were created based on previous research. To mitigate the data imbalance problem in the preprocessing stage, a complex sampling method that combines oversampling and undersampling was applied. Then, various machine learning models were built to detect fraudulent postings. As a result of the analysis, LightGBM showed the best performance compared to other machine learning models. And as a result of SHAP, if the price is unreasonably low compared to the market price and if there is no indication of the transaction area, there was a high probability that it was a fraudulent post. Also, high price, no safe transaction, the more the courier transaction, and the higher the ratio of 0 in the price also led to fraud.

Arrhythmia Classification using GAN-based Over-Sampling Method and Combination Model of CNN-BLSTM (GAN 오버샘플링 기법과 CNN-BLSTM 결합 모델을 이용한 부정맥 분류)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1490-1499
    • /
    • 2022
  • Arrhythmia is a condition in which the heart has an irregular rhythm or abnormal heart rate, early diagnosis and management is very important because it can cause stroke, cardiac arrest, or even death. In this paper, we propose arrhythmia classification using hybrid combination model of CNN-BLSTM. For this purpose, the QRS features are detected from noise removed signal through pre-processing and a single bit segment was extracted. In this case, the GAN oversampling technique is applied to solve the data imbalance problem. It consisted of CNN layers to extract the patterns of the arrhythmia precisely, used them as the input of the BLSTM. The weights were learned through deep learning and the learning model was evaluated by the validation data. To evaluate the performance of the proposed method, classification accuracy, precision, recall, and F1-score were compared by using the MIT-BIH arrhythmia database. The achieved scores indicate 99.30%, 98.70%, 97.50%, 98.06% in terms of the accuracy, precision, recall, F1 score, respectively.