• Title/Summary/Keyword: Order Imbalance Information

Search Result 72, Processing Time 0.025 seconds

Resolving CTGAN-based data imbalance for commercialization of public technology (공공기술 사업화를 위한 CTGAN 기반 데이터 불균형 해소)

  • Hwang, Chul-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.64-69
    • /
    • 2022
  • Commercialization of public technology is the transfer of government-led scientific and technological innovation and R&D results to the private sector, and is recognized as a key achievement driving economic growth. Therefore, in order to activate technology transfer, various machine learning methods are being studied to identify success factors or to match public technology with high commercialization potential and demanding companies. However, public technology commercialization data is in the form of a table and has a problem that machine learning performance is not high because it is in an imbalanced state with a large difference in success-failure ratio. In this paper, we present a method of utilizing CTGAN to resolve imbalances in public technology data in tabular form. In addition, to verify the effectiveness of the proposed method, a comparative experiment with SMOTE, a statistical approach, was performed using actual public technology commercialization data. In many experimental cases, it was confirmed that CTGAN reliably predicts public technology commercialization success cases.

A Hybrid Oversampling Technique for Imbalanced Structured Data based on SMOTE and Adapted CycleGAN (불균형 정형 데이터를 위한 SMOTE와 변형 CycleGAN 기반 하이브리드 오버샘플링 기법)

  • Jung-Dam Noh;Byounggu Choi
    • Information Systems Review
    • /
    • v.24 no.4
    • /
    • pp.97-118
    • /
    • 2022
  • As generative adversarial network (GAN) based oversampling techniques have achieved impressive results in class imbalance of unstructured dataset such as image, many studies have begun to apply it to solving the problem of imbalance in structured dataset. However, these studies have failed to reflect the characteristics of structured data due to changing the data structure into an unstructured data format. In order to overcome the limitation, this study adapted CycleGAN to reflect the characteristics of structured data, and proposed hybridization of synthetic minority oversampling technique (SMOTE) and the adapted CycleGAN. In particular, this study tried to overcome the limitations of existing studies by using a one-dimensional convolutional neural network unlike previous studies that used two-dimensional convolutional neural network. Oversampling based on the method proposed have been experimented using various datasets and compared the performance of the method with existing oversampling methods such as SMOTE and adaptive synthetic sampling (ADASYN). The results indicated the proposed hybrid oversampling method showed superior performance compared to the existing methods when data have more dimensions or higher degree of imbalance. This study implied that the classification performance of oversampling structured data can be improved using the proposed hybrid oversampling method that considers the characteristic of structured data.

Correction of King-Moe Type V Scoliosis with Optimization Method in a FE Model (King-Moe Type V 형태의 척추측만증 유한 요소 모델에서 최적화 기법을 적용한 교정 방법)

  • 김영은;손창규;박경열;정지호;최형연
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2003.06a
    • /
    • pp.701-704
    • /
    • 2003
  • Scoliosis is a complex musculoskeletal dieses requiring 3-D treatment with surgical instrumentation. Conventional corrective surgery for scoliosis was done based on empirical knowledge without information of the optimum position and operative procedure. Frequently, post operative change of rib hump increase and shoulder level imbalance caused serious problems in the view of cosmetics. To investigate the effect of correction surgery, a reconstructed 3-D finite element model for King-Moe type V was developed. Vertebrae, clavicle and other bony element were represented using rigid bodies. Kinematic joints and nonlinear bar elements used to represent the intervertebral disc and ligaments according to reported experimental data. With this model, optimization technique was also applied in order to define the optimal magnitudes of correction. The optimization procedure corrected the scoliotic deformities by reducing the objective function by more than 94%. with an associated reduction of the scoliotic descriptors mainly on the frontal thoracic curve.

  • PDF

Inter-view Balanced Disparity Estimation for Mutiview Video Coding (다시점 영상에서 시점간 균형을 맞추는 변이 추정 알고리듬)

  • Yoon, Jae-Won;Kim, Yong-Tae;Sohn, Kwang-Hoon
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.435-436
    • /
    • 2006
  • When working with multi-view images, imbalances between multi-view images occur a serious problem in multi-view video coding because they decrease the performance of disparity estimation. To overcome this problem, we propose inter-view balanced disparity estimation for multi-view video coding. In general, the imbalance problem can be solved by a preprocessing step that transforms reference images linearly. However, there are some problems in pre-processing such as the transformation of the original images. In order to obtain a balancing effect among the views, we perform block-based disparity estimation, which includes several balancing parameters.

  • PDF

Solar-CTP : An Enhanced CTP for Solar-Powered Wireless Sensor Networks Using a Mobile Sink (Solar-CTP : 모바일 싱크 기반 태양 에너지 수집형 무선 센서 네트워크를 위한 향상된 CTP)

  • Cheong, Seok Hyun;Kang, Minjae;Noh, Dong Kun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.4
    • /
    • pp.77-82
    • /
    • 2020
  • Wireless sensor networks (WSNs) suffer from not only a short lifetime due to limited energy but also an energy imbalance between nodes close to the sink and others. In order to fundamentally solve the short lifetime, recent studies utilize the environmental energy such as solar power. Additionally, WSNs using mobile sinks are being studied to address the energy imbalance problem. This paper proposes an improved CTP (Collection Tree Protocol) scheme which uses these two approaches simultaneously. Basically, it is based on a CTP scheme which is a very popular data collection strategy designed for the typical battery-based WSNs with a fixed sink. Therefore, we tailored it for solar-powered WSNs with a mobile sink. Performance verification confirms that our scheme reduces the number of blackout nodes significantly compared to the typical CTP, thus increases the amount of data collected by the sink.

A Study on Establishing the School Grouping System of Middle School -Focusing the Middle School in Gwangju Metropolitan City- (중학교 학교군 및 중학구 설정을 위한 조사 연구 -광주광역시 중학교를 중심으로-)

  • Lee, Hwa-Ryoung;Ha, Bong-Woon;Dong, Jae-Wook
    • Journal of the Korean Institute of Educational Facilities
    • /
    • v.18 no.3
    • /
    • pp.3-11
    • /
    • 2011
  • This study aims at proposing some reform measures for the middle school grouping system in Gwangju Metropolitan City, which is divided 86 middle schools into 10 clusters and 3 school districts. In doing so, it analyzes the present status of educational environment and student walking distance in each school district such as the number of student per teacher, the student density, the school size and the gender ratio in class. And it conducts a survey of 5,363 middle school students, 3,966 parents and 1,007 teachers, also evaluates their satisfaction levels and needs with the student allocation system. As the result of the survey and data analysis, it finds out some problems in some school districts which are gender imbalance in class, the preference for private middle schools and inconvenience in commuting to school. To solve these problems, the study suggests the better alternatives to replace the current system. Firstly, to set up the basic fundamental principles detailed in 3 action plan, which emphasize the adherence to a close-range allocation, the appropriate size of school and class, and the equalization of educational environment. Secondly, to establish the information system for managing the school district in order to be more objective and transparent. Finally, it gives a concrete proposal which divides the 10th school grouping system into the 11th. The result would be expected to ease the gender imbalance and the concentration of private middle schools, to improve the student walking condition to school.

  • PDF

A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data

  • Pouramini, Jafar;Minaei-Bidgoli, Behrouze;Esmaeili, Mahdi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.3725-3748
    • /
    • 2018
  • Text data distribution is often imbalanced. Imbalanced data is one of the challenges in text classification, as it leads to the loss of performance of classifiers. Many studies have been conducted so far in this regard. The proposed solutions are divided into several general categories, include sampling-based and algorithm-based methods. In recent studies, feature selection has also been considered as one of the solutions for the imbalance problem. In this paper, a novel one-sided feature selection known as probabilistic feature selection (PFS) was presented for imbalanced text classification. The PFS is a probabilistic method that is calculated using feature distribution. Compared to the similar methods, the PFS has more parameters. In order to evaluate the performance of the proposed method, the feature selection methods including Gini, MI, FAST and DFS were implemented. To assess the proposed method, the decision tree classifications such as C4.5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per F-measure suggested that the proposed feature selection has significantly improved the performance of the classifiers.

A Study on EV Charging Scheme Using Load Control

  • Go, Hyo-Sang;Cho, In-Ho;Kim, Gil-Dong;Kim, Chul-Hwan
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.5
    • /
    • pp.1789-1797
    • /
    • 2017
  • It is necessary to charge electric vehicles in order to drive them. Thus, it is essential to have electric vehicle charging facilities in place. In the case of a household battery charger, the power similar to that consumed by a household with a basic contract power of 3kW is consumed. In addition, many consumers who own an electric vehicle will charge their vehicles at the same time. The simultaneous charging of electric vehicles will cause the load to increase, which then will lead to the imbalance of supply and demand in the distribution system. Thus, a smart charging scheme for electric vehicles is an essential element. In this paper, simulated conditions were set up using real data relating to Korea in order to design a smart charging technique suitable for the actual situation. The simulated conditions were used to present a smart charging technique for electric vehicles that disperses electric vehicles being charged simultaneously. The EVs and Smart Charging Technique are modeled using the Electro Magnetic Transients Program (EMTP).

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.

Context-Dependent Video Data Augmentation for Human Instance Segmentation (인물 개체 분할을 위한 맥락-의존적 비디오 데이터 보강)

  • HyunJin Chun;JongHun Lee;InCheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.5
    • /
    • pp.217-228
    • /
    • 2023
  • Video instance segmentation is an intelligent visual task with high complexity because it not only requires object instance segmentation for each image frame constituting a video, but also requires accurate tracking of instances throughout the frame sequence of the video. In special, human instance segmentation in drama videos has an unique characteristic that requires accurate tracking of several main characters interacting in various places and times. Also, it is also characterized by a kind of the class imbalance problem because there is a significant difference between the frequency of main characters and that of supporting or auxiliary characters in drama videos. In this paper, we introduce a new human instance datatset called MHIS, which is built upon drama videos, Miseang, and then propose a novel video data augmentation method, CDVA, in order to overcome the data imbalance problem between character classes. Different from the previous video data augmentation methods, the proposed CDVA generates more realistic augmented videos by deciding the optimal location within the background clip for a target human instance to be inserted with taking rich spatio-temporal context embedded in videos into account. Therefore, the proposed augmentation method, CDVA, can improve the performance of a deep neural network model for video instance segmentation. Conducting both quantitative and qualitative experiments using the MHIS dataset, we prove the usefulness and effectiveness of the proposed video data augmentation method.