• 제목/요약/키워드: Imbalance Data

Search Result 487, Processing Time 0.022 seconds

The Detection of Online Manipulated Reviews Using Machine Learning and GPT-3 (기계학습과 GPT3를 시용한 조작된 리뷰의 탐지)

  • Chernyaeva, Olga;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.347-364
    • /
    • 2022
  • Fraudulent companies or sellers strategically manipulate reviews to influence customers' purchase decisions; therefore, the reliability of reviews has become crucial for customer decision-making. Since customers increasingly rely on online reviews to search for more detailed information about products or services before purchasing, many researchers focus on detecting manipulated reviews. However, the main problem in detecting manipulated reviews is the difficulties with obtaining data with manipulated reviews to utilize machine learning techniques with sufficient data. Also, the number of manipulated reviews is insufficient compared with the number of non-manipulated reviews, so the class imbalance problem occurs. The class with fewer examples is under-represented and can hamper a model's accuracy, so machine learning methods suffer from the class imbalance problem and solving the class imbalance problem is important to build an accurate model for detecting manipulated reviews. Thus, we propose an OpenAI-based reviews generation model to solve the manipulated reviews imbalance problem, thereby enhancing the accuracy of manipulated reviews detection. In this research, we applied the novel autoregressive language model - GPT-3 to generate reviews based on manipulated reviews. Moreover, we found that applying GPT-3 model for oversampling manipulated reviews can recover a satisfactory portion of performance losses and shows better performance in classification (logit, decision tree, neural networks) than traditional oversampling models such as random oversampling and SMOTE.

Effective Gait Imbalance Judgment Method based on Thigh Location (대퇴부 위치 기반 효과적인 보행 불균형 측정 방법)

  • Kim, Seojun;Kim, Yoohyun;Shim, Hyeonmin;Lee, Sangmin
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.4
    • /
    • pp.541-545
    • /
    • 2014
  • In this paper, the angle of the thighs that appear during walking condition to balance estimation to the left and right leg was occurred during normal walking. Get over to the limitations of gait analysis using image processing or foot pressure that was used a lot in the previous, the angle of the thigh were used for estimation of asymmetric gait. We implemented heathy five adult male to test targeting and gait and obtained cycle data from 10 times. For this research, Thigh-Angle measurement device were developed, and attached to in a position of $20^{\circ}$ for flexion and $15^{\circ}$ for extension to measure the angle of the thigh. Also, in order to verify the reliability of estimation of asymmetric gait using thigh-angle, it was compared with the result of asymmetric gait estimation using foot pressure. The results of this paper, using the thigh angle is the average of 16.84% higher than using pressure to accuracy of determine the gait imbalance.

The associations between dietary behavior and subjective measurements of serious dental diseases in nursing home staff (일부 병원종사자의 식행동과 주관적 중대 구강병과의 연관성)

  • Shim, Youn-Soo;An, So-Youn;Park, So-Young
    • Journal of Korean society of Dental Hygiene
    • /
    • v.13 no.3
    • /
    • pp.377-385
    • /
    • 2013
  • Objectives : The objective of this study is to determine the associations between dietary behaviour and subjective measurements of dental caries and periodontal disease in a cohort of nursing home staff. Methods : A self-reported survey was carried out in 280 nursing home staff in Jeollabukdo Province, Korea. The collected data were analyzed using SPSS Version 19.0 program. Multiple regression analysis was conducted to examine the effects of dietary behavior and food intake on subjective measurements of the two serious dental diseases. Results : The irregular meal tended to increase dietary imbalance and periodontal diseases in the nursing staff. For example, it had influences on the imbalance of sugar, vegetable, and safood intake. Conclusions : It is important to take regular meal because irregular eating behavior tended to increase dietary imbalance and periodontal diseases in the nursing staff.

The Effect of Knee Muscle Imbalance on Motion of Back Squat (무릎 근력의 불균형이 백 스쿼트 동작에 미치는 영향)

  • Sohn, Jee-Hoon
    • Journal of Digital Convergence
    • /
    • v.17 no.3
    • /
    • pp.463-471
    • /
    • 2019
  • The purpose of this study was to investigate the effect of muscle imbalance on motion of back squat. The isokinetic muscle strength of the 8 subjects was recorded for the knee flexion/extension by the cybex 770 dynamometer. Each subject performed 3 back squats with the long barbell with an intensity of 25% body weight(BW), 50%BW, 100%BW, 125%BW. During the back squat through the recorded kinematic data the subjects' maximum flexion and extension knee angle, center of mass displacement and V-COP were calculated for evaluation of the stability of the movement. For the statistical analysis independent t-test was used. Knee flexion angle and COM displacement are dominated by the reciprocal muscle ratio. V-COP factor was dominated by bilateral extension deficit. Based on the results we can know that as the intensity of the squat increased to a level control was difficult because the muscles' imbalance influenced the movement.

Multiple linear regression model-based voltage imbalance estimation for high-power series battery pack (다중선형회귀모델 기반 고출력 직렬 배터리 팩의 전압 불균형 추정)

  • Kim, Seung-Woo;Lee, Pyeong-Yeon;Han, Dong-Ho;Kim, Jong-hoon
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.1-8
    • /
    • 2019
  • In this paper, the electrical characteristics with various C-rates are tested with a high power series battery pack comprised of 18650 cylindrical nickel cobalt aluminum(NCA) lithium-ion battery. The electrical characteristics of discharge capacity test with 14S1P battery pack and electric vehicle (EV) cycle test with 4S1P battery pack are compared and analyzed by the various of C-rates. Multiple linear regression is used to estimate voltage imbalance of 14S1P and 4S1P battery packs with various C-rates based on experimental data. The estimation accuracy is evaluated by root mean square error(RMSE) to validate multiple linear regression. The result of this paper is contributed that to use for estimating the voltage imbalance of discharge capacity test with 14S1P battery pack using multiple linear regression better than to use the voltage imbalance of EV cycle with 4S1P battery pack.

Influences and Compensation of Phase Noise and IQ Imbalance in Multiband DFT-S OFDM System for the Spectrum Aggregation (스펙트럼 집성을 위한 멀티 밴드 DFT-S OFDM 시스템에서 직교 불균형과 위상 잡음의 영향 분석 및 보상)

  • Ryu, Sang-Burm;Ryu, Heung-Gyoon;Choi, Jin-Kyu;Kim, Jin-Up
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.21 no.11
    • /
    • pp.1275-1284
    • /
    • 2010
  • 100 MHz bandwidth and 1 Gbit/s data speed are needed in LTE-advanced for the next generation mobile communication system. Therefore, spectrum aggregation method has been studied recently to extend usable frequency bands. Also bandwidth utilization is increased since vacant frequencies are used to communicate. However, transceiver structure requires the digital RF and SDR. Therefore, frequency synthesizer and PA must operate over wide-bandwidth and RF impairments also increases in transceiver. Uplink of LTE advanced uses DFT-S OFDM using plural power amplifier. The effect of ICI increases in frequency domain of receiver due to phase noise and IQ imbalance. In this paper, we analyze influences of ICI in frequency domain of receiver considering phase noise and IQ imbalance in multiband system. Also, we separate phase noise and IQ imbalance effect from channel response in frequency domain of uplink system. And we propose a method to estimate the channel exactly and to compensate IQ imbalance and phase noise. Simulation result shows that the proposed method achieves the 2 dB performance gain of BER=$10^{-4}$.

A Load-Balancing Approach Using an Improved Simulated Annealing Algorithm

  • Hanine, Mohamed;Benlahmar, El-Habib
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.132-144
    • /
    • 2020
  • Cloud computing is an emerging technology based on the concept of enabling data access from anywhere, at any time, from any platform. The exponential growth of cloud users has resulted in the emergence of multiple issues, such as the workload imbalance between the virtual machines (VMs) of data centers in a cloud environment greatly impacting its overall performance. Our axis of research is the load balancing of a data center's VMs. It aims at reducing the degree of a load's imbalance between those VMs so that a better resource utilization will be provided, thus ensuring a greater quality of service. Our article focuses on two phases to balance the workload between the VMs. The first step will be the determination of the threshold of each VM before it can be considered overloaded. The second step will be a task allocation to the VMs by relying on an improved and faster version of the meta-heuristic "simulated annealing (SA)". We mainly focused on the acceptance probability of the SA, as, by modifying the content of the acceptance probability, we could ensure that the SA was able to offer a smart task distribution between the VMs in fewer loops than a classical usage of the SA.

Automated Facial Wrinkle Segmentation Scheme Using UNet++

  • Hyeonwoo Kim;Junsuk Lee;Jehyeok, Rew;Eenjun Hwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2333-2345
    • /
    • 2024
  • Facial wrinkles are widely used to evaluate skin condition or aging for various fields such as skin diagnosis, plastic surgery consultations, and cosmetic recommendations. In order to effectively process facial wrinkles in facial image analysis, accurate wrinkle segmentation is required to identify wrinkled regions. Existing deep learning-based methods have difficulty segmenting fine wrinkles due to insufficient wrinkle data and the imbalance between wrinkle and non-wrinkle data. Therefore, in this paper, we propose a new facial wrinkle segmentation method based on a UNet++ model. Specifically, we construct a new facial wrinkle dataset by manually annotating fine wrinkles across the entire face. We then extract only the skin region from the facial image using a facial landmark point extractor. Lastly, we train the UNet++ model using both dice loss and focal loss to alleviate the class imbalance problem. To validate the effectiveness of the proposed method, we conduct comprehensive experiments using our facial wrinkle dataset. The experimental results showed that the proposed method was superior to the latest wrinkle segmentation method by 9.77%p and 10.04%p in IoU and F1 score, respectively.

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.

A Hybrid Oversampling Technique for Imbalanced Structured Data based on SMOTE and Adapted CycleGAN (불균형 정형 데이터를 위한 SMOTE와 변형 CycleGAN 기반 하이브리드 오버샘플링 기법)

  • Jung-Dam Noh;Byounggu Choi
    • Information Systems Review
    • /
    • v.24 no.4
    • /
    • pp.97-118
    • /
    • 2022
  • As generative adversarial network (GAN) based oversampling techniques have achieved impressive results in class imbalance of unstructured dataset such as image, many studies have begun to apply it to solving the problem of imbalance in structured dataset. However, these studies have failed to reflect the characteristics of structured data due to changing the data structure into an unstructured data format. In order to overcome the limitation, this study adapted CycleGAN to reflect the characteristics of structured data, and proposed hybridization of synthetic minority oversampling technique (SMOTE) and the adapted CycleGAN. In particular, this study tried to overcome the limitations of existing studies by using a one-dimensional convolutional neural network unlike previous studies that used two-dimensional convolutional neural network. Oversampling based on the method proposed have been experimented using various datasets and compared the performance of the method with existing oversampling methods such as SMOTE and adaptive synthetic sampling (ADASYN). The results indicated the proposed hybrid oversampling method showed superior performance compared to the existing methods when data have more dimensions or higher degree of imbalance. This study implied that the classification performance of oversampling structured data can be improved using the proposed hybrid oversampling method that considers the characteristic of structured data.