• Title/Summary/Keyword: data augmentation method

Search Result 201, Processing Time 0.026 seconds

Semantic Occlusion Augmentation for Effective Human Pose Estimation (가려진 사람의 자세추정을 위한 의미론적 폐색현상 증강기법)

  • Hyun-Jae, Bae;Jin-Pyung, Kim;Jee-Hyong, Lee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.12
    • /
    • pp.517-524
    • /
    • 2022
  • Human pose estimation is a method of estimating a posture by extracting a human joint key point. When occlusion occurs, the joint key point extraction performance is lowered because the human joint is covered. The occlusion phenomenon is largely divided into three types of actions: self-contained, covered by other objects, and covered by background. In this paper, we propose an effective posture estimation method using a masking phenomenon enhancement technique. Although the posture estimation method has been continuously studied, research on the occlusion phenomenon of the posture estimation method is relatively insufficient. To solve this problem, the author proposes a data augmentation technique that intentionally masks human joints. The experimental results in this paper show that the intentional use of the blocking phenomenon enhancement technique is strong against the blocking phenomenon and the performance is increased.

A Study of Fine Tuning Pre-Trained Korean BERT for Question Answering Performance Development (사전 학습된 한국어 BERT의 전이학습을 통한 한국어 기계독해 성능개선에 관한 연구)

  • Lee, Chi Hoon;Lee, Yeon Ji;Lee, Dong Hee
    • Journal of Information Technology Services
    • /
    • v.19 no.5
    • /
    • pp.83-91
    • /
    • 2020
  • Language Models such as BERT has been an important factor of deep learning-based natural language processing. Pre-training the transformer-based language models would be computationally expensive since they are consist of deep and broad architecture and layers using an attention mechanism and also require huge amount of data to train. Hence, it became mandatory to do fine-tuning large pre-trained language models which are trained by Google or some companies can afford the resources and cost. There are various techniques for fine tuning the language models and this paper examines three techniques, which are data augmentation, tuning the hyper paramters and partly re-constructing the neural networks. For data augmentation, we use no-answer augmentation and back-translation method. Also, some useful combinations of hyper parameters are observed by conducting a number of experiments. Finally, we have GRU, LSTM networks to boost our model performance with adding those networks to BERT pre-trained model. We do fine-tuning the pre-trained korean-based language model through the methods mentioned above and push the F1 score from baseline up to 89.66. Moreover, some failure attempts give us important lessons and tell us the further direction in a good way.

TAGS: Text Augmentation with Generation and Selection (생성-선정을 통한 텍스트 증강 프레임워크)

  • Kim Kyung Min;Dong Hwan Kim;Seongung Jo;Heung-Seon Oh;Myeong-Ha Hwang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.455-460
    • /
    • 2023
  • Text augmentation is a methodology that creates new augmented texts by transforming or generating original texts for the purpose of improving the performance of NLP models. However existing text augmentation techniques have limitations such as lack of expressive diversity semantic distortion and limited number of augmented texts. Recently text augmentation using large language models and few-shot learning can overcome these limitations but there is also a risk of noise generation due to incorrect generation. In this paper, we propose a text augmentation method called TAGS that generates multiple candidate texts and selects the appropriate text as the augmented text. TAGS generates various expressions using few-shot learning while effectively selecting suitable data even with a small amount of original text by using contrastive learning and similarity comparison. We applied this method to task-oriented chatbot data and achieved more than sixty times quantitative improvement. We also analyzed the generated texts to confirm that they produced semantically and expressively diverse texts compared to the original texts. Moreover, we trained and evaluated a classification model using the augmented texts and showed that it improved the performance by more than 0.1915, confirming that it helps to improve the actual model performance.

Automatic proficiency assessment of Korean speech read aloud by non-natives using bidirectional LSTM-based speech recognition

  • Oh, Yoo Rhee;Park, Kiyoung;Jeon, Hyung-Bae;Park, Jeon Gue
    • ETRI Journal
    • /
    • v.42 no.5
    • /
    • pp.761-772
    • /
    • 2020
  • This paper presents an automatic proficiency assessment method for a non-native Korean read utterance using bidirectional long short-term memory (BLSTM)-based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced-alignment step using a native AM and non-native AM, and (c) a linear regression-based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un-segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.

Data Augmentation Method for Deep Learning based Medical Image Segmentation Model (딥러닝 기반의 대퇴골 영역 분할을 위한 훈련 데이터 증강 연구)

  • Choi, Gyujin;Shin, Jooyeon;Kyung, Joohyun;Kyung, Minho;Lee, Yunjin
    • Journal of the Korea Computer Graphics Society
    • /
    • v.25 no.3
    • /
    • pp.123-131
    • /
    • 2019
  • In this study, we modified CT images of femoral head in consideration of anatomically meaningful structure, proposing the method to augment the training data of convolution Neural network for segmentation of femur mesh model. First, the femur mesh model is obtained from the CT image. Then divide the mesh model into meaningful parts by using cluster analysis on geometric characteristic of mesh surface. Finally, transform the segments by using an appropriate mesh deformation algorithm, then create new CT images by warping CT images accordingly. Deep learning models using the data enhancement methods of this study show better image division performance compared to data augmentation methods which have been commonly used, such as geometric conversion or color conversion.

Vector-Based Data Augmentation and Network Learning for Efficient Crack Data Collection (효율적인 균열 데이터 수집을 위한 벡터 기반 데이터 증강과 네트워크 학습)

  • Kim, Jong-Hyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.28 no.2
    • /
    • pp.1-9
    • /
    • 2022
  • In this paper, we propose a vector-based augmentation technique that can generate data required for crack detection and a ConvNet(Convolutional Neural Network) technique that can learn it. Detecting cracks quickly and accurately is an important technology to prevent building collapse and fall accidents in advance. In order to solve this problem with artificial intelligence, it is essential to obtain a large amount of data, but it is difficult to obtain a large amount of crack data because the situation for obtaining an actual crack image is mostly dangerous. This problem of database construction can be alleviated with elastic distortion, which increases the amount of data by applying deformation to a specific artificial part. In this paper, the improved crack pattern results are modeled using ConvNet. Rather than elastic distortion, our method can obtain results similar to the actual crack pattern. By designing the crack data augmentation based on a vector, rather than the pixel unit used in general data augmentation, excellent results can be obtained in terms of the amount of crack change. As a result, in this paper, even though a small number of crack data were used as input, a crack database can be efficiently constructed by generating various crack directions and patterns.

Projection Loss for Point Cloud Augmentation (점운증강을 위한 프로젝션 손실)

  • Wu, Chenmou;Lee, Hyo-Jone
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.482-484
    • /
    • 2019
  • Learning and analyzing 3D point clouds with deep networks is challenging due to the limited and irregularity of the data. In this paper, we present a data-driven point cloud augmentation technique. The key idea is to learn multilevel features per point and to reconstruct to a similar point set. Our network is applied to a projection loss function that encourages the predicted points to remain on the geometric shapes with a particular target. We conduct various experiments using ShapeNet part data to evaluate our method and demonstrate its possibility. Results show that our generated points have a similar shape and are located closer to the object.

SMD Detection and Classification Using YOLO Network Based on Robust Data Preprocessing and Augmentation Techniques

  • NDAYISHIMIYE, Fabrice;Lee, Joon Jae
    • Journal of Multimedia Information System
    • /
    • v.8 no.4
    • /
    • pp.211-220
    • /
    • 2021
  • The process of inspecting SMDs on the PCB boards improves the product quality, performance and reduces frequent issues in this field. However, undesirable scenarios such as assembly failure and device breakdown can occur sometime during the assembly process and result in costly losses and time-consuming. The detection of these components with a model based on deep learning may be effective to reduce some errors during the inspection in the manufacturing process. In this paper, YOLO models were used due to their high speed and good accuracy in classification and target detection. A SMD detection and classification method using YOLO networks based on robust data preprocessing and augmentation techniques to deal with various types of variation such as illumination and geometric changes is proposed. For 9 different components of data provided from a PCB manufacturer company, the experiment results show that YOLOv4 is better with fast detection and classification than YOLOv3.

A Study on the Brassiere Wearing Evaluation for Augmentation of Mammaplasty Patients (시판 유방 확대 수술 환자용 브래지어의 착의 평가)

  • Yi, Kyong-Hwa;Nam, Young-Ran
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.42 no.5
    • /
    • pp.737-752
    • /
    • 2018
  • The frequency of breast augmentation surgery continues to increase annually; however, the method of follow-up care varies from hospital to hospital. In particular, many different types of post-operative bras are available in the market. This study evaluated the wearing comfort of various commercial bras that were worn immediately after breast enlargement surgery prior to the manufacture of the bra. According to interviews of medical professionals and market research, five types of brassiere were selected and evaluated by wearing satisfaction, functional performance, and an external appearance test for 6 subjects with breast augmentation surgery. The evaluation questionnaires were based on a 5 point Likert scale with data analyzed using SPSS 20.0. The study results revealed that the bra with the highest degree of satisfaction was CNB (without bra cup) type. However, the use of CNB type showed dissatisfaction in functional evaluation questions regarding breast shaking and material & tactile sensation. In the future, it is necessary to develop a new post-operative brassiere based on a CNB type bra that showed the best evaluation. However, it is also necessary to identify the merits of the other four experimental bras and reflect these advantages.

Improving transformer-based speech recognition performance using data augmentation by local frame rate changes (로컬 프레임 속도 변경에 의한 데이터 증강을 이용한 트랜스포머 기반 음성 인식 성능 향상)

  • Lim, Seong Su;Kang, Byung Ok;Kwon, Oh-Wook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.2
    • /
    • pp.122-129
    • /
    • 2022
  • In this paper, we propose a method to improve the performance of Transformer-based speech recognizers using data augmentation that locally adjusts the frame rate. First, the start time and length of the part to be augmented in the original voice data are randomly selected. Then, the frame rate of the selected part is changed to a new frame rate by using linear interpolation. Experimental results using the Wall Street Journal and LibriSpeech speech databases showed that the convergence time took longer than the baseline, but the recognition accuracy was improved in most cases. In order to further improve the performance, various parameters such as the length and the speed of the selected parts were optimized. The proposed method was shown to achieve relative performance improvement of 11.8 % and 14.9 % compared with the baseline in the Wall Street Journal and LibriSpeech speech databases, respectively.