• Title/Summary/Keyword: Transformer 모델 압축

Search Result 3, Processing Time 0.016 seconds

Structured Pruning for Efficient Transformer Model compression (효율적인 Transformer 모델 경량화를 위한 구조화된 프루닝)

  • Eunji Yoo;Youngjoo Lee
    • Transactions on Semiconductor Engineering
    • /
    • v.1 no.1
    • /
    • pp.23-30
    • /
    • 2023
  • With the recent development of Generative AI technology by IT giants, the size of the transformer model is increasing exponentially over trillion won. In order to continuously enable these AI services, it is essential to reduce the weight of the model. In this paper, we find a hardware-friendly structured pruning pattern and propose a lightweight method of the transformer model. Since compression proceeds by utilizing the characteristics of the model algorithm, the size of the model can be reduced and performance can be maintained as much as possible. Experiments show that the structured pruning proposed when pruning GPT-2 and BERT language models shows almost similar performance to fine-grained pruning even in highly sparse regions. This approach reduces model parameters by 80% and allows hardware acceleration in structured form with 0.003% accuracy loss compared to fine-tuned pruning.

Knowledge Distillation based-on Internal/External Correlation Learning

  • Hun-Beom Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.31-39
    • /
    • 2023
  • In this paper, we propose an Internal/External Knowledge Distillation (IEKD), which utilizes both external correlations between feature maps of heterogeneous models and internal correlations between feature maps of the same model for transferring knowledge from a teacher model to a student model. To achieve this, we transform feature maps into a sequence format and extract new feature maps suitable for knowledge distillation by considering internal and external correlations through a transformer. We can learn both internal and external correlations by distilling the extracted feature maps and improve the accuracy of the student model by utilizing the extracted feature maps with feature matching. To demonstrate the effectiveness of our proposed knowledge distillation method, we achieved 76.23% Top-1 image classification accuracy on the CIFAR-100 dataset with the "ResNet-32×4/VGG-8" teacher and student combination and outperformed the state-of-the-art KD methods.

Analysis on the Deformation Characteristics of a Pillar between Large Caverns by Burton-Bandis Rock Joint Model (Barton-Bandis 절리 모델에 의한 지하대공동 암주의 변형 특성 연구)

  • 강추원;임한욱;김치환
    • Tunnel and Underground Space
    • /
    • v.11 no.2
    • /
    • pp.109-119
    • /
    • 2001
  • Up to now single large cavern was excavated for each undergroud hydraulic powerhouse in Korea. But the Yangyang underground hydraulic powerhouse consists of two large caverns; a powerhouse cavern and main transformer cavern. In this carte, the structural stability of the caverns, especially the rock pillar formed between two large caverns, should be guaranteed to be sound to make the caverns permanently sustainable. In this research, the Distinct Element Method(DEM) was used to analyze the structural stability of two caverns and the rock pillar. The Barton-Bandis joint model was used as a constitutive model. The moot significant parameters such as in-site stress, JRC of in-situ natural joints, and spatial distribution characteristics of discontinuities were acquired through field investigation. In addition, two different cases; 1) with no support system and 2) with a support system, were analysed to optimize a support system and to investigate reinforcing effects of a support system. The results of analysis horizontal displacement and joint shear displacement proved to be reduced with the support system. The relaxed zone in the rock pilar also proved to be reduced in conjunction with the support system. Having a support system in place provided the fact that the non zero minimum principal stresses were still acting in the rock pillar so that the pillar was not under uniaxial compressive condition but under triaxial compressive condition. The structural stability f an approximately 36 m wide rock pillar between two large caverns was assured with the appropriate support system.

  • PDF