• Title/Summary/Keyword: Model pruning

Search Result 91, Processing Time 0.023 seconds

Model Structuring Technique by A Knowledge Representation Scheme: A FMS Fractal Architecture Example (지식 표현 기법을 이용한 모델 구조의 표현과 구성 : 단편구조 유연생산 시스템 예)

  • 조대호
    • Journal of the Korea Society for Simulation
    • /
    • v.4 no.1
    • /
    • pp.1-11
    • /
    • 1995
  • The model of a FMS (Flexible Manufacturing System) admits to a natural hierarchical decomposition of highly decoupled units with similar structure and control. The FMS fractal architecture model represents a hierarchical structure built from elements of a single basic design. A SES (System Entity Structure) is a structural knowledge representation scheme that contains knowledge of decomposition, taxonomy, and coupling relationships of a system necessary to direct model synthesis. A substructure of a SES is extracted for use as the skeleton for a model. This substructure is called pruned SES and the extraction operation of a pruned SES from a SES is called pruning (or pruning operation). This paper presents a pruning operation called recursive pruning. It is applied to SES for generating a model structure whose sub-structure contains copies if itself as in FMS fractal architecture. Another pruning operation called delay pruning is also presented. Combined with recursive pruning the delay pruningis a useful tool for representing and constructing complex systems.

  • PDF

Effect of Potential Model Pruning on Official-Sized Board in Monte-Carlo GO

  • Oshima-So, Makoto
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.6
    • /
    • pp.54-60
    • /
    • 2021
  • Monte-Carlo GO is a computer GO program that is sufficiently competent without using knowledge expressions of IGO. Although it is computationally intensive, the computational complexity can be reduced by properly pruning the IGO game tree. Here, I achieve this by using a potential model based on the knowledge expressions of IGO. The potential model treats GO stones as potentials. A specific potential distribution on the GO board results from a unique arrangement of stones on the board. Pruning using the potential model categorizes legal moves into effective and ineffective moves in accordance with the potential threshold. Here, certain pruning strategies based on potentials and potential gradients are experimentally evaluated. For different-sized boards, including an official-sized board, the effects of pruning strategies are evaluated in terms of their robustness. I successfully demonstrate pruning using a potential model to reduce the computational complexity of GO as well as the robustness of this effect across different-sized boards.

Filter Contribution Recycle: Boosting Model Pruning with Small Norm Filters

  • Chen, Zehong;Xie, Zhonghua;Wang, Zhen;Xu, Tao;Zhang, Zhengrui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.11
    • /
    • pp.3507-3522
    • /
    • 2022
  • Model pruning methods have attracted huge attention owing to the increasing demand of deploying models on low-resource devices recently. Most existing methods use the weight norm of filters to represent their importance, and discard the ones with small value directly to achieve the pruning target, which ignores the contribution of the small norm filters. This is not only results in filter contribution waste, but also gives comparable performance to training with the random initialized weights [1]. In this paper, we point out that the small norm filters can harm the performance of the pruned model greatly, if they are discarded directly. Therefore, we propose a novel filter contribution recycle (FCR) method for structured model pruning to resolve the fore-mentioned problem. FCR collects and reassembles contribution from the small norm filters to obtain a mixed contribution collector, and then assigns the reassembled contribution to other filters with higher probability to be preserved. To achieve the target FLOPs, FCR also adopts a weight decay strategy for the small norm filters. To explore the effectiveness of our approach, extensive experiments are conducted on ImageNet2012 and CIFAR-10 datasets, and superior results are reported when comparing with other methods under the same or even more FLOPs reduction. In addition, our method is flexible to be combined with other different pruning criterions.

Structured Pruning for Efficient Transformer Model compression (효율적인 Transformer 모델 경량화를 위한 구조화된 프루닝)

  • Eunji Yoo;Youngjoo Lee
    • Transactions on Semiconductor Engineering
    • /
    • v.1 no.1
    • /
    • pp.23-30
    • /
    • 2023
  • With the recent development of Generative AI technology by IT giants, the size of the transformer model is increasing exponentially over trillion won. In order to continuously enable these AI services, it is essential to reduce the weight of the model. In this paper, we find a hardware-friendly structured pruning pattern and propose a lightweight method of the transformer model. Since compression proceeds by utilizing the characteristics of the model algorithm, the size of the model can be reduced and performance can be maintained as much as possible. Experiments show that the structured pruning proposed when pruning GPT-2 and BERT language models shows almost similar performance to fine-grained pruning even in highly sparse regions. This approach reduces model parameters by 80% and allows hardware acceleration in structured form with 0.003% accuracy loss compared to fine-tuned pruning.

A Study on Maritime Object Image Classification Using a Pruning-Based Lightweight Deep-Learning Model (가지치기 기반 경량 딥러닝 모델을 활용한 해상객체 이미지 분류에 관한 연구)

  • Younghoon Han;Chunju Lee;Jaegoo Kang
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.27 no.3
    • /
    • pp.346-354
    • /
    • 2024
  • Deep learning models require high computing power due to a substantial amount of computation. It is difficult to use them in devices with limited computing environments, such as coastal surveillance equipments. In this study, a lightweight model is constructed by analyzing the weight changes of the convolutional layers during the training process based on MobileNet and then pruning the layers that affects the model less. The performance comparison results show that the lightweight model maintains performance while reducing computational load, parameters, model size, and data processing speed. As a result of this study, an effective pruning method for constructing lightweight deep learning models and the possibility of using equipment resources efficiently through lightweight models in limited computing environments such as coastal surveillance equipments are presented.

A Speaker Pruning Method for Real-Time Speaker Identification System

  • Kim, Min-Joung;Suk, Soo-Young;Jeong, Jong-Hyeog
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.10 no.2
    • /
    • pp.65-71
    • /
    • 2015
  • It has been known that GMM (Gaussian Mixture Model) based speaker identification systems using ML (Maximum Likelihood) and WMR (Weighting Model Rank) demonstrate very high performances. However, such systems are not so effective under practical environments, in terms of real time processing, because of their high calculation costs. In this paper, we propose a new speaker-pruning algorithm that effectively reduces the calculation cost. In this algorithm, we select 20% of speaker models having higher likelihood with a part of input speech and apply MWMR (Modified Weighted Model Rank) to these selected speaker models to find out identified speaker. To verify the effectiveness of the proposed algorithm, we performed speaker identification experiments using TIMIT database. The proposed method shows more than 60% improvement of reduced processing time than the conventional GMM based system with no pruning, while maintaining the recognition accuracy.

Application and Performance Analysis of Double Pruning Method for Deep Neural Networks (심층신경망의 더블 프루닝 기법의 적용 및 성능 분석에 관한 연구)

  • Lee, Seon-Woo;Yang, Ho-Jun;Oh, Seung-Yeon;Lee, Mun-Hyung;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.8
    • /
    • pp.23-34
    • /
    • 2020
  • Recently, the artificial intelligence deep learning field has been hard to commercialize due to the high computing power and the price problem of computing resources. In this paper, we apply a double pruning techniques to evaluate the performance of the in-depth neural network and various datasets. Double pruning combines basic Network-slimming and Parameter-prunning. Our proposed technique has the advantage of reducing the parameters that are not important to the existing learning and improving the speed without compromising the learning accuracy. After training various datasets, the pruning ratio was increased to reduce the size of the model.We confirmed that MobileNet-V3 showed the highest performance as a result of NetScore performance analysis. We confirmed that the performance after pruning was the highest in MobileNet-V3 consisting of depthwise seperable convolution neural networks in the Cifar 10 dataset, and VGGNet and ResNet in traditional convolutional neural networks also increased significantly.

Wanda Pruning for Lightweighting Korean Language Model (Wanda Pruning에 기반한 한국어 언어 모델 경량화)

  • Jun-Ho Yoon;Daeryong Seo;Donghyeon Jeon;Inho Kang;Seung-Hoon Na
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.437-442
    • /
    • 2023
  • 최근에 등장한 대규모 언어 모델은 다양한 언어 처리 작업에서 놀라운 성능을 발휘하고 있다. 그러나 이러한 모델의 크기와 복잡성 때문에 모델 경량화의 필요성이 대두되고 있다. Pruning은 이러한 경량화 전략 중 하나로, 모델의 가중치나 연결의 일부를 제거하여 크기를 줄이면서도 동시에 성능을 최적화하는 방법을 제시한다. 본 논문에서는 한국어 언어 모델인 Polyglot-Ko에 Wanda[1] 기법을 적용하여 Pruning 작업을 수행하였다. 그리고 이를 통해 가중치가 제거된 모델의 Perplexity, Zero-shot 성능, 그리고 Fine-tuning 후의 성능을 분석하였다. 실험 결과, Wanda-50%, 4:8 Sparsity 패턴, 2:4 Sparsity 패턴의 순서로 높은 성능을 나타냈으며, 특히 일부 조건에서는 기존의 Dense 모델보다 더 뛰어난 성능을 보였다. 이러한 결과는 오늘날 대규모 언어 모델 중심의 연구에서 Pruning 기법의 효과와 그 중요성을 재확인하는 계기가 되었다.

  • PDF

Modeling strength of high-performance concrete using genetic operation trees with pruning techniques

  • Peng, Chien-Hua;Yeh, I-Cheng;Lien, Li-Chuan
    • Computers and Concrete
    • /
    • v.6 no.3
    • /
    • pp.203-223
    • /
    • 2009
  • Regression analysis (RA) can establish an explicit formula to predict the strength of High-Performance Concrete (HPC); however, the accuracy of the formula is poor. Back-Propagation Networks (BPNs) can establish a highly accurate model to predict the strength of HPC, but cannot generate an explicit formula. Genetic Operation Trees (GOTs) can establish an explicit formula to predict the strength of HPC that achieves a level of accuracy in between the two aforementioned approaches. Although GOT can produce an explicit formula but the formula is often too complicated so that unable to explain the substantial meaning of the formula. This study developed a Backward Pruning Technique (BPT) to simplify the complexity of GOT formula by replacing each variable of the tip node of operation tree with the median of the variable in the training dataset belonging to the node, and then pruning the node with the most accurate test dataset. Such pruning reduces formula complexity while maintaining the accuracy. 404 experimental datasets were used to compare accuracy and complexity of three model building techniques, RA, BPN and GOT. Results show that the pruned GOT can generate simple and accurate formula for predicting the strength of HPC.

Neural Network Model Compression Algorithms for Image Classification in Embedded Systems (임베디드 시스템에서의 객체 분류를 위한 인공 신경망 경량화 연구)

  • Shin, Heejung;Oh, Hyondong
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.2
    • /
    • pp.133-141
    • /
    • 2022
  • This paper introduces model compression algorithms which make a deep neural network smaller and faster for embedded systems. The model compression algorithms can be largely categorized into pruning, quantization and knowledge distillation. In this study, gradual pruning, quantization aware training, and knowledge distillation which learns the activation boundary in the hidden layer of the teacher neural network are integrated. As a large deep neural network is compressed and accelerated by these algorithms, embedded computing boards can run the deep neural network much faster with less memory usage while preserving the reasonable accuracy. To evaluate the performance of the compressed neural networks, we evaluate the size, latency and accuracy of the deep neural network, DenseNet201, for image classification with CIFAR-10 dataset on the NVIDIA Jetson Xavier.