Browse > Article
http://dx.doi.org/10.7471/ikeee.2022.26.1.19

Regularization Strength Control for Continuous Learning based on Attention Transfer  

Kang, Seok-Hoon (Dept. of Embedded Systems Engineering, Incheon National University)
Park, Seong-Hyeon (Dept. of Embedded Systems Engineering, Incheon National University)
Publication Information
Journal of IKEEE / v.26, no.1, 2022 , pp. 19-26 More about this Journal
Abstract
In this paper, we propose an algorithm that applies a different variable lambda to each loss value to solve the performance degradation caused by domain differences in LwF, and show that the retention of past knowledge is improved. The lambda value could be variably adjusted so that the current task to be learned could be well learned, by the variable lambda method of this paper. As a result of learning by this paper, the data accuracy improved by an average of 5% regardless of the scenario. And in particular, the performance of maintaining past knowledge, the goal of this paper, was improved by up to 70%, and the accuracy of past learning data increased by an average of 22% compared to the existing LwF.
Keywords
LwF; Continuous Learning; Knowledge Transfer; Variable Lambda; Catastrophic Forgetting;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences, vol.3, no.4, pp.128-135, 1999. DOI: 10.1016/s1364-6613(99)01294-2   DOI
2 G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Networks, vol.113, pp.54-71, 2019. DOI: 10.1016/j.neunet.2019.01.012   DOI
3 F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," Proceedings of the 34th International Conference on Machine Learning, vol 70, pp.3987-3995, 2017. DOI: 10.5555/3305890.3306093   DOI
4 Y. Hsu, Y. Liu, A. Ramasamy, and Z. Kira, "Re-evaluating continual learning scenarios: A categorization and case for strong baselines," arXiv:1810.12488, 2019.
5 H. Shin, J. K. Lee, J. Kim, and J. Kim, "Continual learning with deep generative replay," arXiv: 1705.08690, 2017.
6 G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," NIPS Workshop, arXiv:1503.02531, 2014.
7 R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences 3.4, pp.128-135, 1999. DOI: 10.1016/S1364-6613(99)01294-2   DOI
8 S. Zagoruyko, and N. Komodakis, "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer," arXiv:1612.03928, 2016.
9 J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, "Overcoming catastrophic forgetting in neural networks," Proceedings of the national academy of sciences, vol.114, no.13, pp.3521-3526, 2017.   DOI
10 B. Heo, M. Lee, S Yun, and JY. Choi, "Knowledge transfer via distillation of activation boundaries formed by hidden neurons," Proceedings of the AAAI Conference on Artificial Intelligence, Vol.33, No.1, pp.3779-3787, 2019. DOI: 10.48550/arXiv.1811.03233   DOI
11 Z. Li and D. Hoiem, "Learning without forgetting", IEEE transactions on pattern analysis and machine intelligence, vol.40, no.12, pp.2935- 2947, 2017. DOI: 10.48550/arXiv.1612.00796   DOI
12 J. Yoon, E. Yang, J. Lee, and S. J. Hwang, "Lifelong learning with dynamically expandable networks," arXiv:1708.01547, 2017.
13 K. McRae, and PA. Hetherington, "Catastrophic interference is eliminated in pretrained networks," Proceedings of the 15h Annual Conference of the Cognitive Science Society, pp.723-728, 1993. DOI: 10.1.1.30.4449   DOI