Browse > Article
http://dx.doi.org/10.6109/jkiice.2022.26.3.347

Advanced LwF Model based on Knowledge Transfer in Continual Learning  

Kang, Seok-Hoon (Department of Embedded Systems Engineering, Incheon National University)
Park, Seong-Hyeon (Department of Embedded Systems Engineering, Incheon National University)
Abstract
To reduce forgetfulness in continuous learning, in this paper, we propose an improved LwF model based on the knowledge transfer method, and we show its effectiveness by experiment. In LwF, if the domain of the learned data is different or the complexity of the data is different, the previously learned results are inaccurate due to forgetting. In particular, when learning continues from complex data to simple data, the phenomenon tends to get worse. In this paper, to ensure that the previous learning results are sufficiently transferred to the LwF model, we apply the knowledge transfer method to LwF, and propose an algorithm for efficient use. As a result, the forgetting phenomenon was reduced by an average of 8% compared to the existing LwF results, and it was effective even when the learning task became long. In particular, when complex data was first learned, the efficiency was improved more than 30% compared to LwF.
Keywords
LwF; Continual Learning; Knowledge Transfer; Neural Network; Catastrophic Forgetting;
Citations & Related Records
연도 인용수 순위
  • Reference
1 G. I. Parisi, R. Kemker, J. L.Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Networks, vol. 113, pp. 54-71, May. 2019.   DOI
2 Z. Li and D. Hoiem, "Learning without forgetting," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935-2947, vol. 40, no. 12, pp. 2935-2947, Dec. 2018.   DOI
3 F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," in Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3987-3995, 2017.
4 Y. Hsu, Y. Liu, A. Ramasamy, and Z. Kira, "Re-evaluating continual learning scenarios: A categorization and case for strong baselines," arXiv preprint, arXiv:1810.12488, 2018.
5 H. Shin, J. K. Lee, J. Kim, and J. Kim, "Continual Learning with Deep Generative Replay," arXiv preprint, arXiv:1705.08690, 2017.
6 G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," NIPS Workshop, 2014.
7 S. Zagoruyko and N. Komodakis, "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer," arXiv preprint, arXiv:1612.03928, 2016.
8 B. Heo, M. Lee, S Yun, and JY. Choi, "Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019.
9 J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, "Overcoming catastrophic forgetting in neural networks," in Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521-3526, Mar. 2017.   DOI
10 R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences, vol. 3, no. 4, pp. 128-135, Apr. 1999.   DOI
11 J. Yoon, E. Yang, J. Lee, and S. J. Hwang, "Lifelong learning with dynamically expandable networks", arXiv preprint, arXiv:1708.01547, 2017.
12 K. McRae, and PA. Hetherington, "Catastrophic Interference is Eliminated in Pretrained Networks," in Proceedings of the 15h Annual Conference of the Cognitive Science Society, pp. 723-728, 1993.