DOI QR코드

DOI QR Code

Hierarchical multi-task learning with self-supervised auxiliary task

HiSS: 자기 지도 보조 작업을 결합한 계층적 다중 작업 학습

  • Seunghan Lee (Department of Statistics and Data Science, Yonsei University) ;
  • Taeyoung Park (Department of Statistics and Data Science, Yonsei University)
  • 이승한 (통계데이터사이언스학과 연세대학교) ;
  • 박태영 (통계데이터사이언스학과 연세대학교)
  • Received : 2024.07.31
  • Accepted : 2024.08.26
  • Published : 2024.10.31

Abstract

Multi-task learning is a popular approach in machine learning that aims to learn multiple related tasks simultaneously by sharing information across them. In this paper, we consider a hierarchical structure across multiple related tasks with a hierarchy of sub-tasks under the same main task, where representations used to solve the sub-tasks share more information through task-specific layers, globally shared layers, and locally shared layers. We thus propose the hierarchical multi-task learning with self-supervised auxiliary task (HiSS), which is a novel approach for hierarchical multi-task learning that incorporates self-supervised learning as an auxiliary task. The goal of the auxiliary task is to further extract latent information from the unlabeled data by predicting a cluster label directly derived from the data. The proposed approach is tested on the Hyodoll dataset, which consists of user information and activity logs of elderly individuals collected by AI companion robots, for predicting emergency calls based on the time of day and month. Our proposed algorithm is more efficient than other well-known machine learning algorithms as it requires only a single model regardless of the number of tasks, and demonstrates superior performance in classification tasks using various metrics. The source codes are available at: https://github.com/seunghan96/HiSS.

다중 작업 학습 은 여러 관련 작업들 사이에서 정보를 공유하며 동시에 학습하는 기계 학습에서 널리 사용되는 방법론이다. 본 논문에서는, 동일한 주요 작업(main task) 하에 속한 하위 작업(sub task)들의 계층적 구조를 고려하며 다중 작업 학습을 수행하기 위한 HiSS (hierarchical multi-task learning with self-supervised auxiliary task)라는 새로운 계층적 다중 작업 학습 방법론을 제안한다. 해당 방법론은 하위 작업을 해결하기 위한 표현 벡터를 학습하기 위해 전역적 공유층, 지역적 공유층, 작업 별 특정층을 활용하는 계층적 구조를 가진다. 또한, 제안한 방법론은 계층적 다중 작업 학습을 주요 과제로 하고, 자기 지도 학습을 보조 과제로 사용하여 학습을 동시에 진행한다. 이는 레이블 없이 입력 데이터만을 활용하여 획득한 군집 레이블을 보조 분류 태스크의 가상 레이블로 사용함으로써, 레이블이 없는 데이터로부터도 추가적인 정보를 획득하고자 함이다. 제안된 접근 방식은 AI 동반 로봇이 수집한 노인 개인의 사용자 정보와 활동 로그로 구성된 효돌 데이터를 사용하여 검증되었으며, 시간대와 월을 기반으로 응급 호출을 예측한다. HiSS는 작업의 수에 관계없이 단일모델만을 필요로 하여 작업에 따라 개별 모델을 사용하는 기존의 기계 학습 알고리즘보다 더 효율적이고, 다양한 메트릭을 사용하여 분류 작업에서 우수한 성능을 확인하였다. 해당 알고리즘에 대한 소스 코드는 다음링크에서 확인할 수 있다: https://github.com/seunghan96/HiSS.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2020R1A2C1A01005949, RS-2023-00217705), the MSIT (Ministry of Science and ICT), Korea, under the ICAN (ICT Challenge and Advanced Network of HRD) support program (RS-2023-00259934) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation), and the Son Jiho Research Grant of Yonsei University (2023-22-0006).

References

  1. Caron M, Misra I, Mairal J, Goyal P, and Bojanowski P (2020). Unsupervised learning of visual features by contrasting cluster assignments. In European Conference on Computer Vision (pp. 3-20), Springer, Online. 
  2. Caruana R (1997). Multitask learning, Machine Learning, 28, 41-75. 
  3. Chen L, Song J, and Zhang Z (2018). Multi-faceted hierarchical multi-task learning for a large number of tasks with multi-dimensional relations, In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 3404-3414. 
  4. Chen T, Kornblith S, Norouzi M, and Hinton G (2020). A simple framework for contrastive learning of visual representations, In International Conference on Machine Learning, 1597-1607, PMLR. 
  5. Ester M, Kriegel, Hans-Peter, Sander, J"org, and Xu, Xiaowei (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, KDD, 96, 226-231. 
  6. Grill J-B, Strub F, Altche F' et al. (2020). Bootstrap your own latent-a new approach to self-supervised learning, Advances in Neural Information Processing Systems, 33, 21271-21284. 
  7. Hastie T, Friedman J, Tibshirani R, Hastie T, Friedman J, and Tibshirani R (2001). Unsupervised learning, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 437-508. 
  8. Kohonen T (1982). Self-organized formation of topologically correct feature maps, Biological Cybernetics, 43, 59-69. 
  9. Liu Y, Li W, Li H, Zheng Z-M, and Wang S (2015). Multi-task learning for natural language processing, In Proceedings of the 28th AAAI Conference on Artificial Intelligence, 2730-2736. 
  10. MacQueen J (1967). Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297. 
  11. Huang, Z., Rao, M., Raju, A., Zhang, Z., Bui, B., and Lee, C. (2022). Multi-task learning for speaker-role adaptation in neural conversation models, In Proceedings of the 4th Workshop on NLP for Conversational AI 2022, 120-130. 
  12. Misra I, Shrivastava A, Gupta A, and Hebert M (2016). Cross-stitch networks for multi-task learning, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3994-4003. 
  13. Munkhdalai T and Yu H (2017). Meta networks, In Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2554-2563. JMLR.org 
  14. Murtagh F (2014). Ward's hierarchical agglomerative clustering method: Which algorithms implement ward's criterion?, Journal of Classification, 31, 274-295. 
  15. Pentina, Anastasia and Lampert, Christoph H (2015). Curriculum learning of multiple tasks, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5492-5500. 
  16. Ruder S (2017). An overview of multi-task learning in deep neural networks, Available from: arXiv preprint arXiv:1706.05098 
  17. Shwartz-Ziv R and Armon A (2022). Tabular data: Deep learning is not all you need, Information Fusion, 81, 84-90.
  18. Tan P-N, Steinbach M, and Kumar V (2016). Introduction to Data Mining, Pearson, Upper Saddle River, New Jersey.