References
- Liang, Chen, et al. "Less is more: Task-aware layer-wise distillation for language model compression." Proceedings of the 40thInternational Conference on Machine Learning, Honolulu Hawaii USA, 2023, 20852-20867.
- Liang, Chen, et al. "Module-wise Adaptive Distillation for Multimodality Foundation Models." Advances in Neural Information Processing Systems, New Orleans USA, 2023, 69719-69735.
- Pasad, Chou, et al. "Layer-wise analysis of a self-supervised speech representation model." 2021IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Cartagena, Colombia, 2021, 914-921