DOI QR코드

DOI QR Code

저성능 자원에서 멀티 에이전트 운영을 위한 의도 분류 모델 경량화

Compressing intent classification model for multi-agent in low-resource devices

  • 투고 : 2022.06.22
  • 심사 : 2022.07.24
  • 발행 : 2022.09.30

초록

최근 자연어 처리 분야에서 대규모 사전학습 언어모델(Large-scale pretrained language model, LPLM)이 발전함에 따라 이를 미세조정(Fine-tuning)한 의도 분류 모델의 성능도 개선되었다. 하지만 실시간 응답을 요하는 대화 시스템에서 대규모 모델을 미세조정하는 방법은 많은 운영 비용을 필요로 한다. 이를 해결하기 위해 본 연구는 저성능 자원에서도 멀티에이전트 운영이 가능한 의도 분류 모델 경량화 방법을 제안한다. 제안 방법은 경량화된 문장 인코더를 학습하는 과제 독립적(Task-agnostic) 단계와 경량화된 문장 인코더에 어답터(Adapter)를 부착하여 의도 분류 모델을 학습하는 과제 특화적(Task-specific) 단계로 구성된다. 다양한 도메인의 의도 분류 데이터셋으로 진행한 실험을 통해 제안 방법의 효과성을 입증하였다.

Recently, large-scale language models (LPLM) have been shown state-of-the-art performances in various tasks of natural language processing including intent classification. However, fine-tuning LPLM requires much computational cost for training and inference which is not appropriate for dialog system. In this paper, we propose compressed intent classification model for multi-agent in low-resource like CPU. Our method consists of two stages. First, we trained sentence encoder from LPLM then compressed it through knowledge distillation. Second, we trained agent-specific adapter for intent classification. The results of three intent classification datasets show that our method achieved 98% of the accuracy of LPLM with only 21% size of it.

키워드

참고문헌

  1. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
  2. Casanueva, I., Temcinas, T., Gerz, D., Henderson, M., & Vulic, I. (2020). Efficient intent detection with dual sentence encoders. arXiv preprint arXiv:2003.04807.
  3. Cho, W. I., Kim, J. I., Moon, Y. K., & Kim, N. S. (2020, May). Discourse component to sentence (DC2S): An efficient human-aided construction of paraphrase and sentence similarity dataset. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 6819-6826).
  4. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  5. Gao, T., Yao, X., & Chen, D. (2021). Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821.
  6. Henderson, M., Casanueva, I., Mrksic, N., Su, P. H., Wen, T. H., & Vulic, I. (2019). ConveRT: Efficient and accurate conversational representations from transformers. arXiv preprint arXiv:1911.03688.
  7. Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., ... & Gelly, S. (2019, May). Parameter-efficient transfer learning for NLP. In International Conference on Machine Learning (pp. 2790-2799). PMLR.
  8. Jeong, I., & Ahn, H. (2022). A study on the detection of fake news - The Comparison of detection performance according to the use of social engagement networks. Journal of Intelligence and Information Systems, 28(1), 197-216. https://doi.org/10.13088/JIIS.2022.28.1.197
  9. Kim, D., Lee, D., Park, J., Oh, S., Kwon, S., Lee, I., & Choi, D. (2022). KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain. Journal of Intelligence and Information Systems, 28(2), 191-206. https://doi.org/10.13088/JIIS.2022.28.2.191
  10. Larson, S., Mahendran, A., Peper, J. J., Clarke, C., Lee, A., Hill, P., ... & Mars, J. (2019). An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv: 1909.02027.
  11. Liu, X., Eshghi, A., Swietojanski, P., & Rieser, V. (2021). Benchmarking natural language understanding services for building conversational agents. In Increasing Naturalness and Flexibility in Spoken Dialogue Interaction (pp. 165-183). Springer, Singapore.
  12. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  13. Mehri, S., Eric, M., & Hakkani-Tur, D. (2020). Dialoglue: A natural language understanding benchmark for task-oriented dialogue. arXiv preprint arXiv:2009.13570.
  14. Park, H.-y., & Kim, K.-j. (2021). Recommender System using BERT Sentiment Analysis. Journal of Intelligence and Information Systems, 27(2), 1-15. https://doi.org/10.13088/JIIS.2021.27.2.001
  15. Park, S., Moon, J., Kim, S., Cho, W. I., Han, J., Park, J., ... & Cho, K. (2021). Klue: Korean language understanding evaluation. arXiv preprint arXiv:2105.09680.
  16. Pfeiffer, J., Ruckle, A., Poth, C., Kamath, A., Vulic, I., Ruder, S., ... & Gurevych, I. (2020). Adapterhub: A framework for adapting transformers. arXiv preprint arXiv:2007.07779.
  17. Sun, Y., Cheng, C., Zhang, Y., Zhang, C., Zheng, L., Wang, Z., & Wei, Y. (2020). Circle loss: A unified perspective of pair similarity optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6398-6407).
  18. Wang, W., Bao, H., Huang, S., Dong, L., & Wei, F. (2020). Minilmv2: Multi-head self-attention relation distillation for compressing pretrained transformers. arXiv preprint arXiv:2012.15828.
  19. Wei, J., Huang, C., Vosoughi, S., Cheng, Y., & Xu, S. (2021). Few-shot text classification with triplet networks, data augmentation, and curriculum learning. arXiv preprint arXiv: 2103.07552.
  20. Zhang, J., Bui, T., Yoon, S., Chen, X., Liu, Z., Xia, C., ... & Yu, P. (2021). Few-shot intent detection via contrastive pre-training and fine-tuning. arXiv preprint arXiv:2109.06349.
  21. Zhang, J. G., Hashimoto, K., Liu, W., Wu, C. S., Wan, Y., Yu, P. S., ... & Xiong, C. (2020). Discriminative nearest neighbor few-shot intent detection by transferring natural language inference. arXiv preprint arXiv:2010.13009.
  22. Zhang, Z., Takanobu, R., Zhu, Q., Huang, M., & Zhu, X. (2020). Recent advances and challenges in task-oriented dialog systems. Science China Technological Sciences, 63(10), 2011-2027. https://doi.org/10.1007/s11431-020-1692-3