1 |
S. Lee and B. Song, "Graph-based Knowledge Distillation by Multi-head Attention Network," arXiv:1907.02226, Jul, 2019. DOI: 10.48550/arXiv.1907.02226
DOI
|
2 |
S. Kim and S. Kim, "Recursive Oversampling Method for Improving Classification Performance of Class Unbalanced Data in Patent Document Automatic Classification," Journal of The Institute of Electronics and Information Engineers, Vol. 58, No. 4, April, 2021. DOI: 10.5573/ieie.2021.58.4.43
DOI
|
3 |
S. Arora, M. M. Khapra, and H. G. Ramaswamy, "On Knowledge Distillation from Complex Networks for Response Prediction," Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 3813-3822, June, 2019. DOI: 10.18653/v1/N19-1382
DOI
|
4 |
N. Kin, D. Lee, H. Choi, and W. X. S. Wong, "Investigations on Techniques and Applications of Text Analytics," Journal of Korean Institute of Communications and Information Sciences, Vol. 42, No. 2, pp. 471-492, Feb, 2017. DOI: 10.7840/kics.2017.42.2.471
DOI
|
5 |
H. Son, S. Choe, C. Moon, and J. MIn, "Rule-based filtering and deep learning LSTM e-mail spam classification," Proceedings of the Korean Information Science Society Conference, pp.105-107, 2021.
|
6 |
W. X. S. Wong, Y. Hyun, and N. Kim, "Improving the Accuracy of Document Classification by Learning Heterogeneity," Journal of Intelligence and Information Systems, Vol. 24, No. 3, Sep,2018. DOI: 10.13088/jiis.2018.24.3.021
DOI
|
7 |
S. U. Park, "Analysis of the Status of Natural Language Processing Technology Based on Deep Learning," Korean Journal of BigData, Vol. 6, Aug, 2021. DOI: 10.36498/kbigdt.2021.6.1.63
DOI
|
8 |
S. Shakeri, A. Sethy, and C. Cheng, "Knowledge Distillation in Document Retrieval," arXiv:1911.11065, Nov, 2019. DOI:10.48550/arXiv.1911.11065
DOI
|
9 |
S. Zhang, L. Jiang, and J. Tan, "Cross-domain knowledge distillation for text classification," Neurocomputing, Vol. 509, pp. 11-202022, Oct, 2022. DOI: 10.1016/j.neucom.2022.08.061
DOI
|
10 |
B. Dipto and J. Gil, "Research Paper Classification Scheme based on Word Embedding," Proceedings of the Korea Information Processing Society Conference, Vol. 28, No. 2, Nov, 2021. DOI:10.3745/PKIPS.y2021m11a.494
DOI
|
11 |
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, "FITNETS: HINTS FOR THIN DEEP NETS," arXiv:1412.65504, Mar, 2015. DOI: 10.48550/arXiv.1412.6550
DOI
|
12 |
J. Yim, D. Joo, J. Bae and J. Kim, "A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4133-4141, 2017.DOI: 10.1109/cvpr.2017.754
DOI
|
13 |
H. L. Erickson, L. A. Lanning, and R. French, "Concept-Based Curriculum and Instruction for the Thinking Classroom," 2nd Edition, Corwin, 2017. DOI: 10.4135/9781506355382
DOI
|
14 |
G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," arXiv:1503.02531, Mar, 2015. DOI:10.48550/arXiv.1503.02531
DOI
|
15 |
J. Ba and R. Caruana, "Do deep nets really need to be deep?," Advances in neural information processing systems 27, 2014. DOI:10.48550/arXiv.1312.6184
DOI
|
16 |
S. I. Mirzadeh, M. Farajtabar, A. Li, N. Levine, A. Matsukawa, and H. Ghasemzadeh, "Improved Knowledge Distillation via Teacher Assistant," Proceedings of the AAAI conference on artificial intelligence, Vol. 34, No. 04, pp. 5191-5198, April, 2020. DOI: 10.1609/aaai.v34i04.5963
DOI
|
17 |
B. Heo, M. Lee, S. Yun, and J. Y. Choi, "Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons," Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, No. 01, pp. 3779-3787, July, 2019. DOI:10.1609/aaai.v33i01.33013779
DOI
|
18 |
K. Clark, M. T. Luong, Q. V. Le, and C. D. Manning, "ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators," arXiv:2003.10555, Mar, 2020. DOI:10.48550/arXiv.2003.10555
DOI
|
19 |
J. Kim, S. Park, and N. Kwak, "Paraphrasing Complex Network: Network Compression via Factor Transfer," Advances in neural information processing systems 31, 2018. DOI: 10.48550/arXiv.1802.04977
DOI
|
20 |
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv:1301.3781, Sep, 2013. DOI: 10.48550/arXiv.1301.3781
DOI
|
21 |
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter," arXiv:1910.01108, Mar, 2020. DOI: 10.48550/arXiv.1910.01108
DOI
|
22 |
X. Jiao, Y. Yin, L. Shang, X. Jiang, X. Chen, L. Li, F. Wang, and Q. Liu, "TinyBERT: Distilling BERT for Natural Language Understanding," arXiv:1909.10351, Oct, 2020. DOI: 10.48550/arXiv.1909.10351
DOI
|
23 |
K. Lang, "NewsWeeder: Learning to Filter Netnews," Proceedings of the Twelfth International Conference on Machine Learning, pp. 331-339, 1995. DOI: 10.1016/B978-1-55860-377-6.50048-7
DOI
|
24 |
S. Ji, J. Moon, H. Kim, and E. Hwang, "A Twitter News-Classification Scheme Using Semantic Enrichment of Word Features," Journal of KIISE, Vol. 45, No. 10, pp. 1045-1055, Oct, 2018. DOI: 10.5626/JOK.2018.45.10.1045
DOI
|
25 |
S. Hahn and H. Choi, "Self-Knowledge Distillation in Natural Language Processing," arXiv:1908.01851, Aug, 2019. DOI:10.48550/arXiv.1908.01851
DOI
|
26 |
Y. C. Chen, Z. Gan, Y. Cheng, J. Liu, and J. Liu, "Distilling Knowledge Learned in BERT for Text Generation," arXiv:1911. 03829, Jul, 2020. DOI: 10.48550/arXiv.1911.03829
DOI
|
27 |
Z. Huang and N. Wang, "Like What You Like: Knowledge Distill via Neuron Selectivity Transfer," arXiv:1707.01219, Dec, 2017. DOI: 10.48550/arXiv.1707.01219
DOI
|
28 |
Y. Liu, J. Cao, B. Li, C. Yuan, W. Hu, Y. Li, and Y. Duan, "Knowledge distillation via instance relationship graph," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7096-7104, 2019. DOI: 10.1109/cvpr.2019.00726
DOI
|
29 |
J. Gou, B. Yu, S. J. Maybank, and D. Tao, "Knowledge Distillation: A Survey," International Journal of Computer Vision 129.6, pp. 1789-1819, Mar, 2021. DOI: 10.1007/s11263-021-01453-z
DOI
|
30 |
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv:1810.04805, May, 2019. DOI: 10.48550/arXiv.1810.04805
DOI
|