1 |
T. Mikolov, C. Kai, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv:1301.3781, Jan, 2013.
2 |
J. Pennington, R. Socher, and C. D. Manning, "Glove: Global Vectors for Word Representation," Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532-1543, 2014.
3 |
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," arXiv:1607.04606, Jul, 2016.
4 |
T. Mikolov, I. Sutskever, C. Kai, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality," Advances in Neural Information Processing Systems, Vol. 26, pp. 3111-3119, Dec, 2013.
5 |
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, "Deep Contextualized Word Representations," arXiv:1802.05365, Feb, 2018.
6 |
J. Devlin, M. Chang, K. Lee, and K. Toutanova, "BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding," arXiv:1810.04805, Oct, 2018.
7 |
Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, "XLNet : Generalized Autoregressive Pretraining for Language Understanding," Advances in Neural Information Processing Systems, Vol. 32, pp. 1-11, Dec, 2019.
8 |
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, "RoBERTa: A Robustly Optimized BERT Pretraining Approach," arXiv:1907.11692, Jul, 2019.
9 |
Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, "ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations," arXiv:1909.11942, Sep, 2019.
10 |
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, A Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter," arXiv:1910.01108, Oct, 2019.
11 |
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is All You Need," Proceedings of the 31st Conference on Neural Information Processing Systems, pp. 1-11, 2017.
12 |
K. Clark, U. Khandelwal, O. Levy, and C. D. Manning, "What Does BERT Looking At? An Analysis of BERT's Attention," arXiv:1906.04341, Jun, 2019.
13 |
Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, and R. Salakhutdinov, "Transformers-XL: Attentive Language Models Beyond a Fixed-Length Context," arXiv:1901.02860, Jan, 2019.
14 |
M. S. Ahmed, L. Khan, and N. Oza, "Pseudo-Label Generation for Multi-Label Text Classfication," Proceedings of the 2011 Conference on Intelligent Data Understanding, pp. 60-74, 2011.
15 |
C. Sun, X .Qiu, Y. Xu, and X. Huang, "How to Fine-Tune BERT for Text Classification?," Proceedings of the 18th China National Conference on Chinese Computational Linguistics, pp. 194-206, 2019.
16 |
A. Adhikari, A. Ram, R. Tang, and J. Lin, "DocBERT: BERT for Document Classification," arXiv:1904.08398, Apr, 2019.
17 |
N. Reimers and I. Gurevych, "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks," arXiv:1908.10084, Aug, 2019.
18 |
R. Zhang, Z. Wei, Y. Shi, and Y. Chen, "BERT-AL: BERT for Arbitrarily Long Document Understandding," Proceedings of the International Conference on Learning Representations 2020, pp. 1-10, 2020.
19 |
D. Lee, "Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks," Proceedings of the International Conference on Machine Learning 2013 Workshop, pp. 1-6, 2013.
20 |
J. Xu, B. Xu, P. Wang, S. Zheng, G. Tian, J. Zhao, and B. Xu, "Self-Taught Convolutional Neural Networks for Short Text Clustering," Neural Networks, Vol. 88, pp. 22-31, Apr, 2017.
21 |
R. Pappagari, P. Zelasko, J. Villalba, Y. Carmiel, and N. Dehak, "Hierarchical Transformers for Long Document Classification," arXiv:1910.10781, Oct, 2019.
22 |
Z. Yang, Z. Hu, R. Salakhutdinov, and T. Berg-Kirkpatrick, "Improved Variational Autoencoders for Text Modeling using Detailed Convolutions," Proceedings of the 34th International Conference on Machine Learning, pp. 3881-3890, 2017.
23 |
D. Yeo, G. Lee, and J. Lee, "Pipe Leak Detection System using Wireless Acoustic Sensor Module and Deep Auto-Encoder," Journal of The Korea Society of Computer and Information, Vol. 25, No. 2, pp. 59-66, Feb, 2020.
24 |
A. V. M. Barone, "Towards Cross-lingual Distributed Repre sentations without Parallel Text Trained with Adversarial Autoencoders," arXiv:1608.02996, Aug, 2016.
25 |
L. Jiwei, L. Minh-Thang, and J. Dan, "A Hierarchical Neural Autoencoder for Paragraph and Documents," arXiv:1506.01057, Jun, 2015.
26 |
T. Baumel, R. Cohen, and M. Elhadad, "Sentence Embedding Evaluation using Pyramid Annotation," Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, pp. 145-149, 2016.
27 |
Y. Chen and M. J. Zaki, "KATE: K-Competitive Autoencoder for Text," Proceedings of the 23rd International Conference on Knowledge Discovery and Data Mining, pp. 85-94, 2017.
28 |
A. Bakarov, "A Survey of Word Embeddings Evaluation Methods," arXiv:1801.09536, Jan, 2018.
29 |
Y. Tsvetkov, M. Faruqui, and C. Dyer, "Correlation-based Intrinsic Evaluation of Word Vector Representations," arXiv:1606.06710, Jun, 2016.
30 |
J. Zhang and T. Baldwin, "Evaluating the Utility of Document Embedding Vector Difference for Relation Learning," arXiv:1907.08184, Jul, 2019.
31 |
J. H. Lau and T. Baldwin, "An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation," arXiv:1607.05368, Jul, 2016.
32 |
F. F. Liza and M. Grzes, "An Improved Crowdsourcing based Evaluation Technique for Word Embeddings Methods," Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, pp. 55-61, 2016.
33 |
M. Batchkarov, T. Kober, J. Reffin, J. Weeds, and D. Weir, "A Critique of Word Similarity as a Method for Evaluating Distributional Semantic Models," Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, pp. 7-12, 2016.
34 |
G. Wang, S. Shin, and W. Lee, "A Text Sentiment Classification Method Based on LSTM-CNN," Journal of The Korea Society of Computer and Information, Vol. 24, No. 12, pp. 1-7, Dec, 2019.
35 |
M. Faruqui, Y. Tsvetkov, P. Rastogi, and C. Dyer, "Problems with Evaluation of Word Embeddings using Word Similarity Task," arXiv:1605.02276, May, 2016.