1 |
N. S. Sohoni, C. R. Aberger, M. Leszczyynski, J. Zhamg and C. Re "Low Memory Neural Network Training: A Technical Report," https://arxiv.org/abs/1904.10631
|
2 |
M. Courbariaux, Y. Bengio and J. David, "Binary Connect: Training Deep Neural Networks with binary weights during propagations," 2015. https://arxiv.org/abs/1511.00363
|
3 |
C. Zhu, S. Han, H. Mao and W. J. Dally, "Trained ternary quantization," International Conference on Learning Representations, 2017.
|
4 |
S. Han, H. Mao and W. J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," NIPS Deep Learning Symposium, 2015.
|
5 |
Nikola Sakharnykh, "Maximizing Unified Memory Performance in CUDA," https://devblogs.nvidia.com/maximizing-unified-memory-performance-cuda
|
6 |
J. Choi, P. I-Jen, C. Z. Wang, S. Venkataramani, V. Srinivasan and K. Gopalakrishnan, "Bridging the Accuracy Gap for 2-bit Quantized Neural Networks(QNN)," https://arxiv.org/abs/1807.06964
|
7 |
S. Uhlich, L. Mauch, K. Yoshiyama, F. Cardinaux, J. A. Garcia, S. Tiedmann, T. Kemp and A. Nakamura, "Differentiable Quantization of Deep Neural Networks," https://arxiv.org/abs/1905.11452
|
8 |
M. Rastegari, V. Ordonez, J. Redmon and A. Farhadi., "Xnor-net: Imagenet classification using binary convolution neural networks," European Conference on Computer Vision, pp.525-542, 2016.
|
9 |
J. Choi, S. Venkataramani, V. Srinivasan, K. Gopalakrishana, Z. Wang, and P. Chuang, "Accurate And Efficient 2-bit Quantized Neural Networks," https://sysml.cc/doc/2019/168.pdf
|
10 |
F. Li, B. Zhang and B. Liu, "Ternary Weight Networks," https://arxiv.org/abs/1605.04711
|