Gated Recurrent Unit based Prefetching for Graph Processing

그래프 프로세싱을 위한 GRU 기반 프리페칭

  • Shivani Jadhav (Department of Computer Science and Artificial Intelligence, Jeonbuk National University) ;
  • Farman Ullah (Department of Computer Science and Artificial Intelligence, Jeonbuk National University) ;
  • Jeong Eun Nah (University College, Yonsei University) ;
  • Su-Kyung Yoon (Department of Computer Science and Artificial Intelligence, Jeonbuk National University)
  • Received : 2023.03.21
  • Accepted : 2023.04.18
  • Published : 2023.06.30

Abstract

High-potential data can be predicted and stored in the cache to prevent cache misses, thus reducing the processor's request and wait times. As a result, the processor can work non-stop, hiding memory latency. By utilizing the temporal/spatial locality of memory access, the prefetcher introduced to improve the performance of these computers predicts the following memory address will be accessed. We propose a prefetcher that applies the GRU model, which is advantageous for handling time series data. Display the currently accessed address in binary and use it as training data to train the Gated Recurrent Unit model based on the difference (delta) between consecutive memory accesses. Finally, using a GRU model with learned memory access patterns, the proposed data prefetcher predicts the memory address to be accessed next. We have compared the model with the multi-layer perceptron, but our prefetcher showed better results than the Multi-Layer Perceptron.

Keywords

Acknowledgement

This research was supported by National Research Foundation of Korea (NRF) Grant funded by the Korean Government (Ministry of Science and ICT) NRF-2021R1I1A3059832.

References

  1. Wulf , Wm A., and Sally A. McKee. "Hitting the memory wall: Implications of the obvious." ACM SIGARCH computer architecture news 23, no. 1 (1995): 20-24. https://doi.org/10.1145/216585.216588
  2. Chen, Yong, Surendra Byna, and Xian-He Sun. "Data access history cache and associated data prefetching mechanisms." In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp. 1-12. 2007.
  3. LARSSON, MATTIAS. "Data Prefetcher Based on a Temporal Convolutional Network." (2022).
  4. Srivastava, Ajitesh, Angelos Lazaris, Benjamin Brooks, Rajgopal Kannan, and Viktor K. Prasanna. "Predicting memory accesses: the road to compact ml-driven prefetcher." In Proceedings of the International Symposium on Memory Systems, pp. 461-470. 2019.
  5. Zeng, Yuan, and Xiaochen Guo. "Long short term memory based hardware prefetcher: a case study." In Proceedings of the International Symposium on Memory Systems, pp. 305-311. 2017.
  6. Nesbit, Kyle J., and James E. Smith. "Data cache prefetching using a global history buffer." In 10th International Symposium on High Performance Computer Architecture (HPCA'04), pp. 96-96. IEEE, 2004
  7. Ishii, Yasuo, Mary Inaba, and Kei Hiraki. "Access map pattern matching for high performance data cache prefetch." Journal of Instruction-Level Parallelism 13, no. 2011 (2011): 1-24.
  8. Zhang, Pengmiao, Ajitesh Srivastava, Benjamin Brooks, Rajgopal Kannan, and Viktor K. Prasanna. "Raop: Recurrent neural network augmented offset prefetcher." In The International Symposium on Memory Systems, pp. 352-362. 2020.
  9. Michaud, Pierre. "Best-offset hardware prefetching." In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 469-480. IEEE, 2016.
  10. da Cruz, Eduardo Henrique Molina, Marco Antonio Zanata Alves, Alexandre Carissimi, Philippe Olivier Alexandre Navaux, Christiane Pousa Ribeiro, and Jean-Francois Mehaut. "Using memory access traces to map threads and data on hierarchical multi-core platforms." In 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp. 551-558. IEEE, 2011.
  11. Shevgoor, Manjunath, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H. Pugsley, and Zeshan Chishti. "Efficiently prefetching complex address patterns." In Proceedings of the 48th International Symposium on Microarchitecture, pp. 141-152. 2015.
  12. Dahlgren, Fredrik, and Per Stenstrom. "Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors." In Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture, pp. 68-77. IEEE, 1995.
  13. Ros, Alberto. "Berti: A per-page best-request-time delta prefetcher." The 3rd Data Prefetching Championship (2019).
  14. D. Joseph and D. Grunwald, "Prefetching using markov predictors," IEEE Transactions on Computers, vol. 48, no. 2, pp. 121-133, 1999 https://doi.org/10.1109/12.752653
  15. Stephen Somogyi, Thomas F Wenisch, Anastassia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2006. Spatial memory streaming. ACM SIGARCH Computer Architecture News 34, 2 (2006), 252-263 https://doi.org/10.1145/1150019.1136508
  16. Bhatia, Eshan, Gino Chacon, Seth Pugsley, Elvira Teran, Paul V. Gratz, and Daniel A. Jimenez. "Perceptron-based prefetch filtering." In Proceedings of the 46th International Symposium on Computer Architecture, pp. 1-13. 2019.
  17. Dey, Rahul, and Fathi M. Salem. "Gate-variants of gated recurrent unit (GRU) neural networks." In 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS), pp. 1597-1600. IEEE, 2017.
  18. Hochreiter, Sepp. "The vanishing gradient problem during learning recurrent neural nets and problem solutions." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6, no. 02 (1998): 107-116. https://doi.org/10.1142/S0218488598000094
  19. Nai, Lifeng, et al. "GraphBIG: understanding graph computing in the context of industrial solutions." Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 2015.
  20. Luk, Chi-Keung, et al. "Pin: building customized program analysis tools with dynamic instrumentation." Acm sigplan notices. Vol. 40. No. 6. ACM, 2005.
  21. A. Iosup, Alexandru, et al. "Ldbc graphalytics: A benchmark for large-scale graph analysis on parallel and distributed platforms." Proceedings of the VLDB Endowment 9.13 (2016): 1317-1328. https://doi.org/10.14778/3007263.3007270