References
- A. Vijay Bhaskar and T. G. Venkatesh, "Performance analysis of network-on-chip in many-core processors," Journal of Parallel and Distributed Computing, vol. 147, pp. 196-208, Jan. 2021. https://doi.org/10.1016/j.jpdc.2020.09.013
- S. H. Gade and S. Deb, "A Novel Hybrid Cache Coherence with Global Snooping for Many-core Architectures," ACM Transactions on Design Automation of Electronic Systems, vol. 27, no. 1, pp. 1-31, 2021.
- S. Kim, M. Fayazi, A. Daftardar, K. -Y. Chen, J. Tan, S. Pal, T. Ajayi, Y. Xiong, T. Mudge, C. Chakrabarti, D. Blaauw, R. Dreslinski, and H. -S. Kim, "Versa: A 36-Core Systolic Multiprocessor With Dynamically Reconfigurable Interconnect and Memory," IEEE Journal of Solid-State Circuits, vol. 57, no. 4, pp. 986-998, Apr. 2022. DOI: 10.1109/JSSC.2022.3140241
- K. Fernandes, "GPU Development and Computing Experiences," University of Cambridge, Research Computing Services, 2015.
- J. L. Traff, A. Ripke, C. Siebert, P. Balaji, R. Thakur, and W. Gropp, "A Simple, Pipelined Algorithm for Large, Irregular All-gather Problems," in Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, vol. 5205, pp. 84-93, 2008.
- J. Park, H. Yun, and S. Moon, "Enhancing Performance Using Atomic Pipelined Message Broadcast in a Distributed Memory MPSoC," IEICE Electronics Express, vol. 11, pp. 1-7, Nov. 2014.
- J. Park, "Efficient Pipelined Broadcast with Monitoring Processing Node Status on a Multi-Core Processor," Mathematics, vol. 7, no. 12, Dec. 2019. DOI:10.3390/math7121159.