Dynamic routing balancing on InfiniBand network

Authors

  • Diego Lugones Computer Architecture & Operating Systems Department (CAOS), University Autònoma of Barcelona, Spain.
  • Daniel Franco Computer Architecture & Operating Systems Department (CAOS), University Autònoma of Barcelona, Spain.
  • Emilio Luque Fadón Computer Architecture & Operating Systems Department (CAOS), University Autònoma of Barcelona, Spain.

Keywords:

adaptive routing algorithms, congestion control, infiniBand networks, high speed network modeling

Abstract

InfiniBand (IBA) technology was developed to address the performance issues associated with messages movement among Endnodes and computer I/O devices. However, InfiniBand is also widely deployed within high performance computing (HPC) clusters due to the high bandwidth and low message latency attributes it offers to inter-processor communication systems. An interconnection-network efficient design is mandatory because its great impact on the parallel computer performance. Therefore, a high speed routing scheme that minimizes congestion and avoids hot-spot areas should be included in the network components. We have developed Dynamic Routing Balancing (DRB), an adaptive routing mechanism that balances the communication traffic over the interconnection network. It is based on limited and load-controlled multipath expansion in order to maintain low and bounded network latency. In this work, we propose using DRB as the congestion control mechanism for InfiniBand networks. Experimentation shows that our method achieves significant performance improvement over the original InfiniBand technique which is based on message throttling. An improvement up to 66% for latency and 35% for throughput is achieved for the networks under analysis. Finally, the proposed mechanism use the management model defined in InfiniBand specs, thus full compatibility is provided.

Downloads

Download data is not yet available.

References

[1] E. Baydal, “A Family of Mechanisms for Congestion Control in Wormhole Networks”, IEEE Trans. Parallel Distrib. Syst. vol 16 , pp.772-784, sept. 2005.
[2] T. Cormen, C. Leiserson., R. Rivest, C. Stein. “Introduction to Algorithms”, second edition, MIT Press and McGraw-Hill, 2001.
[3] W. Dally, B. Towles. “Principles and practices of interconnection networks”, Morgan Kaufmann publishers, 2004.
[4] J. Duato, I. Johnson, J. Flich, F. Naven, P. Garcia, T. Nachiondo, “A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks,” in 11th International Symposium on HPCA-11, 2005, pp. 108-119.
[5] D. Franco, I.Garcés, and E. Luque, 1999. "A new method to make communication latency uniform". Procc. of ACM International Conference on Supercomputing (lCS99), 210- 219.
[6] P.J. Garcia, F.J. Quiles, J. Flich, J. Duato, I. Johnson, F. Naven, "RECN-DD: A Memory-Efficient Congestion Management Technique for Advanced Switching," in ICPP, 2006, pp. 23-32.
[7] InfiniBand Trade Association, “InfiniBand Architecture Specification vers.1.2,” June 2008, http://www.InfiniBandta.com/
[8] OPNET Technologies, “Opnet Modeler Accelerating Network R&D,” June 2008, http://opnet.com. 2008.
[9] G. Pfister ”Solving Hot Spot Contention Using InfiniBand Architecture Congestion Control,” in Ion High Performance Interconnects for Distributed Computing, 2005.
[10] J.R Santos, Y. Turner, G. Janakiraman, “End-to-end congestion control for InfiniBand,” in Twenty-Second Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 2003, vol.2, 2003, pp. 1123-1133.
[11] T. Shanley, “InfiniBand Network Architecture”, Addison Wesley, 1999.
[12] A. Singh, W. Dally, B. Towles, AK. Gupta “Globally Adaptive Load-Balanced Routing on Tori”, IEEE Computer Architecture Letters, vol. 3, no. 1, pp. 6–9, jan 2004.
[13] Y. Shihang, G. Min, I. Awan, "An Enhanced Congestion Control Mechanism in InfiniBand Networks for High Performance Computing Systems," in Proceedings of the 20th International Conference on Advanced Information Networking and Applications, AINA 2006, IEEE Computer Society, vol 1, april 2006, pp. 845-850,
[14] Top500 Supercomputers Site, “Interconnect Family share for 06/2008,” June 2008, http://www.top500.org.
[15] A. Vishnu, M. Koop, A. Moody, A. Mamidala, S. Narravula, D. Panda, "Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective," In Proceedings of the Seventh IEEE international Symposium on Cluster Computing and the Grid, CCGRID, IEEE Computer Society, 2007, pp. 479-486.
[16] L. Xuan-Yi, C. Yeh-Ching, H. Tai-Yi, “A multiple LID routing scheme for fat-tree-based InfiniBand networks,” in Proceedings of the 18th Parallel and Distributed Processing Symposium, 2004, pp. 26-30.

Downloads

Published

2008-07-01

How to Cite

Lugones, D., Franco, D., & Luque Fadón, E. (2008). Dynamic routing balancing on InfiniBand network. Journal of Computer Science and Technology, 8(02), p. 104–110. Retrieved from https://journal.info.unlp.edu.ar/JCST/article/view/749

Issue

Section

Original Articles

Most read articles by the same author(s)

1 2 > >>