I recently read an article about optimizing MPI collecive calls . I learnt from a reliable source that MPI uses the services provided by TCP/IP or infiniband . My doubt is, suppose we have a bottleneck in the TCP/IP layer itself, how will optimizing the MPI calls really help the performance.
If you reduce the number of MPI calls, the TCP bottleneck will matter less.
But in principle, you're right. If TCP/IP happens to be the bottleneck in your particular application/setup (in many cluster computers that use MPI it is not, because these things are interconnected by networks that sometimes outperform the persistent storage) then the right thing to optimize is the TCP/IP setup. Profiling is the key, no matter the size of the project.