Distributed Memory Breadth-First Search Revisited: Enabling Bottom-Up Search

TitleDistributed Memory Breadth-First Search Revisited: Enabling Bottom-Up Search
Publication TypeConference Paper
Year of Publication2013
AuthorsBeamer, S., Buluç A., Asanović K., & Patterson D.
Other Numbers3461
Abstract

Breadth-first search (BFS) is a fundamental graphprimitive frequently used as a building block for many complexgraph algorithms. In the worst case, the complexity of BFS islinear in the number of edges and vertices, and the conventionaltop-down approach always takes as much time as the worst case.A recently discovered bottom-up approach manages to cut downthe complexity all the way to the number of vertices in the bestcase, which is typically at least an order of magnitude less thanthe number of edges. The bottom-up approach is not alwaysadvantageous, so it is combined with the top-down approachto make the direction-optimizing algorithm which adaptivelyswitches from top-down to bottom-up as the frontier expands.We present a scalable distributed-memory parallelization of thischallenging algorithm and show up to an order of magnitudespeedups compared to an earlier purely top-down code. Ourapproach also uses a 2D decomposition of the graph that haspreviously been shown to be superior to a 1D decomposition.Using the default parameters of the Graph500 benchmark, ournew algorithm achieves a performance rate of over 240 billionedges per second on 115 thousand cores of a Cray XE6, whichmakes it over 7× faster than a conventional top-down algorithmusing the same set of optimizations and data distribution.

Acknowledgment

This research used resources of the National Energy ResearchScientific Computing Center, which is supported bythe Office of Science of the U.S. Department of Energy underContract No. DE-AC02-05CH11231. This research also usedresources of the Oak Ridge Leadership Computing Facilitylocated in the Oak Ridge National Laboratory, which is supportedby the Office of Science of the Department of Energyunder Contract DE-AC05-00OR22725. The second author wassupported in part by the DARPA UHPC program under contract HR0011-10-9-0008, and in part by the Director, Office of Science, U.S. Department of Energy under Contract No. DEAC02-05CH11231. Research was also supported by Microsoft (Award #024263) and Intel (Award #024894) funding and by matching funding by U.C. Discovery (Award #DIG07-10227). Additional support comes from Par Lab affiliates National Instruments, Nokia, NVIDIA, Oracle, and Samsung.

URLhttps://www.icsi.berkeley.edu/pubs/arch/ICSI_distributedmemorybreadth13.pdf
Bibliographic Notes

Proceedings of the Workshop on Multithreaded Architectures and Applications (MTAAP-2013), at the International Parallel & Distributed Processing Symposium (IPDPS-2013), Boston, Massachusetts

Abbreviated Authors

S. Beamer, A. Buluç, K. Asanovi?, and D. Patterson

ICSI Research Group

Architecture

ICSI Publication Type

Article in conference proceedings