Brief Announcement: Techniques for Programmatically Troubleshooting Distributed Systems

TitleBrief Announcement: Techniques for Programmatically Troubleshooting Distributed Systems
Publication TypeConference Paper
Year of Publication2013
AuthorsWhitlock, S., Scott C., & Shenker S. J.
Page(s)1-3
Other Numbers3428
Abstract

The distributed systems research community has developed many provably correct algorithms and abstractions that are in wide use. However, practical implementations of distributed systems often contain many bugs, and practitioners spend much of their time troubleshooting these bugs. In this paper we present an algorithm, retrospective causal inference, to ease the burden of troubleshooting. We end by enumerating several open research problems related to the troubleshooting process.

URLhttp://www.icsi.berkeley.edu/pubs/networking/briefannouncetechniques13.pdf
Bibliographic Notes

Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC '13), pp. 1-3, Montréal, Québec, Canada

Abbreviated Authors

S. Whitlock, C. Scott, and S. Shenker

ICSI Research Group

Networking and Security

ICSI Publication Type

Article in conference proceedings