Rollback-Recovery for Middleboxes

TitleRollback-Recovery for Middleboxes
Publication TypeConference Paper
Year of Publication2015
AuthorsSherry, J., Gao P. X., Basu S., Panda A., Krishnamurthy A., Maciocco C., Manesh M., Martins J., Ratnasamy S., Rizzo L., & Shenker S. J.
Other Numbers3806
Abstract

Network middleboxes must offer high availability, with automaticfailover when a device fails. Achieving high availabilityis challenging because failover must correctly restorelost state (e.g., activity logs, port mappings) but must do soquickly (e.g., in less than typical transport timeout values tominimize disruption to applications) and with little overheadto failure-free operation (e.g., additional per-packet latenciesof 10-100s of µs). No existing middlebox design providesfailover that is correct, fast to recover, and imposeslittle increased latency on failure-free operations.We present a new design for fault-tolerance in middleboxesthat achieves these three goals. Our system, FTMB(for Fault-Tolerant MiddleBox), adopts the classical approachof “rollback recovery” in which a system uses informationlogged during normal operation to correctly reconstructstate after a failure. However, traditional rollbackrecovery cannot maintain high throughput given the frequentoutput rate of middleboxes. Hence, we design a novel solutionto record middlebox state which relies on two mechanisms:(1) ‘ordered logging’, which provides lightweightlogging of the information needed after recovery, and (2) a‘parallel release’ algorithm which, when coupled with orderedlogging, ensures that recovery is always correct. Weimplement ordered logging and parallel release in Click andshow that for our test applications our design adds only30µs of latency to median per packet latencies. Our system

URLhttp://www.icsi.berkeley.edu/pubs/networking/rollbackrecovery15.pdf
Bibliographic Notes

Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication (SIGCOMM 2015), London, United Kingdom

Abbreviated Authors

J. Sherry, P. Gao, S. Basu, A. Panda, A. Krishnamurthy, C. Macciocco, M. Manesh, J. Martins, S. Ratnasamy, L. Rizzo, and S. Shenker

ICSI Research Group

Networking and Security

ICSI Publication Type

Article in conference proceedings