ResQ: Enabling SLOs in Network Function Virtualization

TitleResQ: Enabling SLOs in Network Function Virtualization
Publication TypeConference Paper
Year of Publication2018
AuthorsTootoonchian, A., Panda A., Lan C., Walls M., Argyraki K., Ratnasamy S., & Shenker S. J.
Published inProceedings of NSDI 2018
Abstract

Network Function Virtualization is allowing carriers to replace dedicated middleboxes with Network Functions (NFs) consolidated on shared servers, but the question of how (and even whether) one can achieve performance SLOs with software packet processing remains open. A key challenge is the high variability and unpredictability in throughput and latency introduced when NFs are consolidated.We show that, using processor cache isolation and with careful sizing of I/O buffers, we can directly enforce a high degree of performance isolation among consolidated NFs – for a wide range of NFs, our technique caps the maximum throughput degradation to 2.9% (compared to 44.3%), and the 95th percentile latency degradation to 2.5% (compared to 24.5%). Building on this, we present ResQ, a resource manager for NFV that enforces performance SLOs for multi-tenant NFV clusters in a resource efficient manner. ResQ achieves 60%-236% better resource efficiency for enforcing SLOs that contain contention-sensitive NFs compared to previous work.Network Function Virtualization is allowing carriers to replace dedicated middleboxes with Network Functions (NFs) consolidated on shared servers, but the question of how (and even whether) one can achieve performance SLOs with software packet processing remains open. A key challenge is the high variability and unpredictability in throughput and latency introduced when NFs are consolidated.We show that, using processor cache isolation and with careful sizing of I/O buffers, we can directly enforce a high degree of performance isolation among consolidated NFs – for a wide range of NFs, our technique caps the maximum throughput degradation to 2.9% (compared to 44.3%), and the 95th percentile latency degradation to 2.5% (compared to 24.5%). Building on this, we present ResQ, a resource manager for NFV that enforces performance SLOs for multi-tenant NFV clusters in a resource efficient manner. ResQ achieves 60%-236% better resource efficiency for enforcing SLOs that contain contention-sensitive NFs compared to previous work.

Acknowledgment

We would like to thank Andrew Herdrich, Edwin Verplanke, Priya Autee, Christian Maciocco, Charlie Tai, Rich Uhlig, Michael Alan Chang, Yashar Ganjali, David Lie, Hans-Arno Jacobsen, our shepherd Tim Wood, and the NSDI reviewers for their comments and suggestions. This work was funded in part by NSF-1553747, NSF-1704941, and Intel corporation.

URLhttps://www.usenix.org/system/files/conference/nsdi18/nsdi18-tootoonchian.pdf
ICSI Research Group

Networking and Security