Perfect Network Resilience in Polynomial Time
Matthias Bentert, Stefan Schmid
TL;DR
This work resolves the fundamental question of when perfect resilience is achievable by local, in-port- and target-based rerouting in networks. It provides a complete characterization via four rooted obstructions: $K_5 \setminus e$, $K_{3,3} \setminus e$, $K_{3,4} \setminus 2e$, and subdivided $K_{2,4}$, and shows that a rooted graph is perfectly resilient exactly when none of these appear as rooted minors. The authors design a linear-time decision algorithm and an $O(nm)$ synthesis algorithm that outputs compact skipping forwarding rules, leveraging planar and outerplanar structure and hierarchical embeddings. Importantly, skipping forwarding rules are shown to be as powerful as general forwarding patterns in achieving perfect resilience under these conditions, addressing longstanding open questions. The results have potential practical impact for decentralized network design and motivate extensions to more powerful routing models such as source-aware or header-rewriting schemes, highlighting both theoretical insight and practical relevance in resilience design.
Abstract
Modern communication networks support local fast rerouting mechanisms to quickly react to link failures: nodes store a set of conditional rerouting rules which define how to forward an incoming packet in case of incident link failures. The rerouting decisions at any node $v$ must rely solely on local information available at $v$: the link from which a packet arrived at $v$, the target of the packet, and the incident link failures at $v$. Ideally, such rerouting mechanisms provide perfect resilience: any packet is routed from its source to its target as long as the two are connected in the underlying graph after the link failures. Already in their seminal paper at ACM PODC '12, Feigenbaum, Godfrey, Panda, Schapira, Shenker, and Singla showed that perfect resilience cannot always be achieved. While the design of local rerouting algorithms has received much attention since then, we still lack a detailed understanding of when perfect resilience is achievable. This paper closes this gap and presents a complete characterization of when perfect resilience can be achieved. This characterization also allows us to design an $O(n)$-time algorithm to decide whether a given instance is perfectly resilient and an $O(nm)$-time algorithm to compute perfectly resilient rerouting rules whenever it is. Our algorithm is also attractive for the simple structure of the rerouting rules it uses, known as skipping in the literature: alternative links are chosen according to an ordered priority list (per in-port), where failed links are simply skipped. Intriguingly, our result also implies that in the context of perfect resilience, skipping rerouting rules are as powerful as more general rerouting rules. This partially answers a long-standing open question by Chiesa, Nikolaevskiy, Mitrovic, Gurtov, Madry, Schapira, and Shenker [IEEE/ACM Transactions on Networking, 2017] in the affirmative.
