Resiliente Netzwerke für KI-Rechnezentren: Die Dimension des Betriebsrisikos, die Investoren übersehen
![]()
Viavi is a global provider of test and measurement, service assurance, AIOps, and network optimisation solutions
Objective:
- What does “resilience” actually mean when a network sits between a fibre plant and an AI factory?
- Which resilience dimensions of AI-DC infrastructure are currently invisible to investor due-diligence?
- How does fibre / optical / Ethernet resilience translate — or fail to translate — into the AI-fabric layer (NVLink, InfiniBand, RoCE)?
- What should investors require operators to measure, monitor, and report — and what is thecost of getting it wrong?
- Where do existing AIOps and T&M capabilities (already proven in classical telecom resilience) already fit, and where does AI-DC resilience demand genuinely new telemetry?
Content of the Roundtable:
- The shifting definition of "network resilience" — from packet-loss tolerance to per-GPU-hour SLO compliance
- The resilience stack in an AI Data Centre — optical / fibre, Ethernet, NVLink / InfiniBand fabric, facility (power & cooling), application / inference
- Failure modes that classical resilience monitoring misses: optic BER drift, NIC retransmit creep, NCCL collective stalls, partial fabric degradation, rack-level thermal events
- What classical AIOps platforms already monitor today (multi-source ingestion · ML-driven RCA · next-best-action · digital twin · SLE-breach detection) — and what doesn't yet exist for the AI-fabric layer
- The 5K-GPU reference: what a "blind" cluster loses in a year, and how that flows through to investor cash flows
- Investor due-diligence in 2026: what belongs on the resilience checklist for AI-DC asset acquisitions
- Open discussion: what would change in your DD process if all of this were visible in real time?
Redner
-
Holger Eber -
Jürgen Voss
