Disaster recovery, orchestrated.

Sayf protects your Kubernetes workloads with automated cross-region drills, validated failover, and continuous health checks — built on Velero, hardened on real production workloads.

Request access View on GitHub

Backup is not recovery.

Most Kubernetes teams have backups. Few can answer the harder question: if our primary region went down right now, how long would it take to recover, and how do we know?

Velero captures your data. It does not orchestrate the recovery. It does not validate that the restored cluster actually works. It does not tell you whether tomorrow's recovery will be different from today's.

The gap between “backup taken” and “recovery proven” is where production incidents live.

Recovery you can actually rely on.

Automated drills

Scheduled drills run end-to-end against your standby region — backup, restore, application validation. If something drifts, you find out at 02:00, not during a real outage.

Continuous validation

Read-only health checks run multiple times daily, confirming your standby cluster is in a recoverable state. The system's readiness is observed, not assumed.

Operator-aware recovery

Sayf understands operator-managed databases — starting with CloudNativePG, expanding to MySQL, MongoDB, Kafka, Redis. Generic backup tools miss the subtleties; Sayf doesn't.

Documented procedures

Every drill produces an auditable record. Every failover and failback has a documented procedure with RACI. When the moment arrives, the procedure is rehearsed.

Why Sayf is different

Backup tools capture data. Sayf orchestrates recovery.

Sayf is built for teams who need to prove their disaster recovery works, not just check a box that says backups exist.

We orchestrate the full lifecycle: scheduled drills with drift detection, recovery validation that catches the subtle failures backup tools miss, failover procedures that don't depend on heroes, and a continuous validation surface between drills.

Built on the Velero open-source foundation. Hardened on real Saudi enterprise workloads. Designed for the teams who run cross-region production today.

Built on real recoveries, not lab demos.

A Saudi retail group needed cross-region disaster recovery for their production Kubernetes platform on Huawei Cloud. Six PostgreSQL clusters, twelve connection poolers, twenty-eight application deployments, RTO under 60 minutes.

47 min

RTO incremental

57 min

RTO full destructive

100%

validation success

failure modes fixed

Across an eight-week engagement, we built the orchestration, identified twenty specific failure modes in stock backup tooling, and validated recovery end-to-end. Drills run automatically today, with full success across all validated runs. Sayf is the productized version of what we built for them.

Open core

Open by default.

The DR drill engine that ships with Sayf is open source under Apache 2.0. The orchestration script that ran on the original customer engagement — sanitized and generalized — is available on GitHub.

Run it yourself if you want. Sayf's commercial product is the multi-tenant SaaS, the management UI, the workload coverage, and the support — built on top of the same open foundation.

$ git clone \

github.com/sayf-io/sayf-cnpg-dr

$ ./sayf-cnpg-dr.sh \

--phase validate

✓ Pre-flight checks OK

✓ All clusters healthy

✓ All applications available

Early access for teams who want to shape the product.

We're working with a small number of design partners through 2026 — enterprise teams running production Kubernetes who need DR done right, and are willing to be part of building it.