Skip to main content

Disaster Recovery (DR)

Introduction

In today's digital-first world, downtime is not just an inconvenience. It can result in lost revenue, reduced customer trust, and compliance risks. Disaster Recovery (DR) ensures business continuity by enabling workloads to recover quickly and reliably when outages occur.

Traditional DR processes are often manual, slow, and error-prone, leading to extended downtime. Modern applications require automated, testable, and compliant DR solutions that can operate across regions and meet strict recovery objectives.

Why It Matters

Organizations face several challenges without a robust DR system:

  • Backups alone are not enough and often fail to meet Recovery Point Objectives (RPOs).
  • No cross-region failover limits high availability and compliance readiness.
  • Auditability gaps reduce confidence during compliance reviews.
  • Slow, manual recovery processes increase outage impact.

A strong DR solution addresses these issues by reducing downtime, simplifying operations, and providing confidence during audits.

Solution Overview

The Disaster Recovery solution is designed to provide automation, reliability, and compliance. It supports automated recovery of VMs and their attached storage across regions, with both UI and API-based operations.

What Is DRaaS?

Disaster Recovery as a Service (DRaaS) protects cloud virtual machines from regional outages by continuously replicating data to a standby copy in a different E2E Networks data center.

If your primary region (for example, Delhi) experiences an outage due to hardware failure, network disruption, or any other incident, you can recover to your target region (for example, Chennai) in approximately 5 minutes, with data loss limited by your configured replication interval.

How It Works

SOURCE REGION (for example, Delhi)      TARGET REGION (for example, Chennai)
=================================== ====================================

Your VM (Active / Running) -----> Standby VM (Powered Off)
+ Attached Block Volumes + Replica Volumes
  • Your source VM continues to run normally.
  • The target VM stays powered off in standby and starts only during a DR drill or recovery.
  • Every RPO interval, snapshots are shipped to the target region and stored as recovery points.
  • During disaster recovery, you select recovery points and restore target resources to that state.

Recovery Objectives

  • Recovery Point Objective (RPO): how much data might be lost in an outage.
  • Recovery Time Objective (RTO): how quickly systems can be restored.

The implementation supports RTO of 5 minutes and configurable RPO from 1 hour to 240 hours.

Key Concepts

TermMeaning
DR PlanConfiguration linking your source VM (and selected volumes) to a standby replica in another region. One plan per source VM.
Source VMProduction VM you want to protect.
Target VMStandby replica VM created in target region, powered off until drill or recovery.
RPO (Recovery Point Objective)Replication frequency. Example: RPO 4 hours means up to 4 hours data loss in worst case. Range: 1 to 240 hours.
RTO (Recovery Time Objective)Time needed to bring systems back online after disaster. Approximate target is 5 minutes.
Recovery PointSnapshot of VM disk and attached volumes at a specific time, stored in target region.
Scheduled Recovery PointRecovery point created automatically by DRaaS on RPO schedule.
Manual Recovery PointRecovery point created on demand by user action.
Retention PeriodHow long recovery points are retained before automatic deletion. Range: 1 to 365 days.
DR DrillNon-destructive test using recovery points to verify recoverability without impacting production traffic.
RecoveryReal, one-way, terminal failover operation from source to target region.

Key Considerations

  • Persistent storage replication keeps standby systems consistent.
  • Automation-first workflows reduce human error under stress.
  • Cross-region readiness improves resilience and compliance posture.
  • Audit logs and reports provide evidence of DR readiness.

Benefits

  • Reduced downtime through faster recovery.
  • Improved compliance with auditable action history.
  • Operational simplicity through automated workflows.
  • Higher customer confidence in service continuity.

Summary

With automated recovery, cross-region support, and built-in audit logging, Disaster Recovery helps organizations maintain operations during unexpected outages while remaining compliant and auditable.

Next Steps