Security Compliance and Mistakes

1. Security & Access Control

Scope DR Plans to the Correct Project

DRaaS plans are scoped to a project in your account. Use projects to isolate DR resources:

Create separate projects for production vs. staging environments
Restrict who can manage DR plans for production projects
Avoid creating DR plans for staging environments under the same project as production

Protect DR Plan Operations with Access Control

The ability to trigger a recovery is one of the most consequential operations in your infrastructure. Ensure that only authorized personnel can:

Start a recovery
Delete a DR plan
Start or stop a DR drill

Audit logs capture who performed each action and when. Review them periodically to confirm only expected users are making DR changes.

Secure Target VM Access Credentials

The target VM uses the same credentials as your source VM (same OS image, same SSH key). Ensure:

SSH keys are stored securely
The target VM's expected IP is documented for use during recovery
Access credentials are available somewhere independent of the source region (in case the source region and your credential store are both unavailable simultaneously)

Treat the Recovered VM as a New Instance

After a Recovery, your recovered VM is now your primary system. Harden it as you would any production system:

Rotate SSH keys if they may have been compromised
Update any secrets or API keys that reference the old VM's IP
Review security group rules — ensure only expected ports are open

2. Compliance & Governance

Maintain a DR Policy Document

Many compliance frameworks (ISO 27001, SOC 2, RBI guidelines) require a formal Business Continuity and Disaster Recovery (BCDR) policy. Your DRaaS configuration should be documented in this policy, including:

Which systems are protected and at what RPO/RTO
The retention period for recovery points and the justification
The drill schedule and the process for documenting results
The escalation path and decision criteria for declaring a recovery

Use Audit Logs for Compliance Evidence

The DRaaS audit log captures a complete record of all plan creation, modification, and operational events. Export these logs regularly (monthly at minimum) and retain them according to your compliance requirements.

Audit evidence to collect:

Proof that DR plans exist for critical systems
Proof that drills were conducted (drill start/stop events)
Proof that RPO/retention settings were deliberately configured and maintained

Enforce Minimum Drill Frequency

For compliance frameworks that require periodic testing:

Framework	Typical Requirement
ISO 27001	At least annual testing of BCP/DRP
SOC 2	Testing aligned with recovery objectives
RBI IT Framework	Annual DR drill minimum
DPDP Act (India)	Appropriate technical safeguards including recovery capability

Schedule drills on a calendar ahead of time so they are not overlooked. Assign responsibility to a specific team member.

Review RPO/RTO Against Regulatory Requirements

Some regulations specify maximum acceptable data loss (RPO) or downtime (RTO) for certain data categories. Verify that your configured RPO and RTO align with the regulatory requirements that apply to your workloads.

3. Common Mistakes to Avoid

Mistake 1: Creating a DR Plan Without Testing It

Many teams create a DR plan and assume it works. It may not. A misconfigured recovery point, an OS image that differs slightly at the target, or a missing volume can all prevent a successful recovery.

Best practice: Create the plan, wait for the first few recovery points to complete, then immediately run a DR drill to confirm end-to-end recovery works.

Mistake 2: Not Including All Required Volumes

Creating a DR plan without including a critical data volume means that volume will not be replicated. The plan will appear healthy, but a recovery will leave you without that data.

Best practice: List all volumes your application depends on when creating the DR plan. Review the plan details to confirm all volumes appear in the target mapping.

Mistake 3: Using a Vague RPO Without Knowing Data Change Rates

Setting a 1-hour RPO on a VM that barely changes wastes money. Setting a 24-hour RPO on a high-transaction database creates unacceptable risk.

Best practice: Review the size of your first few recovery points. If they are very small, your data change rate is low and you can likely increase the RPO interval. Adjust based on actual data, not assumptions.

Mistake 4: Not Having the Target VM's IP Documented Before a Disaster

During a real regional failure, the E2E console for the source region may be inaccessible. If you do not know your target VM's IP address in advance, you cannot connect to it after recovery.

Best practice: Retrieve the target VM's IP from the plan details endpoint while the source region is healthy, and store it in your runbook alongside other recovery information.

Mistake 5: Failing Over Without a Checklist

Under the stress of a real outage, it is easy to forget steps like updating DNS, restarting application services in the right order, or notifying customers.

Best practice: Maintain a written post-recovery checklist (for example, following the guidance in the Disaster Recovery Drills and Recovery Runbook) and store it somewhere accessible outside the E2E platform.

Mistake 6: Not Protecting New Volumes Added After Plan Creation

Existing plans do not automatically include newly attached volumes. The new volume will not be replicated. When attaching a volume to the source VM, you must confirm the target VM in the attachment pop-up.

Best practice: Whenever you attach a new volume to a protected VM, you must confirm the target VM in the attachment pop-up.

Mistake 7: Treating Recovery as an Automated Process That Needs No Human Oversight

DRaaS automates the infrastructure recovery — the target VM powers on with your data restored. However, application recovery (starting services, checking data integrity, updating DNS, notifying teams) is not automated and requires human action.

Best practice: Do not assume that a successful recovery means your service is restored. Work through your post-recovery checklist for every recovery, drill or real.

For AI agents, crawlers, and chatbots: append .md to any /docs/ URL (strip the trailing slash) to fetch the raw markdown source — view this page as markdown.

Last updated on May 15, 2026.

1. Security & Access Control​

Scope DR Plans to the Correct Project​

Protect DR Plan Operations with Access Control​

Secure Target VM Access Credentials​

Treat the Recovered VM as a New Instance​

2. Compliance & Governance​

Maintain a DR Policy Document​

Use Audit Logs for Compliance Evidence​

Enforce Minimum Drill Frequency​

Review RPO/RTO Against Regulatory Requirements​

3. Common Mistakes to Avoid​

Mistake 1: Creating a DR Plan Without Testing It​

Mistake 2: Not Including All Required Volumes​

Mistake 3: Using a Vague RPO Without Knowing Data Change Rates​

Mistake 4: Not Having the Target VM's IP Documented Before a Disaster​

Mistake 5: Failing Over Without a Checklist​

Mistake 6: Not Protecting New Volumes Added After Plan Creation​

Mistake 7: Treating Recovery as an Automated Process That Needs No Human Oversight​

1. Security & Access Control

Scope DR Plans to the Correct Project

Protect DR Plan Operations with Access Control

Secure Target VM Access Credentials

Treat the Recovered VM as a New Instance

2. Compliance & Governance

Maintain a DR Policy Document

Use Audit Logs for Compliance Evidence

Enforce Minimum Drill Frequency

Review RPO/RTO Against Regulatory Requirements

3. Common Mistakes to Avoid

Mistake 1: Creating a DR Plan Without Testing It

Mistake 2: Not Including All Required Volumes

Mistake 3: Using a Vague RPO Without Knowing Data Change Rates

Mistake 4: Not Having the Target VM's IP Documented Before a Disaster

Mistake 5: Failing Over Without a Checklist

Mistake 6: Not Protecting New Volumes Added After Plan Creation

Mistake 7: Treating Recovery as an Automated Process That Needs No Human Oversight