Storm Warning: Check Your DR Plans

As a security and operations professional, I monitor many avenues of communication. Similar to the various diagrams a weather forecaster displays on the nightly news, These can be leading indicators of events before they occur or before they become front-page news.

From my overnight feeds, I have noted a set of possible threats to a hosting provider. While the level of confidence in the actual execution of said attack is low, it is not outside the realm of possibility; As such, it is worth the exercise to review backups, disaster recovery plans, and operational resiliency.

I WILL APOLOGISE as this missive is being written in a somewhat hurried and annoyed fashion.

We have seen before the chaos that can occur whence a major cloud provider has operational incidents; this can be multiplied many times whence the incident is intentional, planned, and coordinated.

As I carry a number of AWS certifications, I will use AWS as an EXAMPLE.

From AWS:

AWS has the concept of a Region, which is a physical location around the world where we cluster data centers. We call each group of logical data centers an Availability Zone. Each AWS Region consists of multiple, isolated, and physically separate AZ’s within a geographic area.

And let us discuss the AWS Concept of Availability Zones:

An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZ’s give customers the ability to operate production applications and databases that are more highly available, fault-tolerant, and scalable than would be possible from a single data center.

All of the above provide a secure and redundant platform if you choose to utilize it.

I will refer to the AWS diagram for shared responsibility as this applies to operational security as well:

As an AWS customer, you are responsible for ensuring your environment utilizes the resilient infrastructure that AWS provides.

AWS provides guidelines and best practices for Disaster Recovery and Systems Redundancy, and I’ll not go on quoting them. One can research these here.

Action Plan:

Here is a portion of a checklist I use to evaluate infrastructure; feel free to use this to evaluate your infrastructure.

AWS Specific:

  • Is the environment Multi-region?
  • Is the environment Multi-AZ?
  • Are Global services used in preference to regional?
  • Are Regional services used in preference to AZ?
  • Are RDS services running Multi-AZ?
  • Are application resources self-healing?
  • Are Backups copied to another region?
  • Is the critical workload designed to be stateless?

Non AWS Specific:

  • Are resources tagged to assist in recovery?
  • Are backups taken regularly?
  • Are backups copied offsite?
  • Is there a disaster recovery plan in place?
  • Is the DR plan/runbook/playbook stored and made available offline?
  • When was the last Disaster Recovery test conducted?
  • Is the environment monitored for operational issues?
  • Is the environment monitored for security issues?
  • Is the environment monitored for performance issues?

While many of the concepts above are AWS-specific, the latter portions will apply to any (cloud or on-premises) infrastructure.

Author’s Comments:

This missive is written due to known threats.

I will not discuss specifics and NOT amplify the chatter but will leave it to the reader to do the basic research. Enough to be said, there is a storm warning.