Data protection has evolved greatly over the days of on-prem tape backup. It wasn’t long ago that organizations sent their backup tapes offsite for storage in case of a disaster. And for large organizations, synchronous or asynchronous replication between data centers allowed for failover from the primary site to a secondary site in case of disaster.
With the growth in cloud computing, specifically public cloud providers like AWS and Microsoft Azure, the cloud has become a resource that organizations of any size can leverage. While many think of the cloud for computing either in the form of Infrastructure as a Service (IaaS) or Software as a Service (SaaS), the cloud can also be used for data protection.
From a data protection perspective, the cloud can serve as both a repository for backups and as a resource for disaster recovery. Disaster recovery today can be caused by anything from an extended power outage, a ransomware attack, a failed server, or really any event that requires servers or virtual machines to be recovered in an alternate location.
The Economics of Using the Cloud for Backup and Recovery
As mentioned, having the cloud resources needed for offsite backup retention isn’t as simple as it may seem. Cloud providers like AWS have many options, like S3 (Amazon Simple Storage Service) specifically for backup. The options are tiered based on performance and price. For example, S3 recommends the following for backup and recovery.
- S3 Glacier – for object data
- Amazon EFS – for file data
- Amazon EBS – for block data
In addition to these storage options, AWS Storage Gateway is an offering for sending on-premises backups to AWS. Given the number of possibilities for storing backup data, two considerations should come into play when evaluating cloud options: Performance and cost. While one can argue the attractive price of AWS offerings like S3 Glacier, you aren’t going to recover data in seconds or minutes if it isn’t stored on storage that can provide the same or better performance as the primary storage the system is being recovered from.
Cost is a major factor because there are normally multiple copies of backups stored in the cloud to provide multiple recovery points from which to recover. The key variables needed to calculate the cost for storing backups in the cloud and recovering images (systems/VMs) would be the following for AWS:
- S3 Standard Storage (priced per GB)
- S3 Glacier Storage (priced per GB but would need to be recovered to S3 standard to recover from)
- EC2 compute (a combination of vCPUSs, memory, and GB2 storage if not already allocated)
- Egress outbound costs (priced by GB)
AWS and other public cloud providers are very transparent with their pricing and offer calculators to determine the compute, storage, and networking costs. The issue is how does an organization calculate the cost of backup and recovery based on so many variables?
There are personnel costs associated with defining and managing the environment to support the recovery in addition to the actual personnel costs to perform the recovery. All of these can make it challenging for an IT manager who wants a turn-key backup and recovery solution for their organization.
And lastly, there is security. AWS has a shared responsibility model that clearly defines that AWS maintains the security of the cloud whereas the customer is responsible for security in the cloud. This means that the user is responsible for their data, platform, and identity access, operating systems, and more.
The administrative costs associated with security alone, not including Identity Access Management (IAM) software and firewalls can be challenging for IT teams who want to utilize a cost-effective solution.
Public vs. Private Cloud
Given the many choices of public cloud providers, should organizations look at private cloud-based solutions? It can be argued that data that resides in public clouds should not be stored with the same cloud provider for safety. The argument for private clouds ironically comes down to the same two considerations as the public cloud, performance, and cost. However, performance and cost include additional considerations for private clouds.
- What is the availability of the private cloud?
- What types of service levels exist for the time it takes to recover in the cloud?
- What level of performance can be offered for production workloads once they are recovered in the cloud?
- What level of technical support is offered for disaster recoveries in the cloud (live and test)?
- What is the cost of storing backups?
- Are there tiers associated with performance or the amount of storage used?
- What compute costs are associated with recovering workloads in the cloud?
- How long can a recovered system (s) run in the cloud?
Included in the category of private cloud is build your own (BYO) which can be daunting not just in the capital expenditure needed to build out the environment but also the liabilities associated with building and maintaining an “always-on” cloud capable of storing backups and recovering entire systems.