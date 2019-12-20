Loading...

Economic storage and cloud computing resources have changed the notion of what is affordable when it comes to keeping data available. While local databases generally had backups and disaster recovery sites limited to a single objective, the economy of scale in the cloud has changed that multiple. It is not unusual to have more than half a dozen copies or more of the data replicated in the data centers and, in some cases, regions, as standard integrated features for cloud-managed databases. The same goes for snapshots, which for many database services managed in the cloud, extend until the last 30 to 35 days. So, even if a cloud data center goes down, after a few seconds, our database should continue.

But backups are another story. Replication and snapshots will not meet the requirements for long-term data retention. The good news, once again, is that the economy of cloud storage also makes backups more affordable. The difference is that, unlike replication and snapshots, backups are generally not included as part of the central database service and must be ordered separately.

Naturally, there is a huge legacy market for data center backup and recovery solutions that could extend to the coverage of data stored in the cloud, such as Commvault, Veritas, Cohesity, Veeam and others. There are a lot of individual and department level tools aimed at backing up local data in the cloud.

But business backup and recovery as a SaaS service is still emerging. AWS and Microsoft Azure offer their own automated backup services. There are also a handful of emerging third-party SaaS services that use the cloud, not only as a storage target, but also as a control plane.

Druva was one of the first solutions, providing a SaaS service that takes advantage of the self-scaling of EC2 instances, with DynamoDB used to store metadata (which ensures deduplication), with snapshots stored in S3 during the first 90 days, and then automatically staggered to Glacier as they get older. For audit access, Rubrik leverages AWS CloudWatch to monitor activity and CloudTrail to deepen the actual access event logs. There is also Rubrik, which offers backup, instant recovery, archiving, search, analysis, compliance and copy data management services.

Now Clumio, a company that recently emerged from stealth a few months ago, has joined the fray. Like Druva, Clumio is built natively in AWS, where the storage control plan is implemented, and takes advantage of recent innovations in containers and microservices. Like Druva, it uses DynamoDB as a quick metadata search and deduplication; It also uses the RDS PostgreSQL database to store the status and configuration of the backup. To handle the channeling of data for intake and deduplication, Clumio uses Lambda functions for simple and short-term processes performed in minutes, or containers for larger and more complex jobs that span hours. It minimizes overhead costs by dividing backup tasks into small microservices that are processed in the background.

Unlike Druva, Clumio does not rely on snapshots to index or retrieve data. Instead, divide the data into small fragments of 16 Kbytes that can be packaged in the cloud storage. The benefit is that, instead of time-based indexes or text searches that would come from snapshots, Clumio can index at the file level.

For now, Clumio's backup service requires that data be virtualized in storage groups, which could store data in cloud storage services such as AWS Elastic block Storage (EBS) or VMware. Therefore, it does not currently compete or match the specific database services that are included or offered as options with DBaaS services such as Amazon RDS, Microsoft Azure SQL Database or Google Cloud SQL. Clumio has not ruled out developing a database backup service in the future.

On the horizon, Clumio could also develop services that scan PII data or anomalies such as sudden changes due to possible malware. It could possibly allow the footprint of your solution to be extended locally through hybrid cloud offerings such as AWS Outposts. Leveraging native services on AWS, from its databases to cloud monitoring tools and serverless or container orchestration functions makes this extensibility possible.

Like its rivals, Clumio's control plane is in a cloud. But we hope Clumio will soon start organizing the backup and data storage in other clouds. It is true that this would imply the overload of communications between processes between the control plane in AWS and the data flowing to the other clouds. Then, the service could become more multi-cloud, but it would not have the efficiency of having a local control plane.

For now, compensation to make cloud backup SaaS services native to the cloud has had the cost of cloud independence, with a truly multi-cloud or cloud-independent solution yet to emerge.