Ransomware has rapidly become the single largest cyber threat we face today and if the first half of 2021 was any indication, things are only going to get worse. Colonial Pipeline, Kia Motors, JBS Foods, Kaseya and CNA Financial have been some of the more notable, high-profile attacks this year. In the case of the Colonial Pipeline, the attack impacted over a dozen U.S. states and cost the company $5 million. Colonial Pipeline was able to recover ~$2.3M of the ransom, but that is often not the case. CNA Financial was not as fortunate and needed to pay an estimated $40 million to retrieve the encryption keys for their data. And ransom from the Kaseya attack, which impacted an estimated 800 to 1,500 businesses, is said to be in the range of $70M which would make it the largest ransom ever paid (should Kaseya decide to pay).
Mitigating this risk is no easy task, but the good news is that organizations can leverage the security best practices we have been touting for years. I am referring to Zero Trust architectures, defense-in-depth strategies, prioritizing IT and very importantly, end user training. I can’t stress the importance of end user training enough. End users are the primary attack vector for ransomware and a single infected user can do a lot of damage. But even with a comprehensive security strategy, the motivation of a big ransom payoff is often enough to outwit even the most prepared organizations.
There are a wide variety of potential solutions and approaches to reduce the risk and or impact of ransomware and in many cases it remains true that “no one size fits all.” Ransomware often requires a layered approach to enable the protection your business requires. The good news is that there are solutions which can help not only detect ransomware, but also enable a quick recovery from an attack.
The Data Equation
For this blog, I thought I would focus on the data side of the equation given this is often overlooked and/or overshadowed by solutions offering early detection & prevention.
As a starting point, see if you can confidently answer these questions:
- Do your storage and data protection solutions support granular role-based access controls (RBAC), multi-factor authentication (MFA), immutability, encryption, air gapping, etc. and are ALL of these features enabled?
- Could you rapidly identify a ransomware event at the data layer?
- Could you rapidly recover from a ransomware event at a granular level?
- In the event of a ransomware infection, how long would it take to recover?
- How frequently are you testing this?
There are many sources of information on ransomware but one of the best we’ve found is the Cybersecurity & Infrastructure Security Agency or CISA. CISA provides comprehensive information on the evolving threat of ransomware including regular updates on recent attacks, detailed reports, ransomware statistics, and guidance on how organizations can implement a Cybersecurity Framework and ransomware response plan to minimize exposure to ransomware.
Real-world Lessons Learned
At Daymark, we work with many primary storage, archiving and data protection solutions, most of which can aid in protecting your data and enabling one form of recovery or another. Thinking back to the first time I was asked to assist in a ransomware recovery effort for one of our customers, what sticks out most is that the customer had spent 8 days attempting to recover from backups, only to fail on each attempt! Within an hour of speaking to them, we had identified that this customer had storage snapshots dating back 3 months which enabled us to recover the customer’s data in under 5 minutes. The downside, the customer lost the last few months’ worth of updates because they had been unknowingly infected for over 75 days, further highlighting the importance of early detection.
Having multiple means of recovery is essential when trying to recover from a ransomware attack. It is not uncommon to see organizations leverage multiple technologies in a recovery effort depending on the solutions they have available to them and their recovery requirements.
With this in mind, we would recommend that you make sure that your defense-in-depth strategy extends to the data layer. Storage snapshots have made significant advancements over the years and several providers can deliver this capability without any performance impact. Unfortunately, many organizations opt not to use snapshots and instead, rely solely on their backup solution for recovery. In fact, one of the providers we work with recently shared that less than 20% of their customers leverage storage snapshots. This statistic was both shocking and incredibly worrisome given the value of snapshots in a recovery effort. If you are in the 80% not taking advantage of snapshots, now is the time to revisit your strategy and start leveraging snapshots even if for only a short period of time (e.g. 7 to 14 days) to minimize cost. Generally, we recommend you plan for a 20% capacity hit, but mileage will vary, and you should plan on testing up front to tune your snapshot retention accordingly.
Additionally, we are still hearing from some customers that they are not utilizing storage snapshots due to the negative impact on performance. Unfortunately, this concern is common with legacy storage solutions. If your storage solution is unable to provide a zero-performance impact snapshot, we would strongly suggest moving to a modern storage solution sooner rather than later. I cannot stress this enough -- if your storage providers are not prioritizing data recovery and security, they are putting your organization at extreme risk and should be replaced ASAP.
Some providers, including Pure Storage, Infinidat, Dell’s PowerMax, and NetApp offer immutable storage snapshots ensuring that your data can’t be modified or deleted prior to a set schedule.
To illustrate the value of storage snapshots further, Pure Storage supports portability of SafeMode snapshots enabling customers, for example, to create short term SafeMode snapshots on their flagship FlashArray//X (e.g. 3 to 5 day retention). These immutable snapshots can then be offloaded to the FlashArray//C for extended retention (e.g. 30 day’s) and/or FlashBlade for rapid recovery / extended retention (30 days to 100+ days) and/or Cloud Block Store (Pure’s OS running natively in Azure or Amazon). Furthermore, these storage snapshots support protection groups to enable flexible replication policies as well as application consistency for several business-critical applications (e.g. MSSQL & Oracle). Why is this important you might ask? Simple, this is one of the fastest ways to not only protect your critical workloads, but also to recover them: GB/s not MB/s, while enabling immutability so they can’t be modified or destroyed.
Immutable snapshots are an invaluable resource for recovering from ransomware, but some up front planning is needed to avoid unexpected snapshot growth which can cause downstream issues. There are also several options available to offload these snapshots to a secondary target to increase snapshot granularity and retention while reducing cost and minimizing capacity impact to primary storage.
The Last Line of Defense
Next, we will discuss your last line of defense: data protection solutions. Is your backup solution immutable and have you verified and tested this to ensure you understand what level of protection you can expect? Some vendors claim immutability but do not prevent an administrator from expiring data prior to the schedule, or they will allow for a device reset essentially wiping out all data (this includes primary storage). Immutability implies data can’t be modified, but this can be achieved with subpar security controls which leave data exposed to deletion, yet technically check the immutability box. It’s important to do your homework on storage and backup solutions to ensure immutability can’t be bypassed by simply updating a system clock or something more destructive like reinitialization of the solution. In other words, it is crucial to trust but verify that the solutions you’ve selected can live up to their promises. If you don’t, the attackers certainly will.
Malicious actors target data protection solutions as a primary attack vector because if you can’t recover, their odds of a payout increase exponentially. As a result, data protection solutions need to offer extensive security controls to ensure your data is available for recovery when needed. Rubrik, for example, supports a broad range of security controls including, but not limited to, RBAC, MFA, Immutable / Append only File System, retention locking, etc. Furthermore, Rubrik’s CLI prevents malicious attempts to reset an appliance or update the system clock should all of the other security measures fail. In other words, data is stored securely and can’t be tampered with ensuring you can recover if/when needed.
The Importance of Early Detection
Ransomware is very noisy, but it can be difficult to find if you are not looking for it, so it is critical to have systems in place for early detection. Many of the tools available today are still in early stages, but there are a few solutions which stand out due to the breadth of visibility they provide and their ability to quickly identify exactly what has been compromised, offering rapid recovery once malicious data has been detected. Rubrik RADAR is a great example of this type of solution.
Leveraging the insights that can be derived from the data within your data protection solution is invaluable. Data protection provides an ideal environment due to its wide view of all your important data. Now consider the potential benefit of being able to leverage that data to detect anomalous behavior (such as mass file deletion and encryption of your data) and then be able to restore all impacted files to their last known good state with a few clicks. This is precisely what Rubrik RADAR provides, significantly reducing the damage of ransomware and enabling a rapid and granular recovery to a known good state. Furthermore, Rubrik is working on adding the ability to identify sensitive data across your organization to help determine the impact of such an event as part of this workflow. This capability is available today and can be found in a second Rubrik tool called SONAR.
Another solution which can aid in detection and remediation is Varonis. Varonis leverages advanced machine learning and user behavior analytics to monitor, track and analyze how end users access data. This enables Varonis to identify an infection early which is critical to minimizing the “blast radius.” Furthermore, Varonis can take automated actions such as account lockdown to minimize any spread of malware and mitigate risk of widespread infection. The Varonis solution also provides detailed audit logging, enabling organizations to recover faster by pinpointing infected files.
Rapid & Granular DR
The ability to recover workloads rapidly and in a granular manner is critical to the ransomware recovery workflow. In a traditional disaster, you may be recovering from the loss of a storage array, a temporary power loss or in some cases something more damaging such as a full data center loss. Recovering from this type of event is difficult enough. A ransomware attack may not impact your infrastructure in the same way a data center loss can, but it does present new challenges which complicate and slow down the recovery process. For example, in the event of a ransomware event, the recovery workflow can be extended not only by a challenging restore process, but also by the need to perform an in-depth security assessment on all restored data to ensure it is no longer compromised. In other words, you may need to restore the same data several times in order to find a clean copy. If your restore process takes several hours or days for critical systems, you could be faced with a ransomware recovery that takes several weeks! That's more than a non-starter, for some businesses being out for a week or more could be devastating and risk the business itself.
Given the rapid acceleration of ransomware attacks, we feel it is important to reconsider your disaster recovery workflow. The faster you can recover while minimizing data loss by having very granular copies, the lower the impact to the business. We’ve already discussed the value of a solution such as Rubrik RADAR for unstructured data. Another tool we consider to be invaluable in a ransomware event is Zerto. Daymark has been working with Zerto for the better part of a decade to improve disaster recovery capabilities for our customers and the good news is, it keeps getting better.
First, Zerto is a hypervisor-based tool that supports virtually every on-prem hypervisor in addition to hundreds of cloud environments. You can start using Zerto on-prem to protect your production VMs in a DR data center and you can shift that strategy to leverage the cloud as a DR target at any time. Additionally, you can transition between cloud providers just as easily to ensure your data is never held hostage in a veritable “Hotel California." That is serious investment protection. Furthermore, Zerto provides a comprehensive playbook allowing you to create virtual protection groups enabling recovery of full application stacks in minutes while maintaining application consistency. Additionally, Zerto leverages journaling which enables thousands of recovery points over a 30-day period (it’s like a DVR for virtual machine recovery). These are just some of the capabilities that Zerto has to offer which can dramatically streamline recovery from not only a traditional disaster, but also from a ransomware event.
No organization is immune to the threat of ransomware but by being vigilant with security best practices and proactively keeping up with modern solutions designed to mitigate its effect, you can minimize your exposure, reduce the likelihood of a payout, and focus on what matters most -- your business and your customers.
If it’s time to revisit your data protection strategy in light of ransomware threats, let’s talk. Contact us and our team will be happy to help you align your defenses to avoid being in the unenviable position of becoming a ransomware victim.