How Deduplication and Compression Make All-Flash Storage Affordable

Written by Sean Gilbride | Thu, Mar 05, 2015

Jealous of the pure power of all-flash but concerned about cost? The latest generation of all-flash arrays have a couple of tricks up their sleeve that make the difference in usable capacity and cost between flash and spinning disk much less significant: inline deduplication and compression.

Both inline deduplication and compression have been used in traditional storage arrays, however, they’re usually implemented as back-end processes, meaning these space-saving techniques are run on a schedule on already-stored data in a virtual disk pool in the array. This is done to minimize the performance impact, as traditional storage array architectures were never designed with inline deduplication or compression in mind. In short, for traditional storage arrays, these features have been “bolted on” via software upgrades into existing code bases.

The best all-flash storage arrays, on the other hand, are being built from the ground up to allow for inline deduplication and compression without compromising performance and, in some ways, actually enhancing performance.

“All-flash storage arrays are built from the ground up to allow for inline deduplication and compression without compromising performance.”

What is Inline Deduplication?

Inline deduplication is a process through which a flash array’s controller analyzes each write request (using a block size that is usually 4 kilobytes), it checks if any of its existing data matches the write request, and if so, instead of writing the same data again, merely creates a pointer to it. The benefits are clear; inline deduplication saves storage space by not writing multiple copies of the same data, extends the life of the underlying flash by avoiding unnecessary writes (writes = wear) and can actually increase performance by allowing for quicker write acknowledgements to the connected systems.

So, for example, if you need to run 1,000 virtual machines and they’re all running the same operating system, each instance of that operating system will be made up of identical blocks. With inline deduplication, each of those blocks need only be written once, cutting 1,000 write operations down to only 1 write operation.

Also, the latest generation of all-flash storage arrays perform always on inline deduplication globally, across the entire storage array and all LUNs. “Always on inline deduplication” is an important distinction because while many vendors are now claiming to be able to support inline deduplication, it does not mean that all vendors can guarantee that their inline deduplication will not need to pause for long periods to allow for other storage level services to function. We will be reviewing this further in the next blog article in this series titled “Features to look for in an All-Flash Solution”

Leveling the Playing Field

When employed in an all-flash storage array, inline deduplication reduces the overall need for capacity. Depending on the application, deduplication rates of 1:10 are easily achievable. So, with deduplication enabled, a 7 terabyte array can perform at the same level as a 50-70 terabyte array. You can see how this levels the capacity playing field between all-flash and traditional storage arrays.

“A 7TB array can perform at the same level as a 50-70TB array.”

What is Compression?

Of course, all workloads will not benefit equally from inline deduplication and in some cases the benefit will be very minimal (e.g. databases). Some unique blocks are necessary for computing to take place. In the latest generation of all-flash arrays, blocks that can’t undergo deduplication are compressed inline using a special algorithm.

The benefits of compression are similar to the benefits of deduplication, except that it can be applied to unique blocks of data. Compression can also improve overall performance, as it allows flash controllers to further optimize the size of the data chunks that are written out to the flash disks.

The Takeaway: Always on Inline Deduplication and Compression Make All-Flash Storage Feasible

All of this is to say, by leveraging global inline deduplication and compression, all-flash storage arrays can have the same (or even lower) total cost of ownership (TCO) as traditional storage arrays. This is, of course, somewhat dependent on the workload and the environment, but the number of scenarios in which all-flash is a viable, economical solution is growing rapidly.

“All-flash storage arrays can have the same (or even lower) TCO as traditional arrays.”

Is all-flash the right solution for your business’s workload and IT environment? Learn more in our new whitepaper: "Making All-Flash Storage A Reality". Click below to download it.

About the Author

Sean Gilbride - Director of Professional Services

Sean Gilbride is Daymark’s Director of Professional Services. With more than 18 years experience architecting and deploying IT infrastructures, Sean manages the day-to-day operations of Daymark’s professional services team focused on delivering the best virtualization, storage and cloud computing solutions for its clients. Prior to joining Daymark in 2008, Sean held a variety of IT management and senior engineering positions at Wellington Management, Access International and Vicor.

Sean holds several technical certifications from NetApp, EMC and Symantec and is an avid outdoorsman enjoying fishing, skiing and golf.

View full post