Storage technology explained: Flash vs HDD

The past 12 months saw flash storage nudge into areas from which it had hitherto been absent. In particular, this was because of the availability of denser – and therefore cheaper per-gigabyte (GB) – quad-level cell (QLC) flash storage into array markets and use cases that were once considered nearline.

Alongside this, we saw the price-per-GB of flash drop towards the level of spinning disk hard disk drives (HDDs). Meanwhile, the keenest of flash storage advocates predicted the demise of the hard drive and the imminent victory of the all-flash datacentre.

In this article, we define enterprise flash storage, look into its QLC and triple-level cell (TLC) variants, the benefits of non-volatile memory express (NVMe) flash, and examine the pros and cons of flash versus HDD in terms of cost, performance, flash in the cloud, and the likelihood (or otherwise) of the all-flash datacentre.

What is enterprise flash storage?

Enterprise flash storage refers to systems that comprise multiple flash drives housed in datacentre rack-mounted array form factor products.

In enterprise flash storage arrays, the capacity of many drives is aggregated, with access to storage media governed by controller hardware.

The controller is compute that powers the intelligence needed to handle input/output (I/O) from hosts to the storage, decision-making over allocation of data to media, but also in flash arrays to carry out maintenance tasks such as wear levelling, garbage collection, and so on.

Enterprise flash storage array capacities run from tens of terabytes (TB) to many petabytes (PB). As with HDD-based arrays, access to storage can be block (for performance-hungry database use cases, for example), file (for general use and unstructured data) or object (for unstructured data also).

What is QLC flash storage?

QLC is the latest generation of flash storage media. QLC stands for quad-level cell. That means that every cell in the flash chip can store four bits of data using 16 states.

That means it can store more data in the same space than TLC flash, which is also widely available. Previously widely available were single-level cell (SLC) flash and multi-level cell (MLC, meaning two states), but these have been largely superseded now.

At the start of 2024, most enterprise storage arrays are built with TLC drives for general-purpose and mission-critical use cases. But QLC has edged into the mainstream and gained traction for unstructured data workloads, in particular with key enterprise storage array makers adding QLC-based products in the past year or so.

As manufacturers increase the number of possible states per cell, storage density increases and the cost of storage per GB decreases. But, as storage density increases in terms of cell capacity, issues can arise that can limit the endurance of flash media.

What is NVMe flash?

Non-volatile memory express (NVMe) is a protocol developed especially for use with flash storage. Prior to NVMe, flash drives used transport protocols that originated during the HDD era, namely Serial Advanced Technology Attachment (SATA) and Serial-Attached SCSI (SAS). In fact, these are still in use and arrays that use drives with such connectivity (2.5in and 3.5in form factor) are sold by the big storage suppliers.

But NVMe is at the forefront now for flash drive performance. NVMe’s key innovation was to optimise queues and buffers for use with flash, which improved performance many times over.

As a follow-on, suppliers then developed ways of allowing NVMe connectivity across physically more distant connections across the datacentre. Such NVMe-over-fabrics technologies include the ability to carry NVMe via Ethernet, Infiniband, TCP, RDMA (ie, memory-to-memory connectivity) and more.

What is HDD?

Hard disk drives (HDDs) that rely on magnetic read/write heads and mechanically spinning disks have been around for decades, with flash a competitor that has emerged in the past 10 years or so.

As with flash, HDDs can be aggregated into datacentre rack-mounted array products and the capacity of multiple drives pooled for enterprise users. In fact, HDD-based arrays long preceded enterprise flash arrays and are still widely used.  

What’s the difference in performance between flash and HDD?

When we look at flash versus disk, the key thing that stands out is that flash is fast – many times faster than spinning disk HDD.

Flash drives offer lower latency, with access times down to low milliseconds, or even microseconds, compared with the multiple milliseconds of spinning disk, particularly for reads. That means enterprise flash can also offer vastly more input/output operations per second (IOPS) when aggregated into a storage array.

In throughput terms, flash offers gigabit-per-second (Gbps) rates four or five times quicker than HDD.

Such rapidity has been the key draw for enterprise flash storage and is a result of the lack of moving parts. With spinning platters, HDD is limited by physics in ways that solid-state storage is not.

In terms of capacities, HDD is available in up to around 22TB units. And while some flash drives have been marketed that run to 60-plus terabytes, they generally come in smaller sizes, but part of that is because of cost. 

What’s the cost difference between flash and HDD?

In terms of per-GB cost at drive level, flash costs more than spinning disk. But the gap is narrowing.

While HDD prices per GB of capacity have remained largely static, flash drive costs have come down.

At the end of 2023, that differential, on average, was a few cents, making flash 25% to 50% more costly per GB than spinning disk.

In October 2023, flash cost averaged $0.075/GB while HDD cost averaged $0.05/GB for SAS and $0.035/GB for SATA drives.

However, some (flash-oriented) suppliers argue we cannot judge storage costs at drive level alone. 

Will flash kill HDD? How much longer for HDD?

In particular, Pure Storage has declared HDDs will be dead by 2028, with its flash products the chief agent in the cull, and all owing to its ability to aggregate much more flash capacity on its proprietary modules than occurs on commodity flash drives.

With flash module sizes of up to 300TB by 2026 promised by Pure, it contends that spinning disk will be commercially unviable.

Meanwhile, companies such as Panasas, which specialises in storage for unstructured data, point to hyperscaler datacentres’ overwhelming use of spinning disk in ratios up to 90/10 against flash. Panasas argues that there’s still a five-times differential between the lowest-cost flash and HDD, and that for most, something like the hyperscaler solution is optimal.  

When can you use flash and HDD in the cloud?

Enterprise users can also specify flash storage and spinning disk in the cloud. It is more likely in most cases that cloud storage will be specified by performance and cost criteria, in which case the customer may never know what media underlies it.

But it is possible also to specify flash storage in the cloud and the three largest hyperscalers – Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP) – have solid-state storage options that mix cost, capacity and performance. 

The hyperscalers all offer flash storage to support compute with service levels based on capacity and IOPS per volume that range from general-purpose to premium levels aimed at specific workloads (eg, SQL, Oracle, SAP Hana) and environments (eg, Windows, Lustre, MacOS).

There are also options aimed at flash for file storage and flash storage from named suppliers, such as Azure’s NetApp Files.

What is the all-flash datacentre?

For about a decade, the idea of the all-flash datacentre has been discussed. The all-flash datacentre replaces HDD and other media such as tape with flash storage.

Driving it is the continued decrease in the cost of flash storage – as with QLC flash – but also the advantages of flash in terms of rapid access. The latter becomes more relevant as customers want to run analytics on bigger subsets of their data.

So, for example, where backups may previously have been held on nearline media such as slower HDDs, advocates of flash for such use cases point to the ability to run artificial intelligence (AI) on large customer datasets and to gain value therefrom.

Also, with backups as an example, the idea of being able to recover quickly from flash media in case of a ransomware attack is another use case touted by all-flash datacentre boosters. 

When will the all-flash datacentre arrive?

While enthusiastic suppliers of flash storage such as Pure talk down the obstacles to the all-flash datacentre, analysts point to the spread of (especially QLC) flash into secondary workloads but not necessarily all use cases, with spinning disk likely to retain its usefulness for some time for some datasets.

Meanwhile, HDD suppliers such as Toshiba say around 85% of all data is still on spinning disk. That fact, it says, is not likely to change rapidly, not least because the flash capacity to replace it doesn’t exist.


Leave a Comment