12 years of HDD analysis brings insight to the bathtub curve’s reliability

Backblaze is a backup and cloud storage company that has been tracking the annual failure rate (AFR) of hard drives in its data center since 2013. As you can imagine, this brought a lot of data to the firm. This data led the company to conclude that hard drives “last longer” and show fewer errors.

This conclusion came from blog post this week by Stephanie Doyle, writer and blog specialist at Backblaze, and Pat Patterson, chief technical evangelist at Backblaze. The authors compared the AFR of approximately 317,230 drives in Backblaze's data center with the AFR that the company recorded when examining 21,195 of its existing drives. in 2013 and 206,928 disks in 2021. Doyle and Patterson said they found “quite a significant deviation in both actuator failure time and highest AFR point from the last two analyzes we ran.”

The high failure rate of tested drives this year peaked at 4.25 percent over 10 years and three months, up from 13.73 percent over three years and three months in 2013 and 14.24 percent over seven years and nine months in 2021, Doyle and Patterson wrote.

“Not only is this a significant increase in drive life, but it is also the first time we have seen drive failure rates peak at the very end of the drive curve. And that is about a third of each of the other failure peaks,” Doyle and Patterson write.

You can check out Paterson and Doyle. August blog post for more information on the drives they analyzed this year. The drives were manufactured by HGST, Seagate, Toshiba and WDC, and their average age ranged from 3.7 to 103.9 months (about 8.7 years). Disks ranged from 4 TB to 24 TB. In 2021Backblaze's sample included drives from the same manufacturers, and the average age of the tested drives for each model ranged from 3.57 to 80.85 months (about 6.7 years). Disks ranged from 4 TB to 16 TB.

How Backblaze did it done in the pastDoyle and Paterson compared the behavior of hard drives in Backblaze data centers to a bathtub curve, an engineering principle that states that component failure rates tend to follow a U-shape over time, with more failures occurring early in life before the rate drops, stabilizes, and then rises again as the component ages.

Leave a Comment