To read the full version of this content please select one of the options below:

Commercial hard drive failures in a data center application and the role of SMART attribute information

Michael Pecht (Center for Advanced Life Cycle Engineering (CALCE), Department of Mechanical Engineering, University of Maryland, College Park, Maryland, USA)
Edmond Elburn (Center for Advanced Life Cycle Engineering (CALCE), Department of Mechanical Engineering, University of Maryland, College Park, Maryland, USA)

Circuit World

ISSN: 0305-6120

Article publication date: 24 September 2020

Issue publication date: 25 October 2021

Downloads
225

Abstract

Purpose

The reliability of hard disk drives (HDDs) is dependent on the drive construction, as well as the operational and environmental conditions, in which the drive is used. Self-monitoring, analysis and reporting technology (SMART) continuously provides attribute information on HDD usage and degradation characteristics.

Design/methodology/approach

This paper aims to analyze the reported failures Backblaze data set for ST3000DM001 HDDs intended for desktop applications within a data center application. SMART attributes used for predicting failure are discussed and analyzed over the life of many hard drives. A case study on the actual use of SMART and the limitations of the SMART attribute information, the data center’s information and the use of desktop drives in a commercial application are also presented.

Findings

The analysis showed that when Backblaze started to record the data, the hard disk drives had already worked for a while with power on hours mean and standard deviation of 6,683 and 365 h, respectively. Therefore, it is possible that some SMART attributes have experienced critical values that have not been recorded by Backblaze. Additionally, 8% of all ST3000DM001 drives that Backblaze labeled as failed did not have raw values above zero for the five attributes that were considered critical. Backblaze recorded 25 SMART attributes in total for all hard disk drive brands where ST3000DM001 having 83.3% of the attributes ranked as the drive with the most attributes recorded. Having more recorded attributes with critical values leads to label more ST3000DM001 drives as failed while there might be the hard drives from the other brands or part numbers that experienced more critical SMART attributes but were not labeled as failed because of the lack of records.

Originality/value

It is an original work carried out at the Center for Advanced Life Cycle Engineering, University of Maryland.

Keywords

Citation

Pecht, M. and Elburn, E. (2021), "Commercial hard drive failures in a data center application and the role of SMART attribute information", Circuit World, Vol. 47 No. 4, pp. 408-426. https://doi.org/10.1108/CW-07-2020-0127

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited