Data Sanitization Terminology and Definitions

Differentiating between Information Lifecycle Management and Data Lifecycle Management

Data lifecycle management (DLM) is often used interchangeably with information lifecycle management (ILM). However, products that support DLM manage general attributes of files (i.e. type, size and age), whereas ILM goes beyond these general attributes to search for various types of stored files (i.e. specific piece of data, such as a customer number).

The distinction between ILM and DLM is important. EU General Data Protection Regulation: Right to be Forgotten, in effect since May 2018, gave customers the right to request their information be erased and to receive proof of erasure.

Information Lifecycle Management

Information lifecycle management (ILM) is a comprehensive approach to managing the flow of an information system’s data and associated metadata from creation and initial storage to the time when it becomes obsolete and is destroyed.

Information Lifecycle Management graphic

Lifecycle Stages:

  • Create – New digital data is generated, or the existing data is altered/updated, which includes any data/content element—not just a document or database.
  • Store – The digital data is committed/stored to some sort of storage repository (such as laptops, mobile device, servers or in the cloud), and typically occurs nearly simultaneously with creation.
  • Use – Data is viewed, processed or otherwise used in some sort of activity.
  • Share – Information is made accessible to others, such as between users, both internally and externally, including customers and partners.
  • Archive – The data leaves the ‘active’ state and migrates to long-term storage systems for retention.
  • Destroy – Data sanitization is performed to make the data permanently unrecoverable through physical or digital means (e.g. physical destruction, cryptographic erasure or data erasure).

Data Security Lifecycle (DSL)

The data security lifecycle (DSL) and information lifecycle management (ILM) differ based on the needs of the audience (security vs. operations).  The lifecycle includes six phases from creation to destruction. Although it is shown as a linear progression, once created, data can bounce between phases without restriction, and may not pass through all stages.  This is a summary of the lifecycle, and a complete version is available here.

Data Security Lifecycle graphic

Stages for Data Security Lifecycle

  • Create – Classify the information and determine appropriate rights, usually performed by technology or default classification and rights applied based on point of origin.
  • Store – Map the classification and rights to security controls, including access controls, encryption and rights management. Include certain database controls like labeling in rights management – not just DRM. Controls at this stage also apply to managing content in storage repositories, such as using content discovery to ensure that data is in approved/appropriate repositories.
  • Use – Include both detective controls like activity monitoring and preventative controls like rights management. Logical controls are typically applied in databases and applications.
  • Share – Include a mix of detective and preventative controls, such as DLP/CMF/CMP and encryption for secure exchange of data, as well as logical controls and application security.
  • Archive – Use a combination of encryption and asset management to protect the data and ensure its availability.
  • Destroy – Use an effective data sanitization method to deliberately, permanently and irreversibly remove or destroy the data. This process involves going back through the archive, storage and sharing locations of that data (where the data ‘has’ been located) to permanently make it unrecoverable.

Data Hygiene

Data hygiene is the process of ensuring all incorrect, duplicate or unused data is properly classified and migrated into the appropriate lifecycle stage for storage, archival or destruction on an ongoing basis through automated policy enforcement. By following data hygiene best practices, organizations are able to effectively manage ‘where’ their data is throughout the lifecycle and reduce the amount of data they store by successfully destroying the data to mitigate risks.

Data Sanitization

Data sanitization is the process of deliberately, permanently and irreversibly removing or destroying the data stored on a memory device to make it unrecoverable. A device that has been sanitized has no usable residual data, and even with the assistance of advanced forensic tools, the data will not ever be recovered. There are three methods to achieve data sanitization: physical destruction, cryptographic erasure and data erasure.

Physical Destruction

The process of shredding hard drives, smartphones, printer drives, laptops, and other storage media into tiny pieces by large mechanical shredders. Other methods of physical destruction include incineration, melting, pulverization, and disintegration.

Degaussing

Degaussing is a form of physical destruction whereby data is exposed to the powerful magnetic field of a degausser and neutralized, rendering the data unrecoverable. Degaussing can only be achieved on hard disk drives (HDDs) and most tapes, but the drives or tapes cannot be re-used upon completion. Degaussing is not an effective method of data sanitization on solid state drives (SSDs).

Pros & Cons of Physical Destruction

Physical destruction, when done according to updated requirements, is an effective method of destroying data to render the data unrecoverable and achieve data sanitization. While useful in cases where drives are irreparable or unable to be erased, physical destruction can be harmful to the environment and financially costly. This is because physical destruction of data storage hardware destroys the assets so they are unable to be reused or resold, shortcutting the lifespan of functional devices.

Cryptographic Erasure (Crypto Erase)

Cryptographic erasure is used interchangeably with Crypto Erase. Cryptographic erasure is the process of using encryption software (either built-in or deployed) on the entire data storage device, and erasing the key used to decrypt the data.  The encryption algorithm must be at a minimum of 128 bits (go here for industry-tested and accepted algorithms). While the data remains on the storage device itself, by erasing the original key, the data is effectively impossible to decrypt.  As a result, the data is rendered unrecoverable and is an appropriate method to achieve data sanitization.

3 Steps to Achieve Cryptographic Erasure:

  1. The encryption on the storage device must be turned on by default and provide access to the API call to the storage device to remove the key, which allows cryptographic erasure to be supported.
  2. Cryptographic erasure must verify the encryption key has removed the old key and replace with a new key, rendering the data encrypted and the previous key unrecoverable.
  3. The cryptographic erasure software must produce a tamper-proof certificate containing information that the key has been successfully removed, along with data about the device and standard used.

Pros and Cons of Cryptographic Erasure

Cryptographic erasure is a quick and effective method to achieve data sanitization. It is best used when storage devices are in transit or for storage devices that contain information that is not sensitive. Cryptographic erasure relies heavily on the manufacturer where implementation issues could occur. The users also could impact the success of cryptographic erasure through broken keys and human errors. But most importantly, cryptographic erasure still allows for the data to remain on the storage device and often does not achieve the regulatory compliance requirements.

Data Erasure

Data erasure is the software-based method of securely overwriting data from any data storage device using zeros and ones onto all sectors of the device. By overwriting the data on the storage device, the data is rendered unrecoverable and achieves data sanitization.

To Achieve Data Erasure, the Software Must:

  1. Allow for selection of a specific standard, based on your industry and organization’s unique needs.
  2. Verify the overwriting methodology has been successful and removed data across the entire device, or target data (if specifically called).
  3. Produce a tamper-proof certificate containing information that the erasure has been successful and written to all sectors of the device, along with data about the device and standard used.

Block erase can be a feature, but is often used interchangeably with data erasure. Block erase is the ability for vendor software to target the logical block addresses, including those that are not currently mapped to active addresses, on the storage device to erase all data on the device. However, if the block erase software does not provide for the 3 steps noted in the data erasure definition, it does not achieve data sanitization.

Pros and Cons of Data Erasure

Data erasure is the highest form of securing data within data sanitization, due to the validation process for ensuring the data was successfully overwritten and the auditable reporting readily available. Data erasure also supports environmental initiatives, while allowing organizations to retain the resale value of the storage devices. Data erasure, however, is a timelier process than other forms of data sanitization. And, data erasure forces organizations to develop policies and processes for all data storage devices.

Incomplete Data Sanitization Methods

Data sanitization methods have been proven to render the data on the appropriate storage devices unrecoverable. But, many other terms are often used interchangeably, which result in incomplete data sanitization.

 

incomplete data sanitization methods
  • Data deletion is the act of hiding data on a storage device, whereby the data is available for overwrite. Until the data has been overwritten, the data is still easily recoverable. Example: If a file is deleted and the recycle bin is emptied, the pointers to the data are removed, but the data itself is recoverable.
  • Reformatting is performed on a working disk drive to eliminate its contents. By formatting, it leaves most, and sometimes all, existing data on the storage device. Following a reformat, some or most of the data can be recoverable with forensics tools available online (free and paid).
  • Factory Reset is most often found on mobile devices, tablets and Internet of Things (IoT), though it can be found on other IT assets such as routers and computers. A factory reset removes all user data and restores a device back to factory settings, providing the device is not rooted. Some manufacturers enable cryptographic erasure as part of the factory reset. It is important to understand what methodology the device is using during the Factory Reset prior to using this as the only method.
  • Data wiping is often used interchangeably with data erasure; however, there are core differences. Data wiping is the process of overwriting data without verification that the overwriting was successful in overwriting all sectors of the storage device, and does not produce a certified report. Unlike data erasure, data wiping does not follow any erasure standards and does not offer any proof that the data has been successful. Therefore, this method is not considered as an approved method for data sanitization.
  • File shredding is often used interchangeably with data erasure; however, there are some core differences. File shredding destroys data on individual files and folders by overwriting the space with a random pattern of 1’s and 0’s. However, like data wiping, it does not verify that the overwriting has been successful and does not produce a report to prove that the data has been successfully erased. Therefore, this method is not considered as an approved method for data sanitization.