Dell EMC Unity: Data Reduction: Technical White Paper
Dell EMC Unity: Data Reduction: Technical White Paper
Abstract
This white paper is an introduction to the Dell EMC™ Unity Data Reduction
feature. It provides an overview of the feature, methods for managing data
reduction, and interoperability with other Dell EMC Unity features. Data
Reduction exists in Dell EMC Unity OE version 4.3 and later.
June 2021
H16870
Revisions
Revisions
Date Description
March 2018 Initial Release – Unity OE 4.3
Acknowledgments
Author: Ryan Poulin
The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this
publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
This document may contain certain words that are not consistent with Dell's current language guidelines. Dell plans to update the document over
subsequent future releases to revise these words accordingly.
This document may contain language from third party content that is not under Dell's control and is not consistent with Dell's current guidelines for Dell's
own content. When such third party content is updated by the relevant third parties, this document will be revised accordingly.
Copyright © 2016-2021 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks
of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [6/21/21] [Technical White Paper] [H16870.3]
Table of Contents
1 Dell EMC Unity Data Reduction Licensing ...................................................................................................................7
2 Overview.......................................................................................................................................................................8
2.1 Supported Configurations ...................................................................................................................................8
3 Theory of Operation ...................................................................................................................................................10
3.1 Writes ................................................................................................................................................................10
3.2 Reads ...............................................................................................................................................................14
3.3 Overwrites .........................................................................................................................................................14
3.4 Creating Data Reduction Enabled Storage Resources ....................................................................................14
3.5 Enabling Data Reduction on an Existing Storage Resource ............................................................................15
3.6 Local LUN Move ...............................................................................................................................................15
3.7 Disabling Data Reduction on a Resource ........................................................................................................15
4 Management...............................................................................................................................................................17
4.1 Creating a Data Reduction Enabled Storage Resource ...................................................................................17
4.2 Enabling and Disabling Data Reduction on an Existing Storage Resource .....................................................22
4.3 How to Determine Which Storage Resources Have Data Reduction Enabled ................................................29
4.4 Local LUN Move ...............................................................................................................................................32
4.5 Savings Reporting ............................................................................................................................................34
5 Interoperability ............................................................................................................................................................39
5.1 Data at Rest Encryption ....................................................................................................................................39
5.2 Replication ........................................................................................................................................................39
5.3 Snapshots .........................................................................................................................................................39
5.4 Thin Clones .......................................................................................................................................................40
5.5 Dell EMC Unity Native File and Block Import ...................................................................................................40
5.6 Pool Expansion .................................................................................................................................................41
6 Conclusion ..................................................................................................................................................................42
A Technical support and resources ...............................................................................................................................43
A.1 Related resources.............................................................................................................................................43
Executive summary
Data reduction technologies play a critical role in environments in which storage administrators are attempting
to do more with less. Dell EMC Unity Data Reduction aids in this effort by attempting to reduce the amount of
physical storage needed to save a dataset, which helps reduce the Total Cost of Ownership of a Dell EMC
Unity storage system. Dell EMC Unity Data Reduction provides space savings through the use of data
deduplication and compression. Data reduction is easy to manage, and once enabled, is intelligently
controlled by the storage system. Configuring data reduction and reporting savings is simple, and can be
done through Unisphere, Unisphere CLI, or REST API.
In Dell EMC Unity OE version 4.3 and later, the Dell EMC Unity Data Reduction feature replaces Dell EMC
Unity Compression. Data reduction includes deduplication, compression, and zero-block detection which
potentially increases the amount of space savings that can be achieved. Once the Dell EMC Unity OE has
been upgraded, data reduction enabled storage resources can be created, or data reduction can be enabled
on existing storage resources which support data reduction. All resources with compression enabled
previously will automatically begin using the data reduction algorithm. All compression references, including
ways to manage compression and view space savings information, have been updated to utilize the data
reduction terminology. For more information about Dell EMC Unity Compression and features prior to the OE
version 4.3 release, refer to the Dell EMC Unity: Compression white paper found on Dell EMC Online
Support.
This white paper discusses the Dell EMC Unity Data Reduction feature, including technical information about
the underlying technology of the feature, how to manage data reduction on supported storage resources, how
to view data reduction savings, and the interoperability of data reduction with other features of the storage
system. Best Practices information for using Dell EMC Unity Data Reduction, along with information about
when to enable data reduction, can be found in the Dell EMC Unity: Best Practices Guide white paper found
on Dell EMC Online Support.
Audience
This white paper is intended for customers, partners, and employees who are planning to utilize Dell EMC
Unity Data Reduction. It assumes familiarity with Dell EMC Unity and Dell EMC Unity’s management
software.
Terminology
Advanced Deduplication: A dynamic deduplication algorithm which reduces storage consumption by
eliminating duplicate 8KB blocks of data within a storage resource.
All Flash Pool: A Pool which contains only Flash Drives. An All Flash Pool can be a Traditional Pool or a
Dynamic Pool.
Asynchronous Replication: A replication method which allows you to replicate data over long distances and
maintain a replica at a destination site. Updates to the destination image can be issued manually, or
automatically based on a customizable Recovery Point Objective (RPO).
Block Storage Resources: LUNs, LUNs within a Consistency Group, and VMware VMFS Datastores.
Compression: A data reduction method which reduces the physical amount of storage required to save a
dataset.
Consistency Group: A storage instance which contains one or more LUNs within a storage system.
Consistency Groups help organize the storage allocated for a particular host or hosts. Data protection
configurations, such as replication and snapshot settings, on a Consistency Group affect all the LUNs
contained in the group, providing ease of management and crash consistency if the LUNs are dependent on
each other.
Data at Rest Encryption (D@RE): The process of encrypting data and protecting it against unauthorized
access unless valid keys are provided. This prevents data from being accessed and provides a mechanism to
quickly crypto-erase data.
Deduplication: A data reduction method which reduces the physical amount of storage required to save a
dataset.
File Storage Resources: File Systems (NFS, SMB) and VMware NFS Datastores.
Flash drive (SSD): A Flash based storage device used to store data.
Hard Disk Drive (HDD): A storage device based on spinning platters used to store data.
Hybrid Pool: A Pool which does not contain only Flash Drives. A Hybrid Pool typically contains more than
one type of drive technology, such as Flash, SAS, and NL-SAS.
LUN: A block based storage resource which a user provisions. It represents a SCSI logical unit.
Pool: A set of drives that provide specific storage characteristics for the resources that use them, such as
LUNs, VMware Datastores, and File Systems.
REST API: An application programming interface that utilizes familiar HTTP operations like GET, POST, and
DELETE. REST architecture includes certain constraints that ensure that different implementations of REST
conform to the same guiding principles, thereby allowing developers the ease of application development
when working with different REST API deployments.
Snapshot: A snapshot, also called a Dell EMC Unity Snapshot, is a point-in-time view of a storage resource.
When a Snapshot is taken, the snapshot is an exact copy of the source storage resource and shares all
blocks of data with it. As data changes on the source, new blocks are allocated and written to. Dell EMC Unity
Snapshot technology can be used to take a snapshot of a Block or File storage resource.
Storage Resource: An addressable and configurable storage instance associated with a specific quantity of
storage. LUNs, File Systems, and VMware Datastores constitute storage resources.
System Cache (DRAM Cache): Dell EMC Unity software component which leverages DRAM memory to
improve host read and write performance.
Thin Clone: A read/write copy of a Thin Block storage resource (LUN, Consistency Group, or VMware VMFS
Datastore) that shares blocks with the parent resource.
Unisphere: A web-based management environment used to create storage resources, configure and
schedule protection for stored data, and manage and monitor other storage operations.
Unisphere CLI (UEMCLI): The command line interface for managing Dell EMC Unity storage systems.
To verify which version of Dell EMC Unity OE your system is running, simply select the View System Status
icon found on the top blue menu bar of Unisphere. Alternatively, you can view the license status for Dell EMC
Unity Data Reduction by clicking the Update System Status icon, denoted by a gear icon on the top blue
menu bar, and finding Data Reduction in the License Management list. An entry of Data Reduction and a
green checkmark besides it confirms the feature is licensed on the system.
2 Overview
The Dell EMC Unity family of storage systems are feature-rich, easy-to-use, and deliver full Block and File
unified environments starting in a single 2U enclosure. To help reduce the Total Cost of Ownership and
increase the efficiency of a Dell EMC Unity storage system, Dell EMC Unity Compression was added in Dell
EMC Unity OE version 4.1 for Thin Block storage resources. Thin File storage resource support was added in
Dell EMC Unity OE version 4.2. In Dell EMC Unity OE version 4.3, the Dell EMC Unity Data Reduction feature
replaces compression, and provides more space savings logic to the system with the addition of zero block
detection and deduplication. In Dell EMC Unity OE version 4.5, Data Reduction includes an optional feature
called Advanced Deduplication, which expands the deduplication capabilities of the Data Reduction algorithm.
With data reduction, the amount of space required to store a dataset for data reduction enabled storage
resources is reduced when savings are achieved. This space savings reduces the amount of physical storage
required to store a dataset, which can lead to cost savings. Data reduction savings are not only achieved on
the storage resource it is enabled on, but space savings are also realized on Snapshots and Thin Clones of
those resources as well. Snapshots and Thin Clones inherit the data reduction setting of the source storage
resource, which helps to increase the space savings that they can provide.
Data Reduction +
All Flash Pool** 450F | 550F | 650F
Advanced Deduplication
Dell EMC Unity Data Reduction and the Advanced Deduplication option, for configurations supporting the
Advanced Deduplication feature, can be enabled on supported storage resources at the time of the resource’s
creation, or enabled or disabled at a later time. Advanced Deduplication requires Data Reduction to be
enabled on the resource but can be enabled or disabled independently to the Data Reduction setting. Local
LUN Move can be leveraged to move a resource’s data into a data reduction, and optionally an Advanced
Deduplication enabled resource within an All Flash Pool. For File storage resources, there is no direct method
available to convert from a Thick File resource or pre-Dell EMC Unity OE 4.2 Thin File resource to a 4.2 or
later Thin File resource. Methods to move File data include host-based migration to a Dell EMC Unity OE 4.2
or later Thin File System, VMware vMotion for VMs created on NFS Datastores, or Dell EMC Unity
Asynchronous Replication. More information about migration options can be found in the Dell EMC Unity:
Migration Technologies white paper found on Dell EMC Online Support.
Dell EMC Unity Data Reduction can also be enabled on Block and File storage resources participating in
replication sessions. The source and destination storage resources in a replication session are completely
independent, and data reduction with or without the Advanced Deduplication option can be enabled or
disabled separately on the source and destination resource. The availability of enabling data reduction,
Advanced Deduplication, or compression, in the case of codes prior to Dell EMC Unity OE version 4.3, on a
source and/or a destination resource depends on the Dell EMC Unity OE version, the system type, and the
Pool configuration.
Pools containing data reduction enabled storage resources cannot be expanded with SAS or NL-SAS. For
more information about expanding Pools and how to convert to a Hybrid Flash Pool, please review the Pool
Expansion section found under Interoperability.
3 Theory of Operation
3.1 Writes
Dell EMC Unity Data Reduction works the same for both Block and File storage resources. Data reduction
uses a software algorithm to analyze and achieve space savings within a storage resource. Figure 1 below is
a high-level diagram of a storage resource with data reduction enabled residing within an All Flash Pool. As
shown in Figure 1, data reduction occurs inline between System Cache and the storage resource on an All
Flash Pool.
When data is written to the system, the data is saved in System Cache, and the write is acknowledged with
the host. The data reduction algorithm is not invoked for write I/Os at this point in time in order to provide the
fastest response to the host. Figure 2 below outlines an example of a write to a storage resource with data
reduction enabled. No data has been written to the drives within the Pool at this time.
In Dell EMC Unity, before a write is saved in System Cache, the system ensures space is available and
allocated for the I/O within the target storage resource. As all back-end allocations and lookups within the
target resource are deferred until after writes are accepted into System Cache and the host is acknowledged,
a portion of the private space within the storage resource’s overhead is tracked and utilized as a possible
location to store the I/O when accepting data into cache. A storage resource’s private space is fixed in size,
and allocated at time of the storage resource’s creation. After the I/O is acknowledged, the normal cache
cleaning process occurs. Space within the storage resource is utilized or allocated, if needed, and the data is
saved to disk. This caching behavior not only applies to data reduction enabled resources, but it is also
applicable to Block and File storage resources (excluding vVols) created on All Flash Pools.
For data reduction enabled storage resources, the data reduction process occurs during the System Cache’s
proactive cleaning operations or when System Cache is flushing cache pages to the drives within the Pool.
The data in this scenario may be new to the storage resource, or the data may be an update to existing
blocks of data currently residing on disk. In either case, the data reduction algorithm occurs before the data is
written to the drives within the Pool. During the data reduction process, multiple blocks are aggregated
together and sent through the algorithm. After determining if savings can be achieved or data needs to be
written to disk, space within the Pool is allocated if needed and the data is written to the drives. A high-level
diagram of this operation is displayed in Figure 3 below.
Dell EMC Unity’s Data Reduction feature includes multiple space efficiency algorithms to help reduce the total
space occupied by a dataset. Included in the Data Reduction feature is deduplication, compression, and
optionally Advanced Deduplication algorithms. Figure 4 below is an overview of the data reduction feature
with Advanced Deduplication enabled. Before data is sent to the Data Reduction algorithm, it is first
segmented into 8KB blocks. As an 8KB block of data passes through the algorithm, it may or may not touch
all efficiency algorithms within data reduction. If a block can be deduplicated, the remainder of the data
reduction algorithms are skipped, saving time and processing overhead. Each of the algorithms within data
reduction feature will be discussed in detail later in this section.
With Advanced Deduplication disabled, a block of data entering the data reduction feature is only passed
through the deduplication and compression algorithms. The compression algorithm is only reached when
zeros or common patterns are not detected on the block of data. An example of data reduction with Advanced
Deduplication disabled is shown in Figure 5.
System. The Advanced Deduplication algorithm utilizes fingerprints created for each block of data to quickly
identify duplicate data within the dataset. Figure 6 below shows the Advanced Deduplication algorithm in
detail.
The fingerprint cache is a component of the Advanced Deduplication algorithm. The fingerprint cache is a
region in system memory reserved for storing fingerprints for each storage resource with Advanced
Deduplication enabled. There is one fingerprint cache per storage processor, and it contains the fingerprints
for storage resources residing on that SP. Through machine learning and statistics, the fingerprint cache
determines which fingerprints to keep, and which ones to replace with new fingerprints. The fingerprint cache
algorithm learns which resources have high deduplication rates and allows those resources to consume more
fingerprint locations.
If an 8KB block is not deduplicated by the zero and common pattern deduplication algorithm, the data is
passed into the fingerprint calculation portion of the Advanced Deduplication algorithm. Each 8KB block
receives a fingerprint, which is compared to the fingerprints for the storage resource. If a matching fingerprint
is found, deduplication occurs and the private space within the resource is updated to include a reference to
the block of data residing on disk. No data is written to disk at this time. Storage resource savings are
compounded as deduplication can reference compressed blocks on disk. If a match is not found, the data is
passed to the compression algorithm.
3.1.3 Compression
As blocks enter the compression algorithm, they are passed through the compression software. If savings can
be achieved, space is allocated within the Pool which matches the compressed size of the data, the data is
compressed, and the data is written to the Pool. When Advanced Deduplication is enabled, the fingerprint for
the block of data is also stored with the compressed data on disk. The fingerprint cache is then updated to
include the fingerprint for the new data. Compression will not compress data if no savings can be achieved. In
this instance, the original block of data will be written to the Pool. Waiting to allocate space within the
resource until after the compression algorithm is complete helps to not over-allocate space within the storage
resource.
3.2 Reads
When a read operation is sent to a storage resource which has data reduction enabled, the system first needs
to determine where the data is currently located. A lookup is performed to determine if the block is currently
stored within System Cache or on the Pool in its original size or was previously deduplicated or compressed.
If the data resides in System Cache in its original form, the data is sent to the host requesting the data. If the
data does not reside in System Cache and is stored in its original form on the Pool, a normal read operation
occurs as if data reduction is disabled on the storage resource. The data is copied into System Cache and
then sent to the host requesting the data.
If data reduction achieved space savings within the block, the data must be recreated within System Cache in
its original form so it can be sent to the host. If the block was previously deduplicated, the block is either
recreated if the block contained a common pattern or copied into System Cache if the block was deduplicated
by Advanced Deduplication using the information within the private space of the resource, and the host is sent
the data. If the data is compressed, it must first be uncompressed before the data is sent to the host. If the
compressed data already resides in System Cache, the data is uncompressed to a temporary location, the
data is sent to the host, and the temporary location is released. If the compressed data being requested
resides on a drive, the data is first read into System Cache, uncompressed to a temporary location, and the
host is sent the data. Data is never uncompressed on disk due to a read operation, as this would reduce the
amount of savings on the storage resource.
3.3 Overwrites
When an update is received for a previously written block of data, the system determines if the overwrite is for
a block which has space savings or not. The data is also passed through the data reduction logic to determine
if any space savings can be achieved. If the new block of data deduplicates to a known pattern, the private
space within the resource is updated with the new pattern information. If the now outdated block of data had
compression savings or was written to the Pool in its original form, these blocks are freed within the resource
for reuse. If deduplication savings cannot be achieved the data is sent though the compression logic.
If compression can reduce the size of the data, the system needs to determine where to store the block of
data within the Pool. If the amount of compression savings is now less than the last time the data was written,
then new space must be allocated within the storage resource to store the new data size. If the dataset size
hasn’t changed or is smaller than it was previously, then a write to an already allocated block may occur. This
logic prevents causing fragmentation in the resource, which helps with performance and space savings. If a
new block is allocated, the previously used block is freed for reuse. If no deduplication or compression
savings can be achieved, the data is written in the Pool at its original size, and may overwrite the original data
for the resource.
In the background, the old locations that are no longer needed are freed by a cleanup process and can be
reused. This process also frees blocks no longer in use by the storage resource and its Snapshots or Thin
Clones. If enough space is freed within a 256 MB slice, the slice can be freed back to the Pool.
Unisphere CLI or REST API, extra options for enabling data reduction and Advanced Deduplication are
available. Data reduction enabled LUNs are also supported within Consistency Groups. A mix of LUNs with
and without data reduction enabled can reside within the same Consistency Group. Data reduction enabled
File Systems can also reside within a NAS Server with File Systems with data reduction disabled.
Move can also be used on a storage resource with data reduction enabled to migrate the data to a resource
with data reduction enabled. This is also true for Advanced Deduplication enabled resources. During the
migration all data passes through the data reduction algorithm, and additional savings may be achieved. This
process is most often utilized when data has been written to a resource before data reduction was enabled,
and the user is looking to ensure all data is subject to the data reduction algorithms. If Dell EMC Unity
Compression was previously used, Move can be used to pass all data through the new space savings logic
instead of waiting for the data to be overwritten. For more information about Local LUN Move and any
restrictions of its usage, refer to the white paper titled Dell EMC Unity: Migration Technologies on Dell EMC
Online Support.
As with other features, Local LUN Move can be managed in Unisphere, Unisphere CLI, and REST API.
state within its Pool, whether it is deduplicated, compressed, or not. Data written after disabling data reduction
will be stored in its original form. As data that has been deduplicated or compressed is overwritten, data
reduction savings are reduced on the storage resource. To fully remove data reduction savings from a Block
storage resource, Local LUN Move can be utilized by specifying a non-data reduction enabled destination.
4 Management
Creating and managing Dell EMC Unity Data Reduction from Unisphere, Unisphere CLI, and REST API is
easy and intuitive. The following sections outline how to create a data reduction enabled storage resource,
how to enable and disable data reduction, how to determine the current state of data reduction on a resource,
and where data reduction savings are reported. Unisphere examples for each of these areas will be shown.
For more information about using the Unisphere CLI, refer to the Unisphere Command Line Interface User
Guide on Dell EMC Online Support. For information about managing Dell EMC Unity Data Reduction from
REST API, consult the REST API documentation which can be accessed directly from any Dell EMC Unity
system:
4.1.1 LUNs
In Unisphere, standalone Thin and Thick LUNs are created using the Create LUNs wizard, which can be
found on the Block page. The Block page can be accessed by selecting Block under Storage in the left
Unisphere pane. Figure 7 below shows the Create LUNs wizard. To create a data reduction enabled LUN or
multiple data reduction enabled LUNs, ensure the Thin and Data Reduction checkboxes are checked in the
Configure step. Once Data Reduction is enabled the Advanced Deduplication checkbox is shown and can
be enabled for resources which support it. After customizing the other settings of the LUN, click Next.
After configuring the other settings of the LUNs, the Summary step is displayed. An example of the
Summary step when creating multiple LUNs with data reduction and Advanced Deduplication enabled is
shown in Figure 8. The Summary screen includes the Name(s) of the LUNs being created, and if Data
Reduction and Advanced Deduplication will be enabled on the new LUNs.
Figure 9. Configure LUNs step within the Create a Consistency Group Wizard
After configuring LUNs within the Consistency Group, the Storage screen within the Create a Consistency
Group wizard is populated. An example of this window is shown in Figure 10. From this screen you can see if
the LUNs being created have data reduction and Advanced Deduplication enabled or not by reviewing the
Data Reduction and Advanced Deduplication columns. The Data Reduction and Advanced
Deduplication columns are not enabled by default, but can be added to the view by clicking the Gear Icon
and clicking the checkbox next to Data Reduction and Advanced Deduplication under the Columns option.
After configuring the other File System settings, the Summary step is displayed. This is the last step before
the creation of the File System. An example of the Summary step can be seen in Figure 12. The Summary
screen includes entries for Data Reduction and Advanced Deduplication, which signifies if the File System
will be configured with those features enabled (Yes) or disabled (No).
4.2.1 LUNs
To enable and disable data reduction and Advanced Deduplication on an existing LUN, review the properties
of the LUN from the Block page. The Block page can be accessed by selecting Block under Storage in the
left Unisphere pane. After double clicking the Name of the LUN, or after selecting a LUN and clicking the
Pencil (View/Edit) icon, the LUN Properties window is displayed. On the General tab, a checkbox for Data
Reduction exists. Depending if data reduction is currently disabled or enabled on the storage resource, the
box will either be unchecked or checked. To change the state of data reduction, simply check or uncheck the
Data Reduction box and click Apply. Advanced Deduplication is only available once Data Reduction is
enabled and the configuration supports it. Advanced Deduplication can be enabled or disabled independently
to the Data Reduction setting. To change the state of Advanced Deduplication, simply check or uncheck the
Advanced Deduplication box and click Apply.
Figure 14 below shows the Properties window of a LUN which currently has Data Reduction disabled. The
Data Reduction box has been selected to enable data reduction on the resource but Apply has not yet been
selected. When enabling data reduction on a resource, an informational message is displayed after checking
the Data Reduction box. A similar message appears when Data Reduction is enabled, and the Advanced
Deduplication box has been checked but Apply has not been selected. The message when enabling Data
Reduction states:
Only the newly written data will have data reduction applied. Existing data will remain unchanged.
Figure 15 below shows the Properties window of a LUN which currently has Data Reduction and Advanced
Deduplication enabled. The Advanced Deduplication box has been deselected to disable Advanced
Deduplication on the resource but Apply has not yet been selected. When disabling Advanced Deduplication
on a resource, an informational message is displayed after unchecking the Advanced Deduplication box. A
similar message appears when Data Reduction is disabled but Apply has not been selected. The message
when disabling Advanced Deduplication states:
Newly written data will not have advanced deduplication applied. Existing data will remain unchanged.
This message implies that the Local LUN Move option must be used to remove all Advanced Deduplication
savings from the existing data within the storage resource if desired. If the Local LUN Move option is not
utilized, then only overwrites to the LUN will cause Advanced Deduplication savings to decrease.
After opening the Consistency Group Properties window, navigate to the LUNs tab. From this screen you can
view the current LUNs within the Consistency Group, and the current state of data reduction and Advanced
Deduplication if the Data Reduction and Advanced Deduplication columns are displayed. A sample of the
LUNs tab is shown in Figure 17. To edit the settings of one of the LUNs, double click the LUN or select the
LUN and click the Pencil icon.
The LUN Properties window is now shown. This is the exact same Properties window as a LUN not currently
in a Consistency Group. An example of this window is shown in Figure 18. As mentioned previously, to
enable or disable data reduction, along with Advanced Deduplication on a LUN, simply check or uncheck the
Data Reduction or Advanced Deduplication box and click Apply. The new state for data reduction and
Advanced Deduplication are reflected in the LUNs tab within the Consistency Group Properties window.
Only the newly written data will have data reduction applied. Existing data will remain unchanged.
This message outlines that only new writes to the File System or overwrites to existing data within the File
System are considered for data reduction.
When reviewing the Properties window for a File System which does not support data reduction, no option to
enable data reduction will be shown. This is true for Thick File Systems, and any File Systems created on a
code previous to Dell EMC Unity OE version 4.2. Figure 20 below shows an example of the File System
Properties window for a Thick File System. As you can see no option for data reduction is shown.
Figure 20. File System Properties Window – Data Reduction Not Supported
Figure 21 below shows the Properties window of a File System which currently has data reduction enabled.
The Advanced Deduplication box has been deselected to disable Advanced Deduplication on the resource
but Apply has not yet been selected. When disabling Advanced Deduplication on a resource, an informational
message is displayed after unchecking the Advanced Deduplication box. The message states:
Newly written data will not have advanced deduplication applied. Existing data will remain unchanged.
This message implies that only overwrites to the File System will cause data reduction savings to decrease.
For VMware NFS Datastores, data reduction is only supported if the resource is Thin, resides on an All Flash
Pool, and is created while the system is running Dell EMC Unity OE version 4.2 or later. If the resource
supports data reduction, data reduction can be enabled or disabled at any time. If the resource is Thick, or
was created on an earlier code, Unisphere will not display an option to enable data reduction. When
attempting to enable or disable data reduction on a resource which does not support data reduction via
Unisphere CLI or REST API, an error will be returned. Advanced Deduplication is only supported on
resources which support Data Reduction, and currently reside on a configuration which supports Advanced
Deduplication.
4.3.1 LUNs
To review the status of data reduction and Advanced Deduplication on each of the LUNs created on the
system, navigate to the Block page, which can be accessed by selecting Block under Storage in the left
Unisphere pane. This page contains three columns specific for data reduction. The columns are Data
Reduction, which shows if Data Reduction is enabled or not on the resource, Advanced Deduplication,
which shows if Advanced Deduplication is enabled, and Data Reduction Savings (GB), which shows the
amount of savings in GBs for the resource. To add these and other columns to the view, simply click the Gear
Icon in the top right portion of the LUNs tab and select the columns to add under the Columns option. An
example of this screen is shown in Figure 23.
Data reduction information has also been added to the quick properties view of the LUN tab on the Block
page. After selecting a LUN, the right portion of the screen is populated with more information about the
storage resource. In Figure 23, a data reduction enabled storage resource is selected. In the information
provided in the right portion of the screen, you can determine if Data Reduction and Advanced Deduplication
are enabled and the current Data Reduction Savings on the selected resource.
When reviewing the Data Reduction, Advanced Deduplication, and Data Reduction Savings (GB)
columns, the information provided depends on if the storage resource supports data reduction or not. For
Thick File Systems or File Systems created on a code prior to Dell EMC Unity OE version 4.2, -- is displayed
in the columns to denote that the storage resource does not support data reduction. For Thin File Systems
created on Dell EMC Unity OE version 4.2 or later, the Data Reduction column will display Yes or No
depending on if data reduction is enabled or not. No is also displayed for File Systems created on Dell EMC
Unity OE version 4.2 or later, and currently reside within a non-All Flash Pool. The Advanced Deduplication
column is available for configurations which support Advanced Deduplication and displays either Yes or No
depending on the current state of Advanced Deduplication. The Data Reduction Savings (GB) column
displays the amount of savings currently achieved within the File System. As data reduction savings are not
removed when disabling data reduction on a storage resource, data reduction may be disabled, but savings
still exist.
Data reduction information has also been added to the quick properties view of the File Systems tab on the
File page. After selecting a File System, the right portion of the screen is populated with more information
about the storage resource. In Figure 25, a data reduction enabled storage resource is selected. In the
information provided in the right portion of the screen, you can determine if Data Reduction and Advanced
Deduplication are enabled and the current Data Reduction Savings on the selected resource. This is an
easy way to review the current state of data reduction on a specific resource if the Data Reduction,
Advanced Deduplication, and Data Reduction Savings (GB) columns are not shown. The -- designation,
which is explained above, is also used when the storage resource selected does not support data reduction.
Reduction, which shows if data reduction is enabled or not on the resource, Advanced Deduplication,
which shows if Advanced Deduplication is enabled, and Data Reduction Savings (GB), which shows the
amount of savings in GBs for the resource. To add these and other columns to the view, simply click the Gear
Icon in the top right portion of the Datastores tab and select the columns to add under the Columns option. An
example of this screen is shown in Figure 26.
When reviewing the Data Reduction, Advanced Deduplication, and Data Reduction Savings (GB)
columns, the information provided depends on if the storage resource supports data reduction or not. For
Thick NFS Datastores or NFS Datastores created on a code prior to Dell EMC Unity OE version 4.2, -- is
displayed in the columns to denote that the storage resource does not support data reduction. For Datastores
which support data reduction, the Data Reduction column will display Yes or No depending on if data
reduction is enabled or not. No is also displayed for Datastores which support data reduction, but currently
reside within a non-All Flash Pool. The Advanced Deduplication column is available for configurations which
support Advanced Deduplication and displays either Yes or No depending on the current state of Advanced
Deduplication. The Data Reduction Savings (GB) column displays the amount of savings currently achieved
within the File System. As space savings are not removed when disabling data reduction on a storage
resource, data reduction may be disabled, but savings still exist.
Data reduction information is also added to the quick properties view of the Datastores tab on the VMware
page. After selecting a VMware Datastore, the right portion of the screen is populated with more information
about the storage resource. In Figure 26, a data reduction enabled storage resource has been selected. In
the information provided in the right portion of the screen, you can determine if Data Reduction and
Advanced Deduplication are enabled, and the current Data Reduction Savings on the selected resource.
This is an easy way to review the current state of data reduction on a specific resource if the Data Reduction,
Advanced Deduplication, and Data Reduction Savings (GB) columns are not shown. The -- designation,
which is explained above, is also used when the storage resource selected does not support data reduction.
Local LUN Move can also be leveraged to migrate a Block resource’s data to or from a resource with data
reduction and/or Advanced Deduplication enabled. When Move is utilized to migrate data to a resource with
data reduction enabled, as the data is migrated via the TDX data copy engine, all data will pass through the
data reduction logic. If Advanced Deduplication is supported and enabled, the data will also pass through the
Advanced Deduplication algorithm. This allows space savings to be achieved during the migration. When
migrating to a resource with data reduction disabled, all space savings achieved on the source will be
removed during the migration.
The Move option can be found on the Block page for LUNs. After selecting a storage resource, select the
More Actions drop down list, then Move. This launches the Move dialog box. An example of the More
Actions dialog box and the Move option can be found in Figure 27.
The Move LUN dialog box is shown in Figure 28 below. In this window the user will select the destination Pool
for the new storage resource. The user will also select if the resource will be Thin, and if it will have data
reduction and Advanced Deduplication enabled. The resource must be Thin for the data reduction option to
be available. The Move operation is completely transparent to the host. To start the Move operation, simply
select OK in the Move LUN dialog box.
Move can also be used for LUNs contained within a Consistency Group. The Move option for Consistency
Groups is found on the LUNs tab within the Properties window of a Consistency Group. An example of this is
shown in Figure 29. After selecting a LUN within the Consistency Group, select More Actions, then Move. A
similar dialog box as previously shown in Figure 28 is displayed. After selecting OK, the data is moved.
Move can also be used with VMware VMFS Datastores. For VMware VMFS Datastores, the Move option is
found on the Datastores tab of the VMware page. As shown in Figure 30, after selecting a VMware VMFS
Datastore, the Move option can be found under More Actions. After selecting Move, a dialog box similar to
Figure 28 is displayed.
File Systems also provide the same savings information as LUNs. Figure 32 below shows an example of the
Properties window of a File System. As with LUNs, the data reduction savings are reported in GBs, %
savings, and ratio.
VMware VMFS Datastores display data reduction savings in the same manner as LUNs and File Systems. In
the VMware VMFS Properties window on the General tab, the GBs saved, % savings, and ratio can be
viewed. An example of this screen can be found in Figure 33.
Figure 33. VMware VMFS Datastore Properties Window – Data Reduction Savings
VMware NFS Datastores display data reduction savings in the same manner as VMware VMFS Datastores.
In the VMware NFS Properties window on the General tab, the GBs saved, % savings, and ratio can be
viewed. An example of this screen can be found in Figure 34.
Figure 34. VMware NFS Datastore Properties Window – Data Reduction Savings
Figure 35. Pool Properties Window – Usage Tab – Data Reduction Savings
5 Interoperability
Dell EMC Unity Data Reduction is supported on standalone LUNs, LUNs contained within a Consistency
Group, File Systems, or VMware VMFS and NFS Datastores. All software features on a storage system are
supported with data reduction. Data reduction algorithms also support Local LUN Move which leverages the
Transparent Data Transfer (TDX) technology and offload data transfer operations sent to the system. The
following sections talk specifically about certain features of the Dell EMC Unity storage system, and how they
relate to data reduction.
More information about Data at Rest Encryption can be found in the Dell EMC Unity: Data at Rest Encryption
white paper found on Dell EMC Online Support.
5.2 Replication
Storage Resources utilizing data reduction can be replicated using any supported replication software, such
as Native Synchronous or Asynchronous Replication to any supported destination system. All data replicated,
regardless if it is local replication or to a remote system, is first restored to its original size then replicated to
the destination. This method of replicating data reduction enabled storage resources ensures that all
replication topologies are supported as if data reduction is not enabled on the resource. Replicating to
systems which do not support data reduction and/or Advanced Deduplication is also supported, such as
replicating to Dell EMC UnityVSA or a physical Dell EMC Unity system with a configuration which does not
support data reduction and/or Advanced Deduplication.
Dell EMC Unity Data Reduction and Advanced Deduplication can also be enabled on only the source, only
the destination, or both the source and destination storage resources, depending on if the system and Pool
configuration support Dell EMC Unity Data Reduction and/or Advanced Deduplication. This allows the user to
fully control where to implement data reduction. One example of a supported replication configuration is when
utilizing Asynchronous Local Replication. The source storage resource may reside on an All Flash Pool and
have data reduction enabled, but the destination may be on a large capacity Hybrid Pool which does not
support data reduction. Another example of a supported configuration is when replicating a storage resource
from a Dell EMC UnityVSA system or a production system not utilizing data reduction, to a storage resource
with data reduction and Advanced Deduplication enabled on a remote system. Replication can also occur
between a resource created before the system was running Dell EMC Unity OE version 4.4, and one after.
More information about Replication can be found in the Dell EMC Unity: Replication Technologies white paper
found on Dell EMC Online Support.
5.3 Snapshots
The Dell EMC Unity Snapshots feature is fully supported with data reduction and Advanced Deduplication.
Snapshots also benefit from the space savings achieved on the source storage resource. When taking a
Snapshot of a data reduction enabled storage resource, the data on the source may be compressed or
deduplicated. The data is left in its current state, and the Snapshot inherits the savings achieved on the
source storage resource.
When a snapshot is mounted and the source storage resource has data reduction enabled, data reduction is
also utilized on any snapshot I/O. If a read is received for a compressed or deduplicated block of data, the
data is restored to its original size and sent to the requestor. Savings can also be achieved on writes to a
snapshot. As write operations are received, if the source storage resource has data reduction enabled,
snapshot writes are also passed through the data reduction algorithms. This savings is tracked and reported
as part of the GBs saved for the source storage resource.
More information about Snapshots can be found in the Dell EMC Unity: Snapshots and Thin Clones white
paper found on Dell EMC Online Support.
Data reduction and Advanced Deduplication can be enabled or disabled on the source storage resource at
any time. Changing the data reduction setting will also control if data reduction is enabled or disabled on all
Thin Clones residing on the storage resource. This is also true for the Advanced Deduplication setting. When
data reduction or Advanced Deduplication is enabled on the source, no existing data is changed unless
overwritten. When data reduction or Advanced Deduplication is disabled, all data is left in its current state.
While a Thin Clone exists for a storage resource, Local LUN Move is not available on the source. Also, Local
LUN Move is not available for use on a Thin Clone.
More information about Thin Clones can be found in the Dell EMC Unity: Snapshots and Thin Clones white
paper found on Dell EMC Online Support.
More information about Dell EMC Unity Native File and Block Import can be found in the Dell EMC Unity:
Migration Technologies white paper found on Dell EMC Online Support.
To expand and convert an All Flash Pool to a Hybrid Pool, all storage resources which have data reduction
enabled or have used data reduction must be removed from the Pool. For Block resources, Local LUN Move
can be used to move the resource’s data to another resource on the same Pool with data reduction disabled,
or to another Pool. When utilizing Local LUN Move to relocate a Block device within the same Pool, you must
ensure data reduction is disabled on the destination device. For File resources, the data must be migrated to
a new resource, either by leveraging Dell EMC Unity Asynchronous Replication or a host based migration
tool. Once all resources which have utilized data reduction have been removed from the Pool, the expansion
will be allowed.
More information about migration options can be found in the Dell EMC Unity: Migration Technologies white
paper found on Dell EMC Online Support.
6 Conclusion
Dell EMC Unity storage systems offer a powerful capacity efficiency feature which can improve the effective
capacity utilization of a Dell EMC Unity system. Data reduction is included with all physical Dell EMC Unity
systems at no additional cost. Advanced Deduplication is also supported at no additional cost on
configurations which support it. When data reduction is utilized, not only is space saved due to the storage
resources being Thin, but savings are achieved by utilizing intelligent zero detect, deduplication, and
compression algorithms as well. Dell EMC Unity Snapshots and Thin Clones also save space within the
system, which can greatly reduce the amount of storage needed for a dataset. By reducing the amount of
storage needed to store a dataset, Dell EMC Unity Data Reduction helps to further reduce the Total Cost of
Ownership of a Dell EMC Unity system.
Storage technical documents and videos provide expertise that helps to ensure customer success on Dell
EMC storage platforms.