Syrian Virtual University
BAIT-IOS202: Windows Platform-I
MODULE 10: IMPLEMENTING STORAGE
SPACES AND DATA DEDUPLICATION
Module 10: Implementing Storage Spaces and Data
Deduplication
Module Overview ......................................................................................................................................... 1
Lesson 1: Implementing Storage Spaces ...................................................................................................... 2
Lesson 2: Managing Storage Spaces ............................................................................................................ 4
Lab A: Implementing Storage Spaces .......................................................................................................... 8
Lesson 3: Implementing Data Deduplication ............................................................................................... 9
Lab B: Implementing Data Deduplication ................................................................................................. 14
Module Review and Takeaways................................................................................................................. 15
Module Overview
Windows Server 2016 includes a variety of advanced storage technologies that administrators can use to
equip servers with massive amounts of storage space, both inside and outside of the computer. These
technologies provide various fault tolerance mechanisms, which can maintain data availability when
equipment failures and other disasters occur.
This module describes how to use the following two new features within your Windows Server storage
architecture:
Storage Spaces
Data Deduplication
Objectives
After completing this module, you will be able to:
Describe and implement the Storage Spaces according to the enterprise storage needs.
Manage and maintain Storage Spaces.
Describe and implement Data Deduplication.
Lesson 1: Implementing Storage Spaces
Managing direct-attached storage (DAS) on a server can be a tedious task for administrators. To
overcome this problem, many organizations use storage area networks (SANs) that group disks together.
However, SANs are expensive because they require special configuration, and sometimes special
hardware. To help overcome these storage issues Windows Server 2016 includes a disk virtualization
technology called Storage Spaces, which enables a server to combine the storage space from individual
physical disks and allocate that space to create virtual disks of any size supported by the hardware.
Lesson Objectives
After completing this lesson, you will be able to:
Implement Storage Spaces as an enterprise storage solution.
Describe the Storage Spaces feature and its components.
Describe the features of Storage Spaces, including storage layout; drive allocation, and
provisioning schemes such as thin provisioning.
Enterprise storage needs
When planning your storage solution, you should consider the following factors:
Cost-effective: in all organization, the overall cost of storage continues to grow with the growth
of enterprise date.
Capacity.
Performance.
Scalability: is the storage solution scalable? So you can continue using it for a longer period.
Resiliency
Supported capabilities:
o Mirror/parity support
o Data stripping
o Enclosure awareness
o Storage tiering
o Storage replication
o Data deduplication
o Data encryption
o Performance analysis
Storage solution must be a balanced (Cost, Capacity, Performance and Resiliency)
Question: Which factors should you consider when planning your enterprise storage strategy?
Question: What storage technologies does your organization use?
What are Storage Spaces?
Storage Spaces is a storage virtualization feature built into Windows Server 2016 and Windows 10, and
uses unallocated disk space on server drives to create storage pools.
Components
Storage space features consists of two components:
Storage pools: The collection of physical disks aggregated into a single logical disk, allowing you
to manage the multiple physical disks as a single disk.
Storage spaces: The virtual disks created from free space in a storage pool.
To create a virtual disk, you need the following
Physical disks: the number of physical disks vary according the type of disk you are creating.
Object Min physical disks
Storage pool 1
Resilient, mirrored virtual disk 2
Virtual disk with resiliency through parity 3
Three-way mirroring. 5
Storage pool: you can attach a physical disk to only one storage pool
Virtual disk or storage space: that allocates a portion of the storage pool (fixed or thin) as a disk.
Disk drive: that you can access from the operating system.
Components and features of Storage Spaces
It is important to consider the following features when planning for virtual disks and configuring a
storage space.
Storage Spaces features Description
Storage layout Simple
Two-way and three-way mirrors
Parity.
Disk sector size A storage pool’s sector size is set the moment it is created
Default Values:
o 512 and 512e drives 512e
o least one 4-kilobyte (KB) drive 4KB
Cluster disk requirement All drives in the pool must support SAS
Drive allocation Data-store
Manual
Hot spare
Provisioning schemes Thin provisioning space
Fixed provisioning space
Stripe parameters You can increase the performance of a virtual disk by striping data across
multiple physical disks.
Stripe: one pass of data written to a storage space.
Columns: the number of disks across which a stripe is written.
Interleave: amount of data written to a single column per strip.
Stripe_width = Columns * Interleave
(New-VirtualDisk)
Considerations
When planning for the storage space, consider the following factors;
Fault tolerance
Performance.
Reliability.
Extensibility.
Demonstration: Configuring Storage Spaces
Demonstration Steps
Create a storage pool
Create a virtual disk and a volume
Lesson 2: Managing Storage Spaces
This lesson explores how to use Storage Spaces to mitigate disk failure, to expand your storage pool, and
to use logs and performance counters to ensure the optimal behavior of your storage.
Lesson Objectives
After completing this lesson, you will be able to:
Describe how to Create and manage Storage Spaces.
Explain how use Storage Spaces to mitigate storage failure.
Explain how to expand your storage pool.
Describe how to use event logs and performance counters to monitor Storage Spaces.
Managing Storage Spaces
Windows Server 2016 provides many tools for managing storage spaces including:
Server Manager
Windows PowerShell
Failover Cluster Manager
System Center Virtual Machine Manager
Windows Management Instrumentation (WMI)
In this lesson, you will learn how to use Server Manager and PowerShell to manage storage spaces.
Manage Storage spaces using Server Manager
create storage pools
Remove physical disks from pools
Create, manage, and delete virtual disks
Manage Storage spaces using Windows PowerShell
You can use the following cmdlets to manage existing storage pools.
Windows PowerShell cmdlet Description
Get-StoragePool Lists storage pools.
New-StoragePool Create a new storage pool.
Get-VirtualDisk Lists virtual disks.
Repair-VirtualDisk Repairs a virtual disk.
Get-PhysicalDisk | Where Lists unhealthy physical disks.
{$_.HealthStatus -ne “Healthy”}
Reset-PhysicalDisk Removes a physical disk from a storage pool.
Get-VirtualDisk | Get-PhysicalDisk Lists physical disks that are used for a virtual disk.
Optimize-Volume Optimizes a volume, performing such tasks on supported volumes and
system SKUs as defragmentation, trim, slab consolidation, and
storage tier processing.
Recommendations
Don't allocate all available SSD capacity for your storage spaces immediately
Don't pin files to storage tiers until you see how well Storage Tiers Optimization can optimize
storage performance.
Do consider pinning the parent VHDX file to the SSD tier if you're providing pooled desktops
through VDI.
Managing disk failure with Storage Spaces
Before deploying storage spaces, you should plan to handle disk failures with minimal impact and risk on
services and data loss. Following are some consideration for that.
Design a complete, fault-tolerant storage solution depending on the fault level you want to
tolerate.
Deploy a highly available storage pool using clustered storage pools.
You can import and mount a storage pool on another server if the system fails.
Choose compatible and similar disks to build the storage space and keep firmware updated.
Unless you enabled hot spares, retire missing disks automatically by changing the
RetireMissingPhysicalDisks policy to Always.
Replace the physical disk before you remove the drive from the storage pool: remove the drive
from the pool can initiate the repair process, which can results in data loss.
Keep unallocated disk space in the pool for virtual disk repairs instead of using hot spares.
Be prepared for multiple disk failures.
Expand storage pools
One of the advantages of Storage Spaces is the ability to expand your storage pool by adding additional
storage at any time by adding physical disks.
Demonstration: Managing Storage Spaces by using Windows PowerShell
Demonstration Steps
View the properties of a storage pool
Add physical disks to a storage pool
Monitoring storage behavior
It is important that you monitor storage behavior and function.
Using the Event Log
You can use event log to identify errors related to storage spaces including the following common Event
IDs:
Event ID Possible Causes
100 Failure to read configuration.
300 - Physical drive I/O failure.
301 - Corrupted configuration data.
- Insufficient physical disk's memory resources.
102 Failure to update configuration data
302 - Physical disk I/O failure.
- Insufficient number of physical drives are online.
- Insufficient physical disk's memory resources.
103 - Capacity consumption has exceeded threshold limit.
200 - Windows was unable to read the drive header for a physical drive
201/202 - The metadata on a physical drive has become corrupt
203 - I/O failure occurred on a physical drive.
303 - A drive in the storage pool fails or is removed.
304 - One or more drives hosting data for a storage space have failed or
are missing.
Performance monitoring
Storage architecture has impact on storage performance. In Windows Server 2016 you can use either
Performance monitor or PowerShell to monitor storage performance.
You can use Windows PowerShell to generate and collect performance data about your storage space.
For example the following cmdlet:
Measure-StorageSpacesPhysicalDiskPerformance -StorageSpaceFriendlyName StorageSpace1 -MaxNumberOfSamples 60 -
SecondsBetweenSamples 2 -ReplaceExistingResultsFile -ResultsFilePath StorageSpace1.blg -SpacetoPDMappingPath PDMap.csv
Can be used to:
Monitors the performance of all physical disks in a storage space.
Capture performance data for 1 minute at 2 seconds interval.
Stores the performance log in the file named StorageSpace1.blg, and physical disks mapping
information in a file named PDMap.csv.
Lab A: Implementing Storage Spaces
Objectives
After completing this lab, you will be able to:
Create a storage space.
Enable and configure storage tiering.
Exercise 1: Creating a Storage Space
Task 1: Create a storage pool from six disks that are attached to the server
Task 2: Create a three-way mirrored virtual disk (need at least five physical disks)
Task 3: Copy a file to the volume, and verify it is visible in File Explorer
Task 4: Remove a physical drive to simulate drive failure
Task 5: Verify that the file is still available
Task 6: Add a new disk to the storage pool and remove the broken disk
Exercise 2: Enabling and configuring storage tiering
Task 1: Use the Get-PhysicalDisk cmdlet to view all available disks on the system
Task 2: Create a new storage pool
Task 3: View the media types
Task 4: Specify the media type for the sample disks and verify that the media type is Changed
Task 5: Create pool-level storage tiers by using Windows PowerShell
Task 6: Create a new virtual disk with storage tiering by using the New Virtual Disk Wizard
Lesson 3: Implementing Data Deduplication
Data Deduplication is a role service of Windows Server 2016 that conserves storage space on an NTFS
volume by locating redundant data and storing one only copy of that data instead of multiple copies, This
lesson explains how to implement this service to achieve the ultimate goals of storing more data and using
less physical disk space.
Lesson Objectives
After completing this lesson, you will be able to:
Describe Data Deduplication in Windows Server 2016.
Identify Data Deduplication components in Windows Server 2016.
Explain how to deploy Data Deduplication.
Explain how to monitor and maintain data deduplication.
What is Data Deduplication?
Data Deduplication has as a goal to store more data on less space, to achieve this goal, it provides the
following ways:
Capacity optimization: compared to NTFS compression, Data Deduplication is more efficient.
Scale and performance: In Windows Server 2016, Data Deduplication can run multiple threads in
parallel by using multiple I/O queues on multiple volumes simultaneously without affecting other
workloads on the server.
Reliability and data integrity: Data Deduplication ensure data integrity by using checksum,
consistency, and identity validation. Also, Data Deduplication maintains redundancy to ensure
that the data is repaired.
Bandwidth efficiency with BranchCache when transferring data over the WAN to a branch office.
Optimization management with familiar tools such as Server manager and PowerShell.
Data Deduplication operate at the volume level, and not the file level, therefore after deduplication, files
are no longer stored as independent streams of data, and they are replaced with stubs that point to data
blocks that are stored within a common chunk store. And shared bloks are stored only once, which reduce
the needed space to store all files.
Enhancements to the Data Deduplication role service
Compared to Windows Server 2012 and 2012 R2, Windows Sertver 2016 has several improvements
including the following:
Support for volume sizes up to 64 TB with 10TB or less in 2012.
Support for file sizes up to 1 TB.
Simplified deduplication configuration for virtualized backup applications
Volume requirements for Data Deduplication
You can enable Data Deduplication on a volume that:
Is not be a system or boot volume
Has MBR or GUID partition table.
Has NTFS or ReFS file system.
Must be attached to the Windows Server (not a removable or remote disk)
Also note the following:
Files with extended attributes, encrypted files, files smaller than 32 KB, and reparse point files
will not be processed for Data Deduplication.
Data Deduplication is not available for Windows client operating systems.
Data Deduplication components
Data Deduplication has many components:
Filter driver: which monitor local and remote I/O.
Deduplication service: which controls the following job types:
o Optimization: deduplication and compression.
o Garbage Collection: clean up data chunks that became unreferenced after file deletion.
o Scrubbing: to analyze chunk store corruption logs and make repairs.
o Unoptimization: Like when decommissioning a server with volumes enabled for data
deduplication.
Data Deduplication process
Data deduplication process consists of the following steps and actions:
Optimization jobs of files on the volume.
Segment all file data on the volume into small chunks (32kb to 128kb).
Identifies chunks that are duplicated
Inserts chunks into a common chunk store
Replaces all duplicate chunks with a reference
Replaces the original files with a reparse point
Compresses chunks and organizes them
Removes primary data stream of the files
You can run the process through scheduled tasks or interactively by using PowerShell. And after the
process, the volume can contains the following elements:
- Un-optimized files.
- Optimized files.
- Chunk store.
- Addition free space.
Deploying Data Deduplication
Planning a Data Deduplication deployment
You must plan data deduplicatio deployment using the following steps:
Target deployments: data deduplication is preferred for the following data types:
o General file shares.
o Software deployment share (software setup binaries and updates files).
o VHD libraries.
Determine which volumes are candidates for deduplication considering the following:
o Is duplicated data present?
o Does data access pattern allow time for deduplication?
o Does the server have time and resource to run deduplication?
Evaluate savings with the Deduplication Evaluation Tool (DDPEval.exe)
Plan the rollout, scalability, and deduplication policies considering the following:
o Do you want to process incoming data sooner? Then make the MinimumFileAgeDays
setting smaller.
o Add directories that you do not want to deduplicate to the exclusion list.
o Add file types that you do not want to deduplicate to the exclusion list.
o Update the schedules for Garbage collection and Scrubbing according to the off-peak
hours of your servers.
Installing and configuring Data Deduplication
Install Data Deduplication components on the server by using
o Server Manager: Add Roles and features Wizard
o Windows PowerShell:
Import-Module ServerManager
Add-WindowsFeature -Name FS-Data-Deduplication
Import-Module DeduplicationEnable Data Deduplication
Configure Data Deduplication jobs:
o Optimization
Start-DedupJob –Volume VolumeLetter –Type Optimization
o Data Scrubbing
Start-DedupJob –Volume VolumeLetter –Type Scrubbing
o Garbage Collection
Start-DedupJob –Volume VolumeLetter –Type GarbageCollection
o Unoptimization
Start-DedupJob –Volume VolumeLetter –Type Unoptimization
Configure Data Deduplication schedules
Demonstration: Implementing Data Deduplication
Demonstration Steps
Install the Data Deduplication role service
Enable Data Deduplication
Check the status of Data Deduplication
Monitoring and maintaining Data Deduplication
It is important that you monitor and maintain the systems that are enabled for Data Deduplication to
ensure optimal performance.
Monitoring and reporting of Data Deduplication
Windows Server 2016 provides the following tools to monitor Data Deduplication:
Windows PowerShell cmdlets
o Get-DedupStatus: Returns the deduplication status for a volume including rate,
number/size of optimized files, the last run time, and the amount of space saved on the
volume.
o Get-DedupVolume: returns the deduplication metadata including rate, number/size of
optimized files, and settings such as minimum file age, minimum file size, exclude
files/folders and chunk redundancy threshold.
o Get-DedupMetadata: returns the status information of the data store.
o Get-DedupJob: returns status and information about running or queued deduplication
jobs.
Event Viewer logs: in Applications and Services Logs\Microsoft\Windows\Deduplication
Performance Monitor data: Disk Read Bytes/sec, Disk Write Bytes/sec and average Disk
sec/Transfer counters can be used to monitor the throughput rates of currently running jobs.
File Explorer: Compare the Size and Size on Disk to check deduplication on individual files.
Server Manager:
Maintaining Data Deduplication
To ensure beter efficiency of deduplication, you can use the following PowerShell cmdlets:
Update-DedupStatus: to update the cached metadata.
Start-DedupJob: to start on-demand deduplication job.
Measure-DedupFileMetadata: to estimate the disk space that you can reclaim if you delete some
folders.
Expand-DedupFile: to expand a file to its original location because of compatibility with
aaplications.
Troubleshooting bad effects of Data Deduplication
When Data Deduplication impact your system, consider the following options:
Use a different deduplication frequency
Use job options such as:
o StopWhenSystemBusy
o Preempt
o ThrottleLimit
o Priority
o Memory
Expand specific files is needed (Expand-DedupFile).
Disable the Deduplication on the volume (Start-DedupJob –Volume VolumeLetter –Type Unoptimization).
Lab B: Implementing Data Deduplication
Objectives
After completing this lab, you will be able to:
Install the Data Deduplication role service.
Enable Data Deduplication.
Check the status of Data Deduplication.
Exercise 1: Installing Data Deduplication
Task 1: Install the Data Deduplication role service
Task 2: Check the status of Data Deduplication
Task 3: Verify the virtual machine performance
Exercise 2: Configuring Data Deduplication
Task 1: Configure Data Deduplication
Task 2: Configure optimization to run now and view the status
Task 3: Verify if the file has been optimized
Task 4: Verify VM performance again
Module Review and Takeaways
Review Questions
Question: You attach five 2-TB disks to your Windows Server 2012 computer. You want to simplify
the process of managing the disks. In addition, you want to ensure that if one disk fails, the failed disk’s
data is not lost. What feature can you implement to accomplish these goals?
Question: Your manager has asked you to consider the use of Data Deduplication within your storage
architecture. In what scenarios is the Data Deduplication role service particularly useful?