KEMBAR78
PowerScale+Administration Course+Guide | PDF | Scalability | Cloud Computing
0% found this document useful (0 votes)
368 views441 pages

PowerScale+Administration Course+Guide

Uploaded by

Ladislau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
368 views441 pages

PowerScale+Administration Course+Guide

Uploaded by

Ladislau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 441

POWERSCALE

ADMINISTRATION

COURSE GUIDE

(V3)

PARTICIPANT GUIDE
Internal Use - Confidential
PowerScale Administration-SSP1

Internal Use - Confidential


© Copyright 2020 Dell Inc. Page i
Table of Contents

NAS, PowerScale, and OneFS ...................................................................................... 5


Network Attached Storage ................................................................................................... 6
PowerScale........................................................................................................................ 15
PowerScale Management Interfaces.................................................................................. 35
Common Cluster Operations .............................................................................................. 43
OneFS Directory Structure ................................................................................................. 48

Configuring the Foundation for Access ................................................................. 53


Authentication Providers .................................................................................................... 54
Access Zones .................................................................................................................... 65
Groupnets .......................................................................................................................... 73
Subnet - SmartConnect Zones ........................................................................................... 80
IP Address Pools ............................................................................................................... 87

Configuring Identity Management and Authorization ........................................... 96


Role-Based Access Control ............................................................................................... 97
User Identity Mapping ...................................................................................................... 107
Authorization .................................................................................................................... 117

Configuring Client Access to Data ....................................................................... 131


OneFS Caching ............................................................................................................... 132
SMB Shares..................................................................................................................... 143
NFS Exports .................................................................................................................... 156
S3 Buckets....................................................................................................................... 163
HDFS and Swift ............................................................................................................... 171

Foundations of Data Protection and Data Layout ............................................... 173


File Striping ...................................................................................................................... 174
Data Protection ................................................................................................................ 182
Protection Management ................................................................................................... 200
Data Layout ..................................................................................................................... 211

PowerScale Administration-SSP1

Page ii © Copyright 2020 Dell Inc.


Configuring Storage Pools .................................................................................... 216
Storage Pools .................................................................................................................. 217
File Pools ......................................................................................................................... 226
SmartPools ...................................................................................................................... 239
CloudPools ...................................................................................................................... 248

Configuring Data Services .................................................................................... 261


File Filtering ..................................................................................................................... 262
SmartQuotas.................................................................................................................... 269
SmartDedupe................................................................................................................... 283
SnapshotIQ ...................................................................................................................... 293
SyncIQ ............................................................................................................................. 305
SmartLock........................................................................................................................ 326

Monitoring Tools .................................................................................................... 333


PowerScale HealthCheck ................................................................................................ 334
InsightIQ .......................................................................................................................... 341
DataIQ v1 ........................................................................................................................ 352
isi statistics ........................................................................................................... 381

Appendix ............................................................................................... 389

Glossary ................................................................................................ 425

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page iii


NAS, PowerScale, and OneFS

PowerScale Administration-SSP1

Internal Use - Confidential


Page 4 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

NAS, PowerScale, and OneFS

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 5
NAS, PowerScale, and OneFS

Network Attached Storage

Scenario

IT Manager: You are responsible for the administration and


management of the PowerScale cluster. We have a new installed cluster
that is powered. The cluster has initial IP addresses and DNS
configured. Now, before you jump in and start exploring its capabilities, I
want you to explain a few things.

Your Challenge: The IT manager wants you to compare the


PowerScale to traditional NAS platforms, and describe scale-up and
scale-out architecture.

Storage Technologies

DAS

In the early days of system data, corporations1 stored data on hard drives in a
server. To minimize risk, corporations mirrored the data on a RAID. This technique
is called Direct Attached Storage (DAS).

1The intellectual property of the company depended entirely upon that hard drive's
continued functionality.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 6 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

RAID

DAS

SAN

As applications proliferated, soon there were many servers, each with its own DAS.
This worked fine, with some drawbacks2. Due to this limitation with DAS, SAN was
introduced which effectively utilized volume manager and RAID.

Volume Manager RAID

SAN

NAS

SAN was set up for servers, not personal computers3 (PCs).

2If one server’s DAS was full while another server’s DAS was half empty, the
empty DAS could not share its space with the full DAS.

3PCs worked differently from the storage file server and the network
communications in PCs, only communicate from one file system to another file
system.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 7
NAS, PowerScale, and OneFS

The breakthrough came when corporations put employee computers on the


network, and added to the storage file system to communicate with users.

From this, Network Attached Storage (NAS) was born.

NAS works pretty well, but there is room for improvement4.

File System Volume Manager RAID

NAS

CAS

Content Address Storage (CAS) is object-based storage that separates metadata


from its objects. CAS has three properties: Clip Descriptor File (CDF), object,
metadata. The CDF contains addresses that point to object data and metadata.

4 The server is spending as much time servicing employee requests as it is doing


the application work it was meant for. The file system does not know where data is
supposed to go, because that is the volume manager’s job. The volume manager
does not know how the data is protected; that is RAID’s job. If high-value data
needs more protection than other data, you need to migrate the data to a different
volume that has the protection level that data needs. So there is opportunity to
improve NAS.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 8 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Cloud

Cloud storage stores data over the Internet to a cloud provider. The cloud provider
manages and protects the data. Typically, cloud storage is delivered on demand
with just-in-time capacity and costs.

NAS Overview

NAS provides the advantages of server consolidation by eliminating the need for
multiple file servers.

• Consolidates the storage that is used by the clients onto a single system,
making it easier to manage the storage.
• Uses network and file-sharing protocols to provide access to the file data5.
• Uses its own operating system6 and integrated hardware and software
components to meet specific file-service needs.

Scale-Up versus Scale-Out Architecture

PowerScale clusters are a NAS solution. There are two types of NAS architectures;
scale-up and scale-out.

5NAS enables both UNIX and Microsoft Windows users to share the same data
seamlessly.

6 Its operating system is optimized for file I/O and, therefore, performs file I/O better
than a general-purpose server. As a result, a NAS device can serve more clients
than general-purpose servers and provide the benefit of server consolidation.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 9
NAS, PowerScale, and OneFS

Scale-Up

• With a scale-up platform, if more storage is needed, another independent NAS


system is added to the network.
• A scale-up solution has controllers that connect to trays of disks and provide the
computational throughput.
• Traditional NAS is great for specific types of workflows, especially those
applications that require block-level access.

Controller with
disk shelves

Independent systems on
network - separate
Clients points of management

Structured or Unstructured
storage

Additional storage - Usually


restricted to tens or hundreds of
TBs

Graphic highlighting adding controllers for a scale-up solution.

Scale-Out

• With a clustered NAS solutions, or scale-out architecture, all the NAS boxes, or
PowerScale nodes, belong to a unified cluster with a single point of
management.
• In a scale-out solution, the computational throughput, disks, disk protection, and
management are combined and exist for a single cluster.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 10 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Unstructured storage

PowerScale cluster

1000+ PBS

Up to 252 nodes Clients

Adding storage, adds compute and


bandwidth

Graphic highlighting adding nodes for a scale-out solution.

Scale-Out NAS

Scale-out NAS7 is now a mainstay in most data center environments. The next
wave of scale-out NAS innovation has enterprises embracing the value8 of NAS
and adopting it as the core of their infrastructure.

7The PowerScale scale-out NAS storage platform combines modular hardware


with unified software to harness unstructured data. Powered by the OneFS
operating system, a PowerScale cluster delivers a scalable pool of storage with a
global namespace.

8 Enterprises want to raise the standard on enterprise grade resilience, with a no


tolerance attitude toward data loss and data unavailable situations and support for
features to simplify management. Organizations see massive scale and
performance with smaller data center rack footprints that the performance-centric
workloads drives.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 11
NAS, PowerScale, and OneFS

1: The unified software of the platform provides centralized web-based and


command-line administration to manage the following features:

• A cluster that runs a distributed file system.


• Scale-out nodes that add capacity and performance.
• Storage options that manage files and tiering.
• Flexible data protection and high availability.
• Software modules that control costs and optimize resources.

OneFS Operating System

PowerScale Administration-SSP1

Internal Use - Confidential


Page 12 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

With traditional NAS systems the file system9, volume manager10, and the
implementation of RAID11 are all separate entities.

OneFS is the operating system and the underlying file system that drives and
stores data.
OneFS is a single file system that performs the duties of the volume manager and
applies protection.
OneFS is built on FreeBSD.
• Creates a single file system for the cluster.12
• Volume manager and protection.13
• Data shared across cluster.14
• Scale resources.15

9The file system is responsible for the higher-level functions of authentication and
authorization.

10 The volume manager controls the layout of the data.

11 RAID controls the protection of the data.

12As nodes are added, the file system grows dynamically and content is
redistributed.

13 OneFS performs the duties of the volume manager and applies protection to the
cluster as a whole. There is no partitioning, and no need for volume creation. All
data is striped across all nodes.

14Because all information is shared among nodes, the entire file system is
accessible by clients connecting to any node in the cluster.

15Each PowerScale storage node contains globally coherent RAM, meaning that,
as a cluster becomes larger, it also becomes faster. When adding a node, the
performance scales linearly.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 13
NAS, PowerScale, and OneFS

Challenge

IT Manager:
Open participation question:
Question: What is the difference between scale-up and scale-out
architecture?

PowerScale Administration-SSP1

Internal Use - Confidential


Page 14 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

PowerScale

Scenario

IT Manager: Now that you have an understanding of the scale-out and


scale-up storage and the OneFS operating system, we will first focus on
the PowerScale nodes.

Your Challenge: The IT manager wants to ensure you have a good


understanding of the PowerScale hardware. Discuss the benefits of the
PowerScale nodes and identify the type of workflow suited for the
different PowerScale platforms.

PowerScale Hardware Overview

Generation 6 (or Gen 6) chasis and Generation 6.5 nodes

PowerScale includes all-flash, hybrid, and archive storage systems.

Gen 6 highlights.
Gen 6.5 highlights.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 15
NAS, PowerScale, and OneFS

Gen 6 Hardware Components

Gen 6 requires a minimum of four nodes to form a cluster. You must add nodes to
the cluster in pairs.

The chassis holds four compute nodes and 20 drive sled slots.

Both compute modules in a node pair power-on immediately when one of the
nodes is connected to a power source.

Gen 6 chassis

1 10 9

2 8
4
6

3
5 7

1: The compute module bay of the two nodes make up one node pair. Scaling out a
cluster with Gen 6 nodes is done by adding more node pairs.

2: Each Gen 6 node provides two ports for front-end connectivity. The connectivity
options for clients and applications are 10 GbE, 25 GbE, and 40 GbE.

3: Each node can have 1 or 2 SSDs that are used as L3 cache, global namespace
acceleration (GNA), or other SSD strategies.

4: Each Gen 6 node provides two ports for back-end connectivity. A Gen 6 node
supports 10 GbE, 40 GbE, and InfiniBand.

5: Power supply unit - Peer node redundancy: When a compute module power
supply failure takes place, the power supply from the peer node temporarily
provides power to both nodes.

6: Each node has five drive sleds. Depending on the length of the chassis and type
of the drive, each node can handle up to 30 drives or as few as 15.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 16 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

7: Disks in a sled are all the same type.

8: The sled can be either a short sled or a long sled. The types are:

• Long Sled - four drives of size 3.5"


• Short Sled - three drives of size 3.5"
• Short Sled - three or six drives of size 2.5"

9: The chassis comes in two different depths, the normal depth is about 37 inches
and the deep chassis is about 40 inches.

10: Large journals offer flexibility in determining when data should be moved to the
disk. Each node has a dedicated M.2 vault drive for the journal. A node mirrors
their journal to its peer node. The node writes the journal contents to the vault when
a power loss occurs. A backup battery helps maintain power while data is stored in
the vault.

Gen 6.5 Hardware Components

Gen 6.5 requires a minimum of three nodes to form a cluster. You can add single
nodes to the cluster. The F600 and F200 are a 1U form factor and based on the
R640 architecture.

Graphic shows F200 or F600 node pool.

1
5

8 2

7 4

1: Scaling out an F200 or an F600 node pool only requires adding one node.

2: For front-end connectivity, the F600 uses the PCIe slot 3.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 17
NAS, PowerScale, and OneFS

3: Each Gen F200 and F600 node provides two ports for backend connectivity. The
PCIe slot 1 is used.

4: Redundant power supply units - When a power supply fails, the secondary
power supply in the node provides power. Power is supplied to the system equally
from both PSUs when the Hot Spare feature is disabled. Hot Spare is configured
using the iDRAC settings.

5: Disks in a node are all the same type. Each F200 node has four SAS SSDs.

6: The nodes come in two different 1U models, the F200 and F600. You need
nodes of the same type to form a cluster.

7: The F200 front-end connectivity uses the rack network daughter card (rNDC).

8: Each F600 node has 8 NVMe SSDs.

Important: The F600 nodes have a 4-port 1 GB NIC in the rNDC


slot. OneFS does not support this NIC on the F600.

PowerScale Node Specifications

PowerScale offers nodes for different workloads of performance and capacity. The
table below shows some of the node specifications. To get the latest and a
complete list of specification and compare between the node offerings, browse the
product page.

Node Storag Drive Drives Infrastructu Front-End Compute


Type e Type Capacitie per re Networkin per node
s chassi Networking g per
s per node node

PowerScale Administration-SSP1

Internal Use - Confidential


Page 18 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

F800 2.5" 1.6 TB, 60 2 InfiniBand 2 x 10 Ultra


SSD 3.2 TB, connections GbE compute
3.84 TB, supporting (SFP+) or bundle
7.68 TB, QDR links or 2 x 40 • 16 Core,
15.4 TB 2 x 40 GbE GbE 2.6Ghz,
(QSFP+) (QSFP+) Intel®
Xeon®
Processo
r E5-
2697A v4
• RAM-
256 GB
DDR4

F810 2.5" 3.84 TB, 60 2 x 40 GbE 2 x 10 Ultra


SSD 7.68 TB, (QSFP+) GbE compute
15.36 TB (SFP+) or bundle
2 x 40 • 16 Core,
GbE 2.6Ghz,
(QSFP+) Intel®
Xeon®
Processo
r E5-
2697A v4
• RAM-
256 GB
DDR4

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 19
NAS, PowerScale, and OneFS

F600 2.5" 15.36 TB, 8/ 2 x 100 GbE 2 x 10/25 R640 base


SSD 30.72 TB, node 2 x 25 GbE GbE platform
61.4 TB 2 x 100 • 2nd
GbE Generatio
n Intel®
Xeon®
Scalable
Processo
rs
• 128GB
or 192GB
or 384GB
DDR4

F200 2.5" 3.8 TB, 4/ 2 x 10/25 2 x 10/25 R640 base


SSD 7.68 TB, node GbE GbE platform
15.36 TB • 2nd
Generatio
n Intel®
Xeon®
Scalable
Processo
rs
• 48GB or
96GB
DDR4

PowerScale Administration-SSP1

Internal Use - Confidential


Page 20 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

H400 3.5" 2 TB, 4 60 2 InfiniBand 2 x 10GE Medium


* SATA TB, 8 TB, connections (SFP+) compute
12 TB supporting or 2 x 25 bundle
QDR links or GbE • 4 Core,
2 x 10 GbE (SFP28) 2.2Ghz,
(SFP+) Intel®
Xeon®
Processo
r D-1527
• RAM- 64
GB
DDR4

H500 3.5" 2 TB, 4 60 2x 2 x 10GE High


* SATA TB, 8 TB, InfiniBand or (SFP+) or compute
or 12 TB 2 x 40 GbE 2 x 40 bundle
per node GbE • 10 Core,
(QSFP+) 2.2Ghz,
per node Intel®
or 2 x 25 Xeon®
GbE Processo
(SFP28) r E5-2630
v4
• RAM-
128 GB
DDR4

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 21
NAS, PowerScale, and OneFS

H560 3.5" 10 TB or 80 2 InfiniBand 2 x 10 Turbo


0* SATA 12 TB connections GbE compute
supporting (SFP+) bundle
QDR links or or 2 x 40 • 14 Core,
2 x 40 GbE GbE 2.2Ghz,
(QSFP+) (QSFP+) Intel®
or 2 x 25 Xeon®
GbE Processo
(SFP28) r E5-2680
v4
• RAM-
256 GB
DDR4

H600 2.5" 600 GB 120 2x 2 x 10GE Turbo


* SAS or 1.2 TB InfiniBand or (SFP+) compute
2 x 40 GbE or 2 x 40 bundle
per node GbE per • 14 Core,
node 2.2Ghz,
or 2 x 25 Intel®
GbE Xeon®
(SFP28) Processo
r E5-2680
v4
• RAM-
256 GB
DDR4

PowerScale Administration-SSP1

Internal Use - Confidential


Page 22 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

A200 3.5" 2 TB, 4 60 2x 2 x 10GE Low


* SATA TB, 8 TB, InfiniBand or (SFP+) per compute
or 12 TB 2 x 10 GbE node bundle
per node or 2 x 25 • 2 Core,
GbE 2.2Ghz,
(SFP28) Intel®
Pentium
®
Processo
r D1508
• RAM- 16
GB
DDR4

A200 3.5" 10 TB or 80 2x 2 x 10GE Low


0* SATA 12 TB InfiniBand or (SFP+) per compute
2 x 10 GbE node bundle
per node or 2 x 25 • 2 Core,
GbE 2.2Ghz,
(SFP28) Intel®
Pentium
®
Processo
r D1508
• RAM- 16
GB
DDR4

* Has 1 or 2 SSDs per node for cachingThe H600.

PowerScale Features

The design goal for the PowerScale nodes are to keep the simple ideology of NAS,
provide the agility of the cloud, and the cost of commodity. Click each tab to learn

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 23
NAS, PowerScale, and OneFS

more on the features provided by PowerScale. See the student guide for more
information.

Performance and Scale

Performance and Scale

PowerScale clusters optimizes performance at a Petabyte (PB) scale by:

• Optimizing components to maximize performance


• Leveraging standard technology to increase focus on scale

Some of the benefits include:


• Increased performance per usable Terabyte (TB)
• Enable lower latency apps to leverage scale-out NAS.
• Predictable performance at scale

Use Case: Media and Entertainment 16

16A Media and Entertainment production house needs high single stream
performance at PB scale that is cost optimized. The organization requires cloud
archive in a single namespace, archive optimized density with a low Total Cost of
Ownership (TCO) solution. This environment typically has large capacities and
employs new performance technologies at will.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 24 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Data Protection

Data Protection and Availability

PowerScale provides enterprise grade resilience and data protection with:


• Eliminating single point of failure
• Small fault domains
• Predictable failure handling at PB densities

Use Case: Financial sectors17

Sizing

Support evolving needs with lower TCO

17Financial sectors rely heavily on data protection and availability to operate. Data
loss such as customer transactions or system downtime can negatively affect the
business.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 25
NAS, PowerScale, and OneFS

The Gen 6x platforms addresses the challenges of agility and lower TCO by:
• Dedicated cache drives
• Modular architecture
• Non-disruptive upgrades

Some of the benefits include:


• Gen 6 cluster in a box and simple growth path
• Customizable solution
• Same building blocks irrespective of cluster profile

Use Case: Start-up company18

Computer and storage permutations are wrapped into bundles in order to


significantly increase both performance and deterministic performance projection.
The predefined compute bundles optimize memory, CPU, and cache to simplify
configuration selection based on a customer's performance, capacity, and cost
profile.

In order to focus on scale, PowerScale leverages standard technologies to


eventually target a greater than 400 node capacity. With the OneFS 8.2 and higher,
the cluster maximum node limit is 252 nodes. Changes to the back-end
infrastructure, such as adopting Ethernet for back-end communication between
nodes, allows PowerScale to push through the limitations set by older technologies.

PowerScale has no dependency on the flash boot drive. Gen 6 nodes boot from
boot partitions on the data drives. These drives are protected using erasure coding
to remove the dependency on dedicated boot drives. Next, PowerScale uses SSD
drives for the journal to remove the NVRAM dependency present on Gen 5 nodes.
There are now multiple distributed copies of the journal.

18A simplicity and agility use case is a small start-up company growing at rapid
pace, who needs to start with limited capacity and then grow on demand for scale
and new workloads.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 26 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

By creating smaller failure domains with significantly fewer drives in each node pool
and neighborhood, increases the reliability of the system by reducing the spindle-
to-CPU ratio. The increased reliability enables the cluster to use larger capacity
drives, without the risk of overburdening the system in the event of a drive failure.
PowerScale enables predictable failure handling at Petabyte (PB) densities.

Gen 6 platforms have dedicated cache drives for dedicated cache. The caching
options offered are 1 or 2 SSD configurations in various capacities to maximize
front end performance. Gen 6 hardware is focused on support and serviceability,
based on a modular architecture with full redundancy. It is possible to increase
performance with data in place, increase cache without disruption, and upgrade
speeds and feeds non-disruptively.

PowerScale Family

The Gen 6x family has different offerings that are based on the need for
performance and capacity. You can scale out compute and capacity separately.
OneFS runs on all nodes. Click each tab to learn more about the different offerings.

F-Series

The F-series nodes sit at the top of both performance and capacity, with the all-
flash arrays. The all-flash platforms can accomplish 250-300k protocol operations
per chassis, and get 15 GB/s aggregate read throughput from the chassis. Even
when the cluster scales, the latency remains predictable.

• F800
• F810

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 27
NAS, PowerScale, and OneFS

• F600
• F200

H-Series

After F-series nodes, next in terms of computing power are the H-series nodes.
These are hybrid storage platforms that are highly flexible and strike a balance
between large capacity and high-performance storage to provide support for a
broad range of enterprise file workloads.

• H400
• H500
• H5600
• H600

A-Series

The A-series nodes namely have lesser compute power compared to other nodes
and are designed for data archival purposes. The archive platforms can be
combined with new or existing all-flash and hybrid storage systems into a single
cluster that provides an efficient tiered storage solution.

• A200
• A2000

PowerScale Administration-SSP1

Internal Use - Confidential


Page 28 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Node Interconnectivity

1: Backend ports int-a and int-b. The int-b port is the upper port. Gen 6 backend
ports are identical for InfiniBand and Ethernet, and cannot be identified by looking
at the node. If Gen 6 nodes are integrated in a Gen 5 or earlier cluster, the backend
will use InfiniBand. Note that there is a procedure to convert an InfiniBand backend
to Ethernet if the cluster no longer has pre-Gen 6 nodes.

2: PowerScale nodes with different backend speeds can connect to the same
backend switch and not see any performance issues. For example, an environment
has a mixed cluster where A200 nodes have 10 GbE backend ports and H600
nodes have 40 GbE backend ports. Both node types can connect to a 40 GbE
switch without effecting the performance of other nodes on the switch. The 40 GbE
switch provides 40 GbE to the H600 nodes and 10 GbE to the A200 nodes.

3: Gen 6.5 backend ports use the PCIe slot.

4: There are two speeds for the backend Ethernet switches, 10 GbE and 40 GbE.
Some nodes, such as archival nodes, might not need to use all of a 10 GbE port
bandwidth while other workflows might need the full utilization of the 40 GbE port
bandwidth. The Ethernet performance is comparable to InfiniBand so there should
be no performance bottlenecks with mixed performance nodes in a single cluster.
Administrators should not see any performance differences if moving from
InfiniBand to Ethernet.

Gen 6 nodes can use either an InfiniBand or Ethernet switch on the backend.
InfiniBand was designed as a high-speed interconnect for high-performance

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 29
NAS, PowerScale, and OneFS

computing, and Ethernet provides the flexibility and high speeds that sufficiently
support the PowerScale internal communications.

Gen 6.5 only supports Ethernet. All new, PowerScale clusters support Ethernet
only.

Warning: With Gen 6, do not plug a backend Ethernet topology into


a backend InfiniBand NIC. If you plug Ethernet into the InfiniBand
NIC, it switches the backend NIC from one mode to the other and
will not come back to the same state.

PowerScale Networking Architecture

OneFS supports standard network communication protocols IPv4 and IPv6.


PowerScale nodes include several external Ethernet connection options, providing
flexibility for a wide variety of network configurations19.

Network: There are two types of networks that are associated with a cluster:
internal and external.

19In general, keeping the network configuration simple provides the best results
with the lowest amount of administrative overhead. OneFS offers network
provisioning rules to automate the configuration of additional nodes as clusters
grow.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 30 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Front-end, External Network

Client/Application PowerScale Storage


Layer Layer

Ethernet

Protocols: NFS, SMB, S3, Ethernet Backend communication


HTTP, FTP, HDFS, SWIFT Layer (PowerScale internal)

F200 cluster showing supported frontend protocols.

Clients connect to the cluster using Ethernet connections20 that are available on all
nodes.

The complete cluster is combined with hardware, software, networks in the


following view:

Back-end, Internal Network

Double click image for enlarge view.

20Because each node provides its own Ethernet ports, the amount of network
bandwidth available to the cluster scales linearly.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 31
NAS, PowerScale, and OneFS

OneFS supports a single cluster21 on the internal network. This back-end network,
which is configured with redundant switches for high availability, acts as the
backplane for the cluster.22

Leaf-Spine Backend Network

The Gen 6x back-end topology in OneFS 8.2 and later supports scaling a
PowerScale cluster to 252 nodes. See the participant guide for more details.

22 downlinks per leaf - 40 Gb ports 10 uplinks per


Dell Z9100 switches leaf - 100 Gb ports

27 uplinks per
spine switch

4 leaf switches = max


of 88 nodes

Max scale out to 132 nodes with


2 spine switches

Leaf-Spine topology for a PoweScale cluster with up to 88 nodes.

Leaf-Spine is a two level hierarchy where nodes connect to leaf switches, and leaf
switches connects to spine switches. Leaf switches do not connect to one another,
and spine switches do not connect to one another. Each leaf switch connects with

21 All intra-node communication in a cluster is performed across a dedicated


backend network, comprising either 10 or 40 GbE Ethernet, or low-latency QDR
InfiniBand (IB).

22 This enables each node to act as a contributor in the cluster and isolating node-
to-node communication to a private, high-speed, low-latency network. This back-
end network utilizes Internet Protocol (IP) for node-to-node communication.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 32 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

each spine switch and all leaf switches have the same number of uplinks to the
spine switches.

The new topology uses the maximum internal bandwidth and 32-port count of Dell
Z9100 switches. When planning for growth, F800 and H600 nodes should connect
over 40 GbE ports whereas A200 nodes may connect using 4x1 breakout cables.
Scale planning enables for nondisruptive upgrades, meaning as nodes are added,
no recabling of the backend network is required. Ideally, plan for three years of
growth. The table shows the switch requirements as the cluster scales. In the table,
Max Nodes indicate that each node is connected to a leaf switch using a 40 GbE
port.

Challenge

IT Manager:
Open participation question:
Question: What are the differences between Gen 6 nodes and
Gen 6.5 nodes?

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 33
NAS, PowerScale, and OneFS

Resources

Link to Gen 6 Info Hub.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 34 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

PowerScale Management Interfaces

Scenario

IT Manager: Good work. I think you understand the PowerScale building


blocks. Before managing the cluster, it is important to know about the
different management interfaces used to administer the cluster.

Your Challenge: The manager wants you to explain the different


administration interfaces and discuss the isi command structure.

Management Interfaces Overview

The OneFS management interface is used to perform various administrative and


management tasks on the PowerScale cluster and nodes. Management capabilities
vary based on which interface is used. The different types of management
interfaces in OneFS are:

• Serial Console
• Web Administration Interface (WebUI)
• Command Line Interface (CLI)
• Platform Application Programming Interface (PAPI)
• Front Panel Display

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 35
NAS, PowerScale, and OneFS

Serial Console Video

This video provides an overview on the serial console. See the student guide for a
transcript of the video.

Click to launch video.

https://edutube.emc.com/Player.aspx?vno=jHnaLyBuvlzyrARCLAU/jw==&autoplay
=true

Four options are available for managing the cluster. The web administration
interface (WebUI), the command-line interface (CLI), the serial console, or the
platform application programming interface (PAPI), also called the OneFS API. The
first management interface that you may use is a serial console to node 1. A serial
connection using a terminal emulator, such as PuTTY, is used to initially configure
the cluster. The serial console gives you serial access when you cannot or do not
want to use the network. Other reasons for accessing using a serial connection
may be for troubleshooting, site rules, a network outage, and so on. Shown are the
terminal emulator settings.

The configuration Wizard automatically starts when a node is first powered on or


reformatted. If the Wizard starts, the menu and prompt are displayed as shown.
Choosing option 1 steps you through the process of creating a cluster. Option 2 will
exit the Wizard after the node finishes joining the cluster. After completing the

PowerScale Administration-SSP1

Internal Use - Confidential


Page 36 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

configuration Wizard, running the isi config command enables you to change
the configuration settings.

isi config

Edit Wizard settings

Common commands -
shutdown, status, name
Change
s
prompt
to >>>
Other "isi" commands not available in
configuration console

Double-click the image to enlarge.

The isi config command, pronounced "izzy config," opens the configuration
console. The console contains configured settings from the time the Wizard started
running.

Use the console to change initial configuration settings. When in the isi config
console, other configuration commands are unavailable. The exit command is
used to go back to the default CLI.

Web Administration Interface (WebUI)

OneFS
version
User must have logon privileges

Connect to
any node in
cluster over
HTTPS on
port 8080

Multiple browser support

Double-click the image to enlarge.

The WebUI is a graphical interface that is used to manage the cluster.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 37
NAS, PowerScale, and OneFS

The WebUI requires at least one IP address configured23 on one of the external
Ethernet ports present in one of the nodes.

Example browser URLs:


• https://192.168.3.11:8080
• https://engineering.dees.lab:8080

To access the web administration interface from another computer, an Internet


browser is used to connect to port 8080. The user must login using the root
account, admin account, or an account with log on privileges. After opening the
web administration interface, there is a four-hour login timeout. In OneFS 8.2.0 and
later, the WebUI uses the HTML5 doctype, meaning it is HTML5 compliant in the
strictest sense, but does not use any HTML specific features. Previous versions of
OneFS require Flash.

Command Line Interface (CLI)

The CLI can be accessed in two ways:

• Out-of-band24
• In-band25

Both methods are done using any SSH client such as OpenSSH or PuTTY. Access
to the interface changes based on the assigned privileges.

OneFS commands are code that is built on top of the UNIX environment and are
specific to OneFS management. You can use commands together in compound

23The Ethernet port IP address is either configured manually or by using the


Configuration Wizard.

24Accessed using a serial cable connected to the serial port on the back of each
node. As many laptops no longer have a serial port, a USB-serial port adapter may
be needed.

25 Accessed using external IP address that is configured for the cluster.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 38 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

command structures combining UNIX commands with customer facing and internal
commands.

4
1
5

3 6

1: The default shell is zsh.

2: OneFS is built upon FreeBSD, enabling use of UNIX-based commands, such as


cat, ls, and chmod. Every node runs OneFS, including the many FreeBSD kernel
and system utilities.

3: Connections make use of Ethernet addresses.

4: OneFS supports management isi commands. Not all administrative


functionalities are available using the CLI.

5: The CLI command use includes the capability to customize the base command
with the use of options, also known as switches and flags. A single command with
multiple options result in many different permutations, and each combination
results in different actions performed.

6: The CLI is a scriptable interface. The UNIX shell enables scripting and execution
of many UNIX and OneFS commands.

Caution: Follow guidelines and procedures to appropriately


implement the scripts to not interfere with regular cluster
operations. Improper use of a command or using the wrong
command can be potentially dangerous to the cluster, the node, or
to customer data.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 39
NAS, PowerScale, and OneFS

CLI Usage

Can use common UNIX


tools

"help" shows needed privileges

Shows syntax and usage

Option
explanation

Double-click the image to enlarge.

The man isi or isi --help command is an important command for a new
administrator. These commands provide an explanation of the available isi
commands and command options. You can also view a basic description of any
command and its available options by typing the -h option after the command.

Platform Application Programming Interface (PAPI)

The Platform Application Programming Interface, or PAPI, is a secure and


scriptable26 interface for managing the cluster.

HTTPS is used in PAPI to encrypt communications.

OneFS applies authentication and RBAC controls to PAPI commands to ensure


that only authorized commands are run.

The example shows a description for https://:8080/platform/quota/quotas1

26A chief benefit of PAPI is its scripting simplicity, enabling customers to automate
their storage administration.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 40 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

1: Structured like URLs that execute on a browser that supports authentication

2: PAPI conforms to the REST architecture. An understanding of HTTP/1.1 (RFC


2616) is required to use the API.

3: Some commands are not PAPI aware, meaning that RBAC roles do not apply.
These commands are internal, low-level commands that are available to
administrators through the CLI. Commands not PAPI aware: isi config, isi
get, isi set, and isi services

4: The number indicates the PAPI version. If an upgrade introduces a new version
of PAPI, some backward compatibility ensures that there is a grace period for old
scripts to be rewritten.

Front Panel Display

Front Panel Display of a Gen 6 chassis.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 41
NAS, PowerScale, and OneFS

The Gen 6 front panel display is an LCD screen with five buttons used for basic
administration tasks27.

The Gen 6.5 front panel has limited functionality28 compared to the Gen 6.

Challenge

Lab Assignment: Launch the lab image and connect to the cluster
using the WebUI and the CLI.

27Some of them include: adding the node to a cluster, checking node or drive
status, events, cluster details, capacity, IP and MAC addresses.

28You can join a node to a cluster and the panel display node name after the node
has joined the cluster.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 42 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Common Cluster Operations

Scenario

IT Manager: I want you to familiarize yourself with the common cluster


tasks before you begin to manage the cluster. Now, we received two
new nodes for the cluster and I was told the nodes have been put in the
rack, cabled up, and are powered on. I want you to examine the
licensing before adding the nodes.

Your Challenge: The new IT manager has given you a task to describe
the OneFS licensing and add the new nodes to the PowerScale cluster.

Licensing

Evaluation licensing
No individual per-
enabled from cluster
feature keys

Upgrades translate
keys into file

Old licensing system


not used on new
OneFS versions

WebUI Cluster management > Licensing > Open Activation File Wizard or use the "isi license"
command.

In OneFS 8.1 and later a single license file contains all the licensed feature
information in a single location.

Administrators can enable evaluation licenses directly from their cluster.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 43
NAS, PowerScale, and OneFS

Device ID and Logical Node Number

Some features use the LNN and


others use the device ID
"isi config" and then "lnnset"
command to change

Device ID cannot be
changed

Unique for each new node - not


reused

LNN can be changed

Changing the LNN 3 to LNN 5 to maintain the sequential numbering of the nodes.

You should have an understanding of the two different numbers that identify a
node. The numbers are the device ID and logical node number or LNN.

The status advanced command from the isi config sub menu shows the
LNNs and device ID.

The lnnset command is used to change an LNN.

When a node joins a cluster, it is assigned a unique node ID number. If you remove
and rejoin a node from the cluster, the node is assigned a new device ID.

You can change an LNN in the configuration console. To change the LNN to
maintain the sequential numbering of the nodes use lnnset <OldNode#>
<NewNode#>.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 44 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Adding Nodes to Cluster

New node boots to


config Wizard

New nodes add CPU,


memory and capacity

Automatically assigned node


number and IP address from initial
config range

Imaged to match cluster OneFS version

Joining a node to cluster using Configuration Wizard.

When adding new nodes to a cluster, the cluster gains more CPU, memory, and
disk space. The methods for adding a node are:
• Front panel
• Configuration Wizard
• WebUI
• CLI

Join the nodes in the order that the nodes should be numbered.

Adding a node not connected to the external network (NANON) increases the
storage and compute capacity of the cluster.

Nodes are automatically assigned node numbers and IP addresses on the internal
and external networks. A node joining the cluster with a newer or older OneFS
version is automatically reimaged to match the OneFS version of the cluster. A
reimage may take up to 5 minutes.

Compatibility

Hardware compatibility is a concern when combining dissimilar Gen 6.5 nodes. For
example, when adding a single F200 node with 48 GB RAM to an F200 node pool

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 45
NAS, PowerScale, and OneFS

that has nodes with 96 GB of RAM. Without compatibility, a minimum of three F200
nodes with 48 GB RAM is required, which creates a separate node pool.

Node series compatibility depends upon the amount of RAM, the SSD size, number
of HDDs, and the OneFS version.

Resource File: The PowerScale Supportability and Compatibility


Guide covers software, protocols, and hardware.

Cluster Shutdown

CLI uses the "isi


config" sub menu

Can shutdown
entire cluster

Can shutdown a node


using the LNN

Using the Configuration Wizard to shutdown node 4

PowerScale Administration-SSP1

Internal Use - Confidential


Page 46 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Administrators can restart or shutdown the cluster using the WebUI29 or the CLI30.

Caution: Do not shut down nodes using the UNIX shutdown –p


command, halt command, or reboot command. Using the UNIX
command may result in RAM not flushing properly.

Challenge

Lab Assignment: Launch the lab and add a node using the
Configuration Wizard and add a node using the WebUI.

29The WebUI Hardware page has a tab for Nodes to shut down a specific node, or
the Cluster tab to shut down the cluster.

30
Native UNIX commands do not elegantly interact with OneFS, because the
OneFS file system is built as a separate layer on top of UNIX.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 47
NAS, PowerScale, and OneFS

OneFS Directory Structure

Scenario

IT Manager: Good, looks like you know what the different PowerScale
management tools are. Now I want you to focus on the directory
structure that OneFS uses. This is important as it sets up the directory
structure we will use moving foreward.

Your Challenge: The IT manager wants to ensure you can configure


the directory structure that conforms to the organization's governance.

Directory Structure Overview

OneFS root
directory

The directory structure is a 2-dimensional construct that organizes files into a


hierarchy of folders.

• The structure should be fixed and scalable.


• One top-level organizational construct can only be subdivided in a limited way.

At the core of OneFS, is the single file system across the cluster (/ifs). The single
file system in practice is a common directory structure.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 48 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

OneFS Integrated Directories

The graphic shows the OneFS built-in directories.

Using or intervening with the built-in directory paths is not recommended unless
explicitly instructed to do so.

• Using a single file system starting with a newly created directory under /ifs is
recommended.
• For example, in the simplest form, you can create /ifs/engineering where
the engineering department data is the top-level directory for the engineering
organization.

Directory Structure Tiering

The graphic shows the recommended directory structure.

Workflow-type Data grouped for logical


root grouping and integration
purposes

OneFS root
Cluster root

Authentication
and segregation
root Location to situate data and create
exports and shares as per
requirement

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 49
NAS, PowerScale, and OneFS

Warning: Having no directory structure, or a poor structure,


designed upfront can create a disruptive activity when the end user
is required to fix the structure.

Directory Structure Example 1

The graphic shows an example of a designed directory structure.

Use case:
• A company that is named X-Attire plans to implement a single cluster for their
engineering team.
• After conversations with the customer, you identify that the customer does not
plan to have another cluster for remote disaster recovery.
• The company name or authentication domain name is used as the access zone
name (x-attire).

Access zones are covered in another topic

PowerScale Administration-SSP1

Internal Use - Confidential


Page 50 © Copyright 2020 Dell Inc.
NAS, PowerScale, and OneFS

Directory Structure Example 2

Use case:
• X-Attire plans to implement a disaster recovery solution.
• X-Attire wants to replicate the Boston/homedirs directory to the Seattle data
center.
• from Seattle, they plan to replicate the /groupdirs directory to Boston.
• Having the directory structure design up front makes the implementation easier.

SyncIQ is covered in another topic.

Directory Structure Permissions

On the /ifs directory, do not set inherited ACLs and do not propagate ACL
values.

Permissions on levels 1 through 5 are customer-specific and you should define the
appropriate permissions and inherited permissions starting at the appropriate level.

ACLs and POSIX mode bits are covered in other topics.

The table shows the recommended permissions at each directory tier.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 51
NAS, PowerScale, and OneFS

Challenge

Lab Assignment: Go to the lab and build the base directories. The
base directories are used throughout your implementation of the
PowerScale cluster.

PowerScale Administration-SSP1

Internal Use - Confidential


Page 52 © Copyright 2020 Dell Inc.
Configuring the Foundation for Access

Configuring the Foundation for Access

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 53


Configuring the Foundation for Access

Authentication Providers

Scenario

IT Manager: Now, the next thing to do is get the cluster pointed to the
Active Directory and LDAP servers. Before our clients can access files
that are stored on the cluster, they must be authenticated. Make sure
that you have a good understanding of the authentication providers that
the cluster supports.

Your Challenge: You are tasked to add authentication providers to the


PowerScale cluster. Before adding authentication providers, you need to
know a few things. The manager wants you to explain the supported
authentication providers and configuring the NTP service.

Authentication Provider Overview

Authentication settings for the clusters are managed using an authentication


provider. Click each authentication provider that is mentioned below to know more.

1 2 3

4 5 6

1: Active Directory is a Microsoft implementation of Lightweight Directory Access


Protocol (LDAP), Kerberos, and DNS technologies that can store information about
network resources. Active Directory can serve many functions, but the primary

PowerScale Administration-SSP1

Page 54 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

reason for joining the cluster to an Active Directory domain is to perform user and
group authentication.

2: The Lightweight Directory Access Protocol (LDAP) is a networking protocol that


enables you to define, query, and modify directory services and resources. OneFS
can authenticate users and groups against an LDAP repository to grant them
access to the cluster.

3: The Network Information Service (NIS) provides authentication and identity


uniformity across local area networks. OneFS includes a NIS authentication
provider that enables you to integrate the cluster with the NIS infrastructure. NIS,
can authenticate users and groups when they access the cluster.

4: A file provider enables you to supply an authoritative third-party source of user


and group information to a PowerScale cluster. A third-party source is useful in
UNIX and Linux environments that synchronize the /etc/passwd, /etc/group, and
etc/netgroup files across multiple servers.

5: The local provider provides authentication, and lookup facilities for user accounts
added by an administrator.

6: Kerberos is a network authentication provider that negotiates encryption tickets


for securing a connection. OneFS supports Microsoft Kerberos and MIT Kerberos
authentication providers on a cluster. If you configure an Active Directory provider,
support for Microsoft Kerberos authentication is provided automatically. MIT
Kerberos works independently of Active Directory.

Note: The MIT Kerberos authentication provider is used with NFS,


HTTP, and HDFS.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 55


Configuring the Foundation for Access

Authentication Provider Structure

clients Client access protocols

Local security daemon

Authentication
Authentication provider
source / directory

Access control architectural components that show two configured access zones.

The lsassd, pronounced “L-sass-D,” is the OneFS authentication daemon.

lsassd is between the access protocols and the lower-level services providers.

The lsassd daemon mediates between the authentication protocols that clients
use and the authentication providers in the third row.

The authentication providers check their data repositories, which are shown on the
bottom row. The process determines user identity and subsequent access to files.

Active Directory Overview

Function

Active Directory can serve many functions, but the primary reason for joining the cluster to an AD domain is to enable domain
users to access cluster data.

To join the cluster to AD, specify the fully qualified domain name, which can be
resolved to an IPv4 or an IPv6 address, and a username with join permission.
Areas to consider:
• Creates a single AD machine account
• Establishes trust relationship

PowerScale Administration-SSP1

Page 56 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

• Supports NTLM and Microsoft Kerberos


• Each Active Directory provider must be associated with a groupnet
• Adding to an access zone
• Multiple AD instances

When the cluster joins an AD domain, OneFS creates a single AD machine


account. The machine account establishes a trust relationship with the domain and
enables the cluster to authenticate and authorize users in the Active Directory
forest. OneFS supports NTLM and Microsoft Kerberos for authentication of Active
Directory domain users. You can add an Active Directory provider to an access
zone as an authentication method for clients connecting through the access zone.
The access zone and the Active Directory provider must reference the same
groupnet. OneFS supports multiple instances of Active Directory on a PowerScale
cluster; however, only one Active Directory provider can be assigned per access
zone.

Active Directory Configuration Video

The video provides a demonstration of the configuration tasks for an Active


Directory authentication provider. See the student guide for a transcript of the
video.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 57


Configuring the Foundation for Access

Click to launch video.

Link:
https://edutube.emc.com/Player.aspx?vno=Xu/3IyDNSxbuNMOcLHrqBg==&autopl
ay=true

In this demonstration, we’ll go through the steps needed to configure the


PowerScale cluster for Active Directory. Let’s navigate to Access and then to
Authentication providers page. The Active Directory tab is the default selection.
Note that for a multi-mode implementation, connecting to the LDAP server first
establishes the proper relationships between UNIX and AD identities. If AD is
added before joining an LDAP domain, some authentication challenges and
permissions issues may occur, and additional work is needed to remediate these
changes.

Select the Join a domain button. This demonstration shows the barest configuration
to join a domain. Start by entering the provider name. The NetBIOS requires that
computer names be 15 characters or less. Two to four characters are appended to
the cluster name you specify to generate a unique name for each node. If the
cluster name is more than 11 characters, you can specify a shorter name in the
Machine Name field. Enter the user name of the account that has the right to add
computer accounts to the domain, and then enter the account password. The
Enable Secure NFS checkbox enables users to log in using LDAP credentials, but
to do this, Services for NFS must be configured in the AD environment.

OneFS is RFC 2307-compliant. Use Microsoft Active Directory with Windows


Services for UNIX and RFC 2307 attributes to manage Linux, UNIX, and Windows
systems. Integrating UNIX and Linux systems with Active Directory centralizes
identity management and eases interoperability, reducing the need for user-
mapping rules.

Shown is the CLI equivalent command used to join Active Directory. To display a
list of command options, run the isi auth ads create -h command at the
CLI. Now, before connecting to an LDAP server you should decide which optional

PowerScale Administration-SSP1

Page 58 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

customizable parameters you want to use. Refer the Isilon Web Administration
Guide for details on each of the settings.

Click the Join button. While joining the domain, the browser window displays the
status of the process and confirms when the cluster has successfully joined the AD
domain. The join creates a single computer account is for the entire cluster.

And that is the most basic configuration. Note that AD and LDAP both use TCP
port 389. Even though both services can be installed on one Microsoft server, the
cluster can only communicate with one of services if they are both installed on the
same server. This concludes the demonstration.

Network Time Protocol (NTP) Overview

Active Directory and Kerberos depend on accurate time


Internally, cluster nodes
use NTP to coordinate
time settings
Cluster time sets
cluster’s date and
time settings

If multiple NTP Risk: if cluster drifts


servers config, from SMB time,
first on list is authentication fails
first used

NTP configured on the cluster.

Time synchronization is one of the most frequent problems administrators have


with authentication. Both Active Directory and Kerberos depend upon accurate
timing. If the time on the cluster drifts from the authentication server's time, AD
authentication fails.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 59


Configuring the Foundation for Access

• Synchronize to NTP source31


• Cluster time properties32
• Synchronize issues33
• SMB time34
• Node time35

31 The easiest method is to synchronize the cluster and the authentication servers
all to the same NTP source.

32The cluster time property sets the date and time settings, either manually or by
synchronizing with an NTP server. After an NTP server is established, setting the
date or time manually is not allowed.

33After a cluster is joined to an AD domain, adding an NTP server can cause time
synchronization issues. The NTP server takes precedence over the SMB time
synchronization with AD and overrides the domain time settings on the cluster.

34SMB time is enabled by default and is used to maintain time synchronization


between the AD domain time source and the cluster.

35 Nodes use NTP between themselves to maintain cluster time. When the cluster
is joined to an AD domain, the cluster must stay synchronized with the time on the
domain controller. If the time differential is more than five minutes, authentication
may fail.

PowerScale Administration-SSP1

Page 60 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

NTP Configuration

Optional key for NTP server - key file in /ifs

Can configure more than one server

Chimers nodes can contact the external Non-chimer nodes use chimers as NTP
NTP servers servers

WebUI > General settings > NTP page to configure NTP and chimer settings.

You can configure specific chimer nodes by excluding other nodes using the
isi_ntp_config {add | exclude} <node#> command. The list excludes
nodes using their node numbers that are separated by a space.

LDAP Overview

Function

OneFS can authenticate users and groups against an LDAP repository in order to grant them access to the cluster. OneFS
supports Kerberos authentication for an LDAP provider.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 61


Configuring the Foundation for Access

Each LDAP entry36 has a set of attributes37.

The LDAP service provider supports the following features:


• Uses a simple directory service that authenticates users and groups accessing
cluster.
• Supports netgroups and supports the ldapsam schema, which enables NTLM to
authenticate over SMB.
• Enables users to access resources between disparate directory services or as a
single sign-on resource.
• Each LDAP provider must be associated with a groupnet.38

LDAP Configuration Video

The video provides a demonstration of the configuration tasks for an LDAP


authentication provider. See the student guide for a transcript of the video.

36 Each entry consists of a distinguished name, or DN, which also contains a


relative distinguished name (RDN). The base DN is also known as a search DN
because a given base DN is used as the starting point for any directory search.

37Each attribute has a name and one or more values that are associated with it
that is similar to the directory structure in AD.

38 LDAP provider can be added to an access zone as an authentication method for


clients connecting through the access zone. An access zone may include at most
one LDAP provider. The access zone and the LDAP provider must reference the
same groupnet.

PowerScale Administration-SSP1

Page 62 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Click to launch video.

Link:
https://edutube.emc.com/Player.aspx?vno=JKBFLVJaUoqGz8DJmH4zqg==&autop
lay=true

In this demonstration, we’ll go through the steps needed to configure LDAP for the
PowerScale cluster. Let us navigate to Access and then to Authentication providers
page. Next, select the LDAP tab. Now click the Add an LDAP provider button.

For this demonstration, I am only showing the barest configuration. Let us give our
LDAP a provider name. Next, I will enter the URI to the LDAP server. You must
configure a base distinguished name. Often issues involve either misconfigured
base DNs or connecting to the LDAP server. The top-level names almost always
mimic DNS names; for example, the top-level Isilon domain would be dc=isilon,
dc=com for Isilon.com. Our environment is DEES and lab.

Shown is the CLI equivalent command used to configure LDAP. To display a list of
these commands, run the isi auth ldap create -h command at the CLI.
And that is the most basic configuration.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 63


Configuring the Foundation for Access

Now, before connecting to an LDAP server you should decide which optional
customizable parameters you want to use. If there are any issues while configuring
or running the LDAP service, there are a few commands that can be used to help
troubleshoot. The ldapsearch command runs queries against an LDAP server to
verify whether the configured base DN is correct. The tcpdump command verifies
that the cluster is communicating with the assigned LDAP server.

You have the option to enter a netgroup. A netgroup, is a set of systems that reside
in a variety of different locations, that are grouped together and used for permission
checking. For example, a UNIX computer on the 5th floor, six UNIX computers on
the 9th floor, and 12 UNIX computers in the building next door, all combined into
one netgroup.

Select the Add LDAP Provider button. After the LDAP provider is successfully
added, the LDAP providers page displays a green status. This means that the
cluster can communicate with the LDAP server. Note that AD and LDAP both use
TCP port 389. Even though both services can be installed on one Microsoft server,
the cluster can only communicate with one of services if they are both installed on
the same server. This concludes the demonstration.

Challenge

Lab Assignment:
• Join the cluster to Active Directory
• Configure the cluster for LDAP

PowerScale Administration-SSP1

Page 64 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Access Zones

Scenario

IT Manager: Now that you have configured the cluster for Active
Directory and LDAP, it is time to take the next step in implementation.
You are configuring access zone for two organizations, finance and
engineering. Finance is a Microsoft Windows environment and
engineering is a Linux environment. Before you configure the cluster, I
want to ensure you understand access zones and what they do.

Your Challenge: The IT manager has tasked you to explain what an


access zone is, what it does, and how to configure access zones.

Access Zone Overview Video

This video provides an overview for access zones. See the student guide for a
transcript of the video.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 65


Configuring the Foundation for Access

Click to launch video.

Link: https://edutube.emc.com/Player.aspx?vno=w/pzpXjL6ZCFlcdx0riu5A

Although the default view of a cluster is that of one physical machine, you can
partition a cluster into multiple virtual containers called access zones. Access
zones enable you to isolate data and control who can access data in each zone.
Access zones support configuration settings for authentication and identity
management services on a cluster. Configure authentication providers and
provision protocol directories, such as SMB shares and NFS exports, on a zone-by-
zone basis. Creating an access zone, automatically creates a local provider, which
enables you to configure each access zone with a list of local users and groups.
You can also authenticate through a different authentication provider in each
access zone.

Access Control Architectural Components

The OneFS identity management maps users and groups from separate directory
services to provide a single combined identity. It also provides uniform access
control to files and directories, regardless of the incoming protocol.

The table defines the components of access zones.

PowerScale Administration-SSP1

Page 66 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

External Protocols

Click image to enlarge.

Clients use the external access protocols to connect to the PowerScale cluster.
The supported protocols are SMB, NFS, S3, HTTP, FTP, HDFS, and SWIFT.

lsassd Daemon

Click to enlarge image.

The lsassd (L-sass-d) daemon mediates between the external protocols and the
authentication providers, with the daemon contacting the external providers for user
lookups.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 67


Configuring the Foundation for Access

External Providers

Click to enlarge image.

Besides external protocols, there are also external authentication providers.


External directories hold lists of users that the internal providers contact to verify
user credentials. Once a user identity has been verified, OneFS generates an
access token. The access token is used to allow or deny a user access to the files
and folders on the cluster.

Internal Providers

Click to enlarge image.

Internal providers sit within the cluster operating system and are the Local, or File
Providers.
• File provider - authoritative third-party source of user and group information.
• Local provider - provides authentication and lookup facilities for user accounts
added by an administrator.
• Local provider automatically created in access zone.

PowerScale Administration-SSP1

Page 68 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Access Zone Planning - Base Directory

1 4
2

1: Separate authentication from /ifs/eng access zone.

2: Access zone base directories for dvt and eng.

3: The /ifs/eng/hardware directory can be a base directory for another access


zone. This is not a good practice.

4: The /ifs/eng base directory partitions data from the /ifs/dvt directory.

5: The base directory of the default System access zone is /ifs and cannot be
modified. Avoid using the OneFS built-in directories as base directories.

A base or root directory defines the tree structure of the access zone.

The access zone cannot grant access to any files outside of the base directory,
essentially creating a unique namespace.

Using access zones is the recommended method of separating data. However, a


few workflows can benefit from having one access zone being able to see the
dataset of another access zone.

Overlapping example: Creating a /ifs/eng/hardware for the access zone base,


which is inside the eng access zone base directory. Overlapping access zones
enables the eng workers to put data on a cluster, while enabling the dvt workers to
take that data and use it. When you set it up this way, you maintain the different
authentication contexts while enabling the second group access.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 69


Configuring the Foundation for Access

Access Zone Configuration - Demonstration

This demonstration provides a look at access zone configuration. See the student
guide for a transcript of the video.

Error! Hyperlink reference not valid.

Link: https://edutube.emc.com/Player.aspx?vno=08ieHpVlyvyD+A8mTzHopA

In this demonstration, we will go through the steps to create access zones using
the WebUI and the CLI. First, let’s use the WebUI.

Navigate to Access and then to the Access zones page. Note that the System
access zone is shown in the table. The System zone is created by OneFS. Select
the Create an access zone button. In the window, enter the zone name for the new
access zone. Next enter the zone base directory. This should be unique, and you
should avoid using the OneFS built-in directories such as /ifs/data. Our base
directory is /ifs/sales.

Since we have not created this directory before creating the access zone, select
the checkbox to create the base directory automatically. Notice that we already
configured the authentication providers. This access zone is dedicated for the
Active Directory users. Add the AD provider and then select Create zone.

PowerScale Administration-SSP1

Page 70 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Next, we will create another access zone using the CLI. We are logged in via SSH
to node 1 and using the isi zone command. The name of this access zone is
engineering. The unique base directory is /ifs/engineering. Since the
/ifs/engineering directory does not exist, use the option to create it. And
finally, we will add the LDAP authentication provider to the zone.

Next verify that the zones are created. Use the list option. Moving back to the
WebUI, check the access zone page to verify the zones display. Instead of waiting
for the refresh, click on another page and then back.

This demonstration showed configuring access zones using the WebUI and the
CLI. This concludes the demonstration.

Access Zone Considerations

Listed are areas to consider when configuring and discussing access zones.
• The number of access zones should not exceed 50.
• As a good practice, configure an access zone for a specific protocol if multi-
protocol access is not needed. For example, an implementation with both NFS
and SMB access should have an access zone for the NFS access and another
access zone for the SMB access.
• Access zones and authentication providers must be in only one groupnet.
• Authentication sources are joined to the cluster and "seen" by access zones -
multiple instances of the same provider in different access zones is not
recommended.
• Authentication providers are not restricted to one specific zone.
• Only join AD providers not in same forest (untrusted forest).
• Shared UIDs in same zone can potentially cause UID/GID conflicts.
• You can overlap data between access zones for cases where workflows require
shared data - however, overlapping adds complexity that may lead to issues
with client access.

Access Zone Best Practices

You can avoid configuration problems on the cluster when creating access zones
by following best practices guidelines.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 71


Configuring the Foundation for Access

Best Practice Detail

Create unique base directory. Achieves data isolation. Base directories


can overlap only if workflows share data.

System zone is for global admin Employ ZRBAC for zone administration.
access only.

Create zones to isolate data for Do not isolate if workflow requires shared
different clients. data.

Avoid overlapping UID/GID ranges Potential for UID/GID conflicts if overlap in


for providers in same zone. same zone.

Challenge

Lab Assignment: You have the authentication providers configured,


now create the access zones for the environment and add the
authentication providers to the access zones.

PowerScale Administration-SSP1

Page 72 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Groupnets

Scenario

IT Manager: You should configure the access zones before you


configure networking. Now, you will examine the networking
components of OneFS. Ensure you understand groupnets and how
groupnets strengthen multitenancy.

Your Challenge: The IT manager has tasked you to explain groupnets.

Network Configuration Planning

Configure Tenant and


DNS server

Configure SmartConnect
IP address, VLAN, and
MTU on the subnet

Configure dynamic or static IP address


pool for node external network ports

Groupnets reside at the top tier of the networking hierarchy and are the
configuration level for managing multiple tenants on your external network.

Groupnets contain one or more subnets.

By default, OneFS builds Groupnet0, Subnet0, and Pool0.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 73


Configuring the Foundation for Access

A subnet can also be called the SmartConnet zone and contain one or more pools.
Pools enable more granular network configuration.

Multi-Tenancy Overview

SmartConnect: isilon.xattire.com

192.168.0.0/24

192.168.2.0/24
SmartConnect: isilon.gearitup.com

Example of two tenants with a separate groupnet.

Groupnets are the configuration level for managing multiple tenants39 on the
external network of the cluster.

Multi-tenancy is the ability to host multiple organizations in a single cloud,


application, or storage device. Each organization in the environment is called a
tenant.

In the X-Attire scenario, the solution must treat each business unit as a separate
and unique tenant with access to the same cluster. The graphic shows how each
organization has its own groupnet and access zone.

39Even with no plans to use multi-tenancy, a good practice is to organize data


based on access zones. Organizing is for both security purposes and to enable
compartmentalization of failover by, for instance, AD domain.

PowerScale Administration-SSP1

Page 74 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Multi-tenancy Considerations

Groupnets are an option for those clusters that will host multiple companies,
departments, or clients that require their own DNS settings. Some areas to
consider are:
• DNS settings are per groupnet
• Create another groupnet only if separate DNS settings required.
• Follow proper build order:
1. Create groupnet
2. Configure authentication provider
3. Create access zone, and add authentication provider
4. Configure subnet with SmartConnect
5. Create pool, and add access zone
• In a multiple tenant solution, a share can span access zones. Combining
namespaces and overlapping shares is an administrative decision.

Important: Leave the System zone in Groupnet0.

Groupnets and Access Zones Video

This video provides an overview of the groupnet and access zone relationship. See
the student guide for a transcript of the video.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 75


Configuring the Foundation for Access

Click to launch video.

Link:
https://edutube.emc.com/Player.aspx?vno=b4A2l5FzF2na/Txqk2AUTA==&autopla
y=true

Because groupnets are the top networking configuration object, they have a close
relationship with access zones and the authentication providers. Having multiple
groupnets on the cluster means that you are configuring access to separate and
different networks, which are shown as org1 and org2. Different groupnets enable
portions of the cluster to have different networking properties for name resolution.
Configure another groupnet if separate DNS settings are required. If necessary, but
not required, you can have a different groupnet for every access zone. The
limitation of 50 access zones enables the creation of up to 50 groupnets.

When the cluster joins an Active Directory server, the cluster must know which
network to use for external communication to the external AD domain. Because of
this, if you have a groupnet, both the access zone and authentication provider must
exist within same groupnet. Access zones and authentication providers must exist
within only one groupnet. Active Directory provider org2 must exist in within the
same groupnet as access zone org2.

PowerScale Administration-SSP1

Page 76 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

WebUI for Configuration

The maximum number of


DNS server IP addresses is
limited to 3

Rotate is the only option The maximum number of DNS


search domains is limited to 6

DNS caching for the


groupnet
Enable appending node DNS search lists
to client DNS inquiries directed at
SmartConnect service IPs

The graphic shows the Cluster management > Network configuration > external network > Add
a groupnet window.

When creating a groupnet with access zones and providers in the same zone, you
need to create them in the proper order:
1. Create the groupnet.
2. Create the access zone and assign to the groupnet.
3. Create the subnet and pool.
4. Add the authentication provider and associate them with the groupnet
5. Associate the authentication providers with the access zone.

CLI for Configuration

When creating a groupnet with access zones and providers in the same zone, you
should create them in the proper order.

Order Function Command Syntax

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 77


Configuring the Foundation for Access

1 Create groupnet isi network groupnets create


<id> --dns-servers=<ip> isi
auth ads create <name> <user>
Example: isi network groupnets
create groupnet1 --dns-servers
192.168.4.10 --dns-search
org1.com

2 Create Authentication isi auth ads create <name>


Providers <user> --groupnet=<groupnet
name>

3 Create access zone isi zone zones create <name>


<path> --auth-providers=<list
of auth providers> --
groupnet=<groupnet name>

4 Create subnet isi network subnets create


<id> <addr-family> { ipv4 |
ipv6} <prefix-len>

5 Create pool isi network pools create <id>


--access-zone=<zone name>

Tip: You cannot recreate an already defined subnet. A defined


subnet is only used once.

PowerScale Administration-SSP1

Page 78 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Challenge

IT Manager:
Because you configure the network components together, you will not
go to the lab until the other topics are discussed. Open participation
question:
Question: When would you create a groupnet?

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 79


Configuring the Foundation for Access

Subnet - SmartConnect Zones

Scenario

IT Manager: I do not understand what the function of SmartConnect is.


I would like you to do some research and set it up to see what it does.

Your Challenge: The IT manager wants you to explain the


SmartConnect benefits and configure SmartConnect.

SmartConnect Overview Video

This video provides an overview of SmartConnect. See the student guide for a
transcript of the video.

Click to launch video.

PowerScale Administration-SSP1

Page 80 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Link: https://edutube.emc.com/Player.aspx?vno=L7mXSvTcNQl8+LLKzNEzkw

SmartConnect enables client connections to the storage cluster using a single


hostname or however many host names a company needs. It provides load
balancing and dynamic NFS failover and failback of client connections across
storage nodes to provide optimal utilization of the cluster resources. SmartConnect
eliminates the need to install client-side drivers, enabling administrators to manage
large numbers of clients if a system fails.

SmartConnect provides name resolution for the cluster. The cluster appears as a
single network element to a client system. Both cluster and client performance can
be enhanced when connections are more evenly distributed.

SmartConnect simplifies client connection management. Based on user


configurable policies, SmartConnect Advanced applies intelligent algorithms (as in,
CPU utilization, aggregate throughput, connection count, or Round-robin).
SmartConnect distributes clients across the cluster to optimize client performance.
SmartConnect can be configured into multiple zones that can be used to ensure
different levels of service for different groups of clients. SmartConnect can remove
nodes that have gone offline from the request queue, and prevent new clients from
attempting to connect to an unavailable node. Also, SmartConnect can be
configured so new nodes are automatically added to the connection balancing pool.

In Isilon OneFS 8.2, SmartConnect supports connection service for 252 nodes.

SmartConnect Architecture

SmartConnect: isilon.xattire.com
SIP: 192.168.0.100 - 192.168.0.104

192.168.0.0/24

192.168.2.0/24

SmartConnect: isilon.gearitup.com
SIP: 192.168.2.100 - 192.168.2.104

The example shows two unique groups using the same cluster, X-Attire and GearItUp.

You can configure SmartConnect into multiple zones to provide different levels of
service for different groups of clients.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 81


Configuring the Foundation for Access

For example, SmartConnect directs X-Attire users to F800 nodes for their needed
performance. GearItUp users access the H500 nodes for general-purpose file
sharing. The zones are transparent to the users.

The SmartConnect Service IPs40 (SSIP or SIP) are addresses that are part of the
subnet.

Important: To configure SmartConnect, you must also create


records on the customer DNS servers. If the clients use DNS for
name resolution, configure the DNS server to forward cluster name
resolution requests to the SmartConnect service.

SmartConnect Licensing

The table shows the differences between the SmartConnect basic and
SmartConnect Advanced.

SmartConnect Basic SmartConnect Advanced (licensed)


(unlicensed)

40Do not put the SIPs in an address pool. The SIPs are a virtual IP within the
PowerScale configuration, it is not bound to any of the external interfaces.

PowerScale Administration-SSP1

Page 82 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

• Static IP allocation • Dynamic allocation


• Multiple subnets • NFSv3 failover
• Single pool per subnet • Plus static IP allocation
• Single SC DNS zone per • Multiple subnets
subnet • Multiple pools per subnet
• One balancing option: Round- • Multiple SC zone names per subnet
robin
• Four balancing options:
• Round-robin, Connection count,
Throughput, and CPU usage
• Can set rebalance policy
• Up to 6 SSIPs per subnet

SmartConnect Configuration Components

The SIPs, SmartConnect zone, and the DNS entries are the configuration
components for SmartConnect.

• SmartConnect service IPs


• IP addresses pulled out of subnet
• Never used in pool
• Interfaces with DNS server
• Minimum of two, maximum of six per subnet
• SmartConnect Zone name
• One name per pool
• Friendly name for users (seen as servers on the network)
• sales.isilon.xattire.com - \\sales
• mktg.isilon.xattire.com -\\mktg
• DNS:

• Add NS delegation record for SmartConnect Zone.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 83


Configuring the Foundation for Access

• Add A or AAAA record for the SmartConnect Service IPs.

SmartConnect Configuration - Create SmartConnect Zone


Demonstration

This demonstration shows the initial network configuration for the cluster. See the
student guide for a transcript of the video.

Click to launch demonstration.

Link: https://edutube.emc.com/Player.aspx?vno=4hL0i4iBe2BLqJzlT4dN/Q

In this demonstration, we’ll go through the step for an initial configuration of the
cluster external network. The demonstration shows configuring SmartConnect and
a dedicated pool for an access zone.

First, login to the WebUI and navigate to the Cluster management, Network
configuration page. The External network tab is selected by default. Note that
groupnet0 and subnet0 is automatically created by OneFS. On the subnet0 line,

PowerScale Administration-SSP1

Page 84 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

select View / Edit. There are no values for SmartConnect. Select Edit. Go to the
SmartConnect service IPs and enter the range of SmartConnect IP addresses.
OneFS versions prior to OneFS 8.2 do not allow you to enter a range of IP
addresses. For this demonstration we will be using a SmartConnect service name.

Select Save changes. The CLI equivalent to add the SmartConnect service
address is the isi network subnet modify command. Now that
SmartConnect is configured, we will configure the IP address pool for the access
zone. On the subnet0 line, click on the More dropdown and select Add pool.

Enter the pool name and then select the access zone. For this implementation the
authentication providers and the access zones are already created.

Next enter the range of IP address for this pool. Select the external node interfaces
that will carry the client traffic. The SmartConnect basic fully qualified zone name is
sales.dees.lab. We have the SmartConnect advanced license activated. Here is
where we can configure the advanced functions. For the demonstration, we will
keep the default settings. Select Add pool. The CLI equivalent to create a pool is
the isi network pools create command.

This demonstration showed the initial configuration of the network. This concludes
the demonstration.

SmartConnect Considerations

Listed are some areas to consider when discussing SmartConnect.


• DNS Integration:
− DNS primer
− DNS host record
− DNS Delegation best practices
− Cluster name resolution process example
• Never put SIP address in an IP address pool.
• Start with round-robin balancing then modify for workflow.
• DNS servers (not SmartConnect) handle the client DNS requests.
• Ensure that appropriate firewall ports are open.
• SyncIQ requires static allocation.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 85


Configuring the Foundation for Access

• Static pools are best used for stateful clients, and dynamic pools are best for
stateless clients.
• Time-to-live value41.

Challenge

IT Manager:
Because you configure the network components together, you will not
go to the lab until the other topics are discussed. Open participation
question:
Question: What are the SmartConnect Advanced benefits?

41 SmartConnect DNS delegation server answers DNS queries with a time-to-live of


0 so that the answer is not cached. Not caching the answer distributes the IP
addresses successfully. Certain DNS servers, such as Windows Server 2003,
2008, and 2012, fixes the value to one second. Many clients requesting an address
within the same second causes all of them to receive the same address. In some
situations, barriers to deploying SmartConnect happen, in which case other means
should be specified in the solution design.

PowerScale Administration-SSP1

Page 86 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

IP Address Pools

Scenario

IT Manager: So that covers networking at the groupnet and subnet


levels. Now, examine IP address pools and then configure networking
on our cluster.

Your Challenge: The IT manager has tasked you to discuss the IP


address pool settings and configure IP address pools.

IP Address Pools

OneFS configures
groupnet0, subnet0,
pool0

Control connectivity to
access zones

WebUI Cluster management > Network configuration page.

IP address pools are allocated to external network interfaces.

More subnets are configured as either IPv4 or IPv6 subnets. Other IP address
pools are created within subnets and associated with a node, a group of nodes,
NIC ports or aggregated ports.

The pools of IP address ranges in a subnet enable you to customize42 how users
connect to your cluster.

42 Customization is vital for clusters that have different node types.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 87


Configuring the Foundation for Access

Use case: Say that X-Attire adds 4 F800 nodes for a video media group. X-Attire
wants the video media team to connect directly to the F800 nodes to use various
high I/O applications. The administrators can separate the X-Attire connections.
Access to the home directories connect to the front end of the H500 nodes while
the video media group accesses the F800 nodes. This segmentation keeps the
home directory users from using bandwidth on the F800 nodes.

Link Aggregation

Physical NIC

Single
Logical NIC

Physical NIC

Aggregation combining the two physical interfaces into a single, logical interface.

Configure link aggregation, or NIC aggregation, on the pool.

Configure for each node - cannot configure across node.

Link aggregation provides improved network throughput and redundancy.

The network interfaces are added to an IP address pool one at a time or as an


aggregate.

Aggregation modes43 apply to all aggregated network interfaces in the IP address


pool.

43The link aggregation mode determines how traffic is balanced and routed among
aggregated network interfaces.

PowerScale Administration-SSP1

Page 88 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Link Aggregation Modes

OneFS supports dynamic and static aggregation modes.

Click each tab to learn more about the link aggregation modes.

LACP

Link Aggregation Control Protocol, or LACP, is a dynamic aggregation mode that


supports the IEEE 802.3ad.

Configure LACP at the switch level and on the node. Enables the node to negotiate
interface aggregation with the switch.

LACP mode is the default aggregation mode.

PowerScale Node

Node negotiates interface aggregation with switch -


balances outgoing traffic across interfaces

Logical NIC

Physical NIC

Switch
Physical NIC

Round Robin

Round robin is a static aggregation mode that rotates connections through the
nodes in a first-in, first-out sequence, handling all processes without priority.

Round robin balances outbound traffic across all active ports in the aggregated link
and accepts inbound traffic on any port.

Client requests are served one after the other based on their arrival.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 89


Configuring the Foundation for Access

PowerScale Node
Rotates connections in a first-in, first-
out sequence

Logical NIC
1
3
Physical NIC 1 Incoming client requests

5 6 7 8

Physical NIC 2
4
2

The graphic shows, client request 2, client request 3 and so on follow client request 1.

Note : Round Robin is not recommended if the cluster is using TCP/IP workloads.

Failover

Active/Passive failover is a static aggregation mode that switches to the next active
interface when the primary interface becomes unavailable. The primary interface
handles traffic until there is an interruption in communication. At that point, one of
the secondary interfaces takes over the work of the primary.

PowerScale Node Switches to the next active interface when the


primary interface becomes unavailable

Logical NIC Incoming client requests

1 2 3 5 6
Physical NIC 1

4
Physical NIC 2

In the graphic, the nodes serve the incoming client requests. If any of the nodes become
unavailable or interrupted due to an issue, the next active node takes over and serves the upcoming
client request.

FEC

Fast Ethernet Channel, or FEC, is a static aggregation method.

Typically used with older Cisco switches - LACP preferred in new generation
PowerScale nodes.

PowerScale Administration-SSP1

Page 90 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

FEC accepts all incoming traffic and balances outgoing traffic over aggregated
interfaces that is based on hashed protocol header information that includes source
and destination addresses.

Accepts all incoming traffic and balances outgoing traffic


over aggregated interfaces
PowerScale Node

Outgoing traffic
Logical NIC
Incoming client requests 3 1
2
Physical NIC 1
6 5 4 3 2 1
6 5 4
Physical NIC 2

The graphic shows, the node accepts and serves all the incoming client requests. The node
balances outgoing traffic.

Link Aggregation Mapping

Logical network interface, or LNI, numbering corresponds to the physical


positioning of the NIC ports as found on the back of the node.

Network interfaces added to an IP address pool as an aggregated interface are


included when viewing a list of network interfaces on a node. Aggregated LNIs are
listed in the order in which they are created. NIC names correspond to the network
interface name as shown in command-line interface, such as ifconfig and
netstat.

Logical Network Interface Aggregated LNI


(LNI)

Gen 5 ext-1, ext-2, ext-3, ext-4 • ext-agg = ext-1 + ext-2


• ext-agg2 = ext-3 + ext-4
• ext-agg3 = ext-1 + ext-2
+ ext-3 + ext-4

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 91


Configuring the Foundation for Access

ext-1, ext-2, 10gige-1, 10gige- • ext-agg = ext-1 + ext-2


2 • 10gige-agg-1 = 10gige-
1 + 10gige-2

Gen 6 10gige-1, 10gige-2 10gige-agg-1 = 10gige-1 +


10gige-2

40gige-1, 40gige-2 40gige-agg-1 = 40gige-1 +


40gige-2

mgmt-1 1GbE interface for system


management

Note: Number of logical network interfaces vary based on node


model.

Allocation Method

An administrator can choose an allocation method of either static pools or dynamic


pools when configuring IP address pools on the cluster. Each tab defines the
allocation methods.

PowerScale Administration-SSP1

Page 92 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Static

Output showing the network settings. Click image to enlarge.

A static pool is a range of IP addresses that allocates only one IP address at a


time. OneFS allocates a single IP address from the pool to the chosen NIC.

If there are more IP addresses than nodes, new nodes that are added to the pool
get the additional IP addresses.

Once allocating an IP address, the node keeps the address indefinitely unless
deleting the member interface from the pool, or removing the node from the cluster.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 93


Configuring the Foundation for Access

Dynamic

Click image to enlarge.

Dynamic IP allocation is only available with SmartConnect Advanced.

Dynamic pools are best used for stateless protocols such as NFSv3. Also configure
for NFSv4 with continuous availability (CA).

Dynamic IP allocation ensures that all available IP addresses in the IP address


pool are assigned to member interfaces when the pool is created.

Dynamic IP allocation has the following advantages:


• Enables NFS failover, which provides continuous NFS service on a cluster even
if a node becomes unavailable.
• Provides high availability because the IP address is always available to clients.

PowerScale Administration-SSP1

Page 94 © Copyright 2020 Dell Inc.


Configuring the Foundation for Access

Static and Dynamic Pools

One IP to each NIC in


Static pool No address
pool
reallocation

Multiple IPs to each NIC in


pool

Dynamic pool IPs reallocated after


failure

The graphic shows a two SmartConnect zones, each with different IP allocation methods.

Static pools are best used for SMB clients because of the stateful nature of the
SMB protocol.

Dynamic pools are best used for stateless protocols such as NFSv3. You can
identify a Dynamic range by the way the IP addresses present in the interface as
.110 -.112 or .113 -.115 instead of a single IP address like 0.10.

Challenge

Lab Assignment: You have the authentication providers and access


zones configured, now set up the SmartConnect zones and IP address
pools.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 95


Configuring Identity Management and Authorization

Configuring Identity Management and Authorization

PowerScale Administration-SSP1

Page 96 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Role-Based Access Control

Scenario

IT Manager: The next topic that I want you to understand is


administrative access. The organization has several administrators who
need management access, but I do not want to give everybody root
access.

Your Challenge: The IT manager has tasked you to add management


access to the PowerScale cluster. Before you configure management
access, ensure that you understand role-based access control, or RBAC
and zone-based RBAC, or ZRBAC. The manager expects you to
describe RBAC and ZRBAC, explain built-in roles and privileges, and
configure RBAC.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 97
Configuring Identity Management and Authorization

Overview

Configured user with restricted privileges

Restricted options are not


displayed

Root user privileges

Click the image to enlarge.

RBAC and ZRBAC administration defines the ability to perform specific


administrative functions to a specific privilege.

A user who is assigned to more than one role has the combined privileges of those
roles.

The root and admin users can assign others to built-in or custom roles that have
login and administrative privileges to perform specific administrative tasks.

The example shows that user Jane is assigned the Backup Administrator role.
Many of the privileges that user Root has are not visible to user Jane.

Role-based access enables you to separate out some administrative privileges and
assign only the privileges that a user needs. Granting privileges makes access to
the configuration of the cluster less restrictive.

Roles

OneFS includes built-in administrator roles with predefined sets of privileges that
you cannot modify. You can also create custom roles and assign privileges. Click
the tabs to learn more about each role.

PowerScale Administration-SSP1

Page 98 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Built-in Roles

Built-in roles44 are included in OneFS and have been configured with the most
likely privileges necessary to perform common administrative functions.

Click each built-in role to learn more about it.


• SecurityAdmin built-in role45
• SystemAdmin built-in role46
• AuditAdmin built-in role47
• BackupAdmin built-in role48
• VMwareAdmin built-in role49

44You cannot modify the list of privileges that are assigned to each built-in role.
However, you can assign users and groups to built-in roles.

45The SecurityAdmin built-in role enables security configuration on the cluster,


including authentication providers, local users and groups, and role membership.

46The SystemAdmin built-in role enables administration of all cluster configuration


that is not handled by the SecurityAdmin role.

47The AuditAdmin built-in role enables you to view all system configuration
settings.

48 The BackupAdmin built-in role enables backup and restore of files from /ifs.

49The VMwareAdmin built-in role enables remote administration of storage that is


needed by VMware vCenter.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 99
Configuring Identity Management and Authorization

Custom roles

Custom roles supplement built-in roles.

You can create custom roles50 and assign privileges mapped to administrative
areas in your PowerScale cluster environment.

Zone Built-in Roles

OneFS 8.2.0 introduces zone-aware RBAC, or ZRBAC. The ZRBAC feature


enhancement provides flexibility for organization administrators to manage
resources according to their specific organization. Click each zone built-in role to
learn more about it.
• ZoneAdmin51
• ZoneSecurity Admin52

The following list describes what you can and cannot do through roles:
• You can assign privileges to a role but not directly to users or groups.
• You can create custom roles and assign privileges to those roles.
• You can copy an existing role.
• You can add any user or group of users, to one or more roles as long as the
users can authenticate to the cluster.

50The VMwareAdmin built-in role enables remote administration of storage that is


needed by VMware vCenter.

51Enables administration of configuration aspects that are in the current access


zone.

52Enables administration of security configuration aspects that are in the current


access zone.

PowerScale Administration-SSP1

Page 100 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Role Creation Video

The video provides an overview of role creation. See the student guide for a
transcript of the video.

Click to launch video.

Link: https://edutube.emc.com/Player.aspx?vno=tQkWrNubtdORFBHxoRlMAg

This demonstration shows the steps to configure role-based access control or


RBAC and zone-aware RBAC, or ZRBAC. To frame the demonstration, I will use
the scenario of two new members on the IT team. I will assign the users with the
minimum needed privileges to manage the cluster for their job role.

Login as admin, a user that can assign privileges. Navigate to Access, Membership
and roles. On the Membership and roles page, note that the access zone selected
is System. Go to the Roles tab. Before moving on to the configuration, note that
OneFS has a number of built-in roles that cover most access needs. There may be
a need to define a custom role. In these instances, you can select the Create a
Role button. I will demonstrate this in a moment. A great place to learn more about
the different privileges is the Isilon OneFS Web Administration Guide.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 101
Configuring Identity Management and Authorization

Hayden is the administrator I am adding to the AuditAdmin role. Select View/Edit


and then Edit role. Next select Add a member to this role. In the Providers choices,
select Active Directory DEES.lab. Then select the domain. Remember, you must
join the cluster to the Active Directory domain to view the users. Hayden is a
member of the dees.lab domain. Select Hayden. Notice you can modify built-in
roles by adding or removing privileges. Save the changes.

The next example is to add a Windows administrator, Sai, to the sales access
zone. Adding Sai to a role specific to the access zone prevents him from
accidentally configuring Windows shares in other zones. In fact, Sai will have no
visibility into other zones. On the Roles tab, select the sales access zone. Note the
two built-in roles really do not provide the level of access for Sai. Create a role. The
role name is WinAdmin and add a short description. Shown is the CLI command to
create a zone role. Remember OneFS version 8.2 introduces zone-aware roles.

Previous version CLI commands do not have the --zone option. boston-2# isi
auth roles create --zone sales WinAdmin. Just as in the previous
example, add a member to this role. Select the provider and then the domain. Next
Search and select Sai. Now add privileges to the role. First, add the ability to log in
to the WebUI. Next, add the privilege to configure SMB. Give Read/write access to
this privilege. Now save the role. boston-2# isi auth roles modify
WinAdmin --zone sales --add-priv ISI_PRIV_LOGIN_PAPI --add-
priv ISI_PRIV_SMB –-add-user dees\\sai. Now verify the privileges of
the users.

Logout and then log in as Hayden, the AuditAdmin. The first indication is the
Access menu. Notice the options are missing. Navigating to Protocols, Windows
sharing, notice Hayden cannot create a share, only view. Also, since added to a
System zone role, Hayden can audit information in other zones. System zone
administrators are global.

Log out of the WebUI and login as Sai. You must login at an IP address or netBios
associated with the sales access zone. Viewing the Access options, Sai does not
have the privileges. Navigating to Protocols, Windows sharing, notice Sai cannot
switch to another access zone, but can configure SMB shares. This demonstration
stepped through configuring RBAC and ZRBAC. This concludes the demonstration.

PowerScale Administration-SSP1

Page 102 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Role Management

You can view, add, or remove members of any role. Except for built-in roles, whose
privileges you cannot modify, you can add or remove OneFS privileges on a role-
by-role basis.

View Roles

The table shows the commands that view role information.

Command Description

isi auth roles list A basic list of all roles on the cluster

isi auth roles list -- Detailed information about each role on


verbose the cluster, including member and
privileged list

isi auth roles view <role> Detailed information about a single role,
where <role> is the name of the role

View Privileges

User Privileges are performed through the CLI. The table shows the commands
that can view a list of your privileges or of another user.

Command Description

isi auth privileges --verbose A list of privileges

isi auth id A list of your privileges

isi auth mapping token <user> List of privileges for another user,
where <user> is a placeholder for
another user by name:

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 103
Configuring Identity Management and Authorization

Create, modify, and delete a custom role

You can create an empty custom role and then add users and privileges to the role.
Deleting a role does not affect the privileges or users that are assigned to it. Built-in
roles cannot be deleted.

The table shows the commands used to create, modify and delete a custom role.

Command Description

isi auth roles create <name> [-- To create a role, where <name> is
description <string>] the name that you want to assign
to the role and <string> specifies
an optional description

isi auth roles modify <role> [-- To add a user to the role, where
add-user <string>] <role> is the name of the role and
<string> is the name of the user

isi auth roles modify <role> [-- To add a privilege with read/write
add-priv <string>] access to the role, where <role> is
the name of the role and <string>
is the name of the privilege

isi auth roles modify <role> [-- To add a privilege with read-only
add-priv-ro <string>] access to the role, where <role> is
the name of the role and <string>
is the name of the privilege

isi auth roles delete <name> To delete a custom role, where


<name> is the name of the role
that you want to delete

Privileges

Privileges grant access to features with read or read/write access. Administrators


cannot modify built-in roles. ZRBAC provides flexibility for organization
administrators to manage resources according to their specific organization.

PowerScale Administration-SSP1

Page 104 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

List privileges

LOGIN_CONSOLE needed to SSH

LOGIN_PAPI needed to use the


WebUI

The graphic shows built-in roles that have a predefined set of privileges. Red outlines are the only
privileges available for ZRBAC.

Note: The WebUI privileges names differ from the names that are
seen in the CLI.

ZRBAC - ISI_PRIV_AUTH_Privilege

The zone-based ISI_PRIV_AUTH privilege enables non System zone


administrators to create and modify their zone authentication providers.

The graphic shows a local zone administrator, jane, logged in.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 105
Configuring Identity Management and Authorization

1: If zone is created by the system zone admins, only the system zone admins can
modify and delete. Local zone admin can only view and add access zones.

If zone is created by a nonsystem zone admin, both the zone admin and
nonsystem zone admin can view, modify, and delete.

2: ISI_PRIV_AUTH enables Access options

3: A zone administrator is logged in.

4: The IP address in the IP address pool associated with the access zone.

Challenge

Lab Assignment: Go to the lab and create user accounts for RBAC and
ZRBAC.

PowerScale Administration-SSP1

Page 106 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

User Identity Mapping

Scenario

IT Manager: Before you begin to configure the Windows shares, NFS


exports, and S3 buckets, you must be familiar on how OneFS manages
identity.

Your Challenge: The IT manager has tasked you to determine the on-
disk identity to configure on the cluster. Before configuring, you should
have an understanding of how identity management works. The
manager expects you to describe identity management, user tokens,
and on-disk identity.

Layers of Access

How the cluster is reached - SMB, NFS, S3, FTP, HTTP


Protocol

Authentication Identifies users using NIS, local files, LDAP or AD

Identity Assignment
Based on authentication or
mediated in cluster

Determines if user is authorized to access files


Authorization

Cluster connectivity has four layers of interaction. The third layer is identity
assignment. The layer is straightforward and based on the results of the
authentication layer.

There are some cases that need identity mediation within the cluster, or where
roles are assigned within the cluster that are based on user identity.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 107
Configuring Identity Management and Authorization

The focus of this topic is identity assignment.

Network and AIMA Hierarchy

Authentication, identity management, and authorization, or AIMA, ties into the


network hierarchy at different levels. The graphic shows how the AIMA hierarchy
ties into the network hierarchy.

1. The user connects to a SmartConnect zone name, which is tied to a subnet,


and SSIP.
2. The SmartConnect zone name is mapped to an access zone. The access zone
contains the authentication providers, directory services, user mapping, ID
mapping, and generates user tokens.
3. The access zone has a base directory where file permissions and user identities
on disk are applied.
4. Windows shares, NFS exports, and S3 buckets are created per access zone.

Identity Management

The OneFS identity management maps the users and groups from separate
services. The mapping provides a single unified identity on a cluster and uniform
access control to files and directories, regardless of the incoming protocol. Click on
the "i" icons for a high-level information about the process.

PowerScale Administration-SSP1

Page 108 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Authentication providers and protocols are covered in other topics.

1: When the cluster receives an authentication request, lsassd searches the


configured authentication sources for matches to an incoming identity. If the identity
is verified, OneFS generates an access token. This is an internal token that reflects
the OneFS identity management system. When a user attempts to access cluster
resources, OneFS allows or denies access based on matching the identity, user,
and group memberships to this same information on the file or folder.

2: The authentication providers uses OneFS to first verify a user identity after which
users are authorized to access cluster resources. The top layers are access
protocols – NFS for UNIX clients, SMB for Windows clients, and FTP and HTTP for
all.

3: Between the protocols and the lower-level services providers and their
associated data repositories, is the OneFS lsassd daemon. lsassd mediates
between the authentication protocols that clients and the authentication providers,
who check their data repositories for user identity and file access, use.

Access Token Overview Video

The video describes the access token generation. See the student guide for a
transcript of the video.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 109
Configuring Identity Management and Authorization

Click to launch video.

URL:
https://edutube.emc.com/Player.aspx?vno=MmSHIH1OvcP5nHsi0hd51g==&autopl
ay=true

When the cluster receives an authentication request, the lsassd searches the
configured authentication sources for matches to the incoming identity. If the
identity is verified OneFS generates an Access Token. Access Token form basis of
who you are when performing actions on the cluster. Shown is the output of the
users mapping token. The token supplies the primary owner and group identities to
use during file creation. For most protocols the access token is generated from the

PowerScale Administration-SSP1

Page 110 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

user name or from the authorization data that is received during authentication.
Access tokens are also compared against permissions on an object during
authorization checks. The access token includes all identity information for the
session OneFS exclusively uses the information in the token when determining if a
user has access to a particular resource.

Access Token Generation

Access tokens form the basis of who you are when performing actions on the
cluster. The tokens supply the primary owner and group identities to use during file
creation. When the cluster builds an access token, it must begin by looking up
users in external directory services. By default, the cluster matches users with the
same name in different authentication providers and treats them as the same user.
The ID-mapping service populates the access token with the appropriate identifiers.
Finally, the on-disk identity is determined.

Look up user in external Perform ID mapping Perform user mapping


directory services Determine on-disk
identity
- Only matches accounts - Matches accounts to
- Over SMB: AD preferred, with same name to map combine access tokens
LDAP can be appended IDs from different directory
services
- Over NFS: LDAP or NIS only

Primary Identities

OneFS supports three primary identity types, UIDs, GIDs, and SIDs.

UIDs and GIDs from Local, NIS, LDAP providers range from 1 to 65k.

OneFS automatically allocates UIDs and GIDs from the range 1,000,000-
2,000,000.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 111
Configuring Identity Management and Authorization

1
2

1: The user identifier, or UID, is a 32-bit string that uniquely identifies users on the
cluster. UNIX-based systems use UIDs for identity management.

2: The security identifier, or SID, is a unique identifier that begins with the domain
identifier and ends with a 32-bit Relative Identifier (RID). Most SIDs take the form
S-1-5-21-<A>-<B>-<C>-<RID>, where <A>, <B>, and <C> are specific to a domain
or system, and <RID> denotes the object inside the domain. SID is the primary
identifier for users and groups in Active Directory.

3: The group identifier, or GID, for UNIX serves the same purpose for groups that
UID does for users.

Secondary Identities

Secondary identifiers are names, such as usernames. Different systems such as


LDAP and Active Directory may not use the same naming convention to create
object names. There are many variations to entering or displaying a name. Click on
the highlighted icon to learn more.

PowerScale Administration-SSP1

Page 112 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

1: Windows provides a single namespace for all objects that is not case-sensitive,
but specifies a prefix that targets the dees Active Directory domain. UNIX assumes
unique case-sensitive namespaces for users and groups. For example, Sera and
sera can represent different objects.

2: Kerberos and NFSv4 define principals that require all names to have a format
similar to an email address. For example, given username sera and the domain
dees.lab, dees\sera and sera@dees.lab are valid names for a single object in
Active Directory. With OneFS, whenever providing a name as an identifier, the
correct primary identifier of UID, GID, or SID is requested.

Multiple Identities

A concern for administrators when working in a multiprotocol environment is


making sure that users are treated the same regardless of protocol access.

The graphic shows a user that has both a Windows and Linux account. Multiple
identity, or multiprotocol access, could include configuring mapping to ensure user
IDs correctly map to one another.

OneFS is RFC 2307 compliant. Enable RFC 2307 to simplify user mapping.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 113
Configuring Identity Management and Authorization

See the participant guide for information about mapping challenges and
considerations.

Mapping is done either through an external authentication provider or through user-


mapping rules on the cluster. Another factor to consider is merging UIDs together
on the cluster from different environments. Do not put UIDs from different
environments and their authentication providers in the same access zone. When
there are two identifiers for the same user, build the user token with all appropriate
IDs. The final challenge in a multi protocol environment is to appropriately apply the
permissions. Verification may require some testing and experimenting on the
administrator's part to fully understand what different permission settings mean
when applied to a user.

ID Mapper Database

User ID mapping provides a way to control permissions by specifying security


identifiers, user identifiers, and group identifiers.

Click on the information icons to learn more.

1 3

1: The user mapper feature can apply rules to modify the user identity OneFS
uses, add supplemental user identities, and modify the group membership of a
user. The user mapping service combines user identities from different directory
services into a single access token. The mapping service then modifies it according
to the rules that you create.

2: OneFS uses the identifiers to check file or group ownership.

PowerScale Administration-SSP1

Page 114 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

3: Mappings are stored in a cluster-distributed database that is called the ID


mapper. The ID provider builds the ID mapper using incoming source and target
identity type—UID, GID, or SID. Only authoritative sources are used to build the ID
mapper.

4: Each mapping is stored as a one-way relationship from source to destination. If


a mapping is created, or exists, it has to map both ways. The two-way mappings
are presented as two complementary one-way mappings in the database. When
receiving an identity request, if a mapping exists between the specified source and
the requested type, OneFS returns the mapping.

On-Disk Identity

Identifies preferred
identity to store on
disk

Determines identity
stored in ACLs - SID
or UID/GIDs

The graphic shows the token of Windows user Sera with a UID as the on-disk identity.

OneFS uses an on-disk identity store for a single identity for users and groups.

On-disk identities enable administrators to choose storing UNIX or Windows


identity automatically or enables the system to determine the correct identity to
store.

Though OneFS creates a user token from information on other management


systems, OneFS stores an authoritative version of the identity as the preferred on-
disk identity.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 115
Configuring Identity Management and Authorization

On-Disk Identity Use Cases

The available on-disk identity types are Native, UNIX, and SID. The on-disk identity
is a global setting. Because most protocols require some level of mapping to
operate correctly, choose the preferred identity to store on-disk.

Native is default - applies to most


environments and assumes a mixed
environment

UNIX is typical for a UNIX only environment


- stores UIDs and GIDs. An incoming SID
will store the SID if no UID and GID are
found.

SID identity is typical for Windows only


environments - stores SIDs

The use case for the default Native setting is an environment that has NFS and
SMB client and application access. With the Native on-disk identity set, lsassd
attempts to locate the correct identity to store on disk by running through each ID-
mapping method. The preferred object to store is a real UNIX identifier. OneFS
uses a real UNIX identifier when found. If a user or group does not have a real
UNIX identifier (UID or GID), OneFS stores the real SID. Click on the highlighted
icon to learn more.

Troubleshooting Resources

For troubleshooting issues, first see:


http://www.emc.com/collateral/TechnicalDocument/docu63138.pdf

For a list of all customer troubleshooting guides: OneFS Customer Troubleshooting


Guides Info Hub.

Challenge

Your Challenge: It looks like you understand identity management.


Now go to the lab and configure the on-disk identity type for the cluster.

PowerScale Administration-SSP1

Page 116 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Authorization

Scenario

IT Manager: The final topic to understand before creating the Windows


shares, NFS exports, and S3 buckets is how OneFS handles
permissions to the files and directories.

Your Challenge: The IT manager wants to ensure you can describe


POSIX mode bits, Windows ACLs, and how OneFS handles both types
of permissions.

Permissions Overview

Like identities, OneFS also stores permissions on disk. However, storing


permissions is more complex than storing identities because each data access
protocol uses its own permissions model. The individual files and folders that
clients access over NFS or SMB can have UNIX permissions and Windows ACLs
assigned.

Multi-protocol access is covered in greater detail in the PowerScale


Advanced Administration course.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 117
Configuring Identity Management and Authorization

3
4
6
1
5

1: OneFS supports NFS and SMB protocols. It accesses the same directories and
files with different clients.

2: OneFS generates synthetic ACLs.

3: Authoritative permissions are stored on disk.

4: Clients have the same file access regardless of access protocol.

5: OneFS approximately maps ACLs and mode bits - no perfect one-to-one


mapping exists.

6: OneFS supports two types of authorization data on a file, access control lists, or
ACLs, and UNIX permissions, or POSIX mode bits.

Cross Protocol Access

ACL authoritative POSIX authoritative

Extensive DACLs - Read-Write-Execute User-


granular access Group-Others

Synthetic ACLs - limited


Approximated POSIX - only to 3 DACLs
for representation for LS on
export

To handle cross-protocol file access, OneFS stores an internal representation of


the permissions of a file system object, such as a directory or a file.

PowerScale Administration-SSP1

Page 118 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

The internal representation, which can contain information from either the POSIX
mode bits or the ACLs, is based on RFC 3530.

Click each item for more information:


• State53
• Synthetic ACLs54
• Authority55

POSIX Overview

In a UNIX environment, you modify permissions for users/owners, groups, and


others to allow or deny file and directory access as needed. Set the permissions
flags to grant permissions to each of these classes. Assuming the user is not root,
the class determines access the requested file.

53A file can only be in one of the states at a time. That state is authoritative. The
actual permissions on the file are the same, regardless of the state.

54 OneFS uses the internal representation to generate a synthetic ACL, which


approximates the mode bits of a UNIX file for an SMB client. Because OneFS
derives the synthetic ACL from mode bits, it can express only as much permission
information as mode bits can and not more.

55 OneFS must store an authoritative version of the original file permissions for the
file sharing protocol and map the authoritative permissions for the other protocol.
OneFS must do so while maintaining the security settings for the file and meeting
user expectations for access. The result of the transformation preserves the
intended security settings on the files. The result also ensures that users and
applications can continue to access the files with the same behavior.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 119
Configuring Identity Management and Authorization

1 2 3
4 5

1: User or owner permission

2: Group permissions

3: Others or everyone permissions

4: Configure permission flags to grant read (r), write (w), and execute (x)
permissions to users, groups, and others in the form of permission triplets. The
classes are not cumulative. OneFS uses the first class that matches. Typically,
grant permissions in decreasing order, giving the highest permissions to the file
owner and the lowest to users who are not the owner or the owning group.

5: These permissions are saved in 16 bits, which are called mode bits.

6: The information in the upper 7 bits can also encode what the file can do,
although it has no bearing on file ownership. An example of such a setting would
be the “sticky bit.”

Important: OneFS does not support POSIX ACLs, which are


different from Windows ACLs.

PowerScale Administration-SSP1

Page 120 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

POSIX in the WebUI

Only configurable as root

Triplets

9 mode bits

Triplet classes

Modify UNIX permissions in the WebUI on the File system > File system explorer page. Click
image to enlarge.

The graphic shows root user who is logged in and the /ifs/boston/hr
directory. Only root user can view and edit the owner and group of the object.

To assign read, write, or execute permissions to the specified account owner


(user), group members (group), and anyone (other), select or clear the mode bit
boxes. To apply setting changes, click Save changes.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 121
Configuring Identity Management and Authorization

chmod Command

read bit adds 4 to total, write bit


adds 2, and execute bit adds 1

Changing the permissions on a directory so that group members and all others can only read the
directory.

OneFS supports the standard UNIX tools for changing permissions: chmod and
chown. The change mode command, chmod, can change permissions of files and
directories. The man page for chmod documents all options.

Changes that are made using chmod can affect Windows ACLs.

chown Command

Only root user can change the


owner

The output shows that penni is an LDAP user who is responsible for the content of the
/ifs/boston/hr directory.

The chown command is used to change ownership of a file. Changing the owner of
a file requires root user access. The basic syntax for chown is chown [-R]

PowerScale Administration-SSP1

Page 122 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

newowner filenames. Using the -R option changes the ownership on the sub
directories.

The chgrp command changes the group. View the man pages for command
definitions.

Windows ACLs Overview

Sales group ACL

Access control
elements

No permissions = no access

List of advanced
List of basic permissions
permissions

On Windows host: Properties > Security tab > Advanced > Edit window

In a Windows environment, ACLs define file and directory access rights.

While you can apply permissions for individual users, Windows administrators
usually use groups to organize users, and then assign permissions to groups
instead of individual users.

Group memberships can cause a user to have several permissions to a folder or


file.

Windows includes many rights that you can assign individually or you can assign
rights that are bundled together as permissions. For example, the Read permission
includes the rights to read and execute a file while the Full Control permission
assigns all user rights. Full Control includes the right to change ownership and
change the assigned permissions of a file or folder.

When working with Windows, note the important rules that dictate the behavior of
Windows permissions. First, if a user has no permission that is assigned in an ACL,
then the user has no access to that file or folder. Second, permissions can be
explicitly assigned to a file or folder and they can be inherited from the parent
folder. By default, when creating a file or folder, it inherits the permissions of the

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 123
Configuring Identity Management and Authorization

parent folder. If moving a file or folder, it retains the original permissions. On a


Windows client, if the check boxes in the Permissions dialog are not available, the
permission are inherited. You can explicitly assign permissions. Explicit
permissions override inherited permissions. The last rule to remember is that Deny
permissions take precedence over Allow permissions. However, an explicit Allow
permission overrides an inherited Deny permission.

ACL Permission Policy Settings

OneFS has configurable ACL policies that manage permissions. You can change
the default ACL settings globally or individually, to best support the environment.
The global permissions policies change the behavior of permissions on the system.
For example, selecting UNIX only changes the individual ACL policies to
correspond with the global setting. The permissions settings of the cluster are
handled uniformly across the entire cluster, rather than by each access zone.

The WebUI > Access > ACL policy settings page and how the policy settings
translate in the CLI command output. You can also use the "isi auth settings acls
modify" command to configure the ACL settings.

1
2
3
4

1: Use case: Permissions operate in a mixed UNIX and Windows environment.

2: Use case: Permissions operate with UNIX semantics - prevents ACL creation.

3: Use case: Permissions operate with Windows semantics - errors for UNIX
chmod.

PowerScale Administration-SSP1

Page 124 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

4: Use case: Configure individual permission-policy settings. If a General ACL


Setting or Advanced ACL Setting needs changing, select the Custom
environment global setting.

Managing ACL Permissions

The output shows OneFS enhancements to the ls command.

1 4

1: The ls -le command shows actual permissions stored on disk and ACL from
security descriptor.

2: The ls -len command shows numerical (n) owner and group SID or UID/GID.

3: The ls -lean shows hidden (a) directories.

4: The long format includes file mode, number of links, owner, group, MAC label,
number of bytes, abbreviated month, day file last modified, hour file last modified,
minute file last modified, and the path name.

OneFS takes advantage of standard UNIX commands and has enhanced some
commands for specific use with OneFS.

The list directory contents, ls, command provides file and directory permissions
information, when using an SSH session to the cluster. PowerScale has added
specific options to enable reporting on ACLs and POSIX mode bits.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 125
Configuring Identity Management and Authorization

Tip: The ls command options are all designed for long notation
format, which is displayed when the -l option is used. The -l
option also displays the actual permissions that are stored on disk.

Synthetic vs Advanced ACLs

Not stored anywhere - dynamically generated as needed and


then discarded

Translated POSIX to an ACL - POSIX authoritative

Advanced ACLs on the file - ACLs authoritative

Running the ls -le command shows the synthetic ACLs for files and directories (the -d flag lists
directory entries).

A Windows client processes only ACLs, it does not process UNIX permissions.
When viewing the permission of a file from a Windows client, OneFS must translate
the UNIX permissions into an ACL.

Synthetic ACL is the name of the OneFS translation.

If a file has Windows-based ACLs (and not only UNIX permissions), OneFS
considers it to have advanced, or real ACLs56.

56Advanced ACLs display a plus (+) sign when listed using an ls –l, or as shown,
the ls -led command. POSIX mode bits are present when a file has a real ACL,
however these bits are for protocol compatibility and are not used for access
checks.

PowerScale Administration-SSP1

Page 126 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

Permission Authority Video

The video discusses authentication and authorization. See the student guide for a
transcript of the video.

Click to launch video.

Link:
https://edutube.emc.com/html5/videoPlayer.htm?vno=EN8uMS3WuRwjY4Q0mIUa
Zw

Let us begin with a look at authentication and authorization. Whereas


authentication is verifying a user identity, authorization grants users or group
permission to access files and directories. Authentication is logging into a system
using credentials. When logged in, authorization is what gives the user different
levels of access. As an analogy, an employee badge with a security access code is
proof as to who the individual is. The badge grants access to the door to the

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 127
Configuring Identity Management and Authorization

corporate building, thus the user has permission to enter. Share level permissions
work similarly in that users get access to the share before they can gain access to
any of the share directories. A user that has access to a directory (office) can then
access the files within the directory, providing permission to the file is given.

Access to a folder on an Isilon cluster is determined through two sets of permission


entries: POSIX mode bits and Windows ACLs. The graphic shows the /dvt folder
and two shares that are created underneath it. SMB access depends both of these
permissions and when the share permissions combine with file or directory
permissions, OneFS enforces the most restrictive set of permissions. For example,
if a user has no write permission to the /dvt share then the user cannot write to
the /linux and /win directories or files within the directories.

Two options are available when creating a share, Do not change existing
permissions and Apply Windows default ACLs. Understand the Apply
Windows default ACLs settings. This setting can destroy or at a minimum alter
explicitly defined directory permissions that are created on the share. For example,
carefully migrated permissions can change, creating more work and the potential of
causing data unavailability. Files and directories can be either POSIX authoritative
or ACLs authoritative.

A synthetic ACL does not exist on the file system and is not stored anywhere.
Instead, OneFS generates a synthetic ACL as needed, and then discards it. OneFS
creates the synthetic ACL in memory when a client that only understands ACLs,
such as Windows clients, queries the permissions on a file that only has POSIX
permissions.

With synthetic ACLs, POSIX mode bits are authoritative. POSIX mode bits handle
permissions in UNIX environments and govern the synthetic ACLs. Permissions
are applied to users, groups, and everyone, and allow or deny file and directory
access as needed. The read, write, and execute bits form the permissions triplets
for users, groups, and everyone. The mode bits can be modified using the WebUI
or the CLI standard UNIX tools such as chmod and chown. Since POSIX governs
the synthetic ACLs, changes made using chmod change the synthetic ACLs. For
example, running chmod 775 on the /ifs/dvt directory changes the mode bits to
read-write-execute for group, changing the synthetic ACL for the group. The same
behavior happens when making the access more restrictive, for example, running
chmod 755, changes the synthetic ACL to its corresponding permission. The
chmod behavior is different when ACLs are authoritative.

PowerScale Administration-SSP1

Page 128 © Copyright 2020 Dell Inc.


Configuring Identity Management and Authorization

In the example, the directory /ifs/dvt/win has a real ACL. The POSIX mode bits are
775. Running chmod 755 does not change to the POSIX mode bits since merging
775 with 755 gives the combined value of 775. Shown is an excerpt from the Isilon
cluster WebUI page that shows the different behaviors.

The first example shows that the share permission is everyone read-only although
the POSIX indicates read-write-execute. Windows users can write to the share
based on the synthetic ACLs. The second example shows POSIX at 755. Although
the ACL is set to a user with full control, the user cannot write to the share—POSIX
is authoritative.

The “+” indicates a real or native ACL that comes directly from Windows and is
applied to the file. Access control entries make up Windows ACLs. An administrator
can remove the real ACL permission using the chmod -b command. ACLs are
more complex than mode bits and can express a richer set of access rules.
However, not all POSIX mode bits can represent Windows ACLs any more than
Windows ACLs can represent POSIX mode bits.

Once a file is given an ACL, its previous POSIX mode bits are no longer
enforced—the ACL is authoritative. The first example shows a real ACL used,
POSIX set for 777, and the share permissions for the user set to read-only.
Although the POSIX show read-write-execute for everyone, the user cannot write
because of the ACL. In contrast, the second example shows the case where the
user can write.

Troubleshooting Resources

For troubleshooting issues, first see:


http://www.emc.com/collateral/TechnicalDocument/docu63137.pdf

For a list of the latest customer troubleshooting guides: OneFS Customer


Troubleshooting Guides Info Hub.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 129
Configuring Identity Management and Authorization

Challenge

Lab Assignment:
Log in to the cluster and verify the ACL policy setting.
• Permissions and ownership using the WebUI
• Permissions and ownership using the CLI
• ACL authoritative
• ACL policy setting

PowerScale Administration-SSP1

Page 130 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Configuring Client Access to Data

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 131


Configuring Client Access to Data

OneFS Caching

Scenario

IT Manager: The next thing that I would like to know more about is how
the PowerScale caches data.

Your Challenge: The IT manager wants you to describe caching in


OneFS and illustrate the caching process.

OneFS Caching Overview

Exploded view of a Gen 6 chassis.

4
2

1
3

1: Cache levels addresses the immediacy for information.

2: Accelerate access. The immediacy determines how the cache is refreshed, how
long the data is available, and how the data is emptied or flushed from cache.

3: Different cache levels to account for differing data immediacy. The cache levels
provide guidance to the immediacy of information from a client-side transaction
perspective.

PowerScale Administration-SSP1

Page 132 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

4: Cache is temporary. Because cache is a copy of the metadata and user data,
any data that is contained in cache is temporary and can be discarded when no
longer needed.

Caching maintains a copy of the metadata57 and/or the user data blocks in a
location other than primary storage.

Cache in OneFS is divided into levels. Each level serves a specific purpose in read
and write transactions.

Cache Levels

OneFS caching consists of the client-side level 1, or L1, cache and write coalescer,
and level 2, or L2 storage and node-side cache.

Both L1 cache and L2 cache are managed and maintained in RAM. However,
OneFS is also capable of using SSDs as level 3, or L3 cache.

L3 - node - side cache -


interacts with L2

L2 - node - side cache

L1 - client - side cache and


write coalescer

L3 cache interacts with the L2 cache and L3 is contained on SSDs.

57
The copy is used to accelerate access to the data by placing the copy on a
medium with faster access than the drives.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 133


Configuring Client Access to Data

Each cache has its own specialized purpose and works together to provide
performance improvements across the entire cluster.

L1 Cache

Client-side cache.

1: L1 cache allows all blocks for immediate read requests. Read cache is flushed
after a successful read transaction and write cache is flushed after a successful
write transaction. L1 cache collects the requested data from the L2 cache of the
nodes that contain the data.

L1 cache is the client-side cache. It is the buffer on the node that the client
connects, and is involved in any immediate client data transaction.

The write coalescer collects the write blocks and performs the additional process of
optimizing the write to disk.

Following a successful read transaction, the data in L1 cache is flushed or emptied


to provide space for other transactions.

L2 Cache

L2 cache.

PowerScale Administration-SSP1

Page 134 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

1: L2 cache is also contained in the node RAM. It is fast and available to serve L1
cache read requests and take data handoffs from the write coalescer. L2 cache
interacts with the data that is contained on the specific node. The interactions
between the drive subsystem, the HDDs, and the SSDs on the node go through the
L2 cache for all read and write transactions.

2: Interacts with node drives and L3 cache.

L2 cache is the storage side or node-side buffer. L2 cache stores blocks from
previous read and write transactions.

L2 buffers write transactions and L2 writes to disk and prefetches anticipated


blocks for read requests.

L2 cache works with the journaling process.

When full, flushes according to the age of the data.

L3 Cache

L3 cache.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 135


Configuring Client Access to Data

1: Extension of L2 cache.

2: SSD access is slower than access to RAM and is relatively slower than L2 cache
but faster than access to data on HDDs. L3 cache is an extension of the L2 read
cache functionality. Because SSDs are larger than RAM, SSDs can store more
cached metadata and user data blocks than RAM. When L3 cache becomes full
and new metadata or user data blocks are loaded into L3 cache, the oldest existing
blocks are flushed from L3 cache. Flushing is based on first in first out, or FIFO. L3
cache should be filled with blocks being rotated as node use requires.

L3 cache provides additional level of storage node-side cache using the SSDs as
read cache.

Good for random, read heavy workflows accessing the same data sets.

L3 cache has no prefetch.

Important: H-Series and A-Series have two SSD slots in each


node. In H-Series nodes you have the option to enable or disable
L3 cache. In A-Series nodes, you cannot disable L3 cache. As all
disks are SSDs in F-Series nodes, the L3 cache option does not
apply.

PowerScale Administration-SSP1

Page 136 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Caching Big Picture

The graphic shows an eight node cluster that is divided into two node pools with a
detailed view of one of the nodes.

3 1
4

1: Clients connect to L1 cache and the write coalescer. The L1 cache is connected
to the L2 cache on the other nodes and within the same node. The connection to
other nodes occurs over the internal network when data that is contained on those
nodes is required for read or write.

2: The L2 cache on the node connects to the disk storage on the same node. The
L3 cache is connected to the L2 cache and serves as a read-only buffer. The L2
cache on the node connects to the disk storage on the same node.

3: L3 extension from L2.

4: L1 talks to L2 on all cluster nodes.

5: Backend network.

Anatomy of a Read

When a client requests a file, the client-connected node uses the isi get
command to determine where the blocks that comprise the file are located.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 137


Configuring Client Access to Data

1: The first file inode is loaded, and the file blocks are read from disk on all other
nodes. If the data is not present in the L2 cache, data blocks are copied in the L2.
The blocks are sent from other nodes through the backend network.

2: If the data is already present in L2 cache, it is not loaded from the hard disks.
OneFS waits for the data blocks from the other nodes to arrive. Otherwise, the
node gets the data load from the local hard disks, and then the file is reconstructed
in L1 cache and sent to the client.

3: Data blocks are then reconstructed in L1.

Asynchronous Write Anatomy

When a client requests a file write to the cluster, the client-connected node
receives and processes the file.

PowerScale Administration-SSP1

Page 138 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

2 1

1: Cache writes until write coalescer is full, time limit is reached, or protocol
requests confirmation of delivery.

2: The client-connected node creates a write plan for the file including calculating
Forward Error Correction, or FEC. Data blocks assigned to the node are written to
the journal of that node. Data blocks assigned to other nodes travel through the
internal network to their L2 cache, and then to their journal.

At same time, data blocks that are assigned to other nodes go to L2.

3: Once all nodes have all the data and FEC blocks that are journaled, a commit is
returned to the client. Data blocks assigned to client-connected node stay cached
in L2 for future reads, and then data is written onto the HDDs.

4: The Block Allocation Manager, or BAM, on the node that initiated a write
operation makes the layout decisions. The BAM decides on where best to write the
data blocks to ensure that the file is properly protected. Data is copied to journal.
To decide, the BAM Safe Write, or BSW, generates a write plan, which comprises
all the steps that are required to safely write the new data blocks across the
protection group.

5: Once nodes have the data and FEC is journaled, nodes confirmation is sent to
client-connected node and a commit is sent to client.

6: Once complete, the BSW runs this write plan and guaranties its successful
completion. OneFS does not write files at less than the desired protection level.
Data is written to disks.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 139


Configuring Client Access to Data

L3 Cache Settings

L3 cache is enabled by default for all new node pools that are added to a cluster.

L3 cache is either on or off and no other visible configuration settings are available.

File system > Storage pools > SmartPools settings. Enabling and disabling L3 at
the global level and at the node pool level.

1: Enabled on new node pool by default

2: L3 cache cannot enable if node pool has no unprovisioned SSDs and it cannot
coexit with other SSD strategies.

CLI Commands

The following command are used to disable globally and to enable at the node pool
level.

• Global setting: isi storagepool settings modify --ssd-13-


cache-default-enabled no
• Node Pool setting: isi storagepool nodepools modify <pool
name> --13 yes

L3 Cache Considerations

The following points are the L3 cache considerations:

PowerScale Administration-SSP1

Page 140 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

• L3 cache cannot co-exist with other SSD strategies58 on the same node pool.
• SSDs in an L3 cache enabled node pool cannot participate as space used for
GNA.
• L3 acts as an extension of L2 cache regarding reads and writes59 on a node.
• You cannot enable L3 cache in all-flash nodes60.
• You cannot disable L3 cache in archive-type nodes (A200, A2000, NL410,
HD400).
• If changing the L3 cache behavior, migrating data and metadata from the SSDs
to HDDs can take hours.

CLI Cache Keys

The example shows the command to query historical statistics for cache. The first
command lists the keys that are related to cache.

A use case is, running the command to determine the L3 hit and miss stats to
indicate if the node pool needs more SSDs.

Also, you can use the isi_cache_stats and the isi_cache_stats -v


commands to view caching statistics.

58Such as metadata read acceleration, metadata read/write acceleration, and data


on SSD.

59The process of reading or writing, except for larger available cache, is


substantially unchanged.

60 On Gen 6x nodes all data drives are SSDs in the F800, F810, F600, and F200.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 141


Configuring Client Access to Data

1: The command lists the keys that are related to cache. The number and
granularity of available keys is numerous. The keys give administrators insight to
the caching efficiency and can help isolate caching related issues.

2: The command shows the key to list the L1 metadata read hits for node 2, the
node that is connected over SSH.

Challenge

IT Manager:
Open participation question:
Question: What does L1, L2, and L3 cache provide?

PowerScale Administration-SSP1

Page 142 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

SMB Shares

Scenario

IT Manager: The first thing that I would like you to configure is an SMB
share for the Windows users. I want you to create a single share for
now, and ensure that the Windows users have access.

Your Challenge: The IT manager has tasked you to create a share that
the Windows users can access. Before creating the shares, you must
know a few things. The manager wants you ensure you can describe
SMB Continuous Availability, enable SMB sharing, and create shares
and home directories.

Protocol Overview

Configure and create SMB shares for Windows users - created at the zone
level

Configure and create NFS exports for UNIX-type environments - created at


the zone level

Create a virtual rack for HDFS data-intensive distributed applications -


created at the zone level

Enable and configure FTP services - applies to the system

Enable and configure HTTP services - applies to the system

Configure and create Amazon S3 Buckets - created at the zone level

OneFS WebUI Protocols menu.

In addition to supporting common data-access protocols, such as SMB and NFS,


OneFS supports HDFS, FTP, HTTP, and S3.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 143


Configuring Client Access to Data

Important: Previous versions of OneFS show the Object storage


as Swift.

SMB Failover Overview

Network or Node
failure

SMB clients connects


to a single node /ifs/finance/
Data

Old behavior: If this node goes down or a network interruption, the client
needs to reconnect to the cluster manually.

SMB shares provide Windows clients network access to file system resources on
the cluster.

Too many disconnections prompt the clients to open help desk tickets with their
local IT department to determine the nature of the data unavailability.

Clients using SMB 1.0 and SMB 2.x use a time-out service.

PowerScale Administration-SSP1

Page 144 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

SMB Continuous Availability Overview

SMB 3.0 clients with SWP


receive a state change for fast
recovery.
/ifs/finance/
data

OneFS 8.0 and later: CA


enabled - reconnect
automatically

OneFS 8.0 and later support Continuous Availability, or CA.

CA61 enables a continuous workflow from the client-side with no appearance or


disruption.

SMB 3.0 clients use Service Witness Protocol.

SMB Server-Side Copy

Server-side copy offloads copy operations to the server when the involvement of
the client is unnecessary.

File data no longer traverses the network for copy operations that the server can
perform.

The server-side copy feature is enabled by default. To disable the feature, use the
CLI.

61Advanced algorithms are used to determine the metadata and user data blocks
that are cached in L3. L3 cached data is durable and survives a node reboot
without requiring repopulating.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 145


Configuring Client Access to Data

/ifs/finance/
Enabled by default
Server-side copy is data
disabled

Network

Double-click the image to enlarge


Network

Copied data
traverses the
network Server-side copy is

Note: In OneFS, server-side copy is incompatible with the SMB CA.


If CA is enabled for a share and the client opens a persistent file
handle, server-side copy is automatically disabled for that file.

Enabling and Disabling SMB Service

To enable SMB, in the WebUI, go to the Protocols > Windows sharing (SMB) > SMB server
settings tab.

PowerScale Administration-SSP1

Page 146 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

The SMB server settings page contains the global settings that determine how the
SMB file sharing service operates.

These settings include enabling or disabling support for the SMB service.

The SMB service is enabled by default.

A case62 for disabling the SMB service is when testing disaster readiness.

Share Creation Video

This video demonstrates the process of creating an SMB share, mapping the
share, and verifying access. See the student guide for a transcript of the video.

62 The organization fails over the production cluster or directory to a remote site.
When the remote data is available and users write to the remote cluster, all SMB
traffic should be halted on the production site. Preventing writes on the production
site prevents data loss when the remote site is restored back to the production site.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 147


Configuring Client Access to Data

Click to launch video.

Link:
https://edutube.emc.com/html5/videoPlayer.htm?vno=aMwue+nqUbFdOFoqKa98F
g

This demonstration shows the steps to configure SMB shares. Log in to the WebUI
as admin. The dashboard shows all the cluster nodes are healthy. The cluster is
running OneFS 8.2. Navigate to Protocols, Windows sharing. The SMB share will
be in the marketing access zone. Select Create an SMB share. The share I am
creating is called “general purpose”. I will add a description. The path
/ifs/marketing/GeneralPurpose does not exist so I will ensure it is created. This is a
Windows only share that did not previously exist so I will select Apply Windows
default ACLs. In the Members table I will give Everyone full control and then Create
share. The next step is to access the share from a Windows client. From the
Windows client, I will open Windows Explorer and map the share. Good. Now as a
simple test I am creating a text document. I will write some content and save. And
then I will open the document. This demonstration stepped through configuring,
mapping, and accessing an SMB share.

PowerScale Administration-SSP1

Page 148 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Share Creation

Settings Section

Choose the access zone before


creating the share

Share name - This is the


name the users will map to

Path of the share, Base


directory is /ifs/finance

In this example, the "regulations" was


not created before creating the share
Automatically create directory
- default is unchecked

The CLI equivalent are the isi smb shares create or isi smb shares modify commands.

Type the full path of the share in the path field, beginning with /ifs.

You can also browse to the share. If the directory does not exist, the Create SMB
share directory if it does not exist creates the required directory.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 149


Configuring Client Access to Data

Directory ACLs

Creating a new share for


window users

Select if adding a new share to


existing directory structure

Use caution when applying the default ACL settings as it may overwrite existing
permissions in cases where the data has been migrated onto the cluster.

When a cluster is set up, the default permissions on /ifs may or may not be
appropriate for the permissions on your directories.

PowerScale Administration-SSP1

Page 150 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Summary63

Home Directory Provisioning

Use of variable expansion

Each access zone has a path to


"home" automatically

Home directory automatically created-


/ifs/finance/home/<username>

OneFS supports the automatic creation of SMB home directory paths for users.

631) If adding a share to an existing directory structure, you likely do not want to
change the ACL, so select the Do not change existing permissions. 2) If creating a
share for a new directory, you will likely be changing permissions to the ACL to
grant Windows users rights to perform operations. Set the Apply Windows default
ACLs and then once the share is created, go into the Windows Security tab and
assign permissions to users as needed.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 151


Configuring Client Access to Data

Using variable expansion, user home directories are automatically provisioned.

Variables:
• %L64
• %D65
• %U66
• %Z67

64 %L expands to hostname of the cluster, in lowercase.

65 %D expands to the Netbios domain name.

66 %U expands to user name.

67%Z expands to the access zone name. If multiple zones are activated, this
variable is useful for differentiating users in separate zones.

PowerScale Administration-SSP1

Page 152 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Member, File Filter, and Advanced Settings

SMB 3.0 clients automatically failover


to another node when a network or
node fails

File filtering for the share can be enabled


to allow or deny file writes

The graphic shows the permissions that are changed to Full control.

If needed, administrators can apply the Members68 permissions.

Adjustments made to Advanced settings override the default settings for this
share only.

68The default permissions configuration is read-only access for the Everyone


account. Edit or Add member to enable users and groups to write to the share.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 153


Configuring Client Access to Data

You can make access zone global changes to the default values in the Default
share settings tab. Changing the default share settings is not recommended.

In the CLI, you can create shares using the isi smb shares create
command. You can also use the isi smb shares modify to edit a share and
isi smb shares list to view the current Windows shares on a cluster.

The share name can contain up to 80 characters, and can only contain
alphanumeric characters, hyphens, and spaces. The description field contains
basic information about the share. There is a 255-character limit. Description is
optional but is helpful when managing multiple shares.

Example for directory ACLs: Say that /ifs/eng is a new directory that was created
using the CLI. Windows users can create and delete files in the directory. When
creating the share, if the Do not change existing permissions is set and then users
attempt to save files to the share, an access denied occurs because Everyone has
read access. Even as an administrator you cannot modify the security tab of the
directory to add Windows users because the mode bits limit access to only Root.As
an example, /ifs/eng is and NFS export and you explicitly want the /ifs/eng mode bit
rights set based on UNIX client application requirements. Selecting the Apply
Windows default ACLs option as shown in the graphic, overwrites the original
ACLs, which can break the application. Thus, there is risk that is associated with
using Apply Windows default ACLs with an existing directory.

Example for home directories: To create a share that automatically redirects users
to their home directories, select the Allow variable expansion box. To automatically
create a directory for the user, check the Auto-create directories box. You may also
set the appropriate flags by using the isi smb command in the command-line
interface. In the graphic, 1) set up user access to their home directory by mapping
to /ifs/finance/home. Users are automatically redirected to their home directory
/ifs/finance/home/. 2) Expansion variables are used to automatically create a path
where the users store the home directory files. After the creation, users connecting
to this share are automatically redirected to their home directory according to the
used path variables. The access zone is implied, because all access for Active
Directory is done per access zone and each access zone has its own home
directory path.

PowerScale Administration-SSP1

Page 154 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Challenge

Lab Assignment: Now log in to the cluster and create home directories
and a general purpose share.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 155


Configuring Client Access to Data

NFS Exports

Scenario

IT Manager: Now that you have the Windows users able to access the
cluster, you configure access for the linux users. I want you to create an
export that the linux users can access. Have a good understanding of
NFS exports before implementing into the lab.

Your Challenge: The IT manager has tasked you to create an NFS


export and verify that clients can access the export. Make sure you can
discuss NFS, create exports, and mount the exports.

NFS Overview

1
3

1: NFS relies upon remote procedure call (RPC) for client authentication and port
mapping.

2: NFS is native to UNIX clients. You can configure NFS to enable UNIX clients to
access content stored on PowerScale clusters.

3: OneFS supports NFSv3 and NFSv4.

OneFS supports NFS protocol versions 3, 4, and Kerberos authentication.

PowerScale Administration-SSP1

Page 156 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Exporting a directory enables accessing the data that is hosted on the cluster.

NFS is enabled by default in the cluster.

Click here to learn more about connectivity.

NFSv4 Continuous Availability

OneFS 8.0 and


later

Node or network
issue

User with dynamic


Automatically fail pools
over to
another node
NFS client

CA is enabled by default.

Clients transparently fail over to another node when a network or node fails.

No manual intervention on the client side.

Enabling and Disabling NFS

To enable and disable NFS using the WebUI, click Protocols > UNIX sharing (NFS)
> Global settings tab.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 157


Configuring Client Access to Data

1 2

1: Export settings are on the access zone level.

If changing a value in the Export settings, that value changes for all NFS exports in
the access zone. Modifying the access zone default values is not recommended.
You can change the settings for individual NFS exports as you create them, or edit
the settings for individual exports as needed.

2: Enabling NFSv4 requires entering the domain in the Zone settings page.

3: NFSv3 enabled by default

4: NFSv4 disabled by default

5: The NFS service is enabled by default.

If NFSv4 is enabled, specify the name for the NFSv4 domain in the NFSv4 domain
field on the Zone setting page.

You can customize the user/group mappings, and the security types (UNIX and/or
Kerberos), and other advanced NFS settings.

The NFS global settings determine how the NFS file sharing service operates. The
settings include enabling or disabling support for different versions of NFS.
Enabling NFSv4 is nondisruptive, and it runs concurrently with NFSv3. Enabling
NFSv4 does not impact any existing NFSv3 clients.

PowerScale Administration-SSP1

Page 158 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Configuration steps on the UNIX sharing (NFS) page have the possibilities to
reload the cached NFS exports configuration to ensure that any DNS or NIS
changes take effect immediately.

NFS Export Creation - Looking Closer at Settings

Create and manage NFS exports using either the WebUI or the CLI. For the CLI,
use the isi nfs exports command.

Export per access zone.

Protocols > UNIX sharing (NFS) > NFS exports page, Create an export option.
Highlighted are the paths to export.

1: Add multiple directory paths. A network hostname, an IP address, a subnet, or a


netgroup name can be used for reference.

2: Description - 255 characters limit.

3: Specifying no clients allows all clients on the network access to the export.

4: Rule order of precedence: Root clients, always read/write clients, Always read-
only clients, and then clients.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 159


Configuring Client Access to Data

You can enter a client by host name, IPv4 or IPv6 address, subnet, or netgroup.
Client fields:
• Clients - allowed access to the export
• Always read-write clients - allowed read/write access regardless of export's
access restriction setting
• Always read-only clients - allowed read-only access regardless of export's
access restriction setting
• Root clients - map as root

OneFS can have multiple exports with different rules that apply the same directory.
A network hostname, an IP address, a subnet, or a netgroup name can be used for
reference. The same export settings and rules that are created here apply to all the
listed directory paths. If no clients are listed in any entries, no client restrictions
apply to attempted mounts.

When multiple exports are created for the same path, the more specific rule takes
precedence. For example, if the 192.168.3 subnet has read-only access and
192.168.3.3 client has read/write access.

PowerScale Administration-SSP1

Page 160 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

NFS Export Creation - Looking Closer at Permissions

Restrict and allow permissions

Allow mounting sub directories below


the path to be the mounted path

You can configure customized mapping

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 161


Configuring Client Access to Data

Permissions settings can restrict access to read-only and enable mount access to
subdirectories. Other export settings are user mappings.69

The Advanced settings require advanced knowledge.70

NFS Considerations

Following are considerations for NFS:

• NFSv3 and NFSv4 CA clients should use dynamic IP address pools.


• With OneFS 8.0 and later, PowerScale supports up to 40 K exports.

Challenge

Lab Assignment: Now that you have learned how to create an export,
you are ready to create the NFS directory, export the directory, and
mount it to the Centos client.

69The "root user mapping" default is to map root users to nobody, and group is
none. The default Security type is "UNIX (system)". Scrolling down in the "Create
an export" window shows the "Advanced settings".

70 Uninformed changes to these advanced settings could result in operational


failures. Ensure that you understand the consequences of your changes before
saving. Any adjustments made to these settings override the default settings for
this export only. While it is not recommended, any changes made to the default
values are done through the "Export settings" tab. "Advanced settings" are
performance settings, client compatibility settings, and export behavior settings.

PowerScale Administration-SSP1

Page 162 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

S3 Buckets

Scenario

IT Manager: We are considering using the PowerScale cluster to store


and share our S3 content. I want you to see how we can configure S3
on the cluster should we decide to implement the functionality.

Your Challenge: The IT manager has tasked you to create an S3


bucket. The manager wants you to describe the S3 integration with
PowerScale.

S3 Overview

OneFS namespace
Objects stored in buckets

Multi-protocol access to objects

Amazon Simple Storage Service (S3) is an AWS service that provides object
storage through a web interface. OneFS 9.0.x and later support S3 as a tier 1
protocol. OneFS S3 value:

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 163


Configuring Client Access to Data

• Multi-protocol access71
• Multi-tenancy - access zone aware
• Latency and IOPs equivilent to other OneFS protocols
• Evolve the PowerScale data lake story:
• Single namespace and multi-protocol access
• Concurrent access72 to objects and files
• Interoperability with OneFS data services such as snapshots, WORM, quotas,
SnycIQ, and others

Implementation - Creating an S3 Bucket

Enable S3 Service

Enable the service. By default the service is cleared and disabled.

CLI command to change the port settings:


isi s3 settings global modify

71
Support interoperability between all OneFS supported protocols. File system
mapping: Object to file, object to directory, and bucket to base directory.

72 Supports locking and access control semantics.

PowerScale Administration-SSP1

Page 164 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Default ports

WebUI Protocols > Object storage (S3) page, Global settings tab. Click the image to enlarge.

Zone Settings

Configure the root path.

CLI command to set the root path:


isi s3 settings zone modify

Use if virtual host style is needed. For instance,


the base domain for bucket3.engineering.dees.lab
is engineering.dees.lab

Click the image to enlarge.

Object storage (S3) Page

You can create buckets using the Object storage (S3) page or using the isi s3
buckets create command.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 165


Configuring Client Access to Data

Can create S3 buckets per access zone

Two buckets created with root as the


owner

WebUI Protocols > Object storage (S3) page. Click the image to enlarge.

Create Bucket

The example shows creating a bucket.

CLI command to create the bucket and add ACL:


isi s3 buckets create bucket3 /ifs/engineering/bucket4 --
create-path --owner root --acls
name=dees\\john,type=user,perm=READ --zone=System

PowerScale Administration-SSP1

Page 166 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Only characters a-z,0- You can choose a user from the


Name not editable after creation Owner not editable after creation configured authentication providers
9, and '-'

1 2 You can select a user

S3 ACLs enable you to manage access to buckets


and objects.

Click the image to enlarge.

Complete Bucket Create

The graphic shows the Create a Bucket fields completed and the command to view
a created bucket.

AD user with read permissions

Can add more users and ACLs

Click the image to enlarge.

S3 Bucket Table

The Buckets tab shows the created buckets in a list view.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 167


Configuring Client Access to Data

CLI command to list the buckets:


isi s3 buckets list

Click the image to enlarge.

Key Management

A key must be created to authenticate the access. Key management from WebUI
facilitates generation of secret keys and access ID. The example show key creation
using the CLI.

Create a key for the bucket


owner

Access ID and Secret Key are


needed to authenticate

Can view the created keys

Click the image to enlarge.

PowerScale Administration-SSP1

Page 168 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

Accessing the S3 Bucket

The example shows using an Amazon S3 browser to connect to the configured


buckets on the PowerScale cluster.

Considerations

Listed are areas to consider:


• OneFS S3 is not a full AWS implementation - not a 1:1 with AWS.
• User can have two secret keys for transient period, not forever.
• No sorting or searching buckets in the WebUI
• File system ACLs are checked even if bucket ACL allows
• 16 TB object size limitation
• Use SyncIQ to replicate S3 buckets
• Use SmartPools to tier S3 buckets
• Use SnapshotIQ to version S3 buckets

Services

• /var/log/s3.log for general errors

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 169


Configuring Client Access to Data

• /var/log/lwsmd.log for problems with service mgmt such as service startup


issues
• CELOG - logs service start failure, user identity query failure, SBT bucket ID
Invalid, and SBT full
• SRS data includes buckets, log-level, global settings, zone settings, and
components of the service registry
• You can use isi statistics and isi performance for S3 metrics

Challenge

Lab Assignment: Now log in to the cluster and create an S3 bucket.


Add objects to the bucket and access the data over SMB.

PowerScale Administration-SSP1

Page 170 © Copyright 2020 Dell Inc.


Configuring Client Access to Data

HDFS and Swift

Hadoop Introduction

Software library framework

Allows for distributed processing of large data


sets

Analyzes data across groups of computers


using simple programming models

Tool of choice for big data analytics

Requires license

The Hadoop Distributed File System (HDFS) protocol enables a cluster to work
with Apache Hadoop, a framework for data-intensive distributed applications.

In a typical enterprise environment, Hadoop analyzes existing data to improve


processes and performance depending on the business model.

Click to view the entire HDFS topic.

Swift Overview

OneFS supports Swift, an object storage interface compatible with the OpenStack
Swift 1.0 API. Swift is a hybrid between the two storage types, storing Swift
metadata as an alternative data stream. Through Swift, users can access file-
based data that is stored on the cluster as objects. The Swift API is implemented
as Representational State Transfer, or REST, web services over HTTP or HTTPS.
Since the Swift API is considered a protocol, content and metadata can be ingested
as objects and concurrently accessed through protocols that are configured on the
cluster. The cluster must be licensed to support Swift.

Swift enables storage consolidation for applications regardless of protocol, which


can help eliminate storage silos. In environments with petabytes of unstructured
data, Swift can automate the collection, store, and manage the data, such as in a
data lake, for later analysis. Swift can be used to automate data-processing
applications to store objects on an Isilon cluster and analyze the data with Hadoop

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 171


Configuring Client Access to Data

through the OneFS HDFS. Swift benefits include secure multitenancy for
applications through access zones while protecting the data with capabilities such
as authentication, access control, and identity management. Manage data through
enterprise storage features such as deduplication, replication, tiering, performance
monitoring, snapshots, and NDMP backups. Swift balances the workload across
the cluster nodes through SmartConnect and stores object data more efficiently
with FEC instead of data replication.

Automate data-processing applications

Swift client
access

Clients - SMB, NFS,


HDFS access

Dissimilar protocol storage


consolidation

Click to see the entire Swift topic.

PowerScale Administration-SSP1

Page 172 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Foundations of Data Protection and Data Layout

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 173
Foundations of Data Protection and Data Layout

File Striping

Scenario

IT Manager: I am not sure how the cluster does striping. I want you to
do some research and let me know how the operating system stripes a
file.

Your Challenge: The IT manager wants you to describe how files are
broken up for file stripes and diagram the high-level file striping steps.

Introduction to File Striping

Four node Gen 6 example (+2d:1n).

OneFS protects files as the data is being written. Striping protects the cluster data
and improves performance. To understand OneFS data protection, the first step is
grasping the concept of data and forward error correction or FEC stripes.

PowerScale Administration-SSP1

Page 174 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

• File Stripes - files are logically segmented into 128 KB stripe units to calculate
protection
• FEC stripe unit - FEC stripe unit is the calculated piece of data protection
• Data stripe units + FEC stripe units = Stripe width.In the graphic, the stripe
width is 12 (eight data [1 MB file data] + 4 FEC)
• 16 data stripe units + 4 FEC = Maximum Stripe width of 20.
• 16 data stripe units = 2 MB. Files larger than 2 MB have multiple data stripe
units.

Data and FEC Stripe Units

The data stripe units and protection stripe units are calculated for each file stripe by
the Block Allocation Manager (BAM) process73.

F200 example with +1n protection.

1
2

73The BAM process calculates 128-KB FEC stripe units to meet the protection
level for each file stripe. The higher the protection level, the more FEC stripes units
are calculated.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 175
Foundations of Data Protection and Data Layout

1: A file is divided into 128-KB data stripes unit.

2: Each data stripe unit consists of sixteen 8K blocks.

16 X 8K = 128-KB

3: The protection is calculated based on the requested protection level for each file
stripe using the data stripe units that are assigned to that file stripe.

4: The combined 128-KB stripe units are called the Stripe Width. A single file stripe
width can contain up to 16, 128-KB data stripe units for a maximum size of 2 MB as
the files data portion. A large file has thousands of file stripes per file that is
distributed across the node pool.

File Striping Steps

The steps shows a simple example of the write process. The client saves a file to
the node it is connected to. The file is divided into data stripe units. The data stripe
units are assembled into the maximum stripe widths for the file. FEC stripe units
are calculated to meet the Requested Protection level. Then the data and FEC
stripe units are striped across nodes.

Step 1

OneFS stripes the data stripe units and FEC stripe units across the node pools.
Some protection schemes74 use more than one drive per node.

74OneFS uses advanced data layout algorithms to determine data layout for
maximum efficiency and performance. Data is evenly distributed across nodes in
the node pool as it is written. The system can continuously reallocate where the
data is stored and make storage space more usable and efficient. Depending on
the file size and the stripe width, as the cluster size increases, the system stores

PowerScale Administration-SSP1

Page 176 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

The client saves a file to the node it is connected.

File

Client

Graphic shows Gen 6 cluster with a simple example of the write process.

Step 2

If the file is greater than 128 KB, then the file is divided into data stripe units.

large files more efficiently. Every disk within each node is assigned both a unique
GUID (global unique identifier) and logical drive number. The disks are subdivided
into 32-MB cylinder groups that are composed of 8-KB blocks. Each cylinder group
is responsible for tracking, using a bitmap, whether its blocks are used for data,
inodes or other metadata constructs. The combination of node number, logical
drive number, and block offset make the block or inode address, which the Block
Allocation Manager controls.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 177
Foundations of Data Protection and Data Layout

The node divides the file into data stripe units

Data Stripe Units


File

Step 3

The node that the client connects to is the node that performs the FEC calculation.

The node calculates the FEC stripe units

FEC Stripe Unit

Data Stripe Units

PowerScale Administration-SSP1

Page 178 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Step 4

The data stripe units are assembled to maximum stripe width for the file. Also, here
the protection level that is configured is N+1n75.

Stripe width is assembled

Step 5

Depending on the write pattern, the data and FEC stripes might be written to one
drive per node or two drives per node. The important take away is that files
segment into stripes of data, FEC is calculated and this data distributes across the
cluster.

75 one disk per node/one FEC

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 179
Foundations of Data Protection and Data Layout

Data and FEC stripe units are striped across nodes

Considerations: File Striping

Listed are areas to consider when discussing file striping.


• The maximum of 16 data stripe units per file stripe means the maximum file
portion in a file stripe is 2 MB (16 x 128 KB).
• If a file does not fill the 128 KB stripe unit, the stripe unit is not padded (the
extra capacity is usable by the cluster).
• Files less than 128 KB are mirrored - not erasure coded. For example, a 100 KB
file with 2d:1n protection has a 3x mirror.
• The file size and protection level determine the capacity efficiency.
• At 80% capacity consumption, the organization should begin the process of
adding more nodes to prevent the cluster from going beyond 90%. Do not
exceed 90% capacity consumption.

PowerScale Administration-SSP1

Page 180 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Challenge

IT Manager:
Open participation questions:
Question: What does OneFS consider a small file and how are
small files put on disks for protection?

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 181
Foundations of Data Protection and Data Layout

Data Protection

Scenario

IT Manager: The manager gestures to the technical document and then


looks at you, "All this information is giving me a headache. I am going to
need your help. I do not have the time to read and understand all the
intricacies of data protection. Configure the proper data protection and
then let me know how it works and how it is configured."

Your Challenge: The IT manager wants you to describe data protection


levels in OneFS, define stripes and stripe units, and discuss the different
protection levels.

OneFS Data Protection

Data protection is one of the variables that are used to determine how data is laid
out. OneFS is designed to withstand multiple simultaneous component failures
while still affording access to the entire file system and dataset.
• OneFS uses the Reed-Solomon algorithm
• The data can be protected up to an N+4n scheme
• In OneFS, protection is calculated per individual files

Important: Files smaller than 128 KB are treated as small files.


Due to how OneFS applies protection, small files are mirrored.

PowerScale Administration-SSP1

Page 182 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Data Protection Improvements

In Gen 6 nodes, data protection and efficiency focus on:


• Mirrored Journal
• Smaller neighborhood76

In Gen 6.5 nodes, the journal is stored on an NVDIMM that is battery protected.

Data Protection Terms

N+Mn

N+Mn is the primary protection level in OneFS. N +Mn

• N77 Data Stripes + Number of nodes


or
• M78 Data Stripes + Number of drives

76Smaller neighborhoods improve efficiency by the fact that the fewer devices you
have within a neighborhood, the less chance that multiple devices will
simultaneously fail.

77 The “N” is the number of data stripes.

78The M value represents the number of simultaneous tolerable drive failures on


separate nodes without data loss. It also represents the number of FEC stripe units
per protection stripe.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 183
Foundations of Data Protection and Data Layout

• Mn79
• N+Mn80
• N=M81
• N>M82

The number of sustainable drive failures are per disk pool. Multiple drive failures on
a single node are equivalent to a single node failure. The drive loss protection level
is applied per disk pool.

Protection Drive Node Minimum Maximum


Level Failures failures Node Pool Stripe Width
Size

N+1n 1 1 3 nodes - (2 17- (16 data +


data + 1 FEC) 1 FEC)

N+2n 2 2 5 nodes - (3 18 - (16 data +


data + 2 FEC) 2 FEC)

N+3n 3 3 7 nodes - (4 19 - (16 data +


data + 3 FEC) 3 FEC)

79 The “Mn” is the number of simultaneous drive or node failures that can be
tolerated without data loss.

80 The available N+Mn Requested Protection levels are plus one, two, three, or four
“n” (+1n, +2n, +3n, and +4n). With N+Mn protection, only one stripe unit is written
to a single drive on the node.

81 If N equals M, the protection overhead is 50 percent. For example, with N+2n, a


file size 256 KB has a 50% protection overhead (256 KB = 2 stripe units).

82N must be greater than M to gain efficiency from the data protection. If N is less
than M, the protection results in a level of FEC calculated mirroring.

PowerScale Administration-SSP1

Page 184 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

N+4n 4 4 9 nodes - (5 20 - (16 data +


data + 4 FEC) 4 FEC)

N+Md:Bn Protection

N + Md : Bn

Data stripes + Number of drives and number of nodes

The “d” is the number of drives and “n” is the number of nodes. So N+3d:1n reads
as N+3 drives or 1 node.

Unlike N+Mn, N+Md:Bn has different values for the number of drive loss and node
losses that are tolerated before data loss may occur. When a node loss occurs,
multiple stripe units are unavailable from each protection stripe and the tolerable
drive loss limit is reached when a node loss occurs.
• M83
• d84
• Colon (:)85

83In this protection level, M is the number of drives per node onto which a stripe
unit is written.

84 The number of drives.

85 The: (colon) represents an “or” conjunction.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 185
Foundations of Data Protection and Data Layout

• B86
• n87

With Gen 6x, for better reliability, better efficiency, and simplified protection, using
+2d:1n, +3d:1n1d, or +4d:2n is recommended.

Minimum number of nodes in a node pool.88

Actual Protection Nomenclature

The Actual protection is represented differently than requested protection. . The


table shows the representation for the requested protection and the actual
protection.

N is replaced in the actual protection with the number of data stripe units for each
protection stripe. If there is no / in the output, it implies a single drive per node.
Mirrored file protection is represented as 2x to 8x in the output.

86 The B value represents the number of tolerated node losses without data loss.

87 “n” is the number of nodes.

88Remember that Gen 6 requires a minimum of 4 nodes of the same type, so


where the minimum number of nodes of three is indicated, for Gen 6 this is four.
Gen 6.5 requires a minimum of 3 nodes of the same type.

PowerScale Administration-SSP1

Page 186 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Outputs all files in


No /<#> implies directory or single
one drive per node file information

Drives per
node
N+2/2

Data stripe FEC stripe units


units per per stripe
stripe

The graphic shows viewing the output showing Actual protection on a file from the isi get command.
The output displays the number of data stripe units plus the number of FEC stripe units that are
divided by the number of disks per node the stripe is written to

Overhead Protection levels

The protection overhead for each protection level depends on the file size and the
number of nodes in the cluster. The percentage of protection overhead declines as
the cluster gets larger. In general, N+1n protection has a protection overhead equal
to the capacity of one node, N+2n to the capacity of two nodes, N+3n to the
capacity of three nodes, and so on.

Data mirroring requires significant storage overhead and may not always be the
best data-protection method. Example89

89 If you enable 3x mirroring, the specified content is explicitly duplicated three


times on the cluster; depending on the amount of content being mirrored, this can
require a significant amount of capacity.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 187
Foundations of Data Protection and Data Layout

Blue - 50% efficient Yellow - Mirrored Bold - Max efficiency


Protection reached

The table shows the relative protection overhead associated with each FEC requested protection
level. Indicators include when the FEC protection would result in mirroring.

MTTDL

MTTDL deals with how long you can go without losing data. MTTDL is used to
calculate the OneFS suggested protection.
• Accommodate failures90
• Disk pools91
• MTBF92

90Because there are so many disk drives in a large PowerScale installation, it is


common for a drive to be down at one time or another. Where other systems try to
harden against failures, PowerScale accommodates them. OneFS expects that any
device could fail at any point in time.

91Disk pools improve MTTDL because they create more failure domains, improving
the statistical likelihood of tolerating failures over the lifetime of the equipment.

92Mean Time Before Failure (MTBF) refers to individual component failure.


PowerScale subscribes to the "all devices do fail" philosophy (MTTDL), whereas

PowerScale Administration-SSP1

Page 188 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Quorum

For the cluster to properly function and accept data


writes, a quorum of nodes must be active and
responding.

• Greater than 50% available93


• No quorum - no writes94
• Protection level to minimum number of nodes95

N+2n vs. N+2d:1n Data Protection

There are six data stripe units to write a 768-KB file. The desired protection
includes the ability to sustain the loss of two hard drives.

MTBF is a single-component view of reliability. MTTDL is a better measure of what


customers care about.

93For a quorum, more than half the nodes must be available over the internal,
backend network to allow writes. An eight-node Gen 6 cluster, for example,
requires a five-node quorum.

94 If there is no node quorum, reads may occur, depending upon where the data
lies on the cluster but for the safety of new data, no new information will be written
to the cluster. So, if a cluster loses its quorum, the OneFS file system becomes
read-only and will allow clients to access data but not to write to the cluster.

95 Each protection level requires a minimum number of nodes. For example,


N+2d:1n needs a minimum of four Gen 6 nodes. Why? You can lose one node and
still have three nodes up and running, greater than 50%. You must keep quorum to
keep the cluster writeable.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 189
Foundations of Data Protection and Data Layout

1 2 3

1: Using N+2n protection, the 768-KB file will be placed into three separate data
stripes, each with two protection stripe units. Six protection stripe units are required
to deliver the requested protection level for the six data stripe units. The protection
overhead is 50 percent.

2: Using N+2d:1n protection the same 768-KB file requires one data stripe, two
drives wide per node and only two protection stripe units. The eight stripe units are
written to two different drives per node. The protection overhead is the same as the
eight node cluster at 25 percent.

3: If there is a eight node cluster, two FEC stripe units would be calculated on the
six data stripe units using an N+2n protection level. The protection overhead in this
case is 25 percent.

PowerScale Administration-SSP1

Page 190 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Mirrored Data Protection

Mirroring is used to protect the file metadata and some system files that exist under
/ifs in hidden directories. Mirroring can be explicitly96 set as the requested
protection level in all available locations.

Use Case97

2
2x to 8x
x
Mirroring. Original file plus
- The protection blocks are copies of the original set of data 1 to 7 copies.
3
blocks.
x

- 2x to 8x mirror settings. # indicates total number of data 4


instances. x

5
-The protection is explicitly set and the required mirroring is x
selected.
6
-Actual protection is applied for other Requested Protection
x
Levels
7
x

8
x

96 Mirroring is set as the actual protection on a file even though another requested
protection level is specified under certain conditions. If the files are small, the FEC
protection for the file results in a mirroring. The loss protection requirements of the
requested protection determine the number of mirrored copies. Mirroring is also
used if the node pool is not large enough to support the requested protection level.
For example, five nodes in a node pool with N+3n Requested Protection, saves the
file at 4X mirror level, the actual protection.

97 One particular use case is where the system is used to only store small files. A
file of 128 KB or less is considered a small file. Some workflows store millions of 1
KB to 4-KB files. Explicitly setting the requested protection to mirroring can save
fractions of a second per file and reduce the write ingest time for the files.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 191
Foundations of Data Protection and Data Layout

FEC Protection - Single Drive Per Node

Single Drive per Node

Data Stripe Unit

Stripe

FEC Stripe Unit

Some protection schemes use a single drive per node per protection stripe. The
graphic shows only a single data stripe unit, or a single FEC stripe unit is written to
each node. These protection levels are N+M or N+Mn.

PowerScale Administration-SSP1

Page 192 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Example: N+Mn Protection Stripe

The table shows each requested N+Mn Requested Protection level over the
minimum number of required nodes for each level. The data stripe units and
protection stripe units98 can be placed on any node pool and in any order.

+ 1n + 2n +3n +4n N+Mn Level

Data Data Data Data Node 1

Data Data Data Data Node 2

FEC Data Data Data Node 3

FEC Data Data Node 4

FEC FEC Data Node 5

FEC FEC Node 6

FEC FEC Node 7

FEC Node 8

FEC Node 9

The number of data stripe units depends on the size of the file and the size of the
node pool up to the maximum stripe width. As illustrated, N+1n has one FEC stripe

98The number of data stripe units depends on the size of the file and the size of the
node pool up to the maximum stripe width. N+1n has one FEC stripe unit per
protection stripe, N+2n has two, N+3n has three, and N+4n has four. N+2n and
N+3n are the two most widely used Requested Protection levels for larger node
pools, node pools with around 15 nodes or more. The ability to sustain both drive or
node loss drives the use when possible.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 193
Foundations of Data Protection and Data Layout

unit per protection stripe, N+2n has two, N+3n has three, and N+4n has four. N+2n
and N+3n are the two most widely used Requested Protection levels for larger
node pools, node pools with around 15 nodes or more. The ability to sustain both
drive or node loss drives the use when possible.

FEC Protection - Multiple Drives Per Node

Multiple Drives per Node

N+M:B or N+Md:Bn protection protection schemes use multiple drives per node.
The multiple drives contain parts of the same protection stripe. Multiple data stripe
units and FEC stripe units are placed on a separate drive on each node.

Protection:
N+2d:1n

Stripe

The graphic shows an example of a 1 MB file with a Requested Protection of +2d:1n. Four stripe
units, either data or protection stripe units are placed on separate drives in each node. Two drives
on different nodes per disk pool can simultaneously be lost or a single node without the risk of data
loss.

N+Md:Bn Protection Levels

One stripe with multiple stripe units per node.

PowerScale Administration-SSP1

Page 194 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Protection Level Drives Failures Node Failures Maximum Stripe


Width

N+2d:1n 2 1 18 - (16 data + 2


FEC)

N+3d:1n 3 1 19 - (16 data + 3


FEC)

N+4d:1n 4 1 20 - (16 data + 4


FEC)

FEC Protection - Advanced

Advanced Protection

In addition to N+Md:Bn, there are two advanced99 forms of Requested Protection.


The benefit to the advanced N+Md:Bn protection levels are they provide a higher

99 The available Requested Protection levels N+3d:1n1d and N+4d:2n. N+3d:1n1d


includes three FEC stripe units per protection stripe, and provides protection for
three simultaneous drive losses, or one node and one drive loss. The higher
protection provides the extra safety during data rebuilds that are associated with
the larger drive sizes of 4 TB and 6 TB. The maximum number of data stripe units
is 15 and not 16 when using N+3d:1n1d Requested Protection. N+4d:2n includes
four FEC stripe units per stripe, and provides protection for four simultaneous drive
losses, or two simultaneous node failures.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 195
Foundations of Data Protection and Data Layout

level of node loss protection. Besides the drive loss protection, the node loss
protection is increased.

Protection Level Drive failures Other failures Maximum Data


Stripe Width

N+3d:1n1d 3 1 node + 1 drive 18 - (15 data + 3


FEC)

N+4d:2n 4 2 nodes 20 - (16 data + 4


FEC)

Example: Advanced N+Mn:Bn Protection Stripe

The table shows examples of the advanced N+Md:Bn protection schemes100. Two
drives per node per protection stripe. The number of FEC stripe units does not
equal the number of drives that are used for the protection stripe. Even if one node
is lost, there is still a greater level of protection available.

N+Md:Bn Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Drive


Level

+3d:1n1d Data Data FEC Data Data Data 1


3 FEC stripe Data FEC Data FEC Data Data 2
units, 2 Drive
per Node

100 Like other protection levels, the data stripe units and FEC stripe units are placed
on any node in the node pool and on any drive. N+3d:1n1d is the minimum
protection for node pools containing 6-TB drives. The use of N+4d:2n is expected
to increase especially for smaller to middle sized node pools as larger drives are
introduced.

PowerScale Administration-SSP1

Page 196 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

+4d:2n Data Data FEC Data FEC Data 1


4 FEC stripe Data FEC Data FEC Data Data 2
units, 2
Drives per
Node

Protection Overhead

The protection overhead for each protection level depends on the file size and the
number of nodes in the cluster. The percentage of protection overhead declines as
the cluster gets larger.

• N+1n101
• N+2n102
• N+3n103
• Data Mirroring104

For better reliability, better efficiency, and simplified protection, use N+2d:1n,
N+3d:1n1d, or N+4d:2n, as indicated with a red box.

101 N+1n protection has a protection overhead equal to the capacity of one node.

102 N+2n protection has a protection overhead equal to the capacity two nodes.

103N+3n is equal to the capacity of three nodes, and so on. OneFS also supports
optional data mirroring from 2x-8x, enabling from two to eight mirrors of the
specified content.

104 Data mirroring requires significant storage overhead and may not always be the
best data-protection method. For example, if you enable 3x mirroring, the specified
content is explicitly duplicated three times on the cluster. Depending on the amount
of content being mirrored, the mirrors can require a significant amount of capacity.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 197
Foundations of Data Protection and Data Layout

50% efficient Mirrored Protection

The table shows the relative protection overhead that is associated with each FEC requested
protection level available in OneFS. Indicators include when the FEC protection would result in
mirroring.

Considerations

As the cluster scales, the default protection may need adjusting. You may not want
to apply a higher protection to the entire cluster. Although you get better protection,
it is less efficient. Listed are areas to consider.

• The suggested protection feature is enabled on new clusters.105


• Higher protection levels impact utilization for small files.
• As protection increases, performance decreases.106

105 On cluster upgrades, the feature is disabled by default.

106Because the system is doing more work to calculate and stripe the protection
data – impact is approximately linear.

PowerScale Administration-SSP1

Page 198 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

• Large107 archive clusters (20+ nodes) often require N+3.


• Set requested protection to mirroring or use SFSE for workflows with small108
files.
• Gen 6 recommends N+2d:1n or N+3d:1n1d protection.
• Protect critical datasets109 with different policies.

Challenge

Lab Assignment: Review the data protection levels:


• Node pool protection levels
• Directory level protection
• File level protection

107 Other clusters work well with N+2 or N+2d:1n.

108 Some workflows store millions of 1 KB to 4 KB files.

109The customer may want to protect some repositories at a higher level than the
cluster default.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 199
Foundations of Data Protection and Data Layout

Protection Management

Scenario

IT Manager: Okay, I think I get protection levels, but it seems there is


more to data protection than I thought.

Your Challenge: The IT Manager wants to differentiate suggested,


requested, and actual protection. Explain the editing of file pool and
node pool protection, and discuss the editing of file and directory level
protection.

Data Protection Types

2 3

1: Requested Protection is what is configured, it determines the amount of


redundant data on the cluster.

2: Mirrored protection copies data to multiple locations, it can have 2 to 8 mirrors.

3: Suggested is the protection OneFS recommends and cannot be modified.

4: Actual is the level of protection OneFS applies to data. It can be more than
requested protection but never less.

PowerScale Administration-SSP1

Page 200 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Requested Protection

Requested Protection configuration is available at multiple levels. Each level is


used to control protection for specific reasons. A requested protection level is
assigned to every node pool. In OneFS, you set the requested protection at the
directory or individual file level. Management of the requested protection levels is
available using the WebUI, CLI, or PAPI.

Directory path

File

H600

Cluster wide - default


protection

A200

Node pool - default


protection

Requested Protection Settings

Cluster-wide settings

The cluster-wide default data protection setting is made using the default file
pool110 policy.

110The View default policy details window displays the current default file pool
policy settings. The current protection is displayed under requested protection. The
default setting is to use the requested protection setting at the node pool level as
highlighted in the Edit default policy details window.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 201
Foundations of Data Protection and Data Layout

Sets default file pool policy - applies to all files


without higher policy

Recommended to use the node pool protection


setting

Available settings from the Requested


protection drop down menu

To view or edit the default setting, go to File system > Storage pools > File pool policies, and
click View / Edit on the Default policy. isi file pool policy modify finance --set-
requested-protection +3:1, sets the requested protection for the file pool policy at +3d:1n.

Node pool settings

The default file pool policy protection setting uses the node pool or tier setting.
When a node pool is created, the default requested protection111 that is applied to
the node pool is +2d:1n.

The current requested protection for each node pool is displayed in the Tiers and
node pools section.

111The minimum requested protection for an archive-series node pool is +3d:1n1d.


To meet the minimum, modify the archive-series node pool requested protection.

PowerScale Administration-SSP1

Page 202 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Sets requested protection per node


pool

Minimum protection should meet


suggested protection

Drop down to expand the requested


protection options

Click Save changes after selecting the


new requested protection level

To view and edit the requested protection setting for the node pools in the WebUI, go to the File
system > Storage pools > SmartPools page. isi storagepool nodepools modify
v200_25gb_2gb --protection-policy +2n, sets the requested protection of a node pool to
+2n.

Directory and file settings

OneFS stores the properties for each file. To view the files and the next level
subdirectories, click the specific directory.

Manual settings112

112 Manual settings can be used to modify the protection on specific directories or
files. The settings can be changed at the directory, subdirectory, and file level. Best
practices recommend against using manual settings, because manual settings can
return unexpected results and create management issues as the data and cluster
age. Once manually set, reset the settings to default to use automated file pool
policy settings, or continue as manually managed settings. Manual settings
override file pool policy automated changes. Manually configuring is only

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 203
Foundations of Data Protection and Data Layout

Manual settings use case113

Search button to search a file or Browse to directly


open a directory or file

Uncheck to configure manually


To modify
protection level

To view directories and files on the cluster, go to File System > File system explorer.

recommended for unique use cases. Manual changes are made using the WebUI
File system explorer or the CLI isi set command.

113The isi set -p 4x -A on /ifs/finance/data use case for setting a directory


requested protection is that the /ifs/finance/data directory requires a 4x mirror
whereas all other node pool directories use the +2d:1n node pool setting.

PowerScale Administration-SSP1

Page 204 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Use Case - Node Pool or Directory Requested Protection

Requested protection level


on the H600 node pool set
at +2d:1n

H600

A200

Requested protection level


on the A200 node pool set
at +3d:1n1d

The graphic shows a workflow that moves data to an archive tier of storage.

SmartPools file pool policies automate data management including applying


requested protection settings to directories and files, the storage pool location, and
the I/O optimization settings.

• Archive tie on an A200 node pool


• File pool policy moves data from production H600 node pool to archive pool
• Protection on production node pool is higher than protection of archive node
pool
• You can set Requested protection settings at the node pool level or at the
directory level

Suggested Protection

Suggested protection refers to the visual status and CELOG event notification
when node pools are set below the calculated suggested protection level.

Suggested protection is important when monitoring the risk of data loss.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 205
Foundations of Data Protection and Data Layout

View of node pool


Calculates and stores MTTDL for
requested protection
each node pool
drop-down list

Data is at risk when below the


MTTDL

As the cluster scales, OneFS


changes the suggested
protection

Caution: It is recommended that you do not specify a setting below


suggested protection. OneFS periodically checks the protection
level on the cluster, and alerts you if data falls below the
recommended protection.

Not using the suggested protection does not mean that data loss occurs, but it
does indicate that the data is at risk. Avoid anything that puts data at risk. What
commonly occurs is a node pool starts small and then grows beyond the configured
requested protection level. The once adequate +2d:1n requested protection level
becomes no longer appropriate, but is never modified to meet the increased
protection requirements.Not using the suggested protection does not mean that
data loss occurs, but it does indicate that the data is at risk. Avoid anything that
puts data at risk. What commonly occurs is a node pool starts small and then
grows beyond the configured requested protection level.

The once adequate +2d:1n requested protection level becomes no longer


appropriate, but is never modified to meet the increased protection
requirements.Not using the suggested protection does not mean that data loss
occurs, but it does indicate that the data is at risk. Avoid anything that puts data at
risk. What commonly occurs is a node pool starts small and then grows beyond the
configured requested protection level. The once adequate +2d:1n requested
protection level becomes no longer appropriate, but is never modified to meet the
increased protection requirements.

PowerScale Administration-SSP1

Page 206 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Suggested Protection Status

The Suggested protection feature provides a method to monitor and notify users
when the requested protection setting is different than the suggested protection for
a node pool.

SmartPools module health status - suggested To modify the settings, click on View/Edit
protection is part of the reporting in the tab

Indicates v200_24gb_2gb node pool with a requested protection level that is different
than the suggested

The notification shows the suggested setting and node pools that are within suggested protection
levels are not displayed.

Actual Protection

The actual protection114 applied to a file depends on the requested protection level,
the size of the file, and the number of node pool nodes.

The rules are:

• Actual protection must meet or exceed the requested protection level.

114 The actual protection level is the protection level OneFS sets. Actual protection
is not necessarily the same as the requested protection level.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 207
Foundations of Data Protection and Data Layout

• Actual protection may change in the interests of efficiency. Case 1115


• Actual protection depends upon file size. Case 2116
• Both cases117

Orange - mirroring,
low minimum size for
requested protection

Blue - minimum for


requested protection

Bold - actual
requested protection

Gray - actual greater


than max nodes at
requested protection

Red - actual
The chart indicates the actual protection that is applied to a file according to the number of nodes in the node pool. If protection changes
actual protection does not match the requested protection level, it may have been changed to be more efficient given the from requested
file or number of nodes in the node pool. protection

115A requested protection of +2d:1n and there is a 2-MB file and a node pool of at
least 18 nodes, the file is laid out as +2n.

116A 128-KB file is protected using 3x mirroring, because at that file size the FEC
calculation results in mirroring.

117 In both cases, the actual protection applied to the file exceeds the minimum
drive loss protection of two drives and node loss protection of one node. The
exception to meeting the minimum requested protection is if the node pool is too
small and unable to support the requested protection minimums. For example, a
node pool with four nodes and set to +4n requested protection. The maximum
supported protection is 4x mirroring in this scenario.

PowerScale Administration-SSP1

Page 208 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Actual Protection Representation

The actual protection is represented differently than requested protection. The


graphic shows the output showing actual protection on a file from the isi get
command.

Outputs all files in a directory or


single file information

Represents the requested protection and the actual


protection

No / in the output implies a single drive per node

N+2/2
Drives per node
Output

Data stripe FEC stripe units


units per stripe per stripe

Tip: COAL in the output shows if write-coalescing is enabled.


Enabled118 is recommended for optimal write performance.

118With asynchronous writes, OneFS buffers writes in memory. However, if you


want to disable this buffering, you should configure the applications to use
synchronous writes. If that is not possible, disable write-coalescing, also known as
SmartCache.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 209
Foundations of Data Protection and Data Layout

isi get

The isi get command provides detailed file or directory information. The primary
options are –d <path> for directory settings and –DD <path>/<filename> for
individual file settings.

The graphic shows the isi get –DD output. The output has three primary
locations containing file protection. The locations are a summary in the header, line
item detail settings in the body, and detailed per stripe layout per drive at the
bottom.

Challenge

IT Manager:
Open participation questions:
Question: What is a use case for setting requested protection at
the cluster level? At the node pool level? At the directory level?

PowerScale Administration-SSP1

Page 210 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Data Layout

Scenario

IT Manager: You are doing a great job. Now, examine how OneFS lays
out the data on disks.

Your Challenge: The IT manager wants to understand data layout.


Describe the different data access pattern, illustrate an access pattern
using concurrency and streaming.

Data Layout Overview

4
1
3
2

2
3

1
4

1: The number of nodes in a node pool affects the data layout because data
spreads across all nodes in the pool. The number of nodes in a node pool
determines how wide the stripe can be.

2: The nomenclature for the protection level is N+Mn, where N is the number of
data stripe units and Mn is the protection level. The protection level also affects
data layout. You can change the protection level down to the file level, and the
protection level of that file changes how it stripes across the cluster.

3: The file size also affects data layout because the system employs different
layout options for larger files than for smaller files to maximize efficiency and

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 211
Foundations of Data Protection and Data Layout

performance. Files smaller than 128 KB are treated as small files. Due to the way
that OneFS applies protection, small files are triple mirrored.

4: The access pattern modifies both prefetching and data layout settings that are
associated with the node pool. Disk access pattern can be set at a file or directory
level so you are not restricted to using only one pattern for the whole cluster.

There are four variables that combine to determine how OneFS lays out data.

The variables make the possible outcomes almost unlimited when trying to
understand how the cluster behaves with varying workflow with differing variables.

You can manually define some aspects of how it determines what is best, but the
process is automated.

Data Access Patterns

An administrator can optimize layout decisions that OneFS makes to better suit the
workflow. The data access pattern influences how a file is written to the drives
during the write process.

1: Concurrency is the default data access pattern. It is used to optimize workflows


with many concurrent users accessing the same files. The preference is that each
protection stripe for a file is placed on the same drive or drives depending on the
2
requested protection level. For example, a large file with 20 protection stripes, each
stripe unit from each protection stripe would prefer placement on the same drive in
each node. Concurrency influences the prefetch caching algorithm to prefetch and
cache a reasonable amount 3 of anticipated data during a read access.

2: Use Streaming for large streaming workflow data such as movie or audio files.
Streaming prefers to use as many drives as possible, within the given pool, when
writing multiple protection stripes for a file. Each file is written to the same sub pool
within the node pool. Streaming maximizes the number of active drives per node as
the streaming data is retrieved. Streaming also influences the prefetch caching
algorithm to be highly aggressive and gather as much associated data as possible.
The maximum number of drives for streaming is five drives per node across the
node pool for each file.

PowerScale Administration-SSP1

Page 212 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

3: A random access pattern prefers using a single drive per node for all protection
stripes for a file, like a concurrency access pattern. With random however, the
prefetch caching request is minimal. Most random data does not benefit from
prefetching data into cache.

Access Pattern Example: Streaming with 1 MB File

A 1 MB file is divided into eight data stripe units and three FEC units. The data is
laid out in three stripes. With a streaming access pattern, more spindles are
preferred. 1 MB file split into eight stripe unit and three stripes - streaming uses
spindles.

Streaming

N +1n

1024 KB
file
8 X 128
KB chunk
3 stripes and 3
drives wide

Streaming prefers more disks

The graphic is a representation of a Gen 6 chassis with four nodes. Each node has five drive sleds.
Each drive sled has three disks. The orange disk represents a neighborhood. The disk that is used
is in the same neighborhood (orange), do not traverse to disks in the other neighborhoods (gray)

Access Pattern Example: Concurrency with 1-MB File

A 1-MB file is divided into eight data stripe units and three FEC units. The data is
laid out in three stripes, one drive wide.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 213
Foundations of Data Protection and Data Layout

Concurrency

N+ 1

1024 KB
file
8 X 128
KB chunk

3 stripes and 1
drive wide

Concurrency prefers one drive per node for the


file

The graphic is a representation of a Gen 6 chassis with four nodes. Each node has five drive sleds.
Each drive sled has three disks. The orange disk represents a neighborhood.

Tip: For more examples of data layout using concurrency click


here.

Data Layout Management

Configuring the data access pattern is done on the file pool policy, or manually at
the directory and file level. Set data access patterns using the WebUI or use isi
set for directory and file level or isi file pool policy for file pool policy
level.

PowerScale Administration-SSP1

Page 214 © Copyright 2020 Dell Inc.


Foundations of Data Protection and Data Layout

Modify either the


Set on file pool policy or default policy or an
manually at the directory or existing file pool policy.
Concurrency is the
file level
default data access
pattern

For WebUI Administration, go to File systems > Storage pools > File pool policies.

Challenge

IT Manager:
Open participation questions:
Question: What is the preferred file layout with a streaming
access pattern?

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 215
Configuring Storage Pools

Configuring Storage Pools

PowerScale Administration-SSP1

Page 216 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Storage Pools

Scenario

IT Manager: Before you configure file policies and tiering data, I want
you to explain the components of storage pools.

Your Challenge: The IT manager has tasked you to describe storage


pool components.

Storage Pools Overview

Storage Pools are an abstraction layer that encompasses disk pools,


neighborhoods, node pools, and tiers.

Storage pools monitor the health and status at the node pool level. Using storage
pools, multiple tiers of nodes (node pools) can all co-exist within a single file
system, with a single point of management.

Node Pool

F800 - high performance node pool

Storage Pool Node Pool

H400 - low performance node pool

Node Pool

A200 -archive node pool

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 217


Configuring Storage Pools

Storage Pool Anatomy

Storage pools differ between Gen 6 nodes and F200/600 nodes.

Gen 6 drive sleds have three, four, or six drives whereas the F200 has 4 drive bays
and the F600 has 8 drive bays.

Drives are segmented into disk pool, creating a failure domain.

Disk pool - failure


domain
Node pool - identical
hardware attributes - created
automatically
Neighborhood - spans 4 to
Disk pool - failure 19 nodes
domain

Gen 6.5 Neighborhood -


spans 3 to 39 nodes

The graphic shows a Gen 6 node pool that has two chassis, eight nodes, and each node having five
drive sleds with three disks.

Storage Pool Components

Exploring the building blocks and features of storage pools helps understand the
underlying structure when moving data between tiers. The storage pool
components, SmartPools, File Pools and CloudPools, are covered in detail in other
topics.

PowerScale Administration-SSP1

Page 218 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Disk Pool

Disk pools are the smallest unit and are a subset of


Tier
neighborhoods. 1

Disk pools provide separate failure domains. Each


drive within the sled is in a different disk pool,
lessening the chance for data unavailability.

Data protection stripes or mirrors do not span119


disk pools.

Neighborhood Tier
2

Neighborhoods
are a group of disk pools and can span from 4
up to 19 nodes for Gen 6 nodes. Nodes have a
single neighborhood from 1-to-19
nodes. Neighborhoods are
automatically assigned and not configurable.

The graphic shows a 20-node


cluster with two neighborhoods.

Subpool/Neighborhood F200/600 Gen 6

119Not spanning disk pools the granularity at which files are striped to the cluster.
Disk pool configuration is automatic and cannot be configured manually. Removing
a sled does not cause data unavailability as only one disk per disk pool is
temporarily lost.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 219


Configuring Storage Pools

Ideal number of nodes 20 nodes 10 nodes

Max number of nodes 39 nodes 19 nodes

Node pool splits at node number 40 20

Gen 6 Neighborhood

A Gen 6 node pool splits into two neighborhoods when adding the 20th node 120.
One node from each node pair moves into a separate neighborhood.

Though a chassis-wide failure is highly unlikely, OneFS takes precautions against


chassis failure once a cluster is large enough. Nodes sharing a chassis are split
across fault domains, or neighborhoods, to reduce the number of node failures
occurring within one fault domain. The split is done automatically.

Each
Single neighborh neighborhood
ood, 3 disk pools has 3 disk pools
in a 3 diskper
drive sled
example

At 40 nodes,protection against
chassis failure

Gen 6 neighborhoods - each color represents a disk pool.

120After the 20th node added up to the 39th node, no 2 disks in a given drive sled
slot of a node pair share a neighborhood. The neighborhoods split again when the
node pool reaches 40 nodes.

PowerScale Administration-SSP1

Page 220 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Gen 6 Chassis Failure

The graphic shows a 40 node cluster used to illustrate a chassis failure. Once the
40th node is added, the cluster splits into four neighborhoods, labeled NH 1
through NH 4.

At 40 nodes, no disks in nodes are in At 40 nodes, splits to 4


same disk pool as other node in chassis neighborhoods

The splits place each node in a chassis


into a failure domain different from the
other three nodes in the chassis. Protects
against a very unlikely chassis failure.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 221


Configuring Storage Pools

Node Pool

A node pool is a group of similar or identical nodes.


A node pool is the lowest granularity of storage
space that users manage.

OneFS can group multiple node pools with similar


performance characteristics into a single tier with
the licensed version of SmartPools.

Creating multiple tiers in a cluster can meet the


business requirements and optimize storage usage.

The maximum number of like nodes in a node pool


is 252. The graphic shows an 8-node
cluster with two node pools.

SmartPools

SmartPools is a licensable software module that provides basic features in an


unlicensed state and advanced features when licensed.

SmartPools Basic121

121The basic version of SmartPools supports virtual hot spares, enabling space
reservation in a node pool for re protection of data. OneFS implements SmartPools

PowerScale Administration-SSP1

Page 222 © Copyright 2020 Dell Inc.


Configuring Storage Pools

SmartPools Advanced122

File Pools

File pools are the SmartPools logical layer, at which file pool policies are applied.

File pool policies provide a single point of management to meet performance,


requested protection level, space, cost, and other requirements.

User created, and defined policies are set on the file pools.

CloudPools

CloudPools is an extension of the SmartPools tiering capabilities in the OneFS


operating system. The policy engine seamlessly optimizes data placement that is
transparent to users and applications.

Moving the cold archival data to the cloud, lowers storage cost and optimizes
storage resources.

CloudPools offers the flexibility of another tier of storage that is off-premise and off-
cluster.

CloudPools eliminates management complexity and enables a flexible choice of


cloud providers.

basic by default. You can create multiple node pools, but only a single tier and only
a single file pool. A single tier has only one file pool policy that applies the same
protection level and I/O optimization settings to all files and folders in the cluster.

122More advanced features are available in SmartPools with a license. With the
advanced features you can create multiple tiers and file pool policies that direct
specific files and directories to a specific node pool or a specific tier. Advanced
features include the ability to create multiple storage tiers, multiple file pool policy
targets, and multiple file pool policies.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 223


Configuring Storage Pools

Node Loss: A loss of a node does not automatically start reprotecting data. Many
times a node loss is temporary, such as a reboot. If N+1 data protection is
configured on a cluster, and one node fails, the data is accessible from every other
node in the cluster. If the node comes back online, the node rejoins the cluster
automatically without requiring a rebuild. If the node is physically removed, it must
also be smartfailed. Only Smartfail nodes when needing to remove from the cluster
permanently.

Storage Pool CLI

The graphic shows the isi storagepool settings view command with user
configured settings highlighted.

Serviceability

Listed are the CLI options that can help get information about storage pools.

• To view the storage pool status and details


• isi storagepool list
• To view the health of storage pools

• isi status -p

PowerScale Administration-SSP1

Page 224 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Challenge

Lab Assignment: Go to the lab and verify the storage pool settings.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 225


Configuring Storage Pools

File Pools

Scenario

IT Manager: Our media team needs their storage on disks that do not
compete with the other disk.

Your Challenge: The IT manager has tasked you to segregate data into
different node pools.

File Pool Policies Overview

File pool policy


optimized for Tier
F600 1
random access
+2d:1n

File pool policy


optimized for Tier 2
concurrent
+2d:1n
access

H400

File pool policy Tier 3


optimized for
streaming access +3d:1n1d

F200

Cluster with 3 node pools each with a file pool policy.

PowerScale Administration-SSP1

Page 226 © Copyright 2020 Dell Inc.


Configuring Storage Pools

File pool policies automate file movement, enabling users to identify and move
logical groups of files.
• User-defined filters123
• File-based, not hardware-based124
• User-defined or default protection and policy settings125

The example shows that each policy has a different optimization and protection
level. A file that meets the policy criteria for tier 3 is stored in the tier 3 node pool
with +3d:1n1d protection. Also, the file is optimized for streaming access.

Default File Pool Policy

The default file pool policy is defined under the default policy.

Select each information "i" button for setting details.

123Files and directories are selected using filters and apply actions to files
matching the filter settings. The policies are used to change the storage pool
location, requested protection settings, and I/O optimization settings.

124
Each file is managed independent of the hardware, and is controlled through the
OneFS operating system.

125 Settings are based on the user-defined and default storage pool policies. File
pool policies add the capability to modify the settings at any time, for any file or
directory.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 227


Configuring Storage Pools

2
1

3
4

1: The individual settings in the default file pool policy apply to files without settings
that are defined in another file pool policy that you create. You cannot reorder or
remove the default file pool policy.

2: To modify the default file pool policy, click File system, click Storage pools,
and then click the File pool policies tab. On the File pool policies page, next to
the Default policy, click View/Edit.

3: You can choose to have the data that applies to the Default policy target a
specific node pool or tier or go anywhere. Without a license, you cannot change
the anywhere target. If existing file pool policies direct data to a specific storage
pool, do not configure other file pool policies with anywhere.

4: You can define the SSD strategy for the Default policy.

5: You can specify a node pool or tier for snapshots. The snapshots can follow the
data, or go to a different storage location.

6: Assign the default requested protection of the storage pool to the policy, or set a
specified requested protection.

PowerScale Administration-SSP1

Page 228 © Copyright 2020 Dell Inc.


Configuring Storage Pools

7: Under I/O optimization settings, SmartCache is enabled by default. SmartCache


writes data to a write-back cache instead of immediately writing the data to disk.
OneFS can write the data to disk at a time that is more convenient.

8: In the Data access pattern section, you can choose between Random,
Concurrency, or Streaming.

• Streaming access enables aggressive prefetch (also called read-ahead) on


reads, increases the size of file coalescers in the OneFS write cache, and
changes the layout of files on disk (uses more disks in the FEC stripes).
Streaming is most useful in workloads that do heavy sequential reads and
writes.
• Random essentially disables prefetch for both data and metadata. Random is
most useful when the workload I/O is highly random. Using Random greatly
reduces the cache "pollution" that could result from all the random reads, for
example prefetching blocks into cache that are never read.
• Concurrency, the default access setting, is a compromise between Streaming
and Random. Concurrency enables some prefetch, which helps sequential
workloads, but not so much that the cache gets "polluted" when the workload
becomes more random. Concurrency is for general purpose use cases, good
for most workload types or for mixed workload environments.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 229


Configuring Storage Pools

File Pool Policies Use Case

Name the policy.

Configuring the matching criteria.


This example shows using a
filename match.

The tier is created on the Storage


pools > SmartPools page, or using
the "isi storagepool" command.

The targeted files get a 3x mirror


protection level.

The targeted files use a streaming


access pattern.

Click the image to enlarge.

This example is a use case where a media-orientated business unit wants greater
protection and an access pattern that is optimized for streaming.

A tier that is called media_tier with a node pool has been created.

The business unit targets their mp4 marketing segments to the media_tier where
the hosting application can access them.

PowerScale Administration-SSP1

Page 230 © Copyright 2020 Dell Inc.


Configuring Storage Pools

File Pool Policy Filters

Modify time (mtime)

Access time (atime)

Metadata change time


(ctime)

Create time (birthtime)

Drop-down list of filter choices to build the policy criteria.

Create the filters in the File matching criteria section when creating or editing a
file pool policy.

Filter elements:
• Filter type126

126 File pool policies with path-based policy filters and storage pool location actions
are run during the write of a file matching the path criteria. Path-based policies are
first started when the SmartPools job runs, after that they are started during the
matching file write. File pool policies with storage pool location actions, and filters
that are based on other attributes besides path, write to the node pool with the
highest available capacity. The initial write ensures that write performance is not
sacrificed for initial data placement.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 231


Configuring Storage Pools

• Operators127
• Multiple criteria128

SSD Options

With the exception of F-Series nodes, if a node pool has SSDs, by default the L3
cache is enabled on the node pool. To use the SSDs for other strategies, first
disable L3 cache on the node pool. Manually enabling SSD strategies on specific
files and directories is not recommended.

Select each tab for more information.

Metadata Read Acceleration

SSDs for Metadata Read Acceleration is the recommended setting. The setting
uses one metadata mirror, other mirrors and data on HDDs.

Pros Cons

Do not need numerous SSDs to be Does not help random writes –


effective. metadata update hits HDD.

Benefits random reads by allowing


quicker access to metadata.

127Operators can vary according to the selected filter. You can configure the
comparison value, which also varies according to the selected filter and
operator. The Ignore case box should be selected for files that are saved to the
cluster by a Windows client.

128The policy requires at least one criterion, and allows multiple criteria. You can
add AND or OR statements to a list of criteria. Using AND adds a criterion to the
selected criteria block. Files must satisfy each criterion to match the filter. You can
configure up to three criteria blocks per file pool policy.

PowerScale Administration-SSP1

Page 232 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Helps Job Engine - all random Usually shows small SSD utilization:
lookups and treewalks are faster as clients may ask “Where is the value”
one copy of metadata is always on or complain it was over configured
SSD.

Metadata Read/Write Acceleration

Metadata read/write acceleration requires more SSD space. Writes all metadata
mirrors to SSDs and can consume up to six times more SSD space.

Pros Cons

Metadata is on SSDs - speeds random Need many SSDs to be effective,


lookups and treewalks. typically need four times the
metadata read amount.

Hard to size - who knows how many


files they will have.

Metadata updates hit SSDs - speeds Overfilling SSDs can have significant
up creates, writes, and deletes impact – manage with care.
including SnapShot deletes.
Does not show the full utilization until
the file system capacity is high.

Data and Metadata

Use SSDs for data and metadata requires the most space. Writes all data and
metadata for a file on SSDs.

Pros Cons

Only way to guarantee data pins to Expensive


SSDs - good for small, intense
workloads

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 233


Configuring Storage Pools

Can co-habit with metadata Must manage path capacity to avoid


acceleration - cannot mix with L3 on overfilling SSDs - directory quota can
same node pool. help.

Use file pool policies designating Must manage total SSD capacity
specific path for the data on SSDs. utilization - can push metadata from
SSD, which has a wide impact.

Heavy workloads may cause


queueing to SSD, slowing metadata
operations for other workloads.

Avoid SSDs

Using the avoid SSDs option affects performance. This option writes all file data
and all metadata mirrors to HDDs. Typically, use this setting when implementing L3
cache and GNA in the same cluster. You create a path-based file pool policies that
targets an L3 cache enabled node pool. The data SSD strategy and snapshot SSD
strategy for this L3 cache enabled node pole should be set to ‘Avoid SSD’.

PowerScale Administration-SSP1

Page 234 © Copyright 2020 Dell Inc.


Configuring Storage Pools

File Pool Policies Jobs

The FilePolicy job on the WebUI Cluster management > Job operations > Job types page.

File pool policies are applied to the cluster by a job.


• SetProtectPlus job129 - SmartPools unlicensed
• SmartPools job130 - SmartPools licensed

129 The SetProtectPlus job applies the default file pool policy.

130
When SmartPools is licensed, the SmartPools job processes and applies all file
pool policies. By default, the job runs at 22:00 hours every day at a low priority.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 235


Configuring Storage Pools

• FilePolicy job131 - find files needing policy changes (OneFS 8.2.0)


• SmartPoolsTree job132 - Selectively apply SmartPools file pool policies

Policy Template

Policy templates on the WebUI File system > Storage pools > File pool policies page.

Template settings are preset to the name of the template along with a brief
description. You can change the settings.

Template has a configured filter to achieve the specified function.

Template considerations:
• Opens a partially populated, new file pool policy.
• You must rename the policy.
• You can modify and add criteria and actions.
• Use in web administration interface only.

131 Uses a file system index database on the file system instead of the file system
itself to find files needing policy changes. By default, the job runs at 22:00 hours
every day at a low priority. The FilePolicy job was introduced in OneFS 8.2.0.

132The SmartPoolsTree job is used to apply selective SmartPools file pool policies.
The job runs the "isi filepool apply" command. The Job Engine manages the
resources that are assigned to the job. The job enables for testing file pool policies
before applying them to the entire cluster.

PowerScale Administration-SSP1

Page 236 © Copyright 2020 Dell Inc.


Configuring Storage Pools

File Pool Policies Order

The order of the policy matters.


• The first matching policy is applied.
• Create external policy list with filter criteria such as path or file name.
• Prioritize match filter criteria order.
• Reorder policies to match prioritization.
• Default policy completes unassigned actions.

File Pool Policy Considerations

Plan to add more node capacity when the cluster reaches 80% so that it does not
reach 90%. The cluster needs the extra capacity for moving around data, and for
the VHS space to rewrite data when a drive fails. Listed are more considerations.
• Avoid overlapping file policies where files may match more than one rule. If data
matches multiple rules, only the first rule is applied.
• File pools should target a tier and not a node pool within a tier.
• You can use the default policy templates as examples.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 237


Configuring Storage Pools

Serviceability

Example output of the 'isi filepool apply <path/file> -n -v -s' command with truncated output.

Listed here are the CLI options that can help get information about file pools.
• If file pool policy rules are not being applied properly, check the policy order.
• Test file pool policy before applying.

• isi filepool apply


• Syntax: isi filepool apply <path/file> -n -v -s
• Options:
• -n is to test but not apply.
• -v is for verbose output.
• -s prints statistics on processed files.

Challenge

Lab Assignment: Go to the lab and configure a file pool policy.

PowerScale Administration-SSP1

Page 238 © Copyright 2020 Dell Inc.


Configuring Storage Pools

SmartPools

Scenario

IT Manager: Before you configure file tiering, I want you to explain to me


the OneFS SmartPools settings.

Your Challenge: The IT manager has tasked you to describe the


SmartPools settings and then configure SmartPools.

SmartPools Overview

SmartPools enables the grouping of nodes into storage units that include node
pools, CloudPools, and tiers.

With SmartPools, you can segregate data based on its business value, putting data
on the appropriate tier of storage with appropriate levels of performance and
protection.

Different generations133 of PowerScale storage can co-exist within a single storage


pool.

Use SmartPools to manage global storage pool settings.

133Node pool membership changes through the addition or removal of nodes to the
cluster. Typically, tiers are formed when adding different node pools on the cluster.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 239


Configuring Storage Pools

File system > Storage pools page.

SmartPools Licensing

SmartPools is a licensable software module that provides basic features in an


unlicensed state and advanced features when licensed.

Because of the availability to have multiple data target locations, some additional
target options are enabled in some global settings.

Function Unlicensed Licensed

Automatic node pool Yes Yes


provisioning

Number of tiers Multiple Multiple

Number of file pool One (default file Multiple


policies, pool policy)
File pool policy targets

File pool policy filters No Multiple

Policy-based protection One Multiple


level
Metadata acceleration
setting
I/O optimization
Snapshot target

PowerScale Administration-SSP1

Page 240 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Specify spillover target. No Yes

VHS and GNA Yes Yes

SmartPool Settings

Cache Statistics

The isi_cache_stats command accurately assess the performance of the


various levels of cache at a point in time. Statistics for L1, L2 and L3 cache are
displayed for both data and metadata.

Output showing the L3 statistics.

GNA

SmartPools can automatically transfer data among tiers with different performance
and capacity characteristics.

Global namespace acceleration, or GNA, enables the use of SSDs for metadata
acceleration across the entire cluster.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 241


Configuring Storage Pools

Minimum 1.5% of all disk capacity must be SSD and 20%


of nodes must contain SSDs

Use SSDs to store metadata mirror in different


node pools

Accelerates all namespace reads across cluster

L3 cache enable node pools not considered


in GNA calculations

Click image to enlarge.

CLI command to enable GNA: isi storagepool modify --global-


namespace-acceleration-enabled yes

GNA Aspects

The table highlights the pros and cons of enabling GNA.

Pros Cons

Allows metadata read acceleration Difficult to manage and size the disk.
for non-SSD nodes - need some
nodes with SSDs Hard rules and limits

Helps Job Engine and random reads Links expansion of one tier to another
tier to adhere to the limits

L3Cache

L3 cache is enabled by default for all new node pools that are added to a cluster.

L3 cache is either on or off and no other visible configuration settings are available.

Any node pool with L3 cache enabled is excluded from GNA space calculations
and do not participate in GNA enablement.

PowerScale Administration-SSP1

Page 242 © Copyright 2020 Dell Inc.


Configuring Storage Pools

You cannot enable L3 cache if node


pool has no unprovisioned SSDs

Enabled on new node pool by


default
L3 cache cannot coexit with other SSD
strategies.

The left graphic shows global setting. The right graphic shows L3 cache enable or disable on each
node pool separately. Click image to enlarge.

VHS

Virtual hot spare, or VHS, allocation enables space to rebuild data when a drive
fails.

When selecting the option to reduce the amount of available space, free-space
calculations exclude the VHS reserved space.

OneFS uses the reserved VHS free space for write operations unless you select
the option to deny new data writes.

Default - all available free space on a cluster used


to rebuild data

Allocate by percentage of disk space, or the number of


virtual drives, or a combination of both

1 to 4 virtual drives in each node pool

0 to 20 percent to total disk space in


each node pool

Click image to enlarge.

Command example that reserves 10% capacity for VHS: isi storagepool
settings modify --virtual-hot-spare-limit-percent 10

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 243


Configuring Storage Pools

Spillover

Spillover is node capacity overflow management.

With the licensed SmartPools module, you can direct data to spillover to a specific
node pool or tier group.

If spillover is disabled, file is not moved to another node pool.

Options configure how OneFS handles a write operation


when a node pool is full.

Direct data to spillover to a specific node


pool or tier group

Click image to enlarge.

CLI command to disable spillover: isi storagepool settings modify --


spillover-enabled no

Actions

The SmartPools action settings provide a way to enable or disable managing


requested protection settings and I/O optimization settings.

If you clear the box (disable), SmartPools does not modify or manage settings on
the files.

PowerScale Administration-SSP1

Page 244 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Override manually managed request


protection

Override manually managed I/O


optimization

Overrides any manually managed requested


protection setting or I/O optimization

Useful if manually managed settings were made using file system


explorer or the isi set command.

Click image to enlarge.

CLI command for setting Automatically Manage Protection to none: isi


storagepool settings modify --automatically-manage-protection
none.

CLI command to set the Automatically Manage I/O Optimization: isi


storagepool settings modify --automatically-manage-io-
optimization {all | files_at_default | none}.

Protection example: If a +2d:1n protection is set and the disk pool suffers three
drive failures, the data that is not lost can still be accessed. Enabling the option
ensures that intact data is still accessible. If the option is disabled, the intact file
data is not accessible.

GNA can be enabled if 20% or more of the nodes in the cluster contain SSDs and
1.5% or more of the total cluster storage is SSD-based. The recommendation is
that at least 2.0% of the total cluster storage is SSD-based before enabling GNA.
Going below the 1.5% SSD total cluster space capacity requirement automatically
disables GNA metadata. If you SmartFail a node that has SSDs, the SSD total size
percentage or node percentage containing SSDs could drop below the minimum
requirement, disabling GNA. Any node pool with L3 cache enabled is excluded
from GNA space calculations and do not participate in GNA enablement.

GNA also uses SSDs in one part of the cluster to store metadata for nodes that
have no SSDs. The result is that critical SSD resources are maximized to improve
performance across a wide range of workflows.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 245


Configuring Storage Pools

VHS example: If specifying two virtual drives or 3%, each node pool reserves
virtual drive space that is equivalent to two drives or 3% of their total capacity for
VHS, whichever is larger. You can reserve space in node pools across the cluster
for this purpose, equivalent to a maximum of four full drives. If using a combination
of virtul drives and total disk space, the larger number of the two settings
determines the space allocation, not the sum of the numbers.

SSD Usage Comparison

Comparison of L3 cache with the other SSD usage strategies

Assists With L3 Metadata Metadata GNA Data on


Read Read/Write SSD

Metadata Read Yes Yes Yes Yes No

Metadata Write No 1 Mirror All Mirrors 1 Additional No


Mirror

Data Read Yes No No No Yes

Data Write No No No No Yes

Job Engine Yes Yes Yes Yes No


Performance

Granularity Node Manual Manual Global Manual


Pool

Ease of Use High Medium Medium Medium Lowest

SmartPools Considerations

Listed are areas to consider when discussing SmartPools.

• SmartPools automatic provisioning divides equivalent node hardware into disk


pools. Subdividing the node disks into separately protected disk pools increases
resiliency against multiple disk failures.

PowerScale Administration-SSP1

Page 246 © Copyright 2020 Dell Inc.


Configuring Storage Pools

• Disk pools are not user configurable, and a disk drive is only a member on one
disk pool or neighborhood.
• Node pools must have at least four nodes for Gen 6 and at least three nodes for
the F200/600. The default is one node pool per node type and configuration.
• The file pool policy default is all files are written anywhere on cluster. To target
more node pools and tiers, activate the SmartPools license.

Challenge

Lab Assignment: Configure SmartPools.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 247


Configuring Storage Pools

CloudPools

Scenario

IT Manager: Next, take the file pool policies to the CloudPools level. For
some of the long-term archive data, the group is looking at cloud
options.

Your Challenge: The IT manager wants you to explain CloudPools and


how file pool policies are used with CloudPools.

CloudPools Overview and Example Video

CloudPools offers the flexibility of another tier of storage that is off-premise and off-
cluster. Essentially what CloudPools do is provide a lower TCO134 for archival-type
data. Customers who want to run their own internal clouds can use a PowerScale
installation as the core of their cloud.

The video provides a CloudPools overview and use case. See the student guide for
a transcript of the video.

134CloudPools optimize primary storage with intelligent data placement.


CloudPools eliminates management complexity and enables a flexible choice of
cloud providers.

PowerScale Administration-SSP1

Page 248 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Click to launch video.

Click to enlarge graphic.

Link:
https://edutube.emc.com/html5/videoPlayer.htm?vno=wx4VTLcN32kSlHGFwGLE1
Q

Shown is an Isilon cluster with twelve nodes. A key benefit of CloudPools is the
ability to interact with multiple cloud vendors. Shown in the graphic are the
platforms and vendors that are supported as OneFS 8.1.1.

CloudPools is an extension of the SmartPools tiering capabilities in the OneFS


operating system. The policy engine seamlessly optimizes data placement that is
transparent to users and applications. Moving the cold archival data to the cloud,
lowers storage cost and optimizes storage resources.

Let us look at an example, each chassis in the cluster represents a tier of storage.
The topmost chassis is targeted for the production high-performance workflow and

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 249


Configuring Storage Pools

may have node such as F800s. When data is no longer in high demand,
SmartPools moves the data to the second tier of storage. The example shows the
policy moves data that is not accessed and that is over thirty days old. Data on the
middle tier may be accessed periodically. When files are no longer accessed for
more than 90 days, SmartPools archives the files to the lowest chassis or tier such
as A200 nodes.

The next policy moves the archive data off the cluster and into the cloud when data
is not accessed for more than 180 days. Stub files that are also called SmartLinks
are created. Stub files consume approximately 8 KB space on the Isilon cluster.
Files that are accessed or retrieved from the cloud, or files that are not fully moved
to the cloud, have parts that are cached on the cluster and are part of the stub file.
The storing of CloudPools data and user access to data that is stored in the cloud
is transparent to users.

CloudPools files undergo a compression algorithm and then are broken into their 2
MB cloud data objects or CDOs for storage. The CDOs conserve space on the
cloud storage resources. Internal performance testing does note a performance
penalty for a plane compression and decompressing files on read. Encryption is
applied to file data transmitting to the cloud service. Each 128 KB file block is
encrypted using a AES 256 encryption. Then transmitted as an object to the cloud.
Internal performance testing notes a little performance penalty for encrypting the
data stream.

CloudPools Considerations

Private and supported providers


CloudPools is a licensed
PowerScale, ECS, Amazon S3 ,
feature Microsoft Azure, Virtuestream, Google
cloud and Alibaba cloud

Writes to the cloud file are


Access performance based cached locally - cache is flushed
on cloud connection at designated intervals

CloudPools uses the SmartPools framework to move data and state information to
off-cluster storage while retaining the ability to read, modify, and write to data.

PowerScale Administration-SSP1

Page 250 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Consider the following:


• Compression135
• 2 MB CDO size
• Compliance mode136

Resources: See the CloudPools Administration Guide in the


PowerScale Info Hub for information not covered in this topic, such
as best practices and troubleshooting.

CloudPools Administration

Configure and manage CloudPools from the WebUI File system, Storage pools
page, CloudPools tab. Managing CloudPools using the CLI is done with the isi
cloud command.

135 In OneFS 8.2, CloudPools compress data before sending it over the wire.

136 CloudPools in OneFS 8.2 prevents enabling compliance mode on stubs.


Archiving a file before it is committed and moving a stub into a compliance directory
is denied.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 251


Configuring Storage Pools

CloudPools Tab

Configure the connection details for a cloud


service

Configure the CloudPool

File system >Storage pools page > CloudPools page.

Once the SmartPools and CloudPools licenses are applied, the WebUI shows the
cloud storage account options.

After a cloud storage account is defined and confirmed, the administrator can
define the cloud pool itself.

The file pool policies enable the definition of a policy to move data out to the cloud.

PowerScale Administration-SSP1

Page 252 © Copyright 2020 Dell Inc.


Configuring Storage Pools

Cloud Storage Account

Must be unique

Type of the cloud account

The URI must use HTTPS and


match the URI used to set up the
cloud account.

The User Name is the name that


is provided to the cloud provider.

The Key is the account password


that is provided to (or received
from) the cloud provider.

The graphic shows the window for creating a cloud storage account.

Cloud Storage Target

After creating a storage account, create a CloudPool and associate or point it to the
account.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 253


Configuring Storage Pools

The Name must be unique to the


cluster.
Type of cloud account - the drop-
down list has the supported options.

The Vendor name and Description are


optional fields.

Activated after the Type is selected and


the configured storage accounts are listed
on the drop-down list.

The graphic shows the window to Create a CloudPool.

CloudPools SmartLink

Run the isi get -D command to see files archived to the cloud using
CloudPools.

The example checks to see if the local version on the cluster is a SmartLink file.

If the SmartLinked field returns True, the file is archived.

If the output is False, the file is not archived.

PowerScale Administration-SSP1

Page 254 © Copyright 2020 Dell Inc.


Configuring Storage Pools

File Pool Policies - CloudPools

Excerpt from the WebUI > Storage pools page.

SmartPools file pool policies are used to move data from the cluster to the selected
CloudPools storage target.

When configuring a file pool policy, you can apply CloudPools actions to the
selected files.

CloudPools Settings

You may want to modify the settings for the file pool policy based on your
requirements. Modifications are not necessary for most workflows. You can elect to
encrypt and compress data.

1
2
3

4
7
5
8
6
9

10

1: The default CloudPools setting allows you to archive files with snapshot
versions, but you can change the default setting.

2: You can encrypt data prior to archiving it to the cloud. Cloud data is decrypted
when accessed or recalled.

3: You can compress data prior to archiving to the cloud. Cloud data is
decompressed when accessed or recalled.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 255


Configuring Storage Pools

4: Set how long to retain cloud objects after a recalled file replaces the SmartLink
file. After the retention period, the cloud objects garbage collector job cleans up the
local resources allocated for the SmartLink files, and removes the associated cloud
objects.

5: If a SmartLink file has been backed up and the original SmartLink file is
subsequently deleted, associated cloud objects are deleted only after the retention
time of the backed-up SmartLink file has expired.

6: If a SmartLink file has been backed up and the original SmartLink file is
subsequently deleted, associated cloud objects are deleted only after the original
retention time, or a longer incremental or full backup retention period, has expired.

7: Specifies how often SmartLink files modified on the cluster are written to their
associated cloud data objects.

8: Determines whether cloud data is cached when a file is accessed on the local
cluster.

9: Specifies whether cloud data is fully or partially recalled when you access a
SmartLink file on the cluster.

10: Specifies how long the system retains recalled cloud data that is in the cache of
associated SmartLink files.

The graphic shows various default advanced CloudPool options that are configured.

CLI for CloudPools

The output of the isi cloud command shows the actions that you can take.

PowerScale Administration-SSP1

Page 256 © Copyright 2020 Dell Inc.


Configuring Storage Pools

1 3 5 7 10
2 4 6 8 9

1: Use to grant access to CloudPool accounts and file pool policies. You can add
and remove cloud resource, list cluster identifiers, and view cluster details.

2: Used to manage CloudPool accounts. You can create, delete, modify, and
view a CloudPool account, and list the ClouldPool accounts.

3: Use to archive or recall files from the cloud. Specify files individually, or use a file
matching pattern. Files that are targeted for archive must match the specified file
pool policy, or any file pool policy with a cloud target.

4: Use to manage CloudPools TLS client certificates. You can delete, import,
modify, view, and list certificates.

5: Use to manage CloudPool jobs. Use to cancel, create, pause, resume,


list, and view jobs. A CloudPools system job such as cache-writeback cannot
be canceled.

6: Use to configure and manage a CloudPool pool. You can create, delete,
modify, list, and view pools. OneFS no longer accesses the associated cloud
storage account when it is deleted. If a file pool policy references the CloudPool,
OneFS does not allow the delete.

7: Use to manage network proxies. You can create, delete, modify, list, and
view proxies. CloudPools prevents deletion of a proxy that is attached to a cloud
storage account.

8: Files that are stored in the cloud can be fully recalled using the isi cloud
recall command. Recall can only be done using the CLI. When recalled, the full
file is restored to its original directory. The file may be subject to the same file pool
policy that originally archived it, and rearchive it to the cloud on the next
SmartPools job run. If re-archiving is unintended, the recalled file should be moved
to a different, unaffected, directory. The recalled file overwrites the stub file. You
can start the command for an individual file or recursively for all files in a directory
path.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 257


Configuring Storage Pools

9: Use to manage CloudPool top-level settings. You can list and modify
CloudPool settings, and regenerate the CloudPool master encryption key.

10: Use to restore the cloud object index (COI) for a cloud storage account on the
cluster. The isi cloud access add command also restores the COI for a cloud
storage account.

C2S Cloud Support

Commercial Cloud Service, or C2S, is the federal government private cloud.


Federal customers are mandated to use the C2S cloud.

• Support137

137C2S support delivers full CloudPools functionality for a target endpoint, and
supports the use with C2S Access Portal (CAP), and X.509 client certificate

PowerScale Administration-SSP1

Page 258 © Copyright 2020 Dell Inc.


Configuring Storage Pools

• Integration138
• No Internet connection139

CloudPools Limitations

Listed are limitations to CloudPools.


• File recall requires administrative action
• File spillover is not supported

In a standard node pool, file pool policies can move data from high-performance
tiers to storage tiers and back as defined by their access policies. However, data
that moves to the cloud remains stored in the cloud unless an administrator
explicitly requests data recall to local storage. If a file pool policy change is made
that rearranges data on a normal node pool, data is not pulled from the cloud.
Public cloud storage often places the largest fees on data removal, thus file pool
policies avoid removal fees by placing this decision in the hands of the
administrator.

The connection between a cluster and a cloud pool has limited statistical features.
The cluster does not track the data storage that is used in the cloud, therefore file
spillover is not supported. Spillover to the cloud would present the potential for file
recall fees. As spillover is designed as a temporary safety net, once the target pool
capacity issues are resolved, data would be recalled back to the target node pool
and incur an unexpected fee.

authority. C2S also provides support (from AIMA) to securely store certificates,
validate, and refresh if needed.

138The CloudPools C2S feature offers an integrated solution with AWS


Commercial Cloud Services (C2S), a private instantiation of the AWS commercial
cloud.

139This service is 'air gapped' which means it has no direct connection to the
Internet.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 259


Configuring Storage Pools

Statistic details, such as the number of stub files on a cluster or how much cache
data is stored in stub files and would be written to the cloud on a flush of that
cache, is not easily available. No historical data is tracked on the network usage
between the cluster and cloud either in writing traffic or in read requests. These
network usage details should be viewed from the cloud service management
system.

Challenge

Open participation questions:


Question: What is restored when the recall command is used on
a CloudPool?

PowerScale Administration-SSP1

Page 260 © Copyright 2020 Dell Inc.


Configuring Data Services

Configuring Data Services

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 261
Configuring Data Services

File Filtering

Scenario

IT Manager: Now that you have a good understanding of storage pools,


let's investigate file filtering. There are some types of files we do not
want stored in our production directories.

Your Challenge: The IT manager wants you to explain file filtering and
configure the shares to filter unnecessary files.

File Filtering Overview

The graphic shows that .avi files are prevented from writing to
the finance access zone.

PowerScale Administration-SSP1

Page 262 © Copyright 2020 Dell Inc.


Configuring Data Services

File filtering enables administrators to deny or allow file access on the cluster that is
based on the file extension.
• Denies writes for new files.
• Prevents accessing existing files.
• Explicit deny lists.140
• Explicit allow lists.141
• No limit to extension list.
• Per access zone.142
• Configurable for the SMB defaults143.
• No license is required.

140Explicit deny lists are used to block only the extensions in the list. OneFS
permits all other file types to be written. Administrators can create custom
extension lists based on specific needs and requirements.

141
Explicit allow list permits access to files only with the listed file extensions.
OneFS denies writes for all other file types.

142 The top level of file filtering is set up per access zone. When you enable file
filtering in an access zone, OneFS applies file filtering rules only to files in that
access zone.

143OneFS does not take into consideration which file sharing protocol was used to
connect to the access zone when applying file filtering rules. However, you can
apply additional file filtering at the SMB share level.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 263
Configuring Data Services

Management of Existing Cluster Files

If enabling file filtering on an access zone with existing shares or exports, the file
extensions determine access to the files.
• User denied access.144
• Administrator access.145

144 Users cannot access any file with a denied extension. The extension can be
denied through the denied extensions list, or because the extension was not
included as part of the allowed extensions list.

145 Administrators can still access existing files. Administrators can read the files or
delete the files. Administrators with direct access to the cluster can manipulate the
files.

PowerScale Administration-SSP1

Page 264 © Copyright 2020 Dell Inc.


Configuring Data Services

• No filter = access to files.146


• Applies only to supported protocols.147

File Filtering Use Cases

Click each point to learn more about its use case.

• Enforces organization policies148


• Meet compliance requirements149
• Limit large-size files content to share150

146How the file filtering rule is applied to the file determines where the file filtering
occurs. If a user or administrator accesses the cluster through an access zone or
SMB share without applying file filtering, files are fully available.

147 File filters are applied only when accessed over the supported protocols.

148 A use case to enforce file filtering is to adhere to organizational policies.

149 With the compliance considerations today, organizations struggle to meet many
of the requirements. For example, many organizations are required to make all
emails available for litigation purpose. To help ensure that email is not stored
longer than wanted, deny storing .pst.

150Another use case is to limit the cost of storage. Organizations may not want
typically large files, such as video files, to be stored on the cluster, so they can
deny .mov or .mp4 file extension.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 265
Configuring Data Services

• Avoid potential copywrite infringement issues151


• Isolate access zone or share for only a specific file use152

File Filtering Configuration

When you enable file filtering in an access zone, OneFS applies file filtering rules
only to files in that access zone.

151
An organizational legal issue is copyright infringement. Many users store their
.mp3 files on the cluster and open a potential issue for copyright infringement.

152 Another use case is to limit an access zone for a specific application with its
unique set of file extensions. File filtering with an explicit allow list of extensions
limits the access zone or SMB share for its singular intended purpose.

PowerScale Administration-SSP1

Page 266 © Copyright 2020 Dell Inc.


Configuring Data Services

Configure File Filtering

1. Select access
zone

3. Select to add or
deny

2. Enable -
unchecked by
default

4. Add extensions - does not


permit the use of wildcards or
special characters, only the (.)
period.

Access zone level: Web UI: Access > File filter > File filter settings.

You can configure file filtering at different levels.

You can configure file filters on the Protocols > Windows sharing (SMB) >
Default share settings page153.

Modify File Filtering

File filtering settings can be modified by changing the filtering method or editing file
extensions.

• Browse to Access > File Filter, and select the access zone that needs to be
modified from the Current Access Zone drop down list.
• Clear Enable file filters check box to disable file filtering in access zone.

153 Configuring file filters on individual SMB shares enables more granular control.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 267
Configuring Data Services

• Select to deny or allow and then enter the extension of the file, and click submit.
• Click the Remove Filter button next to the extension to remove a file name
extension.

CLI: isi smb shares create and isi smb shares modify commands. If
using RBAC, the user must have the ISI_PRIV_FILE_FILTER privilege.

Challenge

Your Challenge: Login to the cluster and configure file filtering on an


SMB share.

PowerScale Administration-SSP1

Page 268 © Copyright 2020 Dell Inc.


Configuring Data Services

SmartQuotas

Scenario

IT Manager: One of the lessons we learned was that a small percentage


of users would consume a large portion of the storage capacity. To fix
the problem we implemented quotas. I want you to do the same on the
PowerScale system.

Your Challenge: The IT manager wants you to discuss the types of


quotas, explain quota overhead, and configure quotas on the directories.

SmartQuotas Overview Video

This video provides an overview for SmartQuotas. See the student guide for a
transcript of the video.

Click to launch video.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 269
Configuring Data Services

Link:
https://edutube.emc.com/Player.aspx?vno=tCIE1bGAUz6k3W1ic8tZfw==&autoplay
=true

SmartQuotas is a software module that is used to limit, monitor, thin provision, and
report disk storage usage at the user, group, and directory levels. Administrators
commonly use file system quotas for tracking and limiting the storage capacity that
a user, group, or project can consume. SmartQuotas can send automated
notifications when storage limits are exceeded or approached.

Quotas are a useful way to ensure that a user or department uses only their share
of the available space. SmartQuotas are also useful for enforcing an internal
chargeback system. SmartQuotas contain flexible reporting options that can help
administrators analyze data usage statistics for their Isilon cluster. Both
enforcement and accounting quotas are supported, and various notification
methods are available.

Before OneFS 8.2, SmartQuotas reports the quota free space only on directory
quotas with a hard limit. For user and group quotas, SmartQuotas reports the size
of the entire cluster capacity or parent directory quota, not the size of the quota.
OneFS 8.2.0 includes enhancements to report the quota size for users and groups.
The enhancements reflect the true available capacity that is seen by the user.

SmartQuotas Implementation

You can choose to implement accounting quotas or enforcement quotas. The table
below displays the difference between the types.

Accounting Quotas Enforcement Quotas

Monitors disk usage Monitors and limits disk usage

Analysis and planning Enable notification

Threshold subtypes - advisory Threshold subtypes - hard and soft

PowerScale Administration-SSP1

Page 270 © Copyright 2020 Dell Inc.


Configuring Data Services

Enforcement Quotas

Enforcement quotas support three subtypes and are based on administrator-


defined thresholds:
• Hard quotas
• Soft quotas
• Advisory quotas

Quota Types

There are six types of quotas that you can configure.

1: Directory and default directory quotas: Directory quotas are placed on a


directory, and apply to all directories and files within that directory, regardless of
user or group. Directory quotas are useful for shared folders where many users
store data, and the concern is that the directory grows unchecked.

2: User and default user quotas: User quotas are applied to individual users, and
track all data that is written to a specific directory. User quotas enable the
administrator to control the capacity any individual user consumes in a particular
directory. Default user quotas are applied to all users, unless a user has an
explicitly defined quota for that directory. Default user quotas enable the
administrator to apply a quota to all users, instead of individual user quotas.

3: Group and default group quotas: Group quotas are applied to groups and limit
the amount of data that the collective users within a group can write to a directory.
Group quotas function in the same way as user quotas, except for a group of
people and instead of individual users. Default group quotas are applied to all

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 271
Configuring Data Services

groups, unless a group has an explicitly defined quota for that directory. Default
group quotas operate like default user quotas, except on a group basis.

Caution: Configuring any quotas on the root of the file system


(/ifs) could result in significant performance degradation.

Default Directory Quotas

With default directory quotas, you can apply a template configuration to another
quota domain.

Directory default (template


quota created - 10 GB

Do not inherit the 10 GB


directory quota

Quota domains inherited and linked directory quota -


10 GB

The graphic shows a 10-GB default directory quota.

The graphic shows an example of creating a 10-GB hard quota, default directory
quota on the /ifs/sales/promotions directory. The directory default quota is not in
and of itself a quota on the promotions directory. Directories below the promotions
directory, such as the /Q1 and /Q2 directories inherit and apply the 10 GB quota.
The /Q1 domain and the /Q2 domain are independent of each other. Sub
directories such as /storage and /servers do not inherit the 10 GB directory
quota.Given this example, if the /Q2 folder reaches 10 GB, that linked quota is
independent of the 10 GB default directory quota on the parent directory.
Modifications to default directory quota, promotions, reflect to inherited quotas
asynchronously. Inheritance is seen when listing quotas, querying inheriting quota
record, or when I/O happen in the sub directory tree.

PowerScale Administration-SSP1

Page 272 © Copyright 2020 Dell Inc.


Configuring Data Services

Creating Default Directory Quotas

The default directory quota is created using the CLI154.

You can use the WebUI to view the created quotas and their links. See the student
guide for information about quota links.

Creating and viewing default directory quotas.

The top example shows creating a template on the Features directory. The
directory has a hard limit of 10 GB, an advisory at 6 GB, and a soft limit at 8 GB
with a grace period of 2 days.

The Unlink option makes the quota independent of the parent, meaning
modifications to the default directory quota no longer apply to the sub directory.
This example shows removing the link on the Screen_shots sub directory and then
modifying the default directory quota on the parent, Quota, directory. Remove the
link using the button on the WebUI or isi quota quotas modify --
path=/ifs/training/Features/Quota/Screen_shots --
type=directory --linked=false. Using the --linked=true option re-links
or links to the default directory quota.

154 The 'isi quota' command is used to create the default directory quota.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 273
Configuring Data Services

Quota Accounting

Count all snapshot data in usage limits - sum of the current directory and any
snapshots of that directory 1 KB file

Report protection overhead and meta data –


8 KB +

Report physical size without overhead –


8 KB

Report the actual size of file – 1 KB

The reporting behavior on a 1 KB file.

The quota accounting options are Include snapshots in the storage quota, 155and

Enforce the limits for this quota based on:

• File system logical size156 (default)

155Tracks both the user data and any associated snapshots. A single path can
have two quotas that are applied to it, one without snapshot usage (default) and
one with snapshot usage. If snapshots are in the quota, more files are in the
calculation.

156Enforces the File system logical size quota limits. The default setting is to only
track user data, not accounting for metadata, snapshots, or protection.

PowerScale Administration-SSP1

Page 274 © Copyright 2020 Dell Inc.


Configuring Data Services

• Physical size157
• Application logical size158 (OneFS 8.2 and later)

Overhead Calculations

Most quota configurations do not need to include overhead calculations.

If configuring overhead settings, do so cautiously, because they can affect the


amount of disk space that is available to users.

40 GB user quota 10 GB file Include data protection overhead

Example: 10 GB file enforces 20 GB against


quota

2x data protection

Snapshot and protection overhead typically not


used

The graphic shows an example of quota enforcement. 40 GB of capacity on the


/ifs/finance directory restricts the user. The setting for the quota is using the
Physical size option.If the directory is configured with a 2x data protection level and
the user writes a 10 GB file, the file consumes 20 GB of space. The consumption is

157Tracks the user data, metadata, and any associated FEC or mirroring overhead.
This option can be changed after the quota is defined.

158 Tracks the usage on the application or user view of each file. Application logical
size is typically equal or less than file system logical size. The view is in terms of
how much capacity is available to store logical data regardless of data reduction,
tiering technology, or sparse blocks. The option enforces quotas limits, and reports
the total logical data across different tiers, such as CloudPools.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 275
Configuring Data Services

10 GB for the file and 10 GB for the data-protection overhead. The user has
reached 50% of the 40 GB quota by writing a 10 GB file to the cluster.

Quotas and Thin Provisioning

Total quota amount can exceed


cluster capacity 200 TB quota

150 TB capacity

Cluster can reach maximum capacity before


75 TB quota reaching quotas

SmartQuotas supports thin provisioning, also known as over provisioning, which


enables administrators to assign quotas above the physical cluster capacity.

• Capacity reached, but quota can be under limit.159


• Adding nodes.160

159 With thin provisioning, the cluster can be full even while some users or
directories are well under their quota limit. Configuring quotas that exceed the
cluster capacity enables a smaller initial purchase of capacity/nodes.

160
Thin provisioning lets you add more nodes as needed, promoting a capacity on-
demand model.

PowerScale Administration-SSP1

Page 276 © Copyright 2020 Dell Inc.


Configuring Data Services

• Management reduction.161
• Careful monitoring.162

Quota Nesting

Nesting quotas is having multiple quotas within the same directory structure.

User quota = 25 GB

Directory = 1 TB
Directory can be any size up to 1 TB -
each user can only store 25 GB
Directory structure cannot exceed 1
TB

Directory quota = 800 GB

Directory structure cannot exceed 800


GB

No quota
Nesting - multiple quotas within same
directory structure

Directory structure cannot exceed 1 TB

The example shows that all quotas are hard enforced.

At the top of the hierarchy, the /ifs/sales folder has a directory quota of 1 TB. Any
user can write data into this directory, or the /ifs/sales/proposals directory, up to a
combined total of 1 TB. The /ifs/sales/promotions directory has a user quota
assigned that restricts the total amount that any single user can write into this
directory to 25 GB. Even though the parent directory (sales) is below its quota
restriction, a user is restricted within the promotions directory. The
/ifs/sales/customers directory has a directory quota of 800 GB that restricts the
capacity of this directory to 800 GB. However, if users place 500 GB of data in the

161Setting larger quotas initially reduces administrative management as more user


access the cluster.

162Thin provisioning requires monitoring cluster capacity usage carefully. If a quota


exceeds the cluster capacity, nothing prevents users from consuming all available
space, which results in service outages for all users and cluster services.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 277
Configuring Data Services

/ifs/sales/proposals directory, users can only place 500 GB in the other directories.
The parent directory cannot exceed 1 TB.

Percent-Based Advisory and Soft Limits

Create example

Modify example

View example

In OneFS 8.2.0 and later, you can view advisory and soft quota limits as a percent
of the hard quota limit.

Only advisory and soft quota limits can be defined.

A hard limit must exist to set the advisory and soft percentage.

Administrators cannot set both an absolute and a percent-based limit on a


directory.

PowerScale Administration-SSP1

Page 278 © Copyright 2020 Dell Inc.


Configuring Data Services

Quota Notifications

Rules

Email integrates with AD or LDAP email configuration

Configure multiple email recipients

Templates located in /etc/ifs

PowerScale WebUI notification configuration window.

Administrators can configure notifications to send alerts when the provisioned


storage approach storage maximums enabling more storage to be purchased as
needed.

Quota events can generate notifications.

Send notifications by email or through a cluster event. See the student guide for
more information.

The email option sends messages using the default cluster settings. You can send
the email to the owner of the event, or to an alternate contact, or both the owner
and an alternate. You can also use a customized email message template. Use a
distribution list to send the email to multiple users.

If using LDAP or Active Directory to authenticate users, the cluster uses the user
email setting that is stored within the directory. If no email information is stored in
the directory, or if a Local or NIS provider authenticates, you must configure a
mapping rule.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 279
Configuring Data Services

Quota Notification Template

The graphic shows one of the available quota templates that are located in the
/etc/ifs directory.

• PAPI support163.
• OneFS 8.2 enhancements164.

163PAPI supports an email ID list in the action_email_address property:


{"action_email_address": ["user1@isilon.com","user2@isilon.com"].

164In OneFS 8.2.0, administrators can configure quota notification for multiple
users. The maximum size of the comma-separated email ID list is 1024 characters.
The isi quota command option --action-email-address field accepts multiple
comma-separated values.

PowerScale Administration-SSP1

Page 280 © Copyright 2020 Dell Inc.


Configuring Data Services

Template Variables

An email template contains variables. You can use any of the SmartQuotas
variables in your templates.

Considerations

Listed are best practices to consider when discussing SmartQuotas.


• Too many nested quotas can limit performance.
− A single directory with overlapping quotas can also degrade performance.
• Thin provisioning can exceed cluster capacity.
• Most customers do not include overhead and snapshots in quota limits.
• If quota limits include overhead and snapshots, you may need to set larger
quota limits.
− Cloned and deduplicated files are treated as ordinary files by SmartQuotas.
• Test notifications to avoid surprises (i.e, incorrectly configured mail relay).
• OneFS 8.2:

• Increased from 20,000 quota limits per cluster to 500,000 quota limits per
cluster.
• Quota notification daemon optimized to handle about 20 email alerts per
second.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 281
Configuring Data Services

• Support for the rpc.quotad service in the NFS container with some
statistics.

Best Practice:
• Do not enforce quotas on file system root (/ifs).
• Do not configure quotas on SyncIQ target directories.

Challenge

Lab Assignment: The next OneFS feature to implement is


SmartQuotas. Go to the lab and configure user, group, and directory
quotas.

PowerScale Administration-SSP1

Page 282 © Copyright 2020 Dell Inc.


Configuring Data Services

SmartDedupe

Scenario

IT Manager: The cluster is hosting home directories for the users. Much
of the data is shared and has multiple copies. Deduplication should help
address the inefficient use of space.

Your Challenge: The IT manager wants you to describe the benefits of


deduplication, explain how deduplication works, and schedule
deduplication on a directory.

SmartDedupe Overview

Information technology managers are challenged with managing explosive data


growth.

Business data is often filled with significant amounts of redundant information.

SmartDedupe is an asynchronous batch job that identifies identical storage blocks


across the pool. The job is transparent to the user.

Multiple instances of
Single instance of
identical data
data

OneFS Deduplicates at the


File metadata not Saves one copy of
block level
deduplicated blocks
deduplicated

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 283
Configuring Data Services

An example of redundant information is whenever multiple employees store email


attachments, multiple copies of the same files are saved or replicated. This action
leads to multiple copies of the same data, which take up valuable disk capacity.
Data deduplication is a specialized data reduction technique that allows for the
reduction of duplicate copies of data.

SmartDedupe Architecture

The SmartDedupe architecture consists of five principle modules: Deduplication


Control Path, Deduplication Job, Deduplication Engine, Shadow Store, and
Deduplication Infrastructure.

4 3 5

1: The SmartDedupe control path consists of PowerScale OneFS WebUI, CLI and
RESTful PAPI, and is responsible for managing the configuration, scheduling, and
control of the deduplication job.

2: One of the most fundamental components of SmartDedupe, and deduplication in


general, is ‘fingerprinting’. In this part of the deduplication process, unique digital
signatures, or fingerprints, are calculated using the SHA-1 hashing algorithm, one
for each 8KB data block in the sampled set.

When SmartDedupe runs for the first time, it scans the data set and selectively
samples blocks from it, creating the fingerprint index. This index contains a sorted
list of the digital fingerprints, or hashes, and their associated blocks. Then, if they

PowerScale Administration-SSP1

Page 284 © Copyright 2020 Dell Inc.


Configuring Data Services

are determined to be identical, the block’s pointer is updated to the already existing
data block and the new, duplicate data block is released.

3: Shadow stores are similar to regular files but are hidden from the file system
namespace, so cannot be accessed via a path name. A shadow store typically
grows to a maximum size of 2GB, with 32,000 files referring each block. If the
reference count limit is reached, a new block is allocated, which may or may not be
in the same shadow store. Also shadow stores do not reference other shadow
stores. And snapshots of shadow stores are not permitted because the data that is
stored in shadow stores cannot be overwritten.

4: The primary user facing component of PowerScale SmartDedupe is the


deduplication job. This job performs a file system tree-walk of the configured
directory, or multiple directories, hierarchy. The Job Engine performs the control,
impact management, monitoring and reporting of the deduplication job in a similar
manner to other storage management and maintenance jobs on the cluster..

5: Architecturally, the duplication job, and supporting dedupe infrastructure, consist


of the following four phases: Sampling, Duplicate Detection, Block Sharing and
Index Update.

SmartDedupe Considerations

Following are areas to consider for SmartDedupe:

• SmartDedupe License165

165SmartDedupe is included as a core component of PowerScale OneFS but


requires a valid product license key in order to activate. This license key can be
purchased through the PowerScale account team.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 285
Configuring Data Services

• Best for static files and directories166


• Post process167 - not immediate - eventual
• F810 and H5600 In-line data deduplication168
• Asynchronous169 - does not block writes.
• Per disk pool170
• File metadata is not deduplicated.171

166
Deduplication is most effective for static or archived files and directories - less
modified files equals less negative effect.

167To avoid increasing write latency, deduplication is done on data-at-rest. The


data starts out at the full literal size on the drives, and might get deduplicated hours
or days later.

168
In-line data deduplication and in-line data compression is supported in the F810
and H5600 platforms in OneFS 8.2.1.

169Deduplication does not occur across the length and breadth of the entire cluster,
but only on each disk pool individually.

170 Data that is moved between node pools may change what level of deduplication
is available. An example would be a file pool policy that moves data from a high-
performance node pool to nearline storage. The data would no longer be available
for deduplication for the other data on the high-performance node pool, but would
be newly available for deduplication on nearline storage.

171
Metadata is changed more frequently, sometimes in trivial ways, leading to poor
deduplication.

PowerScale Administration-SSP1

Page 286 © Copyright 2020 Dell Inc.


Configuring Data Services

• Encrypted, compressed, and files less than 32 KB172


• Shadow store173 – 2 GB default size – up to 256,000 blocks storable
• Replication and backup behavior174
• Snapshots175

172SmartDedupe does not deduplicate encrypted or compressed files. Also, files


that are 32 KB or smaller are not deduplicated, because doing so would consume
more cluster resources than the storage savings are worth.

173The default size of a shadow store is 2 GB, and each shadow store can contain
up to 256,000 blocks. Each block in a shadow store can be referenced up to
32,000 times.

174 When deduplicated files are replicated to another PowerScale cluster or backed
up to a tape device, the deduplicated files no longer share blocks on the target
cluster or backup device. Although you can deduplicate data on a target
PowerScale cluster, you cannot deduplicate data on an NDMP backup device.
Shadow stores are not transferred to target clusters or backup devices. Because of
this, deduplicated files do not consume less space than non deduplicated files
when they are replicated or backed up. To avoid running out of space, ensure that
target clusters and tape devices have free space to store deduplicated data.

175SmartDedupe will not deduplicate the data stored in a snapshot. However, you
can create snapshots of deduplicated data. If deduplication is enabled on a cluster
that already has a significant amount of data stored in snapshots, it will take time
before the snapshot data is affected by deduplication. Newly created snapshots will
contain deduplicated data, but older snapshots will not.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 287
Configuring Data Services

• One deduplication job runs at a time176


− Schedule deduplication to run during the cluster’s low usage hours, by
default, the SmartDedupe job runs automatically.
− After the initial dedupe job, schedule incremental dedupe jobs to run about
every two weeks, depending on the size and rate of change of the dataset.
− Run SmartDedupe with the default "low" impact Job Engine policy.
• Rehydrates files from shadow store177

SmartDedupe Function

A job in the OneFS Job Engine178 runs through blocks that are saved in every disk
pool, and compares the block hash values.179

176Only one deduplication job can run at a time - uses CPU and memory
resources, and you should run at non peak or off hour times.

177Once file is undeduplicated, it cannot be re-deduplicated. Before rehydrating,


ensure that sufficient cluster capacity exists to hold the undeduplicated directory.

178 The job first builds an index of blocks, against which comparisons are done in a
later phase, and ultimately confirmations and copies take place. The deduplication
job can be a time consuming, but because it happens as a job the system load
throttles, the impact is absolute. Administrators find that their cluster space usage
has dropped once the job completes.

179If a match is found, and confirmed as a true copy, the block is moved to the
shadow store, and the file block references are updated in the metadata.

PowerScale Administration-SSP1

Page 288 © Copyright 2020 Dell Inc.


Configuring Data Services

2 3 4 5

1: Files greater than 32 KB

2: Compare 8 KB blocks.

3: Find matching blocks.

4: Matching blocks moved to shadow store

5: Free blocks

6: Save block references in metadata.

SmartDedupe Use Cases

Data on an enterprise typically contains substantial quantities of redundant


information.

SmartDedupe is typically used in the following ways:

Use Cases Considerations

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 289
Configuring Data Services

Home directories180 Compressed versus


uncompressed data

Archival files181 Unique versus replica


files

Uncompressed virtual Rapid change versus


machine images182 near-static

Note: Rapid changes in the file system tend to undo deduplication,


so that the net savings achieved at any one time is low. If in doubt,
or attempting to establish the viability of deduplication, perform a
dry run.

180A home directory scenario where many users save copies of the same file can
offer excellent opportunities for deduplication.

181 Static, archival data is seldom changing, therefore the storage that is saved may
far outweigh the load dedupe places on a cluster. Deduplication is more justifiable
when the data is relatively static.

182 Workflows that create many copies of uncompressed virtual machine images
can benefit from deduplication. Deduplication does not work well with compressed
data, the compression process tends to rearrange data to the point that identical
files in separate archives are not identified as such. Environments with many
unique files do not duplicate each other, so the chances of blocks being found
which are identical are low.

PowerScale Administration-SSP1

Page 290 © Copyright 2020 Dell Inc.


Configuring Data Services

SmartDedupe Jobs

Because the sharing phase is the slowest deduplication phase, a dry run, or
DedupeAssessment, returns an estimate of capacity savings.

Editing the Dedupe or DedupeAssessment jobs enables the administrator to


change the Default priority, Default impact policy, and Schedule.

Jobs associated with deduplication, Dedupe, and, DedupeAssessment. The


administrator can start the dry run and edit the job type.

1: The assessment enables a customer to decide if the savings that are offered by
deduplication are worth the effort, load, and cost.

2: Dedupe works on datasets which are configured at the directory level, targeting
all files and directories under each specified root directory. Multiple directory paths
can be specified as part of the overall deduplication job configuration and
scheduling.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 291
Configuring Data Services

SmartDedupe Administration

The WebUI SmartDedupe management is under the File system menu options.
Enter the paths for deduplication183 from the Settings tab.

From the Deduplication window, you can start a deduplication job and view any generated reports.

Challenge

Lab Assignment: Run deduplication assessment, run deduplication,


and view reports.

183 Selecting specific directory gives the administrator granular control to avoid
attempting to deduplicate data where no duplicate blocks are expected, like large
collections of compressed data. Deduplicating an entire cluster without considering
the nature of the data is likely to be inefficient.

PowerScale Administration-SSP1

Page 292 © Copyright 2020 Dell Inc.


Configuring Data Services

SnapshotIQ

Scenario

IT Manager: I think we need to use snapshots to give our users the


ability to recover files.

Your Challenge: The IT manager wants you to describe snapshot


behavior, identify snapshot types, and configure and manage snapshot
functionality.

SnapshotIQ Overview

SnapshotIQ snapshots are logical pointers to data stored on a cluster at a specific


point in time.

If you modify a file and determine that the changes are unwanted, you can copy or
restore the file from the earlier file version.

You can use snapshots to stage content to export, and ensure that a consistent
point-in-time copy of the data is replicated or backed up.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 293
Configuring Data Services

The graphic represents the blocks for production data and the snapshot of that production data. The
snapshot is preserving the original blocks B and E after they have changed (B' and E').

Important: A SnapshotIQ license184 is not required for all snapshot


functions.

184 Some OneFS operations generate snapshots for internal system use without
requiring a SnapshotIQ license. If an application generates a snapshot, and a
SnapshotIQ license is not configured, the snapshot can be still accessed. However,
all snapshots that OneFS operations generate are automatically deleted when no
longer needed. You can disable or enable SnapshotIQ at any time. Note that you
can create clones on the cluster using the "cp" command, which does not require a
SnapshotIQ license.

PowerScale Administration-SSP1

Page 294 © Copyright 2020 Dell Inc.


Configuring Data Services

Snapshot Operations

Bloc
k D'

File System Usage

Snapshot Usage

copy original
block to Block A
snapshot

Block B

Snapshot File
Block C

Block D

Block D'

Snapshot create: Snapshots are created almost instantaneously regardless or the


size185 of the file or directory.

Snapshot growth: as the data is modified and only the changed data blocks are
contained186 in snapshots.

185A snapshot is not a copy of the original data, but only an extra set of pointers to
the original data. At the time it is created, a snapshot consumes a negligible
amount of storage space on the cluster. The original file references the snapshots.

186If data is modified on the cluster (Block D’ in the graphic), only one copy of the
changed data is made. With CoW the original block (Block D) is copied to the

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 295
Configuring Data Services

Snapshot consumption: A snapshot consumes187 only the necessary space to


restore the files contained in the snapshot.

Copy on Write and Redirect on Write

OneFS uses both Copy on Write, or, CoW and Redirect on Write, or, RoW.

CoW are typically user-generated snapshots and RoW are typically system-
generated snapshots.

Both methods have pros and cons, and OneFS dynamically picks the snapshot
method to use to maximize performance and keep overhead to a minimum.

A
COW ROW
A
B
Snapshot
Snapshot
B
C

C
File File System D
System

D D
' B
'

The graphic shows changes that are made to, D. Changes incur a double write penalty, there is less
fragmentation of the HEAD file, which is better for cache prefetch and related file reading functions.

snapshot. The snapshot maintains a pointer to the data that existed at the time that
the snapshot was created.

187Snapshots do not consume a set amount of storage space, there is no


requirement to pre-allocate space for creating a snapshot. If the files that a
snapshot contains have not been modified, the snapshot consumes no additional
storage space on the cluster. The amount of disk space that a snapshot consumes
depends on the amount of data that is stored by the snapshot and the amount of
data the snapshot references from other snapshots.

PowerScale Administration-SSP1

Page 296 © Copyright 2020 Dell Inc.


Configuring Data Services

Ordered and Unordered Deletions

An ordered deletion is the deletion of the oldest snapshot of a directory. Ordered


deletion is recommended for datasets with a lower rate of change.

An unordered deletion is the deletion of a snapshot that is not the oldest snapshot
of a directory. For more active data, the configuration and monitoring overhead is
slightly higher, but fewer snapshots are retained.

Ordered - same duration period and delete oldest first

Unordered - multiple schedules, different retentions - deletions not oldest first

The benefits of unordered deletions that are compared with ordered deletions
depend on how often the snapshots that reference the data are modified. If the
data is modified frequently, unordered deletions save space. However, if data
remains unmodified, unordered deletions are not likely to save space, and it is
recommended that you perform ordered deletions to free cluster resources.

In the graphic, /ifs/org/dir2 two has two snapshot schedules. If the retention period
on schedule 1 is longer than the retention period on schedule 2, the snapshots for
the directory are deleted out of order. Unordered deletions can take twice as long
to complete and consume more cluster resources than ordered deletions. However,
unordered deletions can save space by retaining a smaller total number of blocks
in snapshots.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 297
Configuring Data Services

Creating Snapshots

You can create snapshots by configuring a snapshot schedule or manually


generating an individual snapshot.

• Creating more than one snapshot per directory is advantageous.


• Use shorter expiration periods188.
• Use the isi snapshot list | wc –l command to check the available
snapshots.

188Use shorter expiration periods for snapshots that are generated more
frequently, and longer expiration periods for snapshots that are generated less
frequently.

PowerScale Administration-SSP1

Page 298 © Copyright 2020 Dell Inc.


Configuring Data Services

Accessing Snapshot Files

OneFS tracks snapshots in the .snapshot directory. Click each tab for information
about snapshot structure and access.

Snapshot location

Snapshot files are in two places.

• Snapshots are within the path that is snapped189.


• You can view the .snapshot files is at the root of the /ifs190 directory.
• With SmartPools, snapshots can physically reside on a different storage tier
than the original data.

Accessing snapshots

There are two paths through which to access snapshots.

189For example, if snapping a directory located at /ifs/edu/students/name1, view


the hidden .snapshot directory using the CLI or Windows Explorer. The path would
look like /ifs/edu/students/name1/.snapshot.

190From /ifs all the .snapshots on the system can be accessed, but users can only
open the .snapshot directories for which they already have permissions. Without
access rights users cannot open or view any .snapshot file for any directory.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 299
Configuring Data Services

• Access through the /ifs/.snapshot191 directory.


• Access the .snapshot directory in the path192 where the snapshot was taken.

Preserving Permissions

Snapshots can be taken at any point in the directory tree. Each department or user
can have their own snapshot schedule.

The snapshot preserves193 the file and directory permissions at that point in time of
the snapshot.

191This is a virtual directory where all the snaps listed for the entire cluster are
stored.

192 To view the snapshots on /ifs/eng/media, user can change directory (cd) to
/ifs/eng/media and access /.snapshot

193The snapshot owns the changed blocks and the file system owns the new
blocks. If the permissions or owner of the current file is changed, it does not affect
the permissions or owner of the snapshot version.

PowerScale Administration-SSP1

Page 300 © Copyright 2020 Dell Inc.


Configuring Data Services

The snapshot of /ifs/sales/forecast/dave can be accessed from /ifs/.snapshot or


/ifs/sales/forecast/dave/.snapshot. Permissions for ../dave are maintained, and the ability to
traverse the .snapshot directory matches those permissions.

Restoring Snapshots

If data is accidentally erased, lost, corrupted, or compromised, clients can restore


the data from the snapshot.

Restore Theory

A A

File
B Snapshot
System
Time 1

C
D
D
Snapshot
Time 2
E

Restore Target

Client

The graphic illustrates CoW.

The graphic show a simple example of CoW.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 301
Configuring Data Services

For example, a directory has writes and snapshots at different times:


• Time 1: A,B,C,D are preserved in Snapshot Time 1.
• Time 2: A,B,C,D’ are preserved in Snapshot Time 2.

More data is written into the directory:


• Time 3: A’,B,C,D’
• Time 4: A’,B,C,D’, E

Since no snapshot is taken after Time 2, data corruption to A’ or E is not restorable


from a snapshot.

QUESTION: What happens when the user wants to recover block A data that was
overwritten in Time 3 with A’?

A backup snapshot is automatically created before copying A back to the directory.

Restore from Windows

Mapped share from the


PowerScale cluster

Right-click and select


Properties

List associated snapshots


with modification time

Snapshot options

Clients with Windows Shadow Copy Client can restore the data from the snapshot.

PowerScale Administration-SSP1

Page 302 © Copyright 2020 Dell Inc.


Configuring Data Services

Restore from NFS Host

cd to the .snapshot directory

List snaps at directory level

List point in time copies of the files in To recover a file, use the "mv" or "cp"
the directory command

Clients accessing the export over NFS can navigate using the .snapshot directory.

To recover a deleted file, right-click the folder that previously contained the file,
click Restore Previous Version, and select the required file to recover. To restore a
corrupted or overwritten file, right-click the file itself, instead of the folder that
contains file, and then click Restore Previous Version.

No additional storage is consumed and the restore is instant when restoring the
production file from a snap using RoW. Snapshot Time 2 has preserved A. A
backup snapshot is automatically created before copying A back to the file system.
The backup is a failback or safety mechanism should the restore from the snap be
unacceptable and the user wants to revert to A’.

SnapshotIQ Considerations

Listed are areas to consider when discussing snapshots.

• Always set expiration to prevent snaps filling cluster to capacity.


• Total cluster snap limit: 20,000 - best practice is 1000 limit per directory.
• Run concurrent schedules with different frequencies/expiration.
• SnapshotDelete job must run to completion.
• Manual snapshot deletion is not recommended – set up to expire when created.
• Deleting snapshots out of order may cause newer snapshots, which are
dependent on data that is being removed to have to copy the blocks before
deletion.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 303
Configuring Data Services

• Backup, SyncIQ, Clones, File System Analytics use snapshots – no license


required.
• Use aliases - alias names use most recent version of snapshot and eases
readability for application restores.

Challenge

Lab Assignment: Create a snapshot schedule, create snapshots, and


use a snapshot to restore data.

PowerScale Administration-SSP1

Page 304 © Copyright 2020 Dell Inc.


Configuring Data Services

SyncIQ

Scenario

IT Manager: One of the things I am interested in is SyncIQ. I would like


to the investigate the feature and see if it can help make our
environment more efficient.

Your Challenge: The IT manager wants you to describe SyncIQ and


configure a SyncIQ policy.

SyncIQ Overview Video

SyncIQ delivers unique, highly parallel replication performance that scales with the
dataset to provide disaster recovery. The video provides an overview of SyncIQ.
See the student guide for a transcript of the video.

The SyncIQ topic covers a foundation for SyncIQ. The PowerScale


Advanced Administration course provides a more in-depth examination
of SyncIQ.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 305
Configuring Data Services

Click to launch video.

Link:
https://edutube.emc.com/Player.aspx?vno=OZC9t92nwmWVLWNjfT/+5w==&autop
lay=true

Shown is a cluster with the source directory using SyncIQ to replicate data to a
remote target directory. OneFS SyncIQ uses asynchronous replication, enabling
you to maintain a consistent backup copy of your data on another Isilon cluster.
Asynchronous replication is similar to an asynchronous file write.

The target system passively acknowledges receipt of the data and returns an ACK
once the target receives the entire file or update. Then the data is passively written
to the target. SyncIQ enables you to replicate data from one PowerScale cluster to
another. Activate a SyncIQ license on both the primary and the secondary Isilon
clusters before replicating data between them. You can replicate data at the
directory level while optionally excluding specific files and sub-directories from
being replicated.

SyncIQ creates and references snapshots to replicate a consistent point-in-time


image of a SyncIQ domain. The SyncIQ domain is the root of the replication, such
as /ifs/finance. Metadata, such as ACLs and alternate data streams are replicated
along with data. SyncIQ offers automated failover and failback capabilities. If a
primary cluster becomes unavailable, failover and failback enable continued
operations on another Isilon cluster. In SyncIQ, an administrator creates and then
starts the replication policy. A policy is like an invoice list of what should get
replicated and how. A SyncIQ job does the work of replicating the data. OneFS
8.2.0 and later supports over-the-wire encryption to protect against man-in-the-
middle attacks, making data transfer between OneFS clusters secure.

SyncIQ Deployment Topology

Meeting and exceeding the data replication governance requirements of an


organization are critical for an IT administration. SyncIQ exceeds these

PowerScale Administration-SSP1

Page 306 © Copyright 2020 Dell Inc.


Configuring Data Services

requirements by providing an array of configuration options, ensuring


administrators have flexible options to satisfy all workflows with simplicity.

Under each deployment, the configuration could be for the entire cluster or a
specified source directory. Also, the deployment could have a single policy that is
configured between the clusters or several policies, each with different options
aligning to RPO and RTO requirements.

Click the tabs to know more about each type of deployment Typologies.

One-to-one

In the most common deployment scenario of SyncIQ, data replication is configured


between a single source and single target cluster as illustrated in the graphic
below.

One-to-many

SyncIQ supports data replication from a single source cluster to many target
clusters, allowing the same dataset to exist in multiple locations, as illustrated in the
graphic below. A one-to-many deployment could also be referenced as a hub-and-
spoke deployment, with a central source cluster as the hub and each remote
location representing a spoke.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 307
Configuring Data Services

Many-to-one

The many-to-one deployment topology is essentially the flipped version of the one-
to-many explained in the previous section. Several source clusters replicate to a
single target cluster as illustrated in the graphic below. The many-to-one topology
may also be referred to as a hub-and-spoke configuration. However, in this case,
the target cluster is the hub, and the spokes are source clusters.

Local Target

A local target deployment allows a single Isilon cluster to replicate within itself
providing the SyncIQ powerful configuration options in a local cluster as illustrated
in the graphic below. If a local target deployment is used for disaster readiness or
archiving options, the cluster protection scheme and storage pools must be
considered.

Cascaded

A cascaded deployment combines the previous deployments. It allows a primary


cluster to replicate to a secondary location, next to a tertiary location, and so on as
illustrated in the graphic below. Essentially, each cluster replicates to a next in
chain.

PowerScale Administration-SSP1

Page 308 © Copyright 2020 Dell Inc.


Configuring Data Services

SyncIQ Considerations and Limits

Considerations

Listed are areas to consider when configuring SyncIQ:


• Do not configure the /ifs directory as a SyncIQ domain.
• SyncIQ runs as jobs under its own Job Engine194.
• Can perform semi-automated195 failovers and failbacks.

Capabilities

The various capabilities of SyncIQ are:

194The SyncIQ Job Engine is separate from the cluster maintenance activity Job
Engine in OneFS. SyncIQ runs based on SyncIQ policies that you can schedule or
run as required manually.

195Semi-automated failovers from source to target, and semi-automated failback


from target to original source. Failover and failback only include the cluster
preparation activities and do not include DNS changes, client redirection or any
required networking changes.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 309
Configuring Data Services

• Stop a failover in progress and revert196.


• Source and target snapshots197.
• Maximum transmission units.198

196 The semi-automated failover process preserves the synchronization


relationships between the source and target clusters. SyncIQ is RBAC ready,
enabling you to configure administration roles. For organizations automating
processes, PAPI integration is available.

197 The SyncIQ process uses snapshots on both the source and target snapshots.
No SnapshotIQ license is required for basic SyncIQ snapshots on either the source
or target clusters. These snapshots are only used for SyncIQ jobs. SyncIQ
snapshots are single-instance snapshots and OneFS only retains the latest or last-
known good version.

198
SyncIQ can support larger maximum transmission units or MTU over the LAN or
WAN. SyncIQ supports auto-negotiation of MTU sizes over WAN connections. The
MTU across the network is negotiated by the network.

PowerScale Administration-SSP1

Page 310 © Copyright 2020 Dell Inc.


Configuring Data Services

• Import snapshots199.
• OneFS 8.2 and above provides over-the-wire encryption200 and bandwidth
reservation201 at a policy level.

199SyncIQ has the capability to import manually taken snapshots to use as the
point-in-time reference for synchronization consistency. You can add new nodes
while a sync job runs. There is no requirement to stop the sync job before adding
new nodes. Functionality enables the ability to create a point-in-time report showing
the SyncIQ worker activity.

200In-flight encryption makes data transfer between OneFS clusters secure. The
function benefits customers who undergo regular security audits and/or
government regulations.

201The SyncIQ bandwidth setting at the global level splits the bandwidth
reservation evenly among all policies. Using the CLI, you can make bandwidth
reservations for individual policies.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 311
Configuring Data Services

Limitations

The graphic shows the SyncIQ policy scheduling options.

The limitations of SyncIQ are:


• SyncIQ does not offer high availability (HA)202.
• Discourage a complete failover and failback test203.

202The target cluster contains a copy of the source data synchronized on a


schedule. The implementation is active on the source cluster with a read-only copy
on the secondary cluster. It is used for disaster recovery or to maintain a second
copy of the data only.

203 Performing a complete failover and failback test on a monthly or quarterly basis
is discouraged. Perform failover testing if quiescing writes to the source (prevent
changing the data) and all SyncIQ policies are successfully run a final time to
assure complete synchronization between source and target. Failing to perform a
final synchronization can lead to data loss.

PowerScale Administration-SSP1

Page 312 © Copyright 2020 Dell Inc.


Configuring Data Services

• Failover not needed for data retrieval204.


• Scheduling options205.

Compatibility

The table shows the versions of OneFS you can synchronize using SyncIQ. Target
cluster running OneFS 7.1.x version of OneFS is no longer supported. For
information about the support and service life-cycle dates for hardware and
software products, see the Isilon Product Availability Guide.

Source Target cluster running OneFS


Cluster
7.2.x 8.0.x 8.1.x 8.2.x

OneFS 7.1 Yes Yes Yes Yes

OneFS 7.2 Yes Yes Yes Yes

OneFS 8.0.x Yes Yes Yes Yes

OneFS 8.1.x Yes Yes Yes Yes

OneFS 8.2.x Yes Yes Yes Yes

204Retrieving a copy of the data from the target cluster does not require a failover.
The target is a read-only copy of the data. Perform a copy operation to make a
copy of the read-only data on the target cluster to a location outside of the SyncIQ
domain on the target, or to a location on the source cluster, or to the client.

205The 'Whenever the source is modified' option is not for continuous replication.
OneFS does not offer a continuous replication option. This option is for specific
workflows that have infrequent updates and require distribution of the information
as soon as possible.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 313
Configuring Data Services

CloudPools

SyncIQ can synchronize CloudPools data from the CloudPools aware source
cluster to a PowerScale target cluster.

SyncIQ provides data protection for CloudPools data and provides failover and
failback capabilities.

SyncIQ uses the CloudPools API tools to enable support.

The processes and capabilities of SyncIQ are based on the OneFS version
relationship between the source cluster and the target cluster. This relationship
determines the capabilities and behaviors available for SyncIQ policy replication.

Important: Shares, exports, cluster configuration, networking info,


metadata, licenses, etc. are not replicated. Employing tools such as
isi backup, application such as Superna Eyeglass, or a PS
engagement are often required to implement a complex solution.

SyncIQ Administrative Functions

Select each tab for an overview of each SyncIQ function.

Failover

Failover is the process of changing the role of the target replication directories into
the role of the source directories for assuming client read, write, and modify data
activities.

PowerScale Administration-SSP1

Page 314 © Copyright 2020 Dell Inc.


Configuring Data Services

Change from read-only to


read-write

Source

Target

The example shows a failover where the client accesses data on the target cluster.

Failback

A failback206 is the process of restoring the source-to-target cluster relationship to


the original operations where client activity is again on the source cluster.

Like failover, you must select failback for each policy. You must make the same
network changes to restore access to direct clients to the source cluster.

206A failback can happen when the primary cluster is available once again for client
activities. The reason could be from any number of circumstances including the
natural disasters are no longer impacting operations, or site communication or
power outages have been restored to normal. You must failback each SyncIQ
policy.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 315
Configuring Data Services

Source

Target

Replicate changes back


to the source

Changes to read-write once failback


completes

The example shows a failback where the client accesses source data.

Failback Preparation

Source cluster

To initiate a failback, the Resync-prep option is used. Resync-prep creates a mirror


policy for the replication policy on the primary cluster and secondary cluster.

Resync-prep prepares the source cluster to receive the changes made to the data
on the target cluster.

The mirror policy is placed under Data Protection > SyncIQ > Local Targets on the
primary cluster. On the secondary cluster, the mirror policy is placed under Data
Protection > SyncIQ > Policies.

PowerScale Administration-SSP1

Page 316 © Copyright 2020 Dell Inc.


Configuring Data Services

Failover Revert

A failover revert undoes a failover job in process207. Use revert before writes
occur208 on the target.

Changes not preserved

Source

Target

Stops failover job and enables


replication to target

207Failover revert stops the failover job and restores the cluster to a sync ready
state. Failover reverts enables replication to the target cluster to once again
continue without performing a failback.

208Use revert if the primary cluster once again becomes available before any writes
happen to the target. A temporary communications outage or if doing a failover test
scenario are typical use cases for a revert.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 317
Configuring Data Services

SyncIQ Replication Policies

Policy - what data, source and


destination, and how often

SyncIQ

SyncIQ jobs move data

Policy - governs replication between


source and target

SyncIQ policies209 govern data replication.

A SyncIQ policy specifies the clusters210 that are replicating.

SyncIQ jobs do the work211.

209You create and start replication policies on the primary cluster. A policy
specifies what data is replicated, where the data is replicated to, and how often the
data is replicated.

210The primary cluster holds the source root directory, and the secondary cluster
holds the target directory. There are some management capabilities for the policy
on both the primary and secondary clusters, though most of the options are on the
primary.

211SyncIQ jobs are the operations that do the work of moving the data from one
PowerScale cluster to another. SyncIQ generates these jobs according to
replication policies.

PowerScale Administration-SSP1

Page 318 © Copyright 2020 Dell Inc.


Configuring Data Services

Creating the SyncIQ Policy

The panels describe the files for creating the SyncIQ policy. Refer to the student
guide for more information.

Settings

Creating a SyncIQ policy is done of the Data protection > SyncIQ > Policies page
or using the isi sync policy create command.

Unique name

One time copy

File deletion protection SyncIQ domain checked every 10 seconds


Incrementally updates copy

Source cluster protection


Adhoc updates Use case - content distribution, EDA

Use case - test/dev workflows


User generated snapshot basis for replication

Granular updates - typical use case Use case - one-to-many scenario

The graphic shows the SyncIQ policy Settings fields. Click the image to enlarge.

Source Cluster - Directories

The Source root directory is the SyncIQ domain.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 319
Configuring Data Services

SyncIQ domain root Replicates only listed paths and ignores unlisted -
use with caution

Do not use /ifs as the root

Replicates all paths except those listed

Granular control over data replicated

File criteria filters

* Indicates filters available for


Copy policies

Click the image to enlarge.

Target Cluster

The target cluster identification is required for each policy.

SmartConnect IP address or FQDN Local host for same cluster replication

Restrict target nodes

SyncIQ domain root path

Default name for snapshot alias is modifyable

Recommended to set an expiration

Click the image to enlarge.

Advnaced

The final segment of the policy creation are the advanced fields.

PowerScale Administration-SSP1

Page 320 © Copyright 2020 Dell Inc.


Configuring Data Services

Prioritize policies
Data details written to /var/log/isi_migrate.log

Perform checksum on each file data


packet
Runs DomainMark in advance

Automatically deletes report after


expiration
Applicable for synchronization policy only

Determine if deep copy needed

Primary and secondary must be OneFS 8.0 or


higher

Force a deep copy

Settings: In the Settings section, assign a unique name to the policy. Optionally you
can add a description of the policy. The Enable this policy box is checked by
default. If you cleared the box, it would disable the policy and stop the policy from
running. Next designate whether a Copy policy or a Synchronize policy. The
replication policy can be started using one of four different run job options:
Manually, On a Schedule, Whenever the source is modified, or Whenever a
snapshot of the source directory is taken.

Source cluster directories: In the Source Cluster criteria, the Source root directory
is the SyncIQ domain. The path has the data that you want to protect by replicating
it to the target directory on the secondary cluster. Unless otherwise filtered,
everything in the directory structure from the source root directory and below
replicates to the target directory on the secondary cluster.

Includes and excludes: The Included directories field permits adding one or more
directory paths below the root to include in the replication. Once an include path is
listed that means that only paths listed in the include path replicate to the target.
Without include paths all directories below the root are included. The Excluded
directories field lists directories below the root you want explicitly excluded from the
replication process. You cannot fail back replication policies that specify includes or
exclude settings. The DomainMark job does not work for policies with subdrectories
mentioned in Include or Exclude. Using includes or excludes for directory paths
does not affect performance.

File matching criteria: The File matching criteria enables the creation of one or
more rules to filter which files do and do not get replicated. Creating multiple rules
connect them together with Boolean AND or OR statements. When adding a new

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 321
Configuring Data Services

filter rule, click either the Add an “And” condition or Add an “Or” condition links. File
matching criteria says that if the file matches these rules then replicate it. If the
criteria does not match the rules, do not replicate the file.

Target: Snapshots are used on the target directory to retain one or more consistent
recover points for the replication data. You can specify if and how these snapshots
generate. To retain the snapshots SyncIQ takes, select Enable capture of
snapshots on the target cluster. SyncIQ always retains one snapshot of the most
recently replicated delta set on the secondary cluster to facilitate failover,
regardless of this setting. Enabling capture snapshots retains snapshots beyond
the time period that is needed for SyncIQ. The snapshots provide more recover
points on the secondary cluster.

Advanced: The Priority field in the Advanced settings section enables policies to be
prioritized. If more than 50 concurrent SyncIQ policies are running at a time,
policies with a higher priority take precedent over normal policies. If the SyncIQ
replication is intended for failover and failback disaster recovery scenarios,
selecting Prepare policy for accelerated failback performance prepares the
DomainMark for the failback performance. The original source SyncIQ domain
requires a DomainMark. Running a DomainMark during the failback process can
take a long time to complete. You can retain SyncIQ job reports for a specified
time. With an increased number of SyncIQ jobs in OneFS 8.0, the report retention
period could be an important consideration. If tracking file and directory deletions
that are performed during synchronization on the target, you can select to Record
deletions on synchronization.

Deep copy: The Deep copy for CloudPools setting applies to those policies that
have files in a CloudPools target. Deny is the default. Deny enables only stub file
replication. The source and target clusters must be at least OneFS 8.0 to support
Deny. Allow the SyncIQ policy determine if a deep copy should be performed.
Force automatically enforces a deep copy for all CloudPools data that are
contained within the SyncIQ domain. Allow or Force are required for target clusters
that are not CloudPools aware.

PowerScale Administration-SSP1

Page 322 © Copyright 2020 Dell Inc.


Configuring Data Services

Copy vs Synchronize Policies

A SyncIQ policy can copy or synchronize source data to meet organizational goals.
When creating a SyncIQ policy, choose a replication type of either sync 212 or
copy213.

Copy Policy Synchronize Policy

• Goal - retain deleted data. • Goal - source cluster protection.


• Makes a one time full copy of the • Makes a one time full copy of the
source directory to the target source directory to the target
directory. directory.
• Runs manually. • Continues to make incremental
copies of the changes in the source
• Copy retains deleted source data
directory to the target directory.
on target.
• Removes deleted source data on
• Files that are deleted from source
target.
are not deleted from target.
• Files that are deleted from source
• Not secure file retention -
are deleted from target.
SmartLock.
• No file deletion protection.

212 If a mirrored copy of the source is the goal, create a sync policy.

213If the goal is to have all source data that is copied and to retain deleted file
copies, then create a copy policy.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 323
Configuring Data Services

Tip: You can always license SnapshotIQ on the target cluster and
retain historic SyncIQ associated snapshots to aid in file deletion
and change protection.

SyncIQ Configuration Video

The video details a basic SyncIQ use case, configuring replication between two
clusters. See the student guide for a transcript of the video.

Click to launch video.

Link:
https://edutube.emc.com/Player.aspx?vno=6cyyA4XvBqkyHJwXs6ltdg==&autoplay
=true

PowerScale Administration-SSP1

Page 324 © Copyright 2020 Dell Inc.


Configuring Data Services

Challenge

Lab Assignment: Configure a SyncIQ policy.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 325
Configuring Data Services

SmartLock

Scenario

IT Manager: I need a directory that has WORM protection. It does not


need to follow SEC 17a-4 rules. How can you set this up?

Your Challenge: The IT manager wants you to describe SmartLock, the


types of SmartLock operations, and configure SmartLock.

SmartLock Overview

SmartLock is a licensed software application that enables cost-effective and


efficient protection against accidental, premature, or malicious deletion or
modification of data.
• WORM

PowerScale Administration-SSP1

Page 326 © Copyright 2020 Dell Inc.


Configuring Data Services

• SyncIQ integration214
• OneFS data services integration215

SmartLock Concepts

Before configuring SmartLock on a cluster, you must familiarize yourself with a few
concepts to fully understand the SmartLock requirements and capabilities.

• Retention Period
• Compliance
• WORM

SmartLock Operating Modes

There are two SmartLock operation modes available to the cluster: SmartLock
compliance mode216 and SmartLock enterprise mode217.

Before creating SmartLock directories, you must activate a SmartLock license on


the cluster.

Compliance Enterprise

214SmartLock integrates with SyncIQ to provide failover capabilities and retention


on the SyncIQ source and target.

215SmartLock seamlessly integrates with OneFS core capabilities and add-on


software for snapshots, replication, provisioning, backup and restore, virtual
environments and other key functions.

216
You can create compliance directories only if the cluster has been upgraded to
SmartLock compliance mode.

217 SmartLock enterprise mode is the default SmartLock operation mode.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 327
Configuring Data Services

Only use if SEC 17a-4 must be Does not restrict cluster to follow SEC
followed. 17a-4 rules.

Configured during initial cluster install. Data not modified until retention dates
have passed.

Root is disabled - must use References system clock


compadmin account.

Admin tasks using sudo command

References non changeable *Privilege deletes can be enabled.


Compliance Mode clock

No option for privilege deletes.

* If you own a file and have the ISI_PRIV_IFS_WORM_DELETE privilege or are


logged in through the root user account, you can delete the file before the retention
period passes through the privileged delete feature. The privileged delete feature is
not available for compliance directories.

SmartLock Directory Types

1: OneFS supports standard non-WORM directories on the same cluster with


SmartLock directories.

PowerScale Administration-SSP1

Page 328 © Copyright 2020 Dell Inc.


Configuring Data Services

2: Enterprise SmartLock directories are data retention directories that do not meet
SEC regulatory compliance requirements. Enterprise directories are the most
commonly used directories in a SmartLock configuration. Enterprise SmartLock
directories enable administrators or RBAC enabled users the ability to delete files,
which are known as privileged deletes. You can enable or turn on, temporarily
disable or turn off, or permanently disable privileged deletes. The Enterprise
directory may be fully populated with data or empty when creating or modifying.

3: Compliance SmartLock directories are data retention directories that meet SEC
regulatory compliance requirements. Set up the cluster in Compliance mode to
support Compliance SmartLock directories.

When using SmartLock, there are two types of directories: enterprise and
compliance. A third type of directory is a standard or non-WORM218 directory.

You can upgrade219 an empty Enterprise SmartLock directory to a Compliance


SmartLock directory.

If using the compliance clock, you must copy data into the Compliance SmartLock
directory structure before committing the data to a WORM state.

SmartLock Configuration

In this use case the administrator wants to create a WORM directory where files
are locked down for a month. Once moved into the folder, the files are committed to
WORM.

Create a WORM domain from the WebUI File system > SmartLock page and select
Create domain or using the CLI "isi worm domains command.

218
OneFS supports standard non-WORM directories on the same cluster with
SmartLock directories.

219When you upgrade, privileged deletes are disabled permanently and cannot be
changed back.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 329
Configuring Data Services

5
6

1: Setting to "On" enables the root user to delete files that are currently committed
to a WORM state.

2: Setting the SmartLock domain.

3: The default retention period is assigned when committing a file to a WORM state
without specifying a day to release the file from the WORM state.

4: The minimum retention period ensures that files are retained in a WORM state
for at least the specified period of time. The maximum retention period ensures that
files are not retained in a WORM state for more than the specified period of time.

5: After a specified period, a file that has not been modified is committed to a
WORM state.

6: Files committed to a WORM state are not released from a WORM state until
after the specified date, regardless of the retention period.

SmartLock CLI Example

Use case:
• The administrator requires a WORM directory where files are in a WORM state
for at least 30 days and are removed from the WORM state after 60 days.

PowerScale Administration-SSP1

Page 330 © Copyright 2020 Dell Inc.


Configuring Data Services

• The default retention is 60 days.


• Set minimum and maximum retention dates.

CLI:

# isi worm domains create /ifs/finance/freeze_file -d


use_max -m 30D -x 60D --mkdir
o -d use_max uses the maximum retention as the default retention
o --mkdir creates the directory since it does not exist
o Duration syntax is in the format YMWDhms
Use the isi worm domains view command to verify the settings.

Committing Files to WORM

For a file to have a file retention date applied, and set to a read-only state, you
must commit the file to WORM.

Until the files are committed to WORM, files that are in a SmartLock directory act
as standard files that you can move, modify, or delete.

You can commit files manually or by using autocommit.

Manual Commit Autocommit Period

First set the retention date on the file, then Set per SmartLock domain
commit the file to WORM.
Sets a time period from when
the file was last modified on a
directory

Commit files to WORM state using Windows After the time period expires,
controls or UNIX commands the file is automatically
Example: # chmod ugo-w committed to WORM.
/ifs/finance/worm/JulyPayroll.xls

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 331
Configuring Data Services

SmartLock Considerations

Listed are areas to consider when discussing SmartLock.


• Retention settings apply to enterprise and compliance - explicit, default,
minimum, and maximum, retention date override.
• The system clock is the standard cluster time clock that is used for non-WORM
directories and Enterprise SmartLock directories.
• The compliance clock is used for Compliance SmartLock directories only. Set it
one time. The clock slowly drifts towards system clock (can drift up to 14 days
per year).
• Use compliance mode clusters only to meet the needs for regulatory
requirements.
• Root user is disabled on Compliance Mode cluster - use compadmin to
manage cluster.
• No auto delete of files – files past retention period must be identified.
• Limited search capability for expired files – individually test each file.
• You can use the isi worm files view command to verify the retention
status for any file.
• Do not use rm -rf . Command option r deletes all files and directories
recursively, and option f avoids prompting before deleting.
• In OneFS versions later than OneFS 8.0.1, SyncIQ failback is supported on
SmartLock directories.

Challenge

Lab Assignment: Configure WORM on a directory.

PowerScale Administration-SSP1

Page 332 © Copyright 2020 Dell Inc.


Monitoring Tools

Monitoring Tools

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 333


Monitoring Tools

PowerScale HealthCheck

Scenario

IT Manager: To understand the health of the cluster, I want you to run


periodic checks on different OneFS services. It might be a good idea to
create periodic health reports for out weekly meetings.

Your Challenge: The IT manager has tasked you to create a


HealthCheck evaluation and schedule.

PowerScale Administration-SSP1

Page 334 © Copyright 2020 Dell Inc.


Monitoring Tools

HealthCheck Overview

WebUI, Cluster management > HealthCheck page. Click the image to enlarge.

The OneFS HealthCheck tool is a service that helps evaluate the cluster health
status and provides alerts to potential issues.

You can use HealthCheck to verify the cluster configuration and operation,
proactively manage risk, reduce support cycles and resolution times, and improve
uptime.

CLI command: isi healthcheck

CLI example to view the checklist items: isi healthcheck checklists list

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 335


Monitoring Tools

Checklists and Checklist Items

For the CLI equivalent output use the "isi healthcheck checklists view cluster_capacity" command.
Click on the image to enlarge.

The graphic shows that the checklist items for the cluster_capacity check. The
HealthCheck terms and their definition are:

• Checklist - a list of one or more items to evaluate


• Checklist item - an evaluated article such as node capacity

Checklist Item Parameters

You can use the CLI to view the parameters of a checklist item. The example
shows viewing the node_capacity item parameters.

PowerScale Administration-SSP1

Page 336 © Copyright 2020 Dell Inc.


Monitoring Tools

The node_capacity item is an item in the cluster_capacity check.

Running a HealthCheck

Running an evaluation on the cluster_capacity checklist. Click the image to enlarge.

By default, a HealthCheck evaluation runs once a day at 11:00 AM. You can run a
HealthCheck using the WebUI.

The example shows selecting the Run option for the cluster_capacity checklist. The
HealthCheck table shows the status of the checklist.

CLI example of an evaluation:


isi healthcheck evaluation run cluster_capacity

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 337


Monitoring Tools

HealthCheck Schedule

WebUI HealthCheck scheduler. Click the image to enlarge.

You can manage the HealthCheck schedules of the checklists. By default, the
basic checklist is scheduled.

CLI example of creating a schedule called "capacity" for the cluster_capacity


checklist:
isi healthcheck schedules create capacity "Every day at 10
PM" cluster_capacity

PowerScale Administration-SSP1

Page 338 © Copyright 2020 Dell Inc.


Monitoring Tools

Viewing an Evaluation

Evaluation showing
failures

Viewing the evaluation from the WebUI HealthChecks tab.

You can view the evaluation from the HealthChecks tab or the Evaluations tab. For
a failed evaluation, the file will show the checklist items that failed.

CLI example of viewing a failed evaluation:


isi healthcheck evaluation view basic20200427T0400

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 339


Monitoring Tools

HealthCheck Resources

Link to the Info Hub

Challenge

Lab Assignment: Login to the cluster and create a HealthCheck


schedule and run a HealthCheck evaluation.

PowerScale Administration-SSP1

Page 340 © Copyright 2020 Dell Inc.


Monitoring Tools

InsightIQ

Scenario

IT Manager: We have InsightIQ installed in the lab. I want you to


explore the application for monitoring purposes.

Your Challenge: The IT manager wants you to describe the InsightIQ


functions and configure InsightIQ.

InsightIQ Overview

isi_stat_d

InsightIQ host
http

Client

FSA datastore

http

Datastore

InsightIQ focuses on PowerScale data and performance. Listed are key benefits for
using InsightIQ. Refer to the student guide for more information.
• Determine whether a storage cluster is performing optimally.
• Compare changes in performance across multiple metrics, such as CPU usage,
network traffic, protocol operations, and client activity.
• Correlate critical storage cluster events with performance changes.
• Determine the effect of workflows, software, and systems on storage cluster
performance over time.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 341


Monitoring Tools

• View and compare properties of the data on the file system.


• Identify users who are using the most system resources and identify their
activity.

InsightIQ is available for no charge and provides advanced analytics to optimize


applications, correlate workflow and network events. It provides tools to monitor
and analyze cluster performance and file systems. Cluster monitoring includes
performance, capacity, activity, trending, and analysis. InsightIQ runs on separate
hardware from the clusters that it monitors, and provides a graphical output for
trend observation and analysis. It does not take cluster resources beyond the data
collection process. InsightIQ retains a configurable amount of historic information
about the statistics it collects. To prevent collection of a large backlog of data,
InsightIQ retains datasets to provide trending information over a year, but these
settings are configurable.

InsightIQ has a straightforward layout of independent components. Inside the


PowerScale cluster, the isi_stat_d generates and collects monitoring and
statistical data. The isi_api_d presents the data, which also handles PAPI calls,
over HTTP. The InsightIQ datastore can be local to the host or external using an
NFS mount from the PowerScale cluster, or any NFS-mounted server. The
datastore must have at least 70 GB of free disk space. File System Analytics (FSA)
data is kept in a database on the cluster. InsightIQ accesses the cluster through
PAPI rather than an NFS mount.

InsightIQ is accessed through any modern web browser. If loading InsightIQ on a


Red Hat or CentOS Linux system, Dell Technologies provides it in the form of an
RPM package.

InsightIQ Dashboard

The DASHBOARD provides an aggregated cluster overview and a cluster-by-


cluster overview.

PowerScale Administration-SSP1

Page 342 © Copyright 2020 Dell Inc.


Monitoring Tools

Aggregated view

Metrics

Monitoring 3 clusters

Cluster health
Cluster-by-cluster breakout

InsightIQ dashboard showing three clusters configured. Click image to enlarge.

You can modify the view to represent any time period where InsightIQ has
collected data. Also, breakouts and filters can be applied to the data. In the
Aggregated Cluster Overview section, you can view the status of all monitored
clusters as a whole. There is a list of all the clusters and nodes that are monitored.
Total capacity, data usage, and remaining capacity are shown. Overall health of the
clusters is displayed. There are graphical and numeral indicators for connected
clients, active clients, network throughput, file system throughput, and average
CPU usage. Depending on the chart type, preset filters enable you to view specific
data. For example, In/Out displays data by inbound traffic compare with outbound
traffic.

You can also view data by file access protocol, individual node, disk, network
interface, and individual file or directory name. If displaying the data by the client
only, the most active clients are represented in the displayed data. Displaying data
by event can include an individual file system event, such as read, write, or lookup.
Filtering by operation class displays data by the type of operation being performed.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 343


Monitoring Tools

Capacity Analysis

The capacity analysis pie chart is an estimate of usable capacity is based on the
existing ratio of user data to overhead220.

Click the image to enlarge.

220 There is an assumption that data usage factors remain constant over more use.
If a customer uses the cluster for many small files and then wants to add some
large files, the result is not precisely what the system predicts.

PowerScale Administration-SSP1

Page 344 © Copyright 2020 Dell Inc.


Monitoring Tools

Default Reports

- Cluster activity and capacity - Data about files

- Determine cluster perfromance - Identify types of data and storage location

- Investigate issue - Uses File System Analytics

Click the image to enlarge.

You can monitor clusters through customizable reports that display detailed cluster
data over specific periods of time.

• Performance reports
• File system reports
• Live reporting

Capacity Reporting and Forecasting

Click the image to enlarge.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 345


Monitoring Tools

You can drill down to file system reporting to get a capacity reporting interface that
displays more detail about usage, overhead and anticipated capacity.

• Get usage profile


• Forecasting

The graphic shows the Capacity Forecast, displaying the amount data that can be
added to the cluster before the cluster reaches capacity.

The administrator can select cluster information and use that as a typical usage
profile to estimate when the cluster reaches 90% full. The information is useful for
planning node/cluster expansion ahead of time to avoid delays around procurement
and order fulfillment.

The Plot data shows the granularity of the reporting available. The Forecast data
shows the breakout of information that is shown in the forecast chart. Depending
on the frequency and amount of variation, outliers can have a major impact on the
accuracy of the forecast usage data.

Create Performance Report

Create custom live performance reports by clicking Performance Reporting > Create a New
Performance Report. Click the image to enlarge.

There are three types of reports On the Create a New Performance Report page.

• Live performance report from a template.

PowerScale Administration-SSP1

Page 346 © Copyright 2020 Dell Inc.


Monitoring Tools

• Live performance report that is based on a saved performance report.


• Live performance reports that is based on one of the template reports.

Click for configuration steps221.

221In the Create a New Performance Report area, in the Performance Report
Name field, type a name for the live performance report. Select the Live
Performance Reporting checkbox. In the Select the Data You Want to See area,
specify the performance modules that you want to view in the report. You can add
a performance module or modify an existing one. Repeat this step for each
performance module that you want to include. Save the report.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 347


Monitoring Tools

File System Analytics

Click the image to enlarge.

FSA provides detailed information about files and directories on a PowerScale


cluster.

InsightIQ collects the FSA data from the cluster for display to the administrator.

PowerScale Administration-SSP1

Page 348 © Copyright 2020 Dell Inc.


Monitoring Tools

• FSA results sets location - /ifs/.ifsvar/modules/fsa.222


• Result sets routinely deleted to save storage.
• You can set the maximum number of result sets to retain.
• FSAnalyze job runs daily.223

Enable FSA

Monitored Clusters page, Settings > Monitored Clusters. Click the image to enlarge.

Before you can view and analyze data usage and properties through InsightIQ, you
must enable the FSA feature.

222 Unlike InsightIQ datasets, which are stored in the InsightIQ datastore, FSA
result sets are stored on the monitored cluster in the /ifs/.ifsvar/modules/fsa
directory.

223The job collects information across the cluster, such as the number of files per
location or path, the file sizes, and the directory activity tracking.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 349


Monitoring Tools

Important: FSAnalyze runs by default in snapshot based mode


(OneFS 8.0 and later). The snapshots can consume large amounts
of cluster capacity.

To enable FSA, Open the Monitored Clusters page by clicking Settings > Monitored
Clusters. In the Actions column for the cluster that you want to enable or disable
FSA, click Configure. The Configuration page displays. Click the Enable FSA tab.
To enable the FSA job, select Generate FSA reports on the monitored cluster. To
enable InsightIQ for FSA report, select View FSA reports in InsightIQ.

If there are long time periods between the FSAnalyze job runs, the snapshot can
grow very large, possibly consuming much of the cluster's space. To avoid large
snapshot, you can disable the use of snapshots for FSAnalyze. Disabling snapshot
use means that the jobs may take longer to run.

Considerations

Listed are areas to consider for InsightIQ:


• InsightIQ 4.x supports all versions of OneFS from 7.0 and later.
• By default, web browsers connect to InsightIQ over HTTPS or HTTP using port
443 for HTTPS and port 80 for HTTP.
• A revert to a snapshot or modifications of the InsightIQ datastore can cause
datastore corruption.
• The maximum number of clusters that you can simultaneously monitor is based
on the system resources available to the Linux computer or virtual machine.
• It is recommended that you monitor no more than 8 storage clusters or 150
nodes with a single instance of InsightIQ.
• In large clusters (16+ nodes) with nodes that have limited CPU such as the
A200, the CPU usage of the FSAnalyze job can get large.

PowerScale Administration-SSP1

Page 350 © Copyright 2020 Dell Inc.


Monitoring Tools

Challenge

Lab Assignment: Now go to the lab and use InsightIQ to get a


performance baseline.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 351


Monitoring Tools

DataIQ v1

Scenario

IT Manager: How is monitoring and analyzing a cluster’s performance


and file systems is performed?

Your Challenge: The IT manager has asked you to explain DataIQ and
its available monitoring capabilities.

DataIQ Overview

The features DataIQ provides are discussed below.

1 2 3 4 5 6 7

1: DataIQ eliminates the problem of data silos by proving a holistic view into
heterogeneous storage platforms on-premises and in the cloud. A single pane of
glass view gives users a file-centric insight into data and enables intuitive
navigation.

2: DataIQ optimized near real-time scan, and high-speed file indexing deliver
immediate project and user information. Powerful search capabilities across
heterogeneous storage can locate data in seconds, no matter where it resides.
High-speed search and indexing scans and organizes files in "look aside" mode.

PowerScale Administration-SSP1

Page 352 © Copyright 2020 Dell Inc.


Monitoring Tools

3: DataIQ can ‘tag’ an attribute and use that tag to query millions of files across any
storage system. Tags enable business users, and IT, to view data in a true
business context. Tags give organizations the ability to see their data in the right
context, and to optimize their storage environment costs.

4: DataIQ enables data mobility with bi-directional movement between file and
object storage. The use of self-service archive capabilities to move files to the most
appropriate storage tier, such as archive or the cloud, empowers business owners.
Self-service enables content owners to move data from high-performance file
storage to an object archive.

5: With DataIQ, IT and storage admins gain understanding of their environment to


efficiently manage storage costs. They can report on the true cost of dormant and
redundant data and generate chargeback/showback views or cost recovery reports.
IT can also report on storage usage by project, and determine what files must be
cleaned up (such as duplicates or dark data).

6: DataIQ quickly scans file and object storage of all types. It can classify data
according to customer specification and provide instant rollup information. For
example, total tree size, average age of subtree data, 'last modified' date at any
point of folder structure. DataIQ generates fast and granular reports with business-
specific views and metrics, enabling rapid issue isolation. DataIQ integrates with IT
infrastructures to provide rights for AD and LDAP for users and group, as well as
APIs to enhance and extract business data. DataIQ plug-ins enable users to gain
additional insights. Plug-ins extend the GUI and launch internal scripts such as
Data Mover, Previewer, Audited Delete, Send to QA, and other custom scripts.

7: DataIQ monitors cluster health independent of the cluster status. DataIQ


monitors multiple clusters with massive node counts. It also configures and
receives alerts that are based on limits and issues.

DataIQ Implementation

DataIQ employs a traditional client/server model.

The DataIQ server scans the managed storage, saves the results in an index, and
provides access to the index.

Access is available from one or more GUI clients, CLI clients, and through the API
for application integration.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 353


Monitoring Tools

DataIQ

Windows
Clients

Linux
Clients

MAC
Clients

DataIQ Landing Page

After logging in to the DataIQ WebUI, the landing page is the Data Management
page.

Data Management, and Settings are the key functional pages.

Return to landing page

The example shows the landing page - Data Management.

Settings - Pages

Use the left and right arrows to view the Settings pages.

PowerScale Administration-SSP1

Page 354 © Copyright 2020 Dell Inc.


Monitoring Tools

Local settings

The examples show the two themes. Click to enlarge image.

The Local settings page allows you to personalize the theme of the DataIQ WebUI.

• Client maps224
• Viewable files and folders225

General management

You can configure email alerts and SRS on the General management page.

If a volume has the minimum free space threshold configured, an email is sent
when the threshold is triggered.

224 Client maps enable you to map the DataIQ path to the path that the client sees.

225 You can view or hide the hidden-type files and folders. You can also set how the
files and folders are viewed in the tables and lists.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 355


Monitoring Tools

Click image to enlarge.

Access and permissions

Click image to enlarge.

The Access and permissions page is where you can configure groups, add roles to
the groups, set authentication providers, and add users.

PowerScale Administration-SSP1

Page 356 © Copyright 2020 Dell Inc.


Monitoring Tools

Data management configuration

Click image to enlarge.

The Data management configuration page has four panels, Volumes, S3


endpoints, Plugins, and Other settings.

The Other settings include file type class and configuration files.

Licensing

Click image to enlarge.

From the Licensing page, you can manage and upload licenses generated from the
Software Licensing Central online portal.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 357


Monitoring Tools

Settings - Data Management Actions

Shown is an overview of the actions a user with the role of data owner can
perform. The actions are performed from the Data management settings page.
Use the left and right arrows to view the panels.

Volumes Panel

Text

Edits apply globally, settings at the volume Configure volume type, scan management, Volumes added to the scan group adopt the Change the scan management, delete the
level have precedence and hard link handling scan group settings, scan group settings have volume
precedence over volume settings

Click to enlarge.

From the Data management configuration page, Volumes panel, you can set
volume defaults, add and edit volumes, and create scan groups.

PowerScale Administration-SSP1

Page 358 © Copyright 2020 Dell Inc.


Monitoring Tools

S3 Endpoints Panel

Click to enlarge.

From the Settings, Data management configuration page, S3 enpoints panel226,


you can add an Amazon S3 instance.

226DataIQ enables you to setup the endpoint as a volume for scanning. To delete
an endpoint, go to the view breakout for the endpoint.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 359


Monitoring Tools

Other settings - File Type Classes

PowerScale Administration-SSP1

Page 360 © Copyright 2020 Dell Inc.


Monitoring Tools

Click to enlarge.

From the Data management settings page, Other settings panel, you can configure
file type classes.227

Other settings - Configuration Files

The example shows the configuration files and a breakout of the clientmap file. Click to enlarge.

The Data management settings has four configurations files that you can edit. The
files are listed in the Others setting panel:
• Clientmap configuration files allows you to view file paths as they are seen by
the user.
• Data management configuration file allows you to change DataIQ settings
• Viewfilter configuration file allows you to restrict the view of folders by group
• Autotagging configuration file allows you to setup and define tags

227File type classes allow you to scan the volumes by a file type class. For
example, you can make a class called images and then add .jpeg, .png, and .gif
extensions to the class.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 361


Monitoring Tools

Volume Management Options

Scroll through the carousel to view each of the volume management areas. You
can double click the images to enlarge.

Volume defaults

Set a capacity
threshold. When
triggered, flags the
volume

Set a dollar value to


the volume for
reporting purposes
(only $/month)

Prevents scan from


descending into folders
and indexing files that
match the pattern

Provides more
accurate reports on
volumes with hardlinks

The volume defaults are applied to new volumes and volumes without configured
settings.

The settings on volumes that are configured take precedence over the default
values.

Add Volume

Typically unused. Uses sleep period


Most use cases use VFS before issuing another file system
cmd

Does not need to Scan threads used for the volume.


correlate to the High number can impact
mountpoint performance

Set a value to the volume for


Mounted path on the reporting purposes
DataIQ server
Set a capacity threshold. When
triggered, flags the volume
If a member of a scan
group, scan configuration Typically employed when hard links
done at scan group level used extensively
Used to prevent an endless
descent into a broken file Prevents scan from descending into
system folders and indexing files that match
the pattern

The Add new volume window consists of three panels, the general settings, scan
configuration, and advanced settings.

PowerScale Administration-SSP1

Page 362 © Copyright 2020 Dell Inc.


Monitoring Tools

Scan Groups

Field definitions are


the same as those
discussed in the
previous window

You can create scan groups and add volumes with the same scan, TCO, and
minimum free space trigger to the group.

Settings in the scan group have precedence to the settings on the volume.

Editing Volumes

Settings discussed in the


Add Volume window.

The Edit window enables


you to delete volumes

If the volume belongs to a scan group and the scan group settings no longer apply,
you can remove the volume from the scan group and edit the volume settings.

Managing Configuration Files

The configuration files are on the Settings, Data management configuration


page. Consult the DataIQ Admin Guide for an in-depth description of the fields and

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 363


Monitoring Tools

settings for each configuration file. Select each page for an overview and use case
for the configurations.

File Type Class

Configuring file type classes enables report


statistics on classes of file types that are
based on the file extensions228.

Enabling File type classes consumes


additional memory and increases CPU
usage during scans.

228For example, a class that is called Video and a class that is called Image are
configured. The IT manager requests a report on the cost of video-type files and
the cost of image-type files. You can use the DataIQ Analyze feature to view the
storage consumption and cost of each class.

PowerScale Administration-SSP1

Page 364 © Copyright 2020 Dell Inc.


Monitoring Tools

Clientmap Configuration

Format

Supported path formats

Example mappings

Use the clientmap file to map virtual DataIQ paths to valid paths on a client.

Convert229 from virtual to client and from client path to virtual path.

By default, no client map is selected.

229Conversion from virtual paths to client paths occurs when copying paths to the
system clipboard. Conversion from client paths to DataIQ virtual paths occurs when
a client path is entered into a field such as a search field.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 365


Monitoring Tools

Data Management

Format

Option definition

Current option setting

The Data Management configuration file230 controls many functional aspects of


DataIQ.

Make changes to the file only when directed by Dell Support.

Administrators can also modify the file directly at


/usr/local/dataiq/etc/clarity.cfg.

230Modifying settings can impact DataIQ functionality. The defaults are typically
used. The file has a description of each setting.

PowerScale Administration-SSP1

Page 366 © Copyright 2020 Dell Inc.


Monitoring Tools

Viewfilter Configuration

Use the Stanford analyzer


to verify/validate REs

Format uses regular


expressions

Example filters

The Viewfilter configuration file231 enables you to create rules to restrict groups
from viewing folders.

You cannot filter the DataIQ Administrators group.

The configuration file is read:


• When DataIQ starts.

231 Viewfilter uses regular expressions (RE). If a volume or folder matches the RE
for the user's group, then that volume and folder are viewable for the user. If a user
is a member of more than one group, the user is only restricted from folders that
are restricted in all their groups.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 367


Monitoring Tools

• When the Group Config or User Config dialogs are opened.


• When the folder tree is updated.

Autotagging Configuration

Autotagging format uses


REs

Scroll down for notes and


example tags

Use auto-tagging232 to tag and track items. A use case is applying a tag to project
paths for use when determining a work order for a customer.

Auto-tagging occurs when a scan is done.

Administrators can also modify the file directly at


/usr/local/dataiq/etc/autotag.cfg.

Data Management Pages

Use the left and right arrows to view the Data Management pages.

232Auto-tagging attaches business context to the data DataIQ collects. Auto-


tagging enables you generate reports within a business context.

PowerScale Administration-SSP1

Page 368 © Copyright 2020 Dell Inc.


Monitoring Tools

Browse

Configure limits
and actions on
selected item

Flags for multiple Manage table


selections columns

Click image to enlarge.

The main functions of the Browse page are searching233, a panel that shows the
volumes in a tree view, a directory breakdown panel, a table that shows the files
within the selected folder, and an article details panel.

Flagging items in the table makes them reflective in the other data management
components.

233The search bar uses characters similar to Java regular expression (regex) such
as ^ for the beginning of filenames and $ for the ending of filenames.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 369


Monitoring Tools

Browse Details

Limit the number of bytes in the


Customize the table folder - applies to the entire
layout directory structure, blank is Initiate scan on selected
unlimited volume or path

Shown are configurable areas of the Browse page.

DataIQ performs regular scans234 on volumes.

234However, if data changes, updated files may not appear in file searches. Go to
the Actions panel and perform a scan on a volume or path to make sure you are
getting the latest information.

PowerScale Administration-SSP1

Page 370 © Copyright 2020 Dell Inc.


Monitoring Tools

Analyze

Hide or show options

Can analyze based


on tags

Hide or show legend

The Analyze page235 allows you to analyze volumes from a business context.

Flagged items

The Flagged items page lists the items the user marks as flagged.

235
The page enables you to view multi-dimensional project oriented data by cost
and size.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 371


Monitoring Tools

Tag management

Business rules configuration, also called auto-tagging, is used to tag tracked items
during a scan.

The Tag management page shows the results of scan when auto-tagging is
configured.

Jobs

The Jobs page shows a table of the jobs and their status as well as a details panel
for the selected job.

PowerScale Administration-SSP1

Page 372 © Copyright 2020 Dell Inc.


Monitoring Tools

Logs - Scan

The Logs page has two tabs, the Scan logs and the Error logs. The Scan logs table
show the generated logs from completed scan jobs.

DataIQ has two types of scans: Full236 and optimized237.

Logs - Error

The Error logs table shows the errors generated.

236A full scans is done the first time a storage file system is indexed. DataIQ walks
the entire file system, indexing every folder. This initial baseline scan ensures that
everything about the file system is known.

237An optimized scan is an incremental scan that only scans the folders where
there have been changes since the last full scan.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 373


Monitoring Tools

Auto-Tagging Example

The installer does not create the autotagging configuration file, but you can use the
sample file /usr/local/dataiq/etc/autotag.cfg.sample as a starting
point. Auto-tagging generally occurs when DataIQ scans a file system.

1. Backup File Content

Click the image to enlarge.

First make a copy of the existing Autotagging configuration file as a backup. The
graphic shows the location of the Autotagging configuration file on the Settings,
Data management configuration page.

PowerScale Administration-SSP1

Page 374 © Copyright 2020 Dell Inc.


Monitoring Tools

2. Reference Path

Click the image to enlarge.

Enter the path examples on their own line, preceded by comment (#).

3. Auto-Tagging Rule

Click the image to enlarge.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 375


Monitoring Tools

Enter the corresponding rule below each reference path. Having the commented
path makes it easier to understand the rule later and provides a reference for other
administrators.

Tags are automatically removed if the rule that created it no longer matches and
the tag has not been altered.

4. Simulate

Click the image to enlarge.

Once the auto-tagging rules are configured, Simulate and report, and then view
the results. The results panel lists each rule and the number of times it matched. If
the results look reasonable, Save and run the new rules.

The Simulate and report will indicate rules that are invalid.

PowerScale Administration-SSP1

Page 376 © Copyright 2020 Dell Inc.


Monitoring Tools

5. Analyze

Click the image to enlarge.

Go to the Data Management page and watch the auto-tab job details to see when
it completes. View the counts in the details window. Go to the Analyze page to
verify the generated tag sets and view the report.

Tip: Reference the DataIQ Administration Guide for in-depth


coverage of auto-tagging.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 377


Monitoring Tools

Regular Expression Resources

The graphic is an example of using the Stanford Analyzer.

• DataIQ Administration Guide


• Regular expression verification checks and troubleshooting:

• Put the RE from an existing rule or rule fragment in the Stanford Analyzer to
understand it (select Java). Modify the RE in the analyzer until it meets your
needs.
• Test in an RE tester (search for "Java regular expression tester"), and then
put into DataIQ and run in the simulator.

Plug-In Overview

Plug-ins extend DataIQ capabilities.

Plugins provide functions such as data transfer and audited delete to enable
administrators to manage data resources across storage platforms such as
PowerScale and ECS.

PowerScale Administration-SSP1

Page 378 © Copyright 2020 Dell Inc.


Monitoring Tools

The plug-ins DataIQ supports are listed. Click each plug-in for a brief description.
• Data Mover
• Audited Deletes
• Duplicate Finder
• Previewer

Tip: See the DataIQ Administrative Guide for details on plug-in


installation and settings.

Plug-in Examples

The graphics show WebUI excerpts of the plug-ins that are installed on a DataIQ
instance.

Installed and enabled plug-ins on the


Settings, Data management configuration
page

Previewer plug-in seen


in the Details panel,
Data Mover plug-in in the
Metadata tab
navigation panel on the
Data Management page

Double-click image to enlarge.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 379


Monitoring Tools

Challenge

Lab Assignment: Go to the lab and add the PowerScale cluster to the
DataIQ application.

PowerScale Administration-SSP1

Page 380 © Copyright 2020 Dell Inc.


Monitoring Tools

isi statistics

Scenario

IT Manager: You have a good understanding of HealthChecks, DataIQ,


and InsightIQ, but now I want you to know what CLI commands are
available for monitoring.

Your Challenge: The IT manager wants you to discuss the different


monitoring commands, explain the isi statistics functions, and
describe the difference between isi statistics and InsightIQ.

Statistics and Status Commands

The three main commands that enable you to view the cluster from the command
line are isi status, isi devices, and isi statistics.

isi statistics

The isi statistics command has approximately 1,500 combinations of data


you can display as statistical output of cluster operations. The statistics that are
collected are stored in an sqlite3 database that is under the /ifs folder on the
cluster.

The isi statistics command provides protocol, drive, hardware, and node
statistics238.

238Other services such as InsightIQ, the WebUI, and SNMP gather information
using the "isi statistics" command.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 381


Monitoring Tools

The output shows the operations by protocol. The example shows that NFS clients
are connected to node 6 with 278.5k bytes per second input rate.

Output for the general cluster statistics in a top-style display where data is continuously overwritten
in a single table.

isi devices

The isi devices command displays information about devices in the cluster and
changes their status. There are multiple actions available including adding drives
and nodes to the cluster. Use the isi devices command for drive states,
hardware condition, node management, and drive replacement management.

isi status

The isi status command displays information about the current status of the
cluster, alerts, and jobs. The example of the isi status output gives a general node
status, performance metrics, critical alerts, and Job Engine status.

PowerScale Administration-SSP1

Page 382 © Copyright 2020 Dell Inc.


Monitoring Tools

The --quiet option omits the alerts and Job Engine status output.

Tip: See the CLI Reference guide for a complete list of the
command options and output definitions.

Basic isi statistics Functions

The isi statistics command dumps all collected stats, and you can run the
"query" subcommand on a specific statistic.

Some of the functions are listed below:

• You can build a custom isi statistics query that is not in the provided
subcommands
• Cluster and node statistics from kernel counters
• isi_stats_d

• Most data collection


• Works with InsightIQ

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 383


Monitoring Tools

InsightIQ vs isi statistics

The table lists differences between isi statistics and InsightIQ.

In situations where InsightIQ is unavailable or malfunctioning, isi statistics is


a powerful and flexible way of gathering cluster data.

The isi statistics command within a cron job239 gathers raw statistics over a
specified time period.

InsightIQ isi statistics

Not licensed Not licensed

Graphical output Produces raw


CSV output on
demand

Not easily scripted Easily scripted

Not easily Automate with


automated cron

Use from remote Use from any


host node

Web user interface Produces data


similar to other
UNIX utilities,
such as top

239 A cron job can run on UNIX-based systems to schedule periodic jobs.

PowerScale Administration-SSP1

Page 384 © Copyright 2020 Dell Inc.


Monitoring Tools

Fixed interval Flexible sampling


interval

Example: Statistics for Drive Activity

The example output shows the isi statistics drive command for the SSD
drives on node 6.

Some column definitions:


• TimeInQ: Time in queue indicates how long an operation is queued on a drive.
Key for spindlebound clusters. A time in queue value of 10 to 50 milliseconds
equals Yellow zone, a time in queue value of 50 to 100 milliseconds equals
Red.
• Queued: Queue depth indicates how many operations are queued on drives. A
queue depth of 5 to 10 is considered heavy queuing.
• Busy: Disk percent busy can be helpful to determine that the drive is 100%
busy, but it does not indicate how much extra work might be in the queue.

Example: Statistics by Most Active Files and Directories

The examples shows isi statistics heat, which uses --long to include
more columns.

The head -10 option displays the first 10 most active most accessed files and
directories.

The example node 6 output shows the Timestamp in Epoch timestamp format,
Ops as protocol operations, the Event type and Class (getattr is a namespace
read), and LIN for the file or directory associated with the event.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 385


Monitoring Tools

Practical Skills

Combining large sets of collected data with log analysis can help identify long-term
trends and sources of trouble.

1: Sometimes it is not possible to use InsightIQ to troubleshoot as customers may


not allow new software and may have time or facilities constraints.

2: isi Statistics can fill the gaps. Skillful use of isi statistics can
produce equivalent information to what InsightIQ offers and contains many
performance-related options.

3: The isi statistics and isi_stats_d commands can help isolate or


identify issues where InsightIQ may not have visibility. Using isi statistics
keys can show specific metrics, such as isi statistics query current --
keys node.uptime displays the node uptime.

4: isi_cache_stats is used to examine the state of data that is in cache.

PowerScale Administration-SSP1

Page 386 © Copyright 2020 Dell Inc.


Monitoring Tools

Challenge

Lab Assignment: Now that you know which CLI commands are
available for monitoring, go to the lab and run the isi statistics
command.

PowerScale Administration-SSP1

© Copyright 2020 Dell Inc. Page 387


Appendix

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 389
Appendix

Course Materials
• Participant Guide
• Instructor laptop
• Projector and Speakers
• Internet access
• Whiteboard and markers

PowerScale Administration-SSP1

Page 390 © Copyright 2020 Dell Inc.


Appendix

Course Agenda

Day 1 Day 2 Day 3 Day 4 Day 5

AM Course Identity Data Data Monitoring


Introduction, Management Protection Services Labs
NAS, and and Data
PowerScale, Authorization Layout
and OneFS Labs Labs
Labs

Lunch

PM Foundations Client Storage Labs Labs


for Access Access Pools
Labs Labs Labs

Depending on course pace and student knowledge, module and lab exercise
schedule may be altered

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 391
Appendix

Introductions
• Name
• Company
• Job Role
• Experience
• Expectations

PowerScale Administration-SSP1

Page 392 © Copyright 2020 Dell Inc.


Appendix

DNS Primer

When discussing Domain Name System, or DNS, on a PowerScale cluster, there


are two facets to differentiate, DNS client and DNS server.

DNS is a hierarchical distributed database. The names in a DNS hierarchy form a


tree, which is called the DNS namespace. A set of protocols specific to DNS allows
for name resolution, more specifically, a Fully Qualified Domain Name, or FQDN, to
IP Address resolution.

Click the green "i" buttons to learn more.

1: A FQDN is the DNS name of an object in the DNS hierarchy. A DNS resolver
query must resolve an FQDN to its IP address so that a connection can be made
across the network or the Internet. If a computer cannot resolve a name or FQDN
to an IP address, the computer cannot make a connection, establish a session or
exchange information. An example of an FQDN looks like sales.isilon.xattire.com.

2: A single period (.) represents the root domain, and is the top level of the DNS
architecture.

3: Below the root domain are the top-level domains. Top-level domains represent
companies, educational facilities, nonprofits, and country codes such as *.com,
*.edu, *.org, *.us, *.uk, *.ca, and so on. A name registration authority manages the
top-level domains.

4: The secondary domain represents the unique name of the company or entity,
such as EMC, Isilon, Harvard, MIT.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 393
Appendix

5: The last record in the tree is the hosts record, which indicates an individual
computer or server.

PowerScale Administration-SSP1

Page 394 © Copyright 2020 Dell Inc.


Appendix

DNS Host Record: A or AAAA Record

NS records as a sub domain or


SmartConnect zone

Host (A) record - Easier to remember "centos" than


192.168.3.3

Host (A) record - SmartConnect service IP

The SmartConnect service IP on a PowerScale cluster must be created in DNS as


an address (A) record, also called a host entry.

What is an A record?240

For example, a server that is named centos would have an A record that mapped
the hostname centos to the IP address assigned to it: centos.dees.lab A
192.168.3.3 Where centos is the hostname, dees.lab is the domain name, and
centos.dees.lab is the FQDN.

240
An A-record maps the hostname to a specific IP address to which the user
would be sent for each domain or subdomain. It is simple name-to-IP resolution.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 395
Appendix

The Name Server Record, or NS records, indicate which name servers are
authoritative for the zone or domain.

More about NS records.241

Tip: In an IPv6 environment, use the AAAA record in DNS, and


consult with the network administrator to ensure that you are
representing the IPv6 addresses correctly.

241Companies that want to divide their domain into sub domains use NS records.
Sub domains indicate a delegation of a portion of the domain name to a different
group of name servers. You create NS records to point the name of this delegated
sub domain to different name servers.

PowerScale Administration-SSP1

Page 396 © Copyright 2020 Dell Inc.


Appendix

DNS Delegation Best Practices

Use one name server record for each


SmartConnect zone name or alias

Delegate to address (A)


records, not to IP addresses

You must create an address (A) record in DNS for the SmartConnect service IP.
Delegating to an A record means that if you failover the entire cluster, you can do
so by changing one DNS A record. All other name server delegations can be left
alone. In many enterprises, it is easier to update an A record than a name server
record, because of the perceived complexity of the process.

Delegationtion recommendation.242

242 The recommendation is to create one delegation for each SmartConnect zone
name or for each SmartConnect zone alias on a cluster. This method permits
failover of only a portion of the workflow—one SmartConnect zone—without

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 397
Appendix

Important: PowerScale does not recommend creating a single


delegation for each cluster and then creating the SmartConnect
zones as sub records of that delegation. More243.

affecting any other zones. This method is useful for scenarios such as testing
disaster recovery failover and moving workflows between data centers.

243Using this method would enable the PowerScale administrators to change,


create, or modify the SmartConnect zones and zone names as needed without
involving a DNS team, but causes failover operations to involve the entire cluster
and affects the entire workflow, not just the affected SmartConnect zone.

PowerScale Administration-SSP1

Page 398 © Copyright 2020 Dell Inc.


Appendix

SmartConnect Example - Cluster Name Resolution Process

The graphic shows how SmartConnect uses the X-Attire DNS server to provide a
layer of intelligence within the OneFS software application.

6
5

4 7

1
2

1: An NS record that delegates the subdomain isilon.xattire.com to the name server


with a hostname of SIP (sip.xattire.com). The isilon.xattire.com NS sip.xattire.com
states that clients looking to resolve isilon.xattire.com should query the NS
sip.xattire.com.

2: The A record maps the hostname sip.xattire.com to the IP address


192.168.0.100. Clients looking for isilon.xattire.com are forwarded to sip.xattire.com
and sip.xattire.com is found at 192.168.0.100.

3: All clients are configured to make requests from the resident DNS server using a
single DNS hostname. Because all clients reference a single hostname,
isilon.xattire.com, it simplifies the management for large numbers of clients.

4: The resident DNS server forwards the delegated zone lookup request to the
delegated zone server of authority, here the SIP address of the cluster.

5: SmartConnect evaluates the environment and determines which node (single IP


address) the client should connect to, based on the configured policies.

6: SmartConnect then returns this information to the DNS server, which, in turn,
returns it to the client.

7: The client then connects to the appropriate cluster node using the wanted
protocol.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 399
Appendix

NFS Connectivity

Click to play media.

Remote Procedure Call (RPC)

NFS relies upon remote procedure call (RPC) for client authentication and port
mapping. RPC is the NFS method that is used for communication between a client
and server over a network. RPC is on Layer 5 of the OSI model. Because RPC
deals with the authentication functions, it serves as gatekeeper to the cluster.

NFS connectivity

Procedure of NFS connectivity is:

PowerScale Administration-SSP1

Page 400 © Copyright 2020 Dell Inc.


Appendix

• The procedure always starts with a CALL from a client.244


• A server can reject a client CALL for one of two reasons.245
• Portmapper provides the client RPC process with service ports.246
• RPC services cannot run unless they register with portmapper.247

Let us look at the flow of a request by a client. When the RPC services start up on
the cluster, it registers with portmapper. The service tells portmapper what port
number it is listening on, and what RPC program numbers it is prepared to serve.

244 When the server receives the CALL, it performs the service that is requested
and sends back the REPLY to the client. During a CALL and REPLY, RPC looks for
client credentials, that is, identity and permissions.

245
If the server is not running a compatible version of the RPC protocol, it sends an
RPC_MISMATCH. If the server rejects the identity of the caller, it sends an
AUTH_ERROR.

246It acts as a gatekeeper by mapping RPC ports to IP ports on the cluster so that
the right service is offered.

247Clients calling for an RPC service need two pieces of information, the number of
the RPC program it wants to call and the IP port number.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 401
Appendix

HDFS Topic
• Data Lakes and Analytics
• HDFS Overview Video
• OneFS with Hadoop
• OneFS vs. Hadoop
• HDFS Administration
• Best Practices Resources
• Troubleshooting Resources

PowerScale Administration-SSP1

Page 402 © Copyright 2020 Dell Inc.


Appendix

Swift Topic
• File and Object Storage Differences
• Accounts, Containers, and Objects
• Configuring Isilon Swift Accounts
• Storage URL
• Isilon Swift Considerations and Limitations

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 403
Appendix

Journal Behavior for Node Pairs

When a node boots, it first checks its own vault resources before querying its
paired node. This way if the node can recover its journal from its own resources,
there is no need to query the paired node. But, if the journal is bad, the node can
identify the journal condition from its node state block data, and recovery should be
possible. There is a consequence to the nodes running in pairs. If a node runs
unpaired, it is under-protected.

Battery Battery Battery Battery

Journal Mirror Journal Mirror

Mirror Journal Mirror Journal

PowerScale Administration-SSP1

Page 404 © Copyright 2020 Dell Inc.


Appendix

Concurrency Examples
The process of striping spreads all write operations from a client248 across the
nodes of a cluster. Each tab illustrates a file that is broken down into chunks, after
which it is striped across disks249 in the cluster along with the FEC.

Concurrency 256 KB File

The graphic illustrates concurrency with a 256 MB file.

Concurrency

N+1n

256 KB file

128 KB chunk

128 KB chunk

128 KB FEC

248 A client is connected to only one node at a time. However when that client
requests a file from the cluster, the client connected node does not have the entire
file locally on its drives. The client-connected node retrieves and rebuilds the file
using the back-end network.

249 Even though a client is connected to only one node, when that client saves data
to the cluster, the write operation occurs in multiple nodes. The scheme is true for
read operations also.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 405
Appendix

Concurrency 128 KB File

All files 128 KB or less are mirrored. For a protection strategy of N+1 the 128 KB
file has 2 instances, the original data and one mirrored copy.

Concurrency

N+1n

128 KB file

128 KB FEC

Any file ≤ 128 KB is still FEC calculated,


but the result is a copy

Concurrency 192 KB File

The example shows a file that is not evenly distributed in 128 KB chunks. Blocks in
the chunk that are not used are free for use in the next stripe unit. Unused blocks in
a chunk are not wasted.

Concurrency N+1n 192 KB file

F200 3 Node Cluster

128 KB chunk

64 KB used

Less than 128 KB


used - mirrored
protection
64 KB unused - blocks not used are used in the next stripe
FEC blocks Mirror blocks unit, not wasted

Concurrency 1 MB with +2d:1n

The example shows +2d:1n protection of a 1 MB file. The file is divided into eight
data stripe units and three FEC units. The data is laid out in two stripes over two
drives per node to achieve the protection.

PowerScale Administration-SSP1

Page 406 © Copyright 2020 Dell Inc.


Appendix

Concurrency

N+2d:1n

1 MB file

8 x 128 KB chunk

Stripe depth is doubled

Blocks within the same stripe (stripe 1) are written to separate drives on
each node

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 407
Appendix

Data Lakes and Analytics

A Data Lake is a central data repository that enables organizations to access and
manipulate the data using various clients and protocols. The flexibility keeps IT
from managing and maintaining a separate storage solution (silo) for each type of
data such as SMB, NFS, Hadoop, SQL, and others.

The inclusion of platform-as-a-service, or PaaS, makes building 3rd platform


applications simple and efficient.

Click the i buttons in the graphic for information about ingest and OneFS storage.

1 2

1: A Data Lake-based ingest captures a wider range of datatype than were


possible in the past. Data is stored in raw, unprocessed forms to ensure that no
information is lost. Massively parallel processing and in memory technologies
enable data transformation in real time as data is analyzed. Because the Data Lake
has a single, shared repository, more tools can be made available on demand,
enabling data scientists and analysts to find insights. The Data Lake makes it
simple to surface the insights in a consistent way to executives and managers so
that decisions are made quickly.

2: Utilizing Isilon to hold the Hadoop data gives you all of the protection benefits of
the OneFS operating systems. You can select any of the data protection levels that
OneFS offers giving you both disk and node fault tolerance.

PowerScale Administration-SSP1

Page 408 © Copyright 2020 Dell Inc.


Appendix

Resource: For more information, goto the PowerScale and Isilon


technical documents and videos page.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 409
Appendix

HDFS Overview Video


The video provides an overview of a typical Hadoop topology and how the
PowerScale fits into a Hadoop solution. See the student guide for a transcript of the
video.

Click to launch video.

URL:
https://edutube.emc.com/Player.aspx?vno=wZCty171ec2RjiMSRZZe9g==&autopla
y=true

Shown is an Isilon cluster with twelve nodes. A key benefit of CloudPools is the
ability to interact with multiple cloud vendors. Shown in the graphic are the
platforms and vendors that are supported as OneFS 8.1.1.

CloudPools is an extension of the SmartPools tiering capabilities in the OneFS


operating system. The policy engine seamlessly optimizes data placement that is
transparent to users and applications. Moving the cold archival data to the cloud,
lowers storage cost and optimizes storage resources.

PowerScale Administration-SSP1

Page 410 © Copyright 2020 Dell Inc.


Appendix

Let us look at an example, each chassis in the cluster represents a tier of storage.
The topmost chassis is targeted for the production high-performance workflow and
may have node such as F800s. When data is no longer in high demand,
SmartPools moves the data to the second tier of storage. The example shows the
policy moves data that is not accessed and that is over thirty days old. Data on the
middle tier may be accessed periodically. When files are no longer accessed for
more than 90 days, SmartPools archive the files to the lowest chassis or tier such
as A200 nodes.

The next policy moves the archive data off the cluster and into the cloud when data
is not accessed for more than 180 days. Stub files that are also called SmartLinks
are created. Stub files consume approximately 8 KB space on the Isilon cluster.
Files that are accessed or retrieved from the cloud, or files that are not fully moved
to the cloud, have parts that are cached on the cluster and are part of the stub file.
The storing of CloudPools data and user access to data that is stored in the cloud
is transparent to users.

CloudPools files undergo a compression algorithm and then are broken into their 2
MB cloud data objects or CDOs for storage. The CDOs conserve space on the
cloud storage resources. Internal performance testing does note a performance
penalty for a plane compression and decompressing files on read. Encryption is
applied to file data transmitting to the cloud service. Each 128 KB file block is
encrypted using a AES 256 encryption. Then transmitted as an object to the cloud.
Internal performance testing notes a little performance penalty for encrypting the
data stream.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 411
Appendix

OneFS with Hadoop


Access protocols NameNode and DataNodes Hadoop compute nodes

Ambari, Cloudera, GreenPlum,


Hortonworks, IBM, Hadoop,
Pivotal

MapReduce - computations stays on


Data from all sources
Hadoop
resides on the cluster

To recap the overview, all production data resides on PowerScale. This removes
the task of exporting it from your production applications and importing it as with a
traditional Hadoop environment. The MapReduce continues to run on dedicated
Hadoop compute nodes. PowerScale requires this Hadoop front end to do the data
analysis. PowerScale holds the data so that Hadoop, applications, or clients can
manipulate it.

Resource: For supported platforms, see the Hadoop Distributions


and Products Supported by OneFS web page.

PowerScale Administration-SSP1

Page 412 © Copyright 2020 Dell Inc.


Appendix

OneFS vs. Hadoop


The table showcases the benefits of OneFS compared with Hadoop. For details,
click the underlined functions for more Information.

Function Hadoop OneFS

Data protection 3x mirror, no replication snapshots, clones,


SyncIQ

Data migration250 Needs landing zone. Data on cluster

Security251 Kerberos authentication AD, LDAP, and


unsupported Kerberos

Deduplication 3x mirror = 33% efficiency 80% storage


efficiency

Compliance and security No native encryption SEDs, ACLs, POSIX,


access zones, RBAC,
SEC compliant

250 Hadoop requires a landing zone to stage data before using tools to ingest data
to the Hadoop cluster. PowerScale enables cluster data analysis by Hadoop.
Consider the time that it takes to push 100 TB across the WAN and wait for it to
migrate before any analysis can start. PowerScale does in place analytics so no
data moves around the network.

251Hadoop assumes that all members of the domain are trusted. PowerScale
supports integrating with AD or LDAP, and gives you the ability to safely segment
access.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 413
Appendix

Multi distribution support252 1 physical HDFS = 1 Co-mingle physical


distribution of Hadoop and virtual versions.

Scaling253 Compute and storage that Scales compute or


is paired storage as needed.

252Each physical HDFS cluster can only support one distribution of Hadoop.
PowerScale can co-mingle physical and virtual versions of any Apache standards-
based distributions.

253Hadoop pairs the storage with the compute, so adding more space may require
you to pay for more CPU that may go unused. If you need more compute, you end
up with a lot of overhead space. With PowerScale you scale compute as needed or
storage as needed, aligning your costs with your requirements.

PowerScale Administration-SSP1

Page 414 © Copyright 2020 Dell Inc.


Appendix

HDFS Administration

The graphic shows the WebUI Protocols, Hadoop (HDFS), Settings page, and
the corresponding isi hdfs settings command output.

Click the i buttons for details in each area.

6 7

4
5

1: The Default block size determines how the HDFS service returns data upon
read requests from Hadoop compute client. The server-side block size determines
how the OneFS HDFS daemon returns data to read requests. Leave the default
block size at 128 MB. If the customer runs an older version of HDFS, consider a 64
MB block size. If the block size is set to high, many read/write errors and
performance problems occur. Tune on setup.

2: Default checksum type is used for old HDFS workflows. Because OneFS uses
forward error correction, checksums for every transaction are not used, as it can
cause a performance issue.

3: The HDFS Authentication type is on a per-access zone basis. The


authentication method can be Simple, Kerberos, or both.

4: The Ambari client/server framework is a third-party tool that enables you to


configure, manage, and monitor a Hadoop cluster through a browser-based
interface.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 415
Appendix

5: Odp version - on updates, the Hortonworks version must match the version that
is seen in Ambari. Version conflict is common when customer upgrades
Hortonworks. Can cause jobs not to run. Installation also fails when Odp version
does not match.

6: Proxy users for secure impersonation can be created on the Proxy users tab.
For example, create an Apache Oozie proxy user to securely impersonate a user
called HadoopAdmin. Enable the Oozie user to request that the HadoopAdmin user
perform Hadoop jobs. Apache Oozie is an application that can automatically
schedule, manage, and run Hadoop jobs.

7: On the Virtual racks tabs, nodes can be preferred along with an associated
group of Hadoop compute clients to optimize access to HDFS data.

Resource: An HDFS implementation is more involved than


discussed in this topic. See the HDFS Reference Guide for
complete configuration details.

PowerScale Administration-SSP1

Page 416 © Copyright 2020 Dell Inc.


Appendix

Best Practices Resources

• Visit the Using Hadoop with Isilon - Isilon Info Hub web page for documentation.
• Use the Isilon Hadoop tools to create users and groups in the local provider.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 417
Appendix

Troubleshooting Resources
There are several guides that are dedicated to troubleshooting an HDFS solution.

Resource: Using Hadoop with OneFS Info Hub.

PowerScale Administration-SSP1

Page 418 © Copyright 2020 Dell Inc.


Appendix

File and Object Storage Differences


File storage deals with a specific set of users who require shared access to a
specific set of files. Shared access led to file access permissions and locking
mechanisms, enabling users to share and modify files without effecting each
other’s changes. A file system stores data in a hierarchy of directories,
subdirectories, folders, and files. The file system manages the location of the data
within the hierarchy. If you want to access a specific file, you need to know where
to look for the file. Queries to a file system are limited. You can search for a specific
file type such as *.doc, or file names such as serverfile12*.*, but you cannot parse
through the files to find the content contained within them. Determining the context
of a file is also difficult. For example, should you store the file in an archival tier or
will you access the information regularly? It is difficult to determine the content of
the data from the limited metadata provided. A document might contain the minutes
of a weekly team meeting, or contain confidential personal performance evaluation
data.

Object storage combines the data with richly populated metadata to enable
searching for information by file content. Instead of a file that tells you the create or
modified date, file type, and owner, you can have metadata that tells you the
project name, formula results, personnel assigned, location of test and next run
date. The rich metadata of an object store enables applications to run analytics
against the data.

Object storage has a flat hierarchy and stores its data within containers as
individual object. An object storage platform can store billions of objects within its
containers, and you can access each object with a URL. The URL associated with
a file enables the file to be located within the container. Hence, the path to the
physical location of the file on the disk is not required. Object storage is well suited
for workflows with static file data or cloud storage.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 419
Appendix

File Storage Object Storage

Hierarchical structure Flat hierarchy

Manages location of data Data stored in containers

Limited metadata Not concerned with data location

Metadata:
Rich metadata:
File Name: Formula 5Xa
Object ID: 98765 Level: xxx
Created by: M.Smith
File Type: .doc
Created on: 9/9/14 Test date: xxx
Lab facility: Atlanta
File type: Word Patient trial: xxx
Building: 7
Patent: xxx
Lead Scientist: M. Smith
Approval ID: xxx
Description: xxx
Risk Assessment: xxx

PowerScale Administration-SSP1

Page 420 © Copyright 2020 Dell Inc.


Appendix

Accounts, Containers, and Objects


Shown is the Swift logical data layout. Accounts are the administrative control point
for containers and objects, containers organize objects, and objects contain user
data. For users to access objects, they must have an account on the system. An
account is the top of the hierarchy.

Object1

Administrative
Container1 Object2
control point
Contain user
data

Object3
Account

Object4

Container1

Organize objects Object5

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 421
Appendix

Configuring Isilon Swift Accounts


Administrators can create, delete, modify, or view Swift accounts. Administrators
can also define users who can access the accounts. The Swift account
management tasks are performed only through the CLI.

Administrators must provision the accounts before users can use the service. The
general steps are enable Swift license, decide upon file system user or group
ownership, create accounts using the isi swift command, and then assign
users access to account. Make any necessary file system permission changes if
you are relocating data into the account.

The example shows creating a Swift account in the sales access zone and using
an Active Directory user and group. The isi swift accounts list shows the
accounts that are created in the access zone. The isi swift accounts view
shows the account details.

isi swift accounts create <account name> <swiftuser> <swiftgroup> --


Syntax:
zone <string> --users <string>

PowerScale Administration-SSP1

Page 422 © Copyright 2020 Dell Inc.


Appendix

Storage URL
Shown is what a Swift Storage URL looks like. URIs identify objects in the form
http://<cluster>/v1/account/container/object. In the example shown,
192.168.0.1 identifies the cluster. HTTP requests are sent to an internal web
service listening on port 28080. This port is not configurable. HTTPS requests are
proxied through the Apache web server listening on port 8083. This port is not
configurable. OpenStack defines the protocol version /v1. The reseller prefix
/AUTH_bob, where /AUTH is a vestige of the OpenStack implementation's internal
details. The _bob portion of the URL is the account name used. The container /c1
is the container in which an object is stored and the object /obj1 is the object.

Web service
Cluster Protocol version Reseller prefix Account Container Object
listening port

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 423
Appendix

Swift Considerations and Limitations


Swift supports up to 150 concurrent active connections per cluster node. When
uploading objects or listing containers, the Swift service can become memory-
constrained and cause a service outage. To avoid an outage, maintain the Swift
Service memory load within 384 MB. Account and container listing requests initiate
a full file system walk from the requested entity. Workloads can expect longer
response times during the listing operations as the number of containers or objects
increase. To prevent response time issues, redistribute or reduce the objects and
containers until the response times are within the acceptable limits. You cannot
submit a PUT request to create a zero-length object because PUT is incorrectly
interpreted as a pseudo-hierarchical object. If the container is not empty, you
cannot submit a DELETE request to delete a container. As a best practice, delete
all the objects from the container before deleting the container. When
authenticating with Active Directory and Isilon Swift, the user name in the X-Auth-
User header must include the fully qualified AD domain name in the form test-
name@mydomain.com unless the domain has been configured as the default
through the assume-default-domain configuration parameter in the AD provider
configuration.

Pre OneFS 8.0 Swift accounts are deactivated when upgrading to OneFS 8.0 and
later. After the upgrade, Swift no longer uses home directories for accounts. The
upgrade plan should determine which users are using Swift. Create new accounts
under the new Swift path, and then move the data from the old accounts into the
newly provisioned accounts. Swift is not compatible with the auditing feature.

• 150 concurrent active connections per node


• Cannot submit PUT request for 0 length object
• Container must be empty to DELETE
• User name must include FQDN of AD domain
• Upgrade from OneFS 7.2 requires new account provisioning
• Not compatible with auditing

PowerScale Administration-SSP1

Page 424 © Copyright 2020 Dell Inc.


Glossary
Cache - L1
Client-side cache. L1 cache refers to read transaction requests, or when a client
requests data from the cluster. L1 cache is stored in a segmented area of the node
RAM and as a result is fast. Related to L1 cache is the write cache or the write
coalescer that buffers write transactions from the client. The write cache is flushed
after successful write transactions. In OneFS, the two similar caches are
distinguished based on their read or write functionality. Client-side caching includes
both the in and out client transaction buffers.

Cache - L2
Storage side or node-side buffer. Buffers write transactions and L2 writes to disk
and prefetches anticipated blocks for read requests, sometimes called read ahead
caching. For write transactions, L2 cache works with the journaling process to
ensure protected committed writes. As L2 cache becomes full, it flushes according
to the age of the data. L2 flushes the least recently used, or LRU, data.

Chimer Nodes
By default, if the cluster has more than three nodes, three of the nodes are
selected as chimers. If the cluster has four nodes or less, only one node is selected
as a chimer. If no external NTP server is set, nodes use the local clock. Chimer
nodes are selected by the lowest node number that is not excluded from chimer
duty.

DataIQ Audited Deletes Plug-in


The Audited Deletes plug-in enable administrators to delete folders and files from
the DataIQ custom context menu, logging the actions. The plug-in asks for
confirmation before deleting anything, and logs all details of the delete operation.
The plug-in does not work with object stores such as S3, GCP, or ECS.

DataIQ Data Mover Plug-in


The Data Mover plug-in helps transfer files and folders more efficiently between file
systems. Data Mover is compatible with object storage such as Amazon S3,
Google Cloud Platform, and Dell ECS.

DataIQ Duplicate Finder Plug-in


The Duplicate Finder plug-in finds duplicate files across volumes and folders. The
plug-in does not work with object stores such as S3, GCP, or ECS.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 425
DataIQ Previewer Plug-in
The Preview plug-in shows a preview image of the file in the WebUI for common
file types. The supported graphic file extensions are: ".jpg", ".jpeg", ".tiff", ".tif",
".dpx", ".bmp", ".png", ".gif", ".tga", ".targa", ".exr", ".pcx", ".pict", ".ico". The
supported video file extensions are: ".mov", ".mp4", ".mpeg", ".mpg", ".ts", ".avi",
".mkv", ".wmf", ".wmv", ".mxf", ".ogv". The plug-in does not work with object stores
such as S3, GCP, or ECS.

Dynamic Aggregation Mode


A dynamic aggregation mode enables nodes with aggregated interfaces to
communicate with the switch so that the switch can use a comparable mode.

File Pool Policy


File pool policies enable you to filter files and directories and store them on specific
node pools or tiers according to criteria that you specify. You can change the
storage pool tier, change the optimization, and change the protection level if the file
or directory no longer requires greater protection. You can trigger the changes at
any time and on any directory or file.

File Provider
A file provider enables you to supply an authoritative third-party source of user and
group information to a clustr. A third-party source is useful in UNIX and Linux
environments that synchronize /etc/passwd, /etc/group, and etc/netgroup
files across multiple servers.

Front Panel Display


The Front Panel Display is located on the physical node or chassis. It is used to
perform basic administrative tasks onsite.

Generation 6 Hardware
The Gen 6 platforms reduce the data center rack footprints with support for four
nodes in a single 4U chassis. It enable enterprise to take on new and more
demanding unstructured data applications. The Gen 6 can store, manage, and
protect massively large datasets with ease. With the Gen 6, enterprises can gain
new levels of efficiency and achieve faster business outcomes.

Generation 6.5 Hardware

PowerScale Administration-SSP1

Page 426 © Copyright 2020 Dell Inc.


The ideal use cases for Gen 6.5 (F200 and F600) is remote office/back office,
factory floors, IoT, and retail. Gen 6.5 also targets smaller companies in the core
verticals, and partner solutions, including OEM. The key advantages are low entry
price points and the flexibility to add nodes individually, as opposed to a chassis/2
node minimum for Gen 6.

Global Namespace Acceleration (GNA)


GNA enables the use of SSDs for metadata acceleration across the entire cluster.
GNA also uses SSDs in one part of the cluster to store metadata for nodes that
have no SSDs. The result is that critical SSD resources are maximized to improve
performance across a wide range of workflows.

Groupnet
The groupnet is a top-level networking container that manages hostname resolution
against DNS nameservers and contains subnets and IP address pools. Every
subnet is assigned to a single groupnet. Each cluster has a default groupnet
named groupnet0. Groupnet0 contains an initial subnet, subnet0, an initial IP
address pool, pool0, and an initial provisioning rule, rule0. Groupnets are how the
cluster communicates with the world. DNS client settings, such as name servers
and a DNS search list, are properties of the groupnet. If the cluster communicates
to another authentication domain, it must find that domain. To find another
authentication domain, you need a DNS setting to route to that domain. With
OneFS 8.0 and later releases, groupnets can contain individual DNS settings,
whereas prior OneFS versions had a single global entry.

Hadoop
Hadoop is designed to scale up from a single server to thousands of servers.
Hadoop clusters dynamically scale up and down based on the available resources
and the required services levels. Performance varies widely for processing, and
queries can take a few minutes to multiple days depending on how many nodes
and the amount of data requested.

Home Directory
Home directory provisioning creates a single home share that redirects users to
their SMB home directories. If one does not exist, a directory is automatically
created.

InsightIQ File System Reports

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 427
File system reports include data about the files that are stored on a cluster. The
reports have use if, for example, you want to identify the types of data being stored
and where that data is stored. Before applying a file system report, enable InsightIQ
File System Analytics for that cluster.

InsightIQ Live Reporting


InsightIQ supports live versions of reports that are available through the InsightIQ
web application. You can create live versions of both performance and file system
reports. You can modify certain attributes as you view the reports, including the
time period, breakouts, and filters.

InsightIQ Performance Reports


Performance reports have information about cluster activity and capacity. For
example, to determine whether clusters are performing as expected, or if you want
to investigate the cause of a performance issue, the reports are useful.

isi get
The isi get command displays the protection settings on an entire directory path or,
as shown, a specific file without any options. The POLICY or requested protection
policy, the LEVEL or actual protection, the PERFORMANCE or data access pattern
are displayed for each file. Using with a directory path displays the properties for
every file and subdirectory under the specified directory path. Output can show files
where protection is set manually. Mirrored file protection is represented as 2x to 8x
in the output.

Job - Default Impact Policy


The default impact policy is the amount of system resources that the job uses
compared to other system maintenance jobs running simultaneously.

Job - Default Priority


The Default priority gives the job priority as compared to other system maintenance
jobs running simultaneously. You can modify the job priority, but it is not
recommended.

Job - Schedule
With the Schedule options, you can start the job manually or set to run on a
regularly scheduled basis.

LACP Aggregation Mode

PowerScale Administration-SSP1

Page 428 © Copyright 2020 Dell Inc.


LACP uses hashed protocol header information that includes the source and
destination address, and the VLAN tag, if available. LACP enables a network
device to negotiate and identify any LACP enabled devices and create a link. LACP
monitors the link status and if a link fails, fails traffic over. LACP accepts incoming
traffic from any active port. PowerScale is passive in the LACP conversation and
listens to the switch to dictate the conversation parameters.

Layers of Access
• Protocol Layer - The first layer is the protocol layer. Protocols may be Server
Message Block, or SMB, Network File System, or NFS, File Transfer Protocol,
or FTP, or some other protocol.
• Authentication Layer - The authentication layer identifies a user using a system
such as NIS, local files, or Active Directory.
• Identity Assignment Layer - The third layer is identity assignment. This layer is
straightforward and based on the results of the authentication layer, but there
are some cases that need identity mediation within the cluster, or where roles
are assigned within the cluster that are based on user identity.
• Authorization Layer - Finally, based on the established connection and
authenticated user identity, the file and directory permissions are evaluated. The
evaluation determines whether the user is entitled to perform the requested data
activities.

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 429
Leaf-Spine
Leaf-Spine is a two level hierarchy where nodes connect to leaf switches, and leaf
switches connects to spine switches. Leaf switches do not connect to one another,
and spine switches do not connect to one another. Each leaf switch connects with
each spine switch and all leaf switches have the same number of uplinks to the
spine switches.

Local Provider
Local authentication is useful when Active Directory, LDAP, or NIS directory
services are not configured or when a specific user or application needs access to
the cluster. Local groups can include built-in groups and Active Directory groups as
members

MTTDL
MTTDL is a statistical calculation that estimates the likelihood of a hardware failure
resulting in data loss. MTTDL is a system view of reliability and asks the question
“What happens when hardware does fail, and will I lose any data when it does?”

NAS
NAS is an IP-based, dedicated, high-performance file sharing and storage device.

NFS
Network File System, or NFS, is an open standard that UNIX clients use. The NFS
protocol enables a client computer to access files over a network. NFS clients
mount the OneFS export that is accessible under a client mountpoint. The
mountpoint is the directory that displays files from the server. The NFS service
enables you to create as many NFS exports as needed.

NFSv4 Continuous Availability


NFSv4 enables clients to transparently fail over to another node when a network or
node fails. The CA option enables movement from one node to another and no
manual intervention on the client side. Movement to another node enables a
continuous workflow from the client side with no appearance or disruption to their
working time. CA supports home directory workflows.

OneFS CLI
The command-line interface runs "isi" commands to configure, monitor, and
manage the cluster. Access to the command-line interface is through a secure shell
(SSH) connection to any node in the cluster.

PowerScale Administration-SSP1

Page 430 © Copyright 2020 Dell Inc.


OneFS Multi-Tenancy
With OneFS, multi-tenancy enables the PowerScale cluster to simultaneously
handle more than one set of networking configurations. Multi-Tenant Resolver, or
MTDNS is a subset of multi-tenancy that pertains to hostname resolution against
DNS name servers. Each tenant on the cluster can have its own network settings.
Before OneFS 8.0, you could only define one set of DNS servers on the cluster.

PaaS
PaaS combined with approaches like continuous integration and deployment can
measure application development cycles in the days and weeks rather than months
or years. The combinations can dramatically reduce the time it takes from having
an idea to identifying insight, to action, and creating value.

PAPI
The PAPI is divided into two functional areas: one area enables cluster
configuration, management, and monitoring functionality, and the other area
enables operations on files and directories on the cluster. A chief benefit of PAPI is
its scripting simplicity, enabling customers to automate their storage administration.

PowerScale A200
The A200 is an ideal active archive storage solution that combines near-primary
accessibility, value and ease of use.

PowerScale A2000
The A2000 is an ideal solution for high density, deep archive storage that
safeguards data efficiently for long-term retention.

PowerScale F200
Ideal for low-cost all-flash node pool for existing Gen6 clusters. Ideal for small,
remote clusters.

PowerScale F600
Ideal for small, remote clusters with exceptional system performance for small
office and remote office technical workloads.

PowerScale F800
Use the F800 for workflows that require extreme performance and efficiency.

PowerScale F810

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 431
Use the F810 for workflows that require extreme performance and efficiency. The
F810 also provides high-speed inline data deduplication and in-line data
compression. It delivers up to 3:1 efficiency, depending on your specific dataset
and workload.

PowerScale H400
The H400 provides a balance of performance, capacity and value to support a wide
range of file workloads. It delivers up to 3 GB/s bandwidth per chassis and provides
capacity options ranging from 120 TB to 720 TB per chassis.

PowerScale H500
The H500 is a versatile hybrid platform that delivers up to 5 GB/s bandwidth per
chassis with a capacity ranging from 120 TB to 720 TB per chassis. It is an ideal
choice for organizations looking to consolidate and support a broad range of file
workloads on a single platform.

PowerScale H5600
The H5600 combines massive scalability – 960 TB per chassis and up to 8 GB/s
bandwidth in an efficient, highly dense, deep 4U chassis. The H5600 delivers inline
data compression and deduplication. It is designed to support a wide range of
demanding, large-scale file applications and workloads.

PowerScale H600
The H600 is Designed to provide high performance at value, delivers up to 120,000
IOPS and up to 12 GB/s bandwidth per chassis. It is ideal for high performance
computing (HPC) workloads that don’t require the extreme performance of all-flash.

Quotas - Accounting
Accounting quotas monitor, but do not limit, disk storage. With accounting quotas,
you can review and analyze reports to help identify storage usage patterns.
Accounting quotas assist administrators to plan for capacity expansions and future
storage requirements. Accounting quotas can track the amount of disk space that
various users or groups use.

Quotas - Advisory
Advisory quotas do not deny writes to the disk, but they can trigger alerts and
notifications after the threshold is reached.

Quotas - Default Directory Quota

PowerScale Administration-SSP1

Page 432 © Copyright 2020 Dell Inc.


Versions previous to OneFS 8.2.0 have default quotas for users and groups, but
not for directory quotas. Common directory quota workflows such as home
directories and project management folders, can have a default directory quota that
simplifies quota management.

Quotas - Enforcement
Enforcement quotas include the functionality of accounting quotas and enable the
sending of notifications and the limiting of disk storage. Enforcement quotas include
the functionality of accounting quotas and enable the sending of notifications and
the limiting of disk storage.

Quotas - Hard Quota


Hard quotas limit disk usage to a specified amount. Writes are denied after
reaching the hard quota threshold and are only permitted when the used capacity
falls below the threshold.

Quotas - Soft Quota


Soft quotas enable an administrator to configure a grace period that starts after the
threshold is exceeded. After the grace period expires, the boundary becomes a
hard quota, and writes are denied. If the usage drops below the threshold, writes
are again permitted.

Reed-Solomon
OneFS uses the Reed-Solomon algorithm, which is an industry standard method to
create error-correcting codes, or ECC, at the file level.

Reed-Solomon
OneFS uses the Reed-Solomon algorithm, which is an industry standard method to
create error-correcting codes, or ECC, at the file level.

RFC 2307 Compliant


Use Microsoft Active Directory with Windows Services for UNIX and RFC 2307
attributes to manage Linux, UNIX, and Windows systems. Integrating UNIX and
Linux systems with Active Directory centralizes identity management and eases
interoperability, reducing the need for user-mapping rules. Make sure your domain
controllers are running Windows Server 2003 or later.

Scale-out Solution

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 433
Not all clustered NAS solutions are the same. Some vendors overlay a
management interface across multiple independent NAS boxes. This gives a
unified management interface, but does not unify the file system. While this
approach does ease the management overhead of traditional NAS, it still does not
scale well.
With scale-out, a single component (node) of a system or cluster contains the
performance, compute, and capacity. As the need for capacity or compute power
increases, you add more nodes to the cluster. The node is not equivalent to a
scale-up controller as disk capacity is not added to a node. The cluster scales out
as nodes you add nodes, making it a much more scalable solution than a scale-up
implementation.

Scale-up Solution
The two controllers can run active/active or active-passive. For more capacity, add
another disk array. Each of these components is added individually. As more
systems are added, NAS sprawl becomes an issue.

Scale-up Storage
Scale-up storage is the traditional architecture that is dominant in the enterprise
space. High performance, high availability single systems that have a fixed capacity
ceiling characterize scale-up.

Serial Console
The serial console is used for initial cluster configurations by establishing serial
access to the node designated as node 1.

SmartConnect as a DNS Server


SmartConnect serves DNS information to inbound queries and as such acts as a
DNS server.

SmartDedupe
OneFS deduplication saves a single instance of data when multiple identical
instances of that data exist, in effect, reducing storage consumption. Deduplication
can be done at various levels: duplicate files, duplicate blocks in files, or identical
extents of data within files. Stored data on the cluster is inspected, block by block,
and one copy of duplicate blocks is saved, thus reducing storage expenses by
reducing storage consumption. File records point to the shared blocks, but file
metadata is not deduplicated.

SmartLock Compliance

PowerScale Administration-SSP1

Page 434 © Copyright 2020 Dell Inc.


Compliance is a regulatory requirement that carries certain restrictions as to how
retention must be implemented. The simple Securities and Exchange Commission
(SEC) Rule 17a-4(f) definition states that:“the requirement in paragraph (f)(2)(ii)(A)
of the rule permits use of an electronic storage system that prevents the
overwriting, erasing, or otherwise altering of a record during its required retention
period through the use of integrated hardware and software control codes.”This rule
is often seen as the regulatory standard that must be met for data retention by
other regulatory agencies. OneFS uses a specific compliance clock for SmartLock
Compliance retention. System integrity is one of the required elements to
guarantee that the retention of the file meets the compliance requirements. The
system must be secure and protect against modifications which could allow data to
be modified or deleted. Retention date integrity is another requirement that refers to
how the retention date is stored and accessed so that retention time requirements
are met.

SmartLock Retention Period


Retention is a time period where files are set to a read-only state and may not be
moved, modified, or deleted until a future date. When reaching the retention date,
you can once again modify or delete the file. Files from the PowerScale cluster are
never automatically deleted, and OneFS provides no automated means to delete
files with expired retention. The date varies by the internal and regulatory
requirements of the organization. A retention clock manages the date and time that
is associated with the retention date.

SmartLock WORM
SmartLock provides WORM (write-once/read-many) status on files. In a WORM
state, files can be read but not modified. "Committing" a file is changing a file from
a read/write state to a WORM state that has a retention expiration date. Files are
committed to a WORM state when using SmartLock.

SmartPools
SmartPools is a software module that enables administrators to define and control
file management policies within a cluster.

SmartPools Advanced License


The advanced feature, disk pool spillover management, enables the choice
whether write operations are redirected to another node pool when the target node
pool is full. If SmartPools is unlicensed, spillover is automatically enabled.

SmartPools Basic License

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 435
A single tier has only one file pool policy that applies the same protection level and
I/O optimization settings to all files and folders in the cluster. The basic version of
SmartPools supports virtual hot spares, enabling space reservation in a node pool
for reprotection of data. OneFS implements SmartPools basic by default.

SMB Continuous Availability (CA)


CA enables SMB clients to transparently and automatically failover to another node
if a network or node fails. CA is supported with Microsoft Windows 8, Windows 10,
and Windows 2012 R2 clients.

SMB Server-Side Copy


Clients using server-side copy can experience considerable performance
improvements for file copy operations, like CopyFileEx or "copy-paste" when using
Windows Explorer. Server-side copy only affects file copy or partial copy operations
in which the source and destination file handles are open on the same share and
does not work for cross-share operations.

SMB Service Witness Protocol (SWP)


Microsoft introduced an RPC-based-mechanism, called SWP. SWP provides a
faster recovery mechanism for SMB 3.0 clients to fail-over should their server go
down. SWP requires continuously available file shares and is aware of cluster or
scale-out storage. SWP observes the servers in use and if one is unavailable,
notifies the SMB client to release its file handle. The exchange happens within five
seconds, dramatically decreasing the time from the 30 seconds to 45 seconds
previously used with time-outs. SWP uses advanced algorithms to determine the
metadata and user data blocks that are cached in L3. L3 cached data is durable
and survives a node reboot without requiring repopulating.

SMB Time Out Service


The time-out services must wait for a specific period before notifying the client of a
server down. The time-outs can take 30 seconds to 45 seconds, which creates a
high latency that is disruptive to enterprise applications.

Snapshot - Redirect on Write (RoW)


RoW are system defined snapshots. RoW avoids the double write penalty by
writing changes to a snapshot protected file directly to another free area of the file
system. However, RoW has increased file fragmentation. RoW in OneFS is used
for more substantial changes such as deletes and large sequential writes.

Snapshot Manual Create

PowerScale Administration-SSP1

Page 436 © Copyright 2020 Dell Inc.


Manual snapshots are useful to create a snapshot immediately, or at a time that is
not specified in a snapshot schedule. For example, if planning to change the file
system, but are unsure of the consequences, capture the current file system state
using a snapshot before making changes.

Snapshot Schedule
The most common method is to use schedules to generate the snapshots. A
snapshot schedule generates snapshots of a directory according to a schedule. A
benefit of scheduled snapshots is not having to manually create a snapshot every
time wanted. An expiration period should be assigned to the snapshots that are
generated, automating the deletion of snapshots after the expiration period.

SnapshotIQ
OneFS snapshots are used to protect data against accidental deletion and
modification. Because snapshots are available locally, users can restore their data
without administrative intervention.

Stateless Connection
A stateless connection maintains the session or “state” information about the client
side. If a node goes down, the IP address that the client is connected to fails over
to another node in the cluster. The client would not know that their original node
had failed.

Static Aggregation Mode


Static modes do not facilitate communication between nodes and the switch.

Storage Pool Global Settings


Global settings include L3 cache enablement status, global namespace
acceleration (GNA) enablement, virtual hot spare (VHS) management, global
spillover settings, and more. You can use the "isi storagepool" command to
manage the SmartPools settings.

Virtual Hot Spare (VHS)


VHS is available with the licensed and unlicensed SmartPools module. By default,
all available free space on a cluster is used to rebuild data. The virtual hot spare
option reserves free space for this purpose. VHS provides a mechanism to assure
there is always space available and to protect data integrity when the cluster space
is overused.

WebUI

PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 437
The browser-based OneFS web administration interface provides secure access
with OneFS-supported browsers. This interface is used to view robust graphical
monitoring displays and to perform cluster-management tasks.

Windows ACL
A Windows ACL is a list of access control entries, or ACEs. Each entry contains a
user or group and a permission that allows or denies access to a file or folder.

PowerScale Administration-SSP1

Page 438 © Copyright 2020 Dell Inc.


PowerScale Administration-SSP1

© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 439

You might also like