RHEL Multipathing Basics
RHEL Multipathing Basics
a) Algorithms that require the switch to participate in the teaming also known as switch-
dependent modes. These algorithms usually require all the network adapters of the team to be
connected to the same switch.
b) Algorithms that do not require the switch to participate in the teaming also referred to as the
switch-independent modes. Because the switch does not know that the network adapter is part
of a team, the team network adapters can be connected to different switches. Switch-
independent modes do not require that the team members connect to different switches, they
merely make it possible.
Switch-dependent modes:
There are two common choices for switch-dependent modes of NIC Teaming:
1. Generic or Static Teaming - This mode requires configuration on the switch and the computer
to identify which links form the team. Because this is statically configured solution, no
additional protocol assists the switch and the computer to identify incorrectly plugged cables or
other errors that could cause the team to fail.
2. Dynamic teaming (LACP) - The Link Aggregation Control Protocol (LACP) to dynamically
identify links between the computer and a specific switch. This enables the automatic creation
of a team and, in theory, the expansion and reduction of a team simply by a transmission or
receipt of LACP from teh peer network adapter.
# By default, DM-Multipath includes support for the most common storage arrays
that support
# multipathing. The supported devices can be found in the
multipath.conf.defaults file. If
# your storage array supports DM-Multipath and is not configured by default in
this file,
# you may need to add it to the config file.
# DM-Multipathing components
#
------------------------------------------------------------------------------
------------
# - dm-multipath kernel module: Reroutes I/O and supports failover for paths
and path
# groups.
# - multipathd daemon: Monitors paths; as paths fail and come back, it may
initiate path
# group switches. Provides for interactive changes to multipath devices.
This must be
# restarted for any changes to the /etc/multipath.conf file.
# - multipath command: Lists and configures multipath devices. Normally
started up with
# /etc/rc.sysinit, it can also be started up by a udev program whenever a
block device
# is added or it can be run by the initramfs file system.
# - kpartx command: Creates device mapper devices for the partitions on a
device It is
# necessary to use this command for DOS-based partitions with DM-MP. The
'kpartx' is
# provided in its own package, but the device-mapper-multipath package
depends on it.
# DM-Multipathing devices
#
------------------------------------------------------------------------------
------------
# DM-Multipath setup
#
------------------------------------------------------------------------------
------------
# - Install the device-mapper-multipath rpm
# - Edit the /etc/multipath.conf configuration file:
# - comment out the default blacklist or create you own exclude blacklist
# - change any of the default (if required)
# - Start the multipath daemons
# - Create the multipath device with the multipath command
# Basic multipath.conf file
#
------------------------------------------------------------------------------
------------
mpathconf --enable
# - the default section configures the multipath to use friendly names,
there are a
# number of other options that can be used.
# - the blacklist section excludes specific disks from being multipathed,
notice the
# exclusion of all wwid disks
# - the blacklist exceptions section includes the devices with a specific
wwid to be
# included
# - the multipaths section creates aliases that match a specific disk to a
alias using
# the wwid
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^(hd|xvd|vd)[a-z]*"
wwid "*"
}
blacklist_exceptions {
wwid "20017580006c00034"
wwid "20017580006c00035"
wwid "20017580006c00036"
wwid "20017580006c00037"
}
multipaths {
multipath {
wwid "20017580006c00034"
alias mpath0
}
multipath {
wwid "20017580006c00035"
alias mpath1
}
multipath {
wwid "20017580006c00036"
alias mpath2
}
multipath {
wwid "20017580006c00037"
alias mpath3
}
}
modprobe dm-multipath
service multipathd start
multipath -d
# This will perform a dry to make sure everything is ok. Fix anything that
# appears as a problem.
multipath -v2
# Commits the configuration
multipath -ll
chkconfig multipathd on
# Make devices to be configured after a reboot
# Now, we should see something similar to the output below, each device is
active and ready.
multipath -ll
mpath2 (360060e80057110000000711000005405) dm-8 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:1:0 sdc 8:32 [active][ready]
\_ 3:0:2:0 sdn 8:208 [active][ready]
mpath1 (360060e8005711000000071100000810a) dm-7 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:0:1 sdb 8:16 [active][ready]
\_ 3:0:0:1 sdl 8:176 [active][ready]
mpath0 (360060e80057110000000711000002206) dm-6 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:0:0 sda 8:0 [active][ready]
\_ 3:0:0:0 sdk 8:160 [active][ready]
mpath9 (360060e80057110000000711000005306) dm-15 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:7:0 sdj 8:144 [active][ready]
\_ 3:0:4:0 sdp 8:240 [active][ready]
mpath8 (360060e80057110000000711000008305) dm-14 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:6:1 sdi 8:128 [active][ready]
\_ 3:0:5:1 sdr 65:16 [active][ready]
mpath7 (360060e80057110000000711000002506) dm-13 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:6:0 sdh 8:112 [active][ready]
\_ 3:0:5:0 sdq 65:0 [active][ready]
mpath6 (360060e80057110000000711000007408) dm-12 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:5:0 sdg 8:96 [active][ready]
\_ 3:0:6:0 sds 65:32 [active][ready]
mpath5 (360060e80057110000000711000002305) dm-11 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:4:0 sdf 8:80 [active][ready]
\_ 3:0:7:0 sdt 65:48 [active][ready]
mpath4 (360060e80057110000000711000006207) dm-10 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:3:0 sde 8:64 [active][ready]
\_ 3:0:3:0 sdo 8:224 [active][ready]
mpath3 (360060e80057110000000711000000409) dm-9 HP,OPEN-V
[size=408G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 2:0:2:0 sdd 8:48 [active][ready]
\_ 3:0:1:0 sdm 8:192 [active][ready]
# If you have made a mistake in the multipath.conf file use following steps to
correct it:
vi /etc/multipath.conf
service multipathd reload
multipath -F
multipath -d
multipath -v2
device {
vendor "HP"
product "OPEN-.*"
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
hardware_handler "0"
path_selector "round-robin 0"
path_grouping_policy multibus
failback immediate
rr_weight uniform
no_path_retry 12
rr_min_io 1000
path_checker tur
}
# We can blacklist any device but we need to tell multipath what to exclude.
Some examples:
# wwid
# ---------------------------------
# Specific wwid
blacklist {
wwid "20017580006c00034"
}
# All wwid
blacklist {
wwid "*"
}
# device name
# ---------------------------------
# device type
# ---------------------------------
# Blacklist HP devices
blacklist {
device {
vendor "HP"
product "*"
}
}
# wwid
# ---------------------------------
# Exclude a specific wwid
blacklist_exceptions {
wwid "20017580006c00034"
}
# device name
# ---------------------------------
# device type
# ---------------------------------
# Exclude HP devices
blacklist_exceptions {
device {
vendor "HP"
product "*"
}
}
1. Overview
The connection from the server through the HBA to the storage controller is referred as
a path. When multiple paths exists to a storage device(LUN) on a storage subsystem, it
is referred as multipath connectivity. It is a enterprise level storage capability. Main
purpose of multipath connectivity is to provide redundant access to the storage devices,
i.e to have access to the storage device when one or more of the components in a path
fail. Another advantage of multipathing is the increased throughput by way of load
balancing.
Note: Multipathing protects against the failure of path(s) and not the failure of
a specific storage device.
Common example of multipath is a SAN connected storage device. Usually one or more
fibre channel HBAs from the host will be connected to the fabric switch and the storage
controllers will be connected to the same switch.
A simple example of multipath could be: 2 HBAs connected to a switch to which the
storage controllers are connected. In this case the storage controller can be accessed
from either of the HBAs and hence we have multipath connectivity.
In the following diagram each host has 2 HBAs and each storage has 2 controllers. With
the given configuration setup each host will have 4 paths to each of the LUNs in the
storage.
In Linux, a SCSI device is configured for a LUN seen on each path. i.e, if a LUN has 4
paths, then one will see four SCSI devices getting configured for the same
device. Doing I/O to a LUN in a such an environment is unmanageable
Device mapper is a block subsystem that provides layering mechanism for block
devices. One can write a device mapper to provide a specific functionality on top of a
block device.
concatenation
mirror
striping
encryption
flaky
delay
multipath
Multiple device mapper modules can be stacked to get the combined functionality.
Paths are grouped into priority groups, and one of the priority group will be used for I/O,
and is called active. A path selector selects a path in the priority group to be used for an
I/O based on some load balancing algorithm (for example round-robin).
When a I/O fails in a path, that path gets disabled and the I/O is retried in a different
path in the same priority group. If all paths in a priority group fails, a different priority
group which is enabled will be selected to send I/O.
# multipath -ll
mydev1 (3600a0b800011a1ee0000040646828cc5) dm-1 IBM,1815 FAStT
[size=512M][features=1 queue_if_no_path][hwhandler=1 rdac]
\_ round-robin 0 [prio=6][active]
\_ 29:0:0:1 sdf 8:80 [active][ready]
\_ 28:0:1:1 sdl 8:176 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 28:0:0:1 sdb 8:16 [active][ghost]
\_ 29:0:1:1 sdq 65:0 [active][ghost]
Path Group 1:
\_ round-robin 0 [prio=6][active]
-- ------------- ------ ------
| | | |----------------------------------------> Path group state
| | |-----------------------------------------------> Path group priority
| |--------------------------------------------------------------> Path selector and repeat count
|-------------------------------------------------------------------> Path group level
Path Group 2:
\_ round-robin 0 [prio=0][enabled]
\_ 28:0:0:1 sdb 8:16 [active][ghost]
\_ 29:0:1:1 sdq 65:0 [active][ghost]
2.2. Terminology
Path
Connection from the server through a HBA to a specific LUN. Without DM-MP, each
path would appear as a separate device.
Path Group
Paths are grouped into a path groups. At any point of time only path group will be
active. Path selector decides which path in the path group gets to send the next
I/O. I/O will be sent only to the active path.
Path Priority
Each path has a specific priority. A priority callout program provides the priority
for a given path. The user space commands use this priority value to choose an
active path. In the group_by_prio path grouping policy, path priority is used to
group the paths together and change their relative weight with the round
robin path selector.
Path Group Priority
Sum of priorities of all non-faulty paths in a path group. By default, the multipathd
daemon tries to keep the path group with the highest priority active.
Path Grouping Policy
Determines how the path group(s) are formed using the available paths. There are five
different policies:
1. multibus: One path group is formed with all paths to a LUN. Suitable for
devices that are in Active/Active mode.
2. failover: Each path group will have only one path.
3. group_by_serial: One path group per storage controller(serial). All paths
that connect to the LUN through a controller are assigned to a path group.
Suitable for devices that are in Active/Passive mode.
4. group_by_prio: Paths with same priority will be assigned to a path group.
5. group_by_node_name: Paths with same target node name will be assigned to a
path group.
Path States
This refers to the physical state of a path. A path can be in one of the following
states:
1. ready: Path is up and can handle I/O requests.
2. faulty: Path is down and cannot handle I/O requests.
3. ghost: Path is a passive path. This state is shown in the passive path
in Active/Passive mode.
4. shaky: Path is up, but temporarily not available for I/O requests.
DM Path States
This refers to the DM module(kernel)'s view of the path's state. It can be in one of the
two states:
1. active: Last I/O sent to this path successfully completed. Analogous
to ready path state.
2. failed: Last I/O to this path failed. Analogous to faulty path state.
Path Group State
Path Groups can be in one of the following three states:
1. active: I/O will be sent to the multipath device will be sent to this path group. Only one
path group will be in this state.
2. enabled: If none of the paths in the active path group is in the ready state, I/O will
be sent these path groups. There can be one or more path groups in this state.
3. disabled: In none of the paths in the active path group and enabled path group is in the
ready state. I/O will be sent to these path groups. There can be one or more path groups
in this state. This state is available only for certain storage devices.
When all the paths in a path group are in faulty state, one of the enabled path
group (path with highest priority) with any paths in ready state will be
made active. If there is no paths in ready state in any of the enabled path groups,
then one of the disabled path group (path with highest priority) will be made
active. Making a new path group active is also referred as switching of path
group. Original active path group's state will be changed to enabled.
Failback
A user friendly and/or user defined name for a DM device. By default, WWID is
used for the DM device. This is the name that is listed in /dev/disk/by-
name directory. When the user_friendly_names configuration option is set, the
alias of a DM device will have the form of mpath<n>. User also has the option of
setting a unique alias for each multipath device.
DM-Multipath allows many of the feature to be user configurable using the configuration
file /etc/multipath.conf. multipath command and multipathd uses the configuration
information from this file. This file is consulted only during the configuration of multipath
devices. In other words, if the user makes any changes to this file, then
the multipath command need to be rerun to configure the multipath devices (i.e the user
has to do multipath -F followed by multipath).
Support for many of the devices (as listed below) is inbuilt in the user space component
of DM-Multipath. If the support for a specific storage device is not inbuilt or the user
wants to override some of the values only then the user need to modify this file.
1. System level defaults ("defaults"): Where the user can specify system level
default override.
2. Black listed devices ("blacklist"): User can specify the list of devices they do not
want to be under the control of DM-Multipath. These devices will be excluded.
3. Black list exceptions ("blacklist_exceptions"): Specific devices to be treated as
multipath candidates even if they exist in the blacklist.
4. Storage controller specific settings ("devices"): User specified configuration
settings will be applied to devices with specified "Vendor" and "Product"
information.
5. Device specific settings ("multipaths"): User can fine tune configuration settings
for individual LUNs.
User can specify the values for the attributes in this file using regular expression syntax.
For detailed explanation of the different attributes and allowed values for the attributes
please refer to multipath.conf.annotated file.
Attribute values are set at multiple levels (internally in multipath tools and through
multipath.conf file). Following is the order in which the attribute values will be
overwritten.
Man page of multipath/multipathd provides good details on the usage of the tools.
multipathd has a interactive mode option which can be used for querying and managing
the paths and also to check the configuration details that will be used.
When multipathd is running, one has to invoke multipathd with the command
line multipathd -k. multipathd will enter into a command line mode where user can
invoke different commands. Checkout the man page for different commands.
3. Supported Storage Devices
This is the list of devices that have configuration information built-in in the multipath
tools. Not being in this list does not mean that the specific device is not supported, it just
means that there is no built-in configuration in the multipath tools.
Some of the devices do need a hardware handler which need to compiled in the kernel.
The device being in this list does mean that the hardware handler is present in the
kernel. It is possible that the hardware handler is present in the kernel but the device is
not added in the list of supported built-in devices.
1. Using alias: By default, the multipathed devices are named with the uid of the device,
which one accesses through /dev/mapper/${uid_name}. When one uses
user_friendly_names, devices will be named as mpath0, mpath1 etc., which may meet
ones needs. User also have an option to define a alias in multipath.conf for each of the
device.
Syntax is:
multipaths {
multipath {
wwid 3600a0b800011a2be00001dfa46cf0620
alias mydev1
}
}
1. Persistent device names: The names (uid_names or mpath names or alias names) that
appear in /dev/mapper are persistent across boots, and the names dm-, dm-1 etc., can
change between reboots. So, it is advisable to use the device names that appear
under /dev/mapper and avoid using the dm-? names.
2. Restart of tools after changing multipath,conf file: Once multipath.conf file is
changed, the multipath tools need to be rerun for those configuration values to be
effective. One has to kill multipathd, run multipath -F and then restart multipathd
and multipath.
4. Using binding file in clustered environment: Bindings file holds the bindings
between the device mapper names and the uid of the underlying device. By
default the file is /var/lib/multipath/bindings, this can be changed by the multipath
command line option -b. In a clustered environment, this file can be created in
one node and can be transferred to another to get the same names.
Note that the same effect can also be acheived by using alias and having the
same multipath.conf file in all the nodes of the cluster.
5. Getting the multipath device name corresponding to a SCSI device: If one knows
the name of a SCSI device and wants to get the device mapper name associated
with that the could use multipath -l /dev/sda, where sda is the SCSI device. On
the other hand, if one knows the device mapper name and wants to know the
underlying device names they could use the same command with the device
mapper name. i.e multipath -l mpath0, where mpath0 is the device mapper
name.
6. When using LVM on dm-multipath devices, it is better to turn lvm scanning off on
the underlying SCSI devices. This can be done by changing the filter parameter
in /etc/lvm/lvm.conf to be filter = [ "a/dev/mapper/.*/", "r/dev/sd.*/" ].
If your root device is also a multipathed lvm device, then make the above change
before you create a new initrd image.
7. To find out if your device (vendor/product) is supported by the tool by default do
the following.
o In RHEL:
Make sure that multipathd is running. Then run
# multipathd -k
multipathd> show config
This command will list all the devices that are built-in in the tools. In SLES:
# multipath -t
this would list all the devices that are built-in in the tools.
8. If you have more than 1024 paths, you need to set a configuration
parameter max_fds to a number equal to or greater than the number of paths +
32. Otherwise, you might multipathd daemon die with an error (in
/var/log/messages) saying that there are too many files open.
9. When multipath/multipathd starts you might see a message(s) like