Chapter 2. Storage system, connectivity, and file system architecture – IBM Power Systems Infrastructure I/O for SAP Applications

Storage system, connectivity, and file system architecture
This chapter describes SAP applications and their requirements for a file system that is flexible and reliable, and for databases that deliver sufficient performance.
SAP applications on Linux are typically run on Scalable File System (XFS). Some possible alternatives are IBM Spectrum® Scale (formerly known as IBM GPFS) and Network File System (NFS). These filer options provide high availability (HA) to the SAP and SAP HANA shared file systems. Some clients also use these filers for the SAP HANA data and log files systems.
This chapter describes the following topics:
2.1 Filer infrastructures
You can access a filer through Ethernet or InfiniBand.
InfiniBand provides high throughput and low latency by using remote direct memory access (RDMA). Because high-speed network cards now are used in data centers, Ethernet attachments also are used in SAP deployments.
Although InfiniBand cannot be virtualized, Ethernet can be virtualized (when it is not using RDMA over Converged Ethernet (RoCE)). For more information about Ethernet virtualization, see Chapter 1, “Ethernet architectures for SAP workloads” on page 1.
Filers provide three file systems that are relevant to SAP landscapes:
NFS (for example, NetApp)
IBM Spectrum Scale (for example, IBM Elastic Storage® Server (IBM ESS))
Hadoop file system (for example, IBM ESS)
SAP NetWeaver can use filer-based shared file systems for /sapmnt and /usr/sap/trans. Especially in HA scenarios, filer-based shared file systems have higher availability with less operational cost compared to the traditional NFS cross-mount or software options.
SAP HANA can be deployed on IBM ESS or any other SAN HANA Tailored Data Center Integration (TDI) certified filer as a persistent back end. For more information about setting up and configuring this option, see the SAP HANA TDI documentation.
Regarding filers, the boot devices need special treatment. One option is to use internet Small Computer Systems Interface (iSCSI), which outlined in 2.5, “iSCSI boot disk attachment with VIOS 3.1” on page 30.
2.2 PCIe Non-Volatile Memory Express (NVMe) enriched POWER servers
Since IBM POWER8® processors were released, PCIe-attached NVMe cards are used in the field to accelerate predominately read I/O. This section focuses on highlights only, and does not cover all the options where NVMe drives play a role.
For more information about the available options for your Power Systems server model, see the e-config tool or contact your IBM representative or IBM Business Partner. Pay attention to the I/O characteristics of the NVMe cards because there are a broad variety of them based on endurance (determines whether this card is suitable for storing the data persistence of a database), speed, and capacity. With low-end cards, a SAN Volume Controller outperforms them and provide operational benefits that an internal solution cannot. Using internal disks eliminates in most cases Live Partition Mobility (LPM) capability because the data is bound to a single Power Systems server unless you use the disks as a read cache in a Shared Storage Pool (SSP) Virtual I/O Server (VIOS) deployment.
SAP-related documentation about how to configure NVMe along with other SAP HANA documentation can be found at SAP HANA on IBM Power Systems and IBM System Storage - Guides.
2.2.1 NVMe use cases and technology
Although it is possible to use NVMe from an endurance point of view, the dominant use cases are to use them as fast cache.
Here are generic use cases of NVMe1:
VIOS boot image on NVMe: Customers can use an NVMe device to install and boot a VIOS image. Transferring VIOS boot images to an NVMe device can be done by using a Logical Volume Manager (LVM) mirror. Customers can add an NVMe mirror copy to rootvg and remove the old copy after a sync is done.
Logical volume (LV)-backed virtual SCSI (VSCSI) device: Customers can install a NovaLink boot image on the device (An LV-backed device can be used to boot a NovaLink partition in greenfield deployment). A client logical partition (LPAR) can use the LV-backed device, which is on an NVMe volume group (VG), to host the read cache.
Read cache device on VIOS: An NVMe device is perfect for the local read cache on VIOS. It can be used for SSP disk caching where data that is present in the SSP is cached on the NVMe disk. LPM with SSP where NVMe devices are used for caching is possible.
No limitation on SSP operations: When you enable SSP caching that uses an NVMe disk, you can perform any kind of SSP operations, such as adding, removing, or replacing a disk to or from SSP; creating, modifying, or deleting a tier in SSP; and creating and deleting a mirror in SSP.
No dependency on type of disk for client: You can create a VG that can spread across NVMe and some other type of devices. You can create an LV that can spread across NVMe and other devices. But for the client, the LV appears as a normal vSCSI disk even though the LV is spread between the NVMe disk and the other disk.
Backup and restore of a VIOS configuration: You can create back ups of VIOS instances with NVMe disks, install a new VIOS build, and restore the configuration on the new
VIOS build.
Upgrade support from previous VIOS levels: You can upgrade VIOS instances from an older level to a new level and start using the NVMe device at the new level directly.
Db2
Depending on the workload and acceleration of the temporary file system of IBM Db2®, it can speed up processing by up to 30%.
SAP HANA
NVMe cards for SAP HANA can be used as an internal disk option and an accelerator for all read operations. Read acceleration has value when you restart the database, activate a standby node in an SAP HANA Auto Host Failover scenario, and with data tiering options.
 
Notes:
Because the NVMe adapters are attached locally to the Power Systems servers, LPM is no longer possible unless NVMe is used as cache acceleration in SSP configurations.
NVMe performance tuning is different from SAN-based tuning. For example, PCIe NVMe does not perform well when using RAID 5 or RAID 6 configurations compared to RAID 1 or RAID 10. When using NVMe mostly as cache, no RAID protection is required, which boosts bandwidth and performance.
Performance scales with the number of physical cards that is used and not with the number of modules on a single card. For up to four cards, the performance scales linearly. No testing was performed with more cards, but performance further increases if the file system is set up in a striped manner and a sufficient workload is run against it.
2.3 SAN infrastructures
This section describes best practices for the deployment of SAP on SAN.
2.3.1 SAN use cases and architectures best practices
The dominant type of attachment in SAP landscapes is using Fibre Channel (FC) to connect to storage devices, as described in 2.4, “Fibre Channel infrastructure and VIOS options” on page 18).
Today, most SAN designs use a variant of what is called a core-to-edge SAN design, as shown in Figure 2-1. In this design, the SAN elements (typically SAN switches) are designated as either core or edge switches. The edge switches connect servers, and the core switches connect the storage devices and the edge switches.
Figure 2-1 Highly redundant core-to-edge SAN architecture
For HA, the setup is divided into two fabrics that act as a failover configuration. All connections from the servers to the edge switches are handled by both fabrics. The storage systems are also connected to both fabrics.
This setup ensures that there are multiple independent paths from the servers to the storage systems. The maximum number of paths depends on the storage system, but must not be more than 16 for one server.
The number of allowed connections differs for different storage systems. Here are two examples, and both examples are sufficient to run SAP workloads. Configurations are listed so that you make the correct plans when ordering switch and server hardware:
IBM XIV: Up to 12 FC connections to the core switches
IBM SAN Volume Controller: Up to 16 FC connections to the core switches
SAN zoning is used to keep the servers isolated from each other and manage which server can reach which storage volumes on the storage systems.
There are multiple ways for a SAN zoning implementation, but the best practice method is to do the zoning by initiator port, that is, create zones for individual initiator ports (typically a single-server port) with one or more target (storage system) ports2.
To configure the zone, use only the worldwide port name (WWPN), not the worldwide node name (WWNN). When a WWNN is used in place of a WWPN for a device, switches interpret the WWNN as designating all associated ports for the device. Using the WWNN in a zone can cause multipathing issues where there are too many paths between server and storage.
2.4 Fibre Channel infrastructure and VIOS options
There are different connection types that are available for an FC connection that uses VIOS:
vSCSI
N_Port ID Virtualization (NPIV)
The recommended connection type is NPIV because it supports LPM without manual intervention. NPIV lowers latency and reduces the core consumption on the VIOS level. Some SAP solutions such as SAP HANA Auto Host Failover or third-party products
require NPIV.
2.4.1 vSCSI
vSCSI is based on a client/server relationship. The VIOS owns the physical resources and acts as the server, or in SCSI terms the target device. The client LPARs access the vSCSI backing storage devices that are provided by the VIOS as clients.
Interaction between client and server
The virtual I/O adapters are configured by using a Hardware Management Console (HMC) or the Integrated Virtualization Manager on smaller systems. The interaction between a VIOS and a Linux client partition is enabled when both the vSCSI server adapter that is configured in the VIOS' partition profile and the vSCSI client adapter that is configured in the client partition's profile have mapped slot numbers, and both the VIOS and client operating system (OS) recognize their virtual adapter.
Dynamically added vSCSI adapters are recognized on the VIOS after running the cfgdev command. Linux OSs automatically recognize dynamically added vSCSI adapters.
After the interaction between the vSCSI server and the vSCSI client adapters is enabled, mapping storage resources from the VIOS to the client partition is needed. The client partition configures and uses the storage resources when it starts or when it is reconfigured at run time.
The process runs as follows:
The HMC maps interaction between vSCSI adapters.
The mapping of storage resources is performed in the VIOS.
The client partition recognizes the newly mapped storage dynamically.
Redundancy of vSCSI by using VIOS
Figure 2-2 shows one possible configuration in a PowerVM environment that shows the redundancy of vSCSI by using multipath I/O (MPIO) at client partitions. The diagram shows a dual-VIOS environment where the client partition has two vSCSI client adapters and each of them is mapped to two different vSCSI server adapters on different VIOSs. Each VIOS maps the same physical volume (PV) to the vSCSI server adapter on them.
The client partition sees the same PV (hdisk in Figure 2-2), which is mapped from two VIOSs by using vSCSI. To achieve this mapping, the same storage must be zoned to the VIOSs from the storage subsystem. This configuration also has redundancy at the VIOS physical
FC adapter.
Figure 2-2 Redundancy of vSCSI using dual virtual I/O servers
2.4.2 N_Port ID Virtualization
NPIV is an industry-standard technology that allows an NPIV-capable FC adapter to be configured with multiple virtual WWPNs, as shown in Figure 2-3. This technology is also called virtual FC. Similar to the vSCSI function, virtual FC is another way of securely sharing a physical FC adapter among multiple VIOS client partitions.
From an architectural perspective, the key difference with virtual FC compared to vSCSI is that the VIOS does not act as a SCSI emulator to its client partitions, but acts as a direct FC pass-through for the FC Protocol I/O traffic through the IBM POWER Hypervisor. Instead of generic SCSI devices presented to the client partitions with vSCSI, with virtual FC the client partitions are presented with native access to the physical SCSI target devices of SAN disk or tape storage systems.
The benefit of virtual FC is that the physical target device characteristics like vendor or model information remain fully visible to the VIOS client partition so that device drivers like multipathing software, middleware such as copy services, or storage management applications that rely on the physical device characteristics do not need to be changed.
Figure 2-3 NPIV architecture
Redundancy of virtual Fibre Channel
A host bus adapter and VIOS redundancy configuration provides a more advanced level of redundancy for the virtual I/O client partition, as shown in Figure 2-4.
Figure 2-4 Redundancy of virtual Fibre Channel
2.4.3 Example setup of an NPIV virtual Fibre Channel configuration
This section describes how to configure SAN storage devices by using virtual FC for a Linux client of the VIOS. An IBM 2498-F48 SAN switch, an IBM Power Systems E980 server, and an IBM Spectrum Virtualize storage system were used in the lab environment to describe the setup of the virtual FC environment.
Complete the following steps:
1. Use a dedicated virtual FC server adapter (slot P1-C2-C1) in the VIOS partition ish400v1, as shown in Figure 2-5. Each client partition accesses physical storage through its virtual FC adapter, which must be configured in the profiles of the VIOS and the client.
Figure 2-5 LPAR properties on HMC
2. Create the virtual FC server adapter in the VIOS partition:
a. On the HMC, select the managed server to be configured by clicking All Systems and then select <servername> (ish4001v1).
b. Select the VIOS partition on which the virtual FC server adapter will be configured. Then, select Actions → Profiles → Manage Profiles, as shown in Figure 2-6 on page 23.
Figure 2-6 VIOS profile on the HMC
c. To create a virtual FC server adapter, select the profile to use and open it by using the Edit action. Then, click the Virtual Adapters tab and select Actions → Create Virtual Adapter → Fibre Channel Adapter, as shown in Figure 2-7.
Figure 2-7 Creating a virtual Fibre Channel adapter on the HMC
d. Enter the virtual FC adapter number for the virtual FC server adapter. Then, select the client partition to which the adapter can be assigned and enter the client adapter ID, as shown in Figure 2-8. Click OK.
Figure 2-8 Adding the virtual Fibre Channel number on HMC
e. Click OK in the Create Virtual FC Adapters dialog to save the changes.
f. Update the partition profile of the VIOS partition by selecting Profiles → Save Current Configuration, as shown in Figure 2-9 to save the changes.
Figure 2-9 Save Current Configuration menu on the HMC
3. Create a virtual FC client adapter in the virtual I/O client partition:
a. Select the virtual I/O client partition on which the virtual FC client adapter will be configured. Change the partition profile by selecting Profiles → Manage Profiles, as shown in Figure 2-10.
Figure 2-10 Selecting the virtual I/O client partition for the virtual Fibre Channel
b. Click the profile name to edit it and select the Virtual Adapters tab in the Logical Partition Profile Properties dialog. Then, to create a virtual FC client adapter, select Actions → Create Virtual Adapter → Fibre Channel Adapter, as shown in Figure 2-11.
Figure 2-11 Assigning the virtual port on the HMC
c. Enter the virtual slot number for the virtual FC client adapter. Then, select the VIOS partition to which the adapter can be assigned and enter the server adapter ID, as shown in Figure 2-12. Click OK.
Figure 2-12 Adding a server adapter ID
d. Click OK and then Close in the Managed Profiles dialog to save the changes.
4. Log in to the VIOS partition as user padmin.
5. Run the cfgdev command to configure the virtual FC server adapters.
6. Run the lsdev -dev vfchost* command to list all the available virtual FC server adapters in the VIOS partition before mapping to a physical adapter, as shown in Example 2-1.
Example 2-1 The lsdev -dev vfchost* command on the Virtual I/O Server
$ lsdev -dev vfchost*
name status description
vfchost0 Available Virtual FC Server Adapter
vfchost1 Available Virtual FC Server Adapter
7. The lsdev -dev fcs* command lists all available physical FC server adapters in the VIOS partition as shown in Example 2-2.
Example 2-2 The lsdev -dev fcs* command on the Virtual I/O Server
$ lsdev -dev fcs*
name status description
fcs0 Available PCIe3 4-Port 16Gb FC Adapter (df1000e314101406)
fcs1 Available PCIe3 4-Port 16Gb FC Adapter (df1000e314101406)
fcs2 Available PCIe3 4-Port 16Gb FC Adapter (df1000e314101406)
fcs3 Available PCIe3 4-Port 16Gb FC Adapter (df1000e314101406)
8. Run the lsnports command to check the virtual FC adapter readiness of the adapter and the SAN switch. Example 2-3 shows that the fabric attribute for the physical FC adapter in slot C1 is set to 1, which means that the adapter and the SAN switch are NPIV ready. If the value equals 0, then the adapter or SAN switch is not NPIV ready, and you must check the SAN switch configuration.
Example 2-3 The lsnports command on the Virtual I/O Server
$ lsnports
name physloc fabric tports aports swwpns awwpns
fcs0 U78D5.ND1.CSS2010-P1-C2-C1-T1 1 64 63 3072 3069
fcs1 U78D5.ND1.CSS2010-P1-C2-C1-T2 1 64 63 3072 3069
9. Before mapping the virtual FC adapter to a physical adapter, obtain the vfchost name of the virtual adapter that you created and the fcs name for the FC adapter from the output of Example 2-2.
10. To map the virtual FC server adapter vfchost0 to the physical FC adapter fcs0, use the vfcmap command, as shown in Example 2-4.
Example 2-4 The vfcmap command with vfchost0 and fcs0
$ vfcmap -vadapter vfchost0 -fcp fcs0
vfchost0 changed
11. To list the mappings, use the lsmap -all -npiv command, as shown in Example 2-5.
Example 2-5 The lsmap -npiv -vadapter vfchost0 command
$ lsmap -all -npiv
Name Physloc ClntID ClntName ClntOS
---------- ------------------------ --- ------ ------------- -------
vfchost0 U9080.M9S.21DCD17-V1-C206 6 lsh40006 Linux
 
Status:LOGGED_IN
FC name:fcs0 FC loc code:U78D5.ND1.CSS2010-P1-C2-C1-T1
Ports logged in:3
Flags:a<LOGGED_IN,STRIP_MERGE>
VFC client name:host5 VFC client DRC:U9080.M9S.21DCD17-V6-C5
 
Name Physloc ClntID ClntName ClntOS
---------- ----------------------------- ------ ------------ -------
vfchost1 U9080.M9S.21DCD17-V1-C106 6 lsh40006 Linux
 
Status:LOGGED_IN
FC name:fcs1 FC loc code:U78D5.ND1.CSS2010-P1-C2-C1-T2
Ports logged in:3
Flags:a<LOGGED_IN,STRIP_MERGE>
VFC client name:host4 VFC client DRC:U9080.M9S.21DCD17-V6-C4
12. After you create the virtual FC server adapters in the VIOS partition and in the virtual I/O client partition, set the correct zoning in the SAN switch:
a. Obtain the information about the WWPN of the virtual FC client adapter that was created in the virtual I/O client partition.
b. Select the appropriate virtual I/O client partition, then from the task menu click Properties. Expand the Virtual Adapters tab, select the Client FC client adapter, and then select Actions → Properties to list the properties of the virtual FC client adapter, as shown in Figure 2-13.
Figure 2-13 Zone the LPAR and virtual adapter
c. Figure 2-14 on page 29 shows the properties of the virtual FC client adapter. Here you can get the virtual WWPN that is required for the zoning.
Figure 2-14 Getting the virtual WWPN that is required for zoning
d. Log on to your SAN switches and create a zone for the virtual WWPN and the corresponding physical storage ports, or customize an existing one.
Only the first listed WWPN is used for the running LPAR and is considered for the SAN zoning and storage configuration. The second listed WWPN is inactive and used for LPM when the partition is moved to another system. After the move to another system, the second WWPN is the active one and the first WWPN is inactive.
e. After completing the SAN switch zoning, create the storage configuration on your SAN storage system by mapping the LUNs to a host connection that is created with the virtual WWPN of the virtual FC client adapter.
f. After completing the SAN storage configuration, the volumes that are configured in the virtual FC client adapter are now ready for use by the VIOS client partition.
From the Linux client perspective, virtual FC must look like a native physical FC device. There is no special requirement or configuration that is needed to set up a virtual FC on Linux.
After the ibmvfc driver is loaded and a virtual FC Adapter is mapped to a physical FC adapter on the VIOS, the FC port automatically shows up on the Linux partition. You can check whether the ibmvfc driver is loaded on the system by running the lsmod command, as shown in Example 2-6.
Example 2-6 Checking for the ibmvfc driver
[root@lsh40006: ~]# lsmod | grep ibmvfc
ibmvfc 79236 288
scsi_transport_fc 68048 1 ibmvfc
scsi_mod 293836 12 scsi_dh_emc,st,scsi_transport_srp,sd_mod,scsi_dh_alua,scsi_dh_rdac,ibmvfc,sr_mod,dm_multipath,sg,ibmvscsi,scsi_transport_fc
You can also check the devices by looking at the kernel log in the /var/log/messages file or by running the dmesg command, as shown in Example 2-7.
Example 2-7 Checking /var/log/messages for the loaded driver
[root@lsh40006: ~]# dmesg | grep vfc
[ 1.932844] ibmvfc: externally supported module, setting X kernel taint flag.
[ 1.932858] ibmvfc: IBM Virtual Fibre Channel Driver version: 1.0.11 (April 12, 2013)
[ 2.036692] ibmvfc 30000002: Partner initialization complete
[ 2.040987] ibmvfc 30000002: Host partition: ish400v2, device: vfchost1 U78D5.ND2.CSS2235-P1-C2-C1-T2 U9080.M9S.21DCD17-V2-C106 max sectors 8192
[ 2.125175] ibmvfc 30000003: Partner initialization complete
[ 2.129065] ibmvfc 30000003: Host partition: ish400v2, device: vfchost0 U78D5.ND2.CSS2235-P1-C2-C1-T1 U9080.M9S.21DCD17-V2-C206 max sectors 8192
[ 2.195108] ibmvfc 30000004: Partner initialization complete
[ 2.198965] ibmvfc 30000004: Host partition: ish400v1, device: vfchost1 U78D5.ND1.CSS2010-P1-C2-C1-T2 U9080.M9S.21DCD17-V1-C106 max sectors 8192
[ 2.255039] ibmvfc 30000005: Partner initialization complete
[ 2.258851] ibmvfc 30000005: Host partition: ish400v1, device: vfchost0 U78D5.ND1.CSS2010-P1-C2-C1-T1 U9080.M9S.21DCD17-V1-C206 max sectors 8192
To list the virtual FC device, run the lsscsi command, as shown in Example 2-8.
Example 2-8 Listing the Fibre Channel devices
[root@lsh40006: ~]# lsscsi -H -v | grep fc
[2] ibmvfc
[3] ibmvfc
[4] ibmvfc
[5] ibmvfc
You can perform virtual FC tracing on Linux through the file system attributes in the /sys/class directories. The files containing the devices' attributes are useful for checking detailed information about the virtual device and also can be used for troubleshooting. These attributes files can be accessed in the following directories:
/sys/class/fc_host/
/sys/class/fc_remote_port/
/sys/class/scsi_host/
2.5 iSCSI boot disk attachment with VIOS 3.1
For many years, FC-attached storage has been the data transmission technology of choice. It provides high reliability, high throughput, and low-latency storage access at moderate costs.
iSCSI provides block-level access to storage devices by carrying SCSI commands over a Internet Protocol network. iSCSI facilitates data transfers over the internet by using TCP, which is a reliable transport mechanism that uses either IPv6 or IPv4 protocols. TCP is used to manage storage over long distances.
Compared to FC-attached storage, iSCSI storage systems can be a cheaper option. Regarding infrastructure costs, iSCSI is less expensive because all the existing Ethernet infrastructure (network switches, host adapters, and network interface cards (NICs)) can be used with host-server-based iSCSI initiator software without needing extra FC adapters and SAN directors or switches. However, this situation is not a fair comparison because of performance and other considerations.
Some organizations are turning to iSCSI storage systems because they consider it a less expensive data transmission option because no extra FC components must be procured and operated. In the case of moderate performance and throughput requirements, for example, for system disk access or file services, iSCSI-attached storage can be a viable option.
With VIOS 3.1, iSCSI support was added to the VIOS, which you can use to export the iSCSI disks to client LPARs as virtual disks (vSCSI disks). This support is available in VIOS 3.1 and requires FW 860.20 or later. VIOS 3.1 also enables MPIO support for the iSCSI initiator so that you can configure and create multiple paths to an iSCSI disk (Figure 2-15).
Figure 2-15 iSCSI boot architecture
Currently, the iSCSI disk support for VIOS has the following limitations:
There is no support for booting VIOS by using an iSCSI disk. Instead, internal disks can be used for VIOS because a VIOS is always hardbound to a server and needs no LPM capability.
Flat file-based discovery policy is not supported. The recommendation is to use discovery policy ODM.
iSCSI disk-based LV backed devices are not supported. SSPs that use iSCSI disks as either repositories or shared pool disks are not supported. The iSCSI disks or iSCSI-based LVs or VGs cannot be used as paging devices for Active Memory Sharing (AMS) or remote restart.
If the backing device is an iSCSI disk, the client_reserve and mirrored attributes are not supported for virtual target devices.
Several steps must be done to configure iSCSI and the iSCSI storage system on the VIOS, as shown in Figure 2-16.
Figure 2-16 Overview of the iSCSI configuration flow
At first, all the configuration parameters must be defined: iSCSI Qualified Names (IQNs) and the IP addresses:
1. The default IQN for the iSCSI software initiator on the VIOS does not match all the IQN standards. A unique IQN must be defined for the iSCSI software initiator.
2. On each VIOS, two IP addresses (in different IP subnets) that are acquired on different network adapters must be created by forming a private storage network on high-bandwidth network adapters.
3. The IP addresses of the storage system need to be created as well.
The configuration actions on the storage system and the VIOS start.
On the VIOS, complete the following steps:
1. Create an iSCSI protocol device on each VIOS.
The iSCSI initiator name and the discovery policy are defined in the iSCSI protocol device. The discovery methodology must be set to odm. The information about the iSCSI targets is then stored in the Object Data Manager (ODM) objects.
2. Add all iSCSI targets to the ODM.
On the storage system, complete the following steps:
1. Define all the IQNs of the VIOSs.
2. Create all the LUNs and map them to the VIOSs.
After the LUNs are created, the VIOS team should complete the following steps:
1. Wait for LUN creation.
2. Run the configuration manager to discover the new disk devices.
3. Map the new disk devices to the client LPARs by using the vSCSI mappings.
Actions on the storage system depend on the vendor, and those actions are not covered here. The required steps on the VIOS are described in more detail in the next section.
2.5.1 Configuring iSCSI on the VIOS
This section describes the iSCSI configuration on the VIOS.
Defining a unique iSCSI qualified name for each VIOS
For every iSCSI node, a node name that uses the IQN format must be set. The IQN-type designator is a logical name and has the following format:
iqn.<yyyy-mm>.<naming-authority>:<unique name>
<yyyy-mm> Year and month when the naming authority is established.
naming-authority The naming authority is built on the reverse of the internet domain name of the naming authority.
unique name Unique identifier for the iSCSI VIOS. The naming authority must make sure that any names that are assigned following the colon are unique.
Identifying IP addresses for storage connectivity for each VIOS
To get optimal performance from the iSCSI disks, establish the following items:
A separate private network to access the iSCSI storage.
High-speed network adapters and switches (at least 10 Gb Ethernet technology).
A redundant network topology on the storage system and on the VIOS. Two IP addresses on different network adapters for each VIOS, connecting to two IP addresses on different storage cluster nodes.
Configuring the iSCSI protocol device on each VIOS
Log in to the VIOS as an admin user, and set the initiator name and the discovery policy to odm for the iSCSI protocol device by running the following command:
chdev -l 'iscsi0' -a initiator_name='<IQN of the VIOS>' -a disc_policy='odm'
Adding all the iSCSI target devices to the ODM on each VIOS
Log in to the VIOS as an admin user, and define the target devices in the ODM by running the following commands:
# mkiscsi -l iscsi0 -g static -t <Storage IQN #1> -n 3260 -i <Storage IP Addr #1>
# mkiscsi -l iscsi0 -g static -t <Storage IQN #2> -n 3260 -i <Storage IP Addr #2>
Running the configuration manager to configure the disk devices on each VIOS
Log in to the VIOS as an admin user, and run the configuration manager (cfgvdev).
As a result, all the iSCSI LUNs are discovered in the VIOS as hdisks, and they can be mapped through vSCSI to the client LPARs, as shown in Figure 2-17.
Figure 2-17 Diagram of the iSCSI configuration on the VIOS
2.6 Linux I/O
The I/O stack ranges from the physical device on the storage system up to the file system in the OS. This section focuses only on the OS portion. It does not describe the locations of the physical adapters in regard to LPARs, VIOSs, or other components. The focus is on multipathing. The objective is the elimination of single point of failures, and to add robustness to I/O scheduling, interrupt request (IRQ) balancing, the LVM, and the file systems that are relevant to the SAP applications.
2.6.1 Multipathing
The use cases for MPIO with VIOS on PowerVM are as follows:
Reduced planned downtime by using rolling maintenance
Reduced unplanned downtime by eliminating all single points of failure with less hardware
Improved performance
For AIX, MPIO stacks are the dominant deployment option that works independently from the application type. In Linux, it is not widely used yet.
Regarding multipathing on Linux, consider the following important points:
VIOS provides different options to virtualize I/O. Becoming familiar with the pros and cons of each option is essential to achieve the result that you want. The recommended default deployment is to use NPIV in dual-VIOS load-sharing setups. There are no limitations or rules from SAP for non- SAP HANA file systems, but using older technologies impose higher latency that is unwanted for database workloads.
Differentiate between boot and SAP file systems. For SAP file systems, a good entry point is the default multipath settings that are provided by the OS vendor. For all IBM Storage Systems, the boot device needs special care.
If performance is OK, do not start tuning because what helps in the one scenario can cause severe issue in others.
As a best practice, start with a dual-VIOS concept with load sharing that uses NPIV. NPIV does a direct pass through of all I/O requests to the LPAR instead creating a virtual device inside the VIOS, and then adds that virtual device to the LPAR, which results in a copy. This configuration is easier because you can add LUNs from the storage directly to the LPAR by using the host attachment function. Also, a leaner architecture enables lower latency and less processing impact in the virtualization because VIOS has less work to do and uses
fewer cores.
The Linux MPIO driver enables multiple paths to a single device in FC environments for SAN storage. To configure it, complete the following steps.
 
Note: SUSE and Red Hat publish comprehensive documentation for each major release. For example, you can find the documentation for SUSE Linux Enterprise Server 15 SP1 at SUSE Documentation.
1. Enable the daemon for MPIO (if this action was not done during installation).
This task must be performed for all partitions. Start the multipath daemon multipathd at boot time to enable automatically multipath services. To enable the daemon, run the following command:
systemctl enable multipathd
If the multipath services are enabled (or disabled), rebuild the initrd afterward by running the following command:
dracut --force --add multipath
2. For IBM System Storage™ servers, the default multipath settings work for boot and SAP HANA file systems for standard setups. Decide whether a tailored multipath configuration is needed. Candidates for a tailored multipath.conf file are:
 – Cluster managers demanding specific timeout settings.
 – Storage vendors not providing pretested defaults.
 – Administration tasks that require different timeout settings.
Check whether the multipath.conf file exists by running the following command:
/etc/multipath.conf
If the file does not exist, it can be created by running the following command:
multipath -T > /etc/multipath.conf
Check whether multipath.conf was created with the defaults for the storage back end.
Here is an excerpt from a sample multipath.conf file for SAN Volume Controller:
device {
vendor "IBM"
product "2145"
 
Note: When you change multipath versions, you must maintain these manual settings.
As the multipath configuration is not consistent between Linux OS versions, it is a best practice to conduct a series of tests to ensure that the timeout settings match your
specific environment.
 
Note: The configuration of the multipath.conf file changes between versions. Check your setting after each service pack update.
Here are the known service pack or kernel updates where the defaults changed:
 – SUSE Linux Enterprise Server 12 SP2 to SUSE Linux Enterprise Server 12 SP3.
 – Linux kernel 2.6.31 to any newer kernel.
3. Treat special cases.
When using cluster software, you might need to change the defaults to what is described in the Linux distribution guides. For SUSE, these guides are published for each release and sometimes for selected service packs. The one for SUSE Linux Enterprise Server 15 SP1 can be found at SUSE Documentation.
Linux supports different device name types: Worldwide identifier (WWID), user-friendly names, and aliases. For SAP landscapes either use the WWID or create for each WWID an alias. It is not recommended to rely on UUIDs for SAP landscapes.
The server is ready to accept the LUNs for SAP file systems.
4. Verify the bootlist.
Verify the bootlist for multipath boot devices and the LVM filters. Most outages occur due to not checking the boot devices to validate whether they are configured correctly for multipath. Losing the disks for the OS results in losing the SAP application too.
5. Make operational decisions.
Understand the timing of planned and unplanned events, and adjust the timeout and retry settings as needed. Planning maintenance is essential because of the timeout values. The speed at which the maintenance tasks are performed define the difference between an outage and successful maintenance. Covering unplanned outages of certain types require understanding the duration and adjusting the timing and retry settings in the multipath.conf file.
2.6.2 Sample multipath configuration
Here are configuration details for our sample multipath configuration:
SUSE Linux Enterprise Server 12 SP3, kernel 4.4.162-94.72.
Update powerpc-utils to at least SUSE Linux Enterprise Server 12-SP3 (src) powerpc-utils-1.3.3-7.6.2.
Update to the latest multipath-tools rpm.
VIOS 3.1 using NPIV.
These details assume that you are using IBM Spectrum Virtualize.
Modifying the filter in /etc/lvm/lvm.conf
When you use an LV that uses active and passive multipath arrays (not active/enabled multipath arrays as with IBM Spectrum Virtualize), they must be excluded from LVM scans. To do this task, configure filters in /etc/lvm/lvm.conf by completing the following steps:
1. Look at the device entries in which WWID patterns occur. In most cases, you find a pattern matching /dev/mapper/360 or another 3-digit number. If such a pattern occurs, adjust the filter as described in step 2. Otherwise, use the filter that is described in Troubleshooting boot issues (multipath with lvm).
2. Between SUSE Linux Enterprise Server 12 SP2 and SP3, changes were made to /etc/lvm/lvm.conf regarding how multipath_component_detection = 1 is handled. To address this change, use the following filters:
filter = [ "a|/dev/mapper/360.*|", "r/.*/"]
As most disks with WWIDs start with 360, this filter helps for almost all systems that use FC-attached storage on SUSE Linux Enterprise Server 12 SP3 and later.
Adjusting multipath.conf for LUNs that are used for SAP application file systems
 
Note: Adjust the file only if you must. Otherwise, use the default settings for the SAP-related LUNs, and ensure that the boot devices are multipath-capable.
In a sample IBM Spectrum Virtualize storage subsystem, by default the configuration must include the following settings:
path_grouping_policy     group_by_prio
prio                     alua
rr_weight                uniform              #for HDD
                         priorities           #for Flash/SSD
path_selector            "service-time 0"     #performance optimization
There are multiple parameters in /etc/multipathd.conf that affect error detection and failover times. These parameters must be set correctly to ensure that the OS and application on the LPAR is not impacted by a failure of a single path or during rolling maintenance activities on the VIOS or storage subsystem.
Number of I/Os that are routed to a single path before switching to the next one
For systems running kernels older than 2.6.31, use the following string in your multipath.conf file:
rr_min_io               1000
Otherwise, use the following string instead:
rr_min_io_rq          16
rr_min_io specifies the number of I/Os routed to one path before switching to the next path in the same path group (for systems running kernels older than 2.6.31). Newer systems use rr_min_io_rq. Larger values for rr_min_io_rq > 32 can improve throughput while deeper queues then have a bigger impact on failure recovery.
In addition to these changes, you must adjust the queue depth of the devices by running a command, for example:
echo 64 > cat /sys/bus/scsi/devices/<device>/queue_depth
Also, increase /sys/block/<device>/queue/nr_requests if the default (128) results in blocked I/O submissions. This action indirectly helps to optimize the blocking inside SAP HANA.
 
For SAP applications: Go with the default of rr_min_io_rq = 16 unless you value performance higher than recovery aspects.
For high availability setups
The parameter no_path_retry specifies the number of retries until queuing for that path is disabled. The fail (or 0) value prevents queuing and results in immediate failure. The default for no_path_retry in SUSE Linux Enterprise Server 12 is fail, but in SUSE Linux Enterprise Server 11, it is undefined.
Here is the string that is used to set no_path_retry to fail:
no_path_retry       "fail"
 
For SAP applications: For HA clusters, including SAP HANA Auto-Host-Failover, the typical setting is no_path_retry = "fail" to not hinder the take over.
Check with your HA vendor for up-to-date information and best practices if required for
disk monitoring.
Timeout tuning for special purposes
Changing the timeout tuning helps in one situation, but can make another situation worse. When changing the timeout tuning, you must have the skills and understanding about the dependencies, for example, the no_path_retry parameter.
If no_path_retry is set to an integer greater than zero and is not set to queue, there are three different factors in a simplified model that define the timeout in the Linux kernel before an inaccessible volume leads to an I/O error. These factors are summarized in the following equation:
Time untill timeout =
( number_of_active_paths * polling_intervall * no_path_retry )
+ number of (recently seen) active paths (number_of_active_paths)
The number of recently seen active paths depends on the number of paths that are set by the SAN and zoning configuration, and the MPIO parameter dev_loss_tmo:
dev_loss_tmo         typically between 120-300 for SAP applications
If a failure is detected on a multipath link, the SCSI layer waits for a timeout of dev_loss_tmo seconds before the multipath link is marked as failed. When the path is marked as failed, any I/O on that failed path is also marked as failed.
When a link problem is detected, the SCSI layer waits for a timeout of fast_io_fail_tmo seconds before the I/O to devices on that path is marked as failed. If I/O is in a blocked queue, the I/O does not fail until the dev_loss_tmo time elapses and the queue is unblocked. The value must be smaller than the value of dev_loss_tmo.
If there is a failure of one or more paths and there are more than dev_loss_tmo seconds before another path failure event, the number of recently seen active paths is reduced at first.
The LUN is still accessible if there are active paths that are available, and the number of recently seen active paths is reduced. Thus, if there is another path failure event afterward, the time until an I/O error is shown is reduced.
The parameter polling_interval is the interval between two path checks in seconds. For properly functioning paths, the interval between checks gradually increases to max_polling_interval.
 
For SAP applications: Using the defaults must be the starting point. Changes need special considerations and must be made based only on expert knowledge.
User-friendly names
When using aliases instead of WWPNs, set user_friendly_names to yes and add the list of WWPN aliases to multipath.conf. Typically, this parameter is added to the default profile and not the storage-specific portion.
Here is an example of using the user_friendly_names parameter:
user_friendly_names "yes"
Latency optimization
The service-time-0 value selects the path with the potential to have the lowest time-reducing latency, which is a best practice for SAP applications when performance must be optimized.
Here is an example of using the service-time-0 value:
path_selector "service-time 0"
Activating the changed multipath.conf file
After you change multipath.conf, run the following command:
systemctl reload multipathd
Then, verify that no errors occurred by running the following command:
dmesg -T | tail -16
To ensure that you have the latest configuration, update initrd by running the
following command:
dracut --force --add multipath
Troubleshooting
If you have issues, see the following resources:
Enabling MPIO for an existing boot device
The preferred method for enabling MPIO is to enable multipath for all devices during installation. If this step was omitted, complete the following steps:
1. Mount the devices by using the /dev/disk/by-id path that you used during
the installation.
2. Open or create /etc/dracut.conf.d/10-mp.conf and add the following line (mind the leading white space):
force_drivers+=" dm-multipath"
If you have problems with this process in SUSE Linux Enterprise Server 12, see Systemd-udev-settle timing out.
3. Show your boot list by running the following command:
#bootlist -m normal -o
sdat
 
# bootlist -m normal -r
/vdevice/vfc-client@30000002/disk@500507680c326db2
 
  only a single path is known from the bootlist as a single point of failure for the root device.
In this case the bootlist must be extended as described. The amount is limited depending on the OS used.
4. Find your root device in the VG column, with system providing the WWID by running the following command:
pvs | grep system
PV                                              VG       Fmt Attr  PSize PFree
/dev/mapper/360050768018087c52000000000000d68-part2 system lvm2    a-- 49.99g 4.00m
/dev/mapper/360050768018087c52000000000000d6f hn_lg_vg lvm2    a-- 32.00g 0
5. Find the paths for the root device by running multipath -ll, as shown in Example 2-9.
Example 2-9 Paths for the root device
# multipath -ll
360050768018087c52000000000000d6f dm-2 IBM,2145
size=32G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:0:3 sdc 8:32 active ready running
| |- 1:0:12:3 sddg 70:224 active ready running
| |- 1:0:4:3 sdaf 65:240 active ready running
| |- 1:0:8:3 sdbt 68:112 active ready running
| |- 2:0:0:3 sds 65:32 active ready running
| |- 2:0:12:3 sdej 128:176 active ready running
| |- 2:0:4:3 sdbg 67:160 active ready running
| `- 2:0:8:3 sdcv 70:48 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 1:0:10:3 sdcm 69:160 active ready running
|- 1:0:14:3 sdea 128:32 active ready running
|- 1:0:2:3 sdm 8:192 active ready running
|- 1:0:6:3 sdaz 67:48 active ready running
|- 2:0:10:3 sddo 71:96 active ready running
|- 2:0:14:3 sdew 129:128 active ready running
|- 2:0:2:3 sdam 66:96 active ready running
`- 2:0:6:3 sdcb 68:240 active ready running
360050768018087c52000000000000d68 dm-9 IBM,2145
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sdj 8:144 active ready running
| |- 1:0:13:0 sddv 71:208 active ready running
| |- 1:0:5:0 sdat 66:208 active ready running
| |- 1:0:9:0 sdcg 69:64 active ready running
| |- 2:0:1:0 sdag 66:0 active ready running
| |- 2:0:13:0 sdet 129:80 active ready running
| |- 2:0:5:0 sdbv 68:144 active ready running
| `- 2:0:9:0 sddi 71:0 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 1:0:11:0 sddb 70:144 active ready running
|- 1:0:15:0 sdeo 129:0 active ready running
|- 1:0:3:0 sdz 65:144 active ready running
|- 1:0:7:0 sdbn 68:16 active ready running
|- 2:0:11:0 sded 128:80 active ready running
|- 2:0:15:0 sdfd 129:240 active ready running
|- 2:0:3:0 sdba 67:64 active ready running
`- 2:0:7:0 sdcp 69:208 active ready running
6. The multipath -ll command shows the existing path (sdat - green). Now, select more paths. Example 2-10 is based on a SAN Volume Controller, where by default half of the paths are active and the other half are enabled. Check whether you have both paths enabled so that you can always boot. Example 2-10 shows the addition of the blue paths (sdet and sddb).
Example 2-10 The available path and adding new paths
360050768018087c52000000000000d68 dm-9 IBM,2145
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 1:0:1:0 sdj 8:144 active ready running
| |- 1:0:13:0 sddv 71:208 active ready running
| |- 1:0:5:0 sdat 66:208 active ready running
| |- 1:0:9:0 sdcg 69:64 active ready running
| |- 2:0:1:0 sdag 66:0 active ready running
| |- 2:0:13:0 sdet 129:80 active ready running
| |- 2:0:5:0 sdbv 68:144 active ready running
| `- 2:0:9:0 sddi 71:0 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 1:0:11:0 sddb 70:144 active ready running
|- 1:0:15:0 sdeo 129:0 active ready running
|- 1:0:3:0 sdz 65:144 active ready running
|- 1:0:7:0 sdbn 68:16 active ready running
|- 2:0:11:0 sded 128:80 active ready running
|- 2:0:15:0 sdfd 129:240 active ready running
|- 2:0:3:0 sdba 67:64 active ready running
`- 2:0:7:0 sdcp 69:208 active ready running
7. Extend the bootlist by running the following command:
# bootlist -m normal -o sdat sdet sddb
sdat
sddv
sddb
 
# bootlist -m normal -r
/vdevice/vfc-client@30000002/disk@500507680c326db2
/vdevice/vfc-client@30000004/disk@500507680c526db2
/vdevice/vfc-client@30000002/disk@500507680c516db4
8. To verify that no errors occurred, run the following command:
dmesg -T | tail -8
Multipath configuration testing
Before you create the test plan, you need to understand the time that MPIO needs to put a path into the faulty state and then back into active mode. The duration differs depending on where the action was taken. For example, rolling VIOS maintenance is quickly detected, but for storage headnodes, the detection takes much longer because propagating the faulty path into MPIO requires more steps.
It is important to test the multipath configuration because there is a difference between the boot process and other file systems. Although the boot process cannot go through all paths during boot, by default all other file systems have higher robustness regarding redundancy.
Here are some sample candidates for testing:
Performance
Performance tuning is a combination of physical redundancy (number of paths), multipath settings, and file system configuration. You must start with file system configuration optimization and physical redundancy before you start the multipath configuration. Complete the following steps:
a. Verify the different service time settings for a performance difference.
b. Verify different options on ratios among the number of active paths, numbers of LUNs per file system, and file system settings, such as blocksize and stripes.
Rolling maintenance
When you pull and replug cables for different components, such as VIOS, switches, and storage, be careful to not do this task too quickly, or you encounter a race condition where all the paths are faulty.
Path and storage failure
Define the outcome that you want based on your SLAs, and then test for these outcomes.
HA of shared devices
When a cluster manager is installed for applications that use shared disks, you must test a disk failure with cluster handling, and differentiate among HA deployments on a shared-nothing architecture versus a deployment that is based on shared disks. Contact your cluster vendor for their recommendations.
2.6.3 Linux file systems that are relevant to SAP applications
Several different file system types are implemented within the Linux distributions, as shown in Figure 2-18.
Figure 2-18 I/O stack for FC-attached multipath environments for SAP applications
For SAP workloads, use the default root file systems of the Linux distributor, and for SAP-related file systems, use XFS unless directed otherwise by an SAP Note. For the shared file systems, the most common deployment is an HA NFS server.
Here are the different file system types that are implemented within Linux distributions:
Btrfs
Btrfs is a logging style, copy-on-write file system. A changed block is written to a new location, and then the links are updated to point to the new block. Changes are not committed until the last write. SUSE Linux Enterprise Server by default is installed by using Btrfs and with snapshots for the root partition. With snapshots, you can easily reset the system to a defined state, for example, in case of rolling-back applied updates, or to back up files. Before rolling back the system by using a snapshot, ensure that user and application data do not get lost or overwritten during a rollback. More Btrfs subvolumes are created on the root file system, and the subvolumes can be excluded from the snapshot.
For SAP applications, plan to have sufficient space to keep copies so that you can benefit from the features of btrfs under production situations. For example, by using btrfs, changes in configurations can be found by having snapshots and comparing them to the configuration.
XFS
XFS is optimized for handling large files and provides high performance. In SUSE Linux Enterprise Server, XFS is the default file system for data partitions. XFS is supported for internal disks and SAN-attached storage. Multipathing with a matching file system configuration must be enabled to protect against path losses that impact the application and optimal performance.
NFS
NFS is used in SAP landscapes to share binary files that are used to transport changes (/usr/sap/trans) or ensure that the same binary files are visible to all related code at the same time (/sapmnt and HA configurations, such as SAP HANA auto host failover). NFS requires a dedicated and redundant storage network with at least 10 Gb network connectivity.
IBM Spectrum Scale (formerly known as GPFS)
IBM Spectrum Scale is a high-performance clustered file system. It can be deployed in shared-disk or shared-nothing distributed parallel modes. IBM Spectrum Scale requires an external storage server (or GPFS cluster) that is attached through InfiniBand or at least
10 Gb network connectivity. IBM Spectrum Scale File Placement Optimizer (FPO) or a self-build client/server IBM Spectrum Scale cluster is not supported for SAP HANA.
For more information, see SAP Note 2055470.
2.6.4 Logical Volume Manager
With LVM, you can have a layer of abstraction between the Linux OS and the disk devices. One of the most interesting features of LVM is that you can use it to resize (extend or reduce) the various structured elements. Structures of the LVM consist of the following items:
One or more entire LUNs or partitions are configured as PVs.
A VG is created by using one or more PVs.
One or multiple LVs can then be created in a VG.
The OS then creates a file system by using the LV structure.
Because the VGs and LVs are not physically tied to disk devices, it is possible to dynamically resize and create disks and partitions. To add disk capacity to an LPAR, either create a VG or add disk space to an existing VG, and then either expand an existing LV or create one.
 
Note: Resizing the root VG by adding more PVs is not possible because the LPAR fails during the next restart (bootloader).
To add a new VG or LV, complete the following steps:
1. Map a new LUN to the LPAR.
The required steps depend on the attachment method of the disk. Eventually, the disk storage is presented by using vSCSI attachment from the VIOS to the client LPAR. In this case, the back-end device must be present to the VIOS. The device is attached to the LPAR by using VIOS device mapping commands.
Another possibility is that the client LPAR uses physical or virtual FC adapters. In this case, the disk must be masked to the WWPN of the FC adapter in the storage system. SAN zoning must allow access between the LPAR FC adapters and the storage system.
2. Make the new LUN visible to the Linux OS.
Run the rescan-scsi-bus.sh script to automatically update the logical unit configuration of the LPAR. For more information about how to use this script, run the following command:
rescan-scsi-bus.sh --help
If rescan-scsi-bus.sh does not work, run the following command instead:
echo "- - -" > /sys/class/scsi_host/host0/scan #iterate the "0" over the number of ports
3. Create a PV on the new LUN.
The new device for the disk now visible to the OS. To initialize the PV for use by the LVM, run the pvcreate command.
4. Assign the new PV to an existing VG or create a VG.
To add one or more PVs to an existing VG, run the vgextend command. This command increases the space that is available for LVs in the VG. To create a VG, run the
vgcreate command.
5. Create an LV in the VG, or extend an existing LV by running the following command:
# lvcreate --size 5G -n testlv /dev/testvg
Logical volume testlv created.
If the VG consists of multiple LUNs, it is beneficial to stripe the LV across all the LUNs. The command-line option -i, --stripes Stripes of the lvcreate command distributes the LV across multiple LUNs. The LV must be striped across all PVs of the VG. A best practice is to set the number of stripes equal to the number of PVs.
The command-line option -I, --stripesize StripeSize of the lvcreate command specifies the stripe size. The stripe size must be a power of 2, but must not exceed the physical extent size.
For file systems running a workload similar to a database log file (/hana/log), a stripe size of 64 KB delivers the best results. For file systems with a larger blocksize (/hana/data), 64 KB is as good as 128 KB. So, for HANA databases, a stripe size of 64 KB is recommended for all file systems.
6. Create a file system on the new LV by running the following command:
mkfs.xfs -L testlv /dev/testvg/testlv
7. Add the appropriate entries to /etc/fstab to mount the file system:
/dev/hdbvg/usr_sap /usr/sap xfs defaults 1 2
/dev/hdbvg/hana_data_hn1 /hana/data/HN1 xfs defaults 1 2
/dev/hdbvg/hana_log_hn1 /hana/log/HN1 xfs defaults 1 2
/dev/hdbvg/hana_shared_hn1 /hana/shared/HN1 xfs defaults 1 2
8. Mount the file system by running the following command:
mount -a # Will read /etc/fstab and mount file systems which are not mounted
References
In the context of SAP HANA and IBM, SUSE published an SAP Note about how to configure a striped XFS file system, which you can reuse independently of SAP HANA for all XFS file systems that are used by SAP applications in multipath SAN environments on Red Hat and SUSE Linux Enterprise Server OSs. You can find this SAP Note at SAP Note 1944799.
 

1 Virtualization of NVMe adapters on IBM POWER9 processor-based systems, found at: https://developer.ibm.com/articles/au-aix-virtualization-nvme/
2 IBM Support SAN Zoning Best Practices, found at: https://www.ibm.com/support/pages/san-zoning-best-practices