Saturday, August 17, 2013

Configuring FCoE on IBM Flex nodes and V7000 Storage

Disclaimer

I work for an IBM partner. The opinions depicted in this post are solely mine. Any performance degradation, bug, problem mentioned here is generic to any vendor, unless otherwise strictly specified.

Article Revisions

v1.0 - August 17 (17082013) - Initial release
v1.1 - August 17 (17082013) - Small additions to CN4093 section
v1.2 - October 11 (11102013) - Correction to FCoE frames and Ethernet frames (Thanks Anonymous!)

What is FCoE?

FCoE is short for Fiber Channel over Ethernet. It's an encapsulation of FC packets inside Ethernet packets, allowing a server/node to communicate with a storage system through standard Ethernet, instead of investing in dedicated FC infrastructure.

The idea is to combine, or converge as the industry likes to call it, multiple protocols into a single technology, to reduce the datacenter clutter. With Ethernet, you can now transmit data packets (Ethernet), iSCSI (storage protocol) and FCoE (storage protocol).

Issues with FCoE

No Real Optimization

A payload is the data carried by the protocol from one point to another. The typical Ethernet frame payload is 832 bytes, while FC frames can carry 2kB, and iSCSI fits perfectly into Ethernet's packets.
Standard Ethernet supports an increased payload size with something called Jumbo Frames up to 9kB. IPv6 allows a maximum payload size of 4GB (minus 1 byte) but that requires modification of the Transport Layer to allow TCP/UDP to carry larger payloads, and is no longer done on the Ethernet frame.

FCoE runs on Ethernet frames limited to or a max of 9kB when Jumbo Frames are used. However, iSCSI was designed for Ethernet and runs on the Internet Protocol (IP) on top of Ethernet, which allows it to make use of IPv6 Jumbo Frames' large payload size.

So, if you're going with an overhead of protocols already, you might as well go with iSCSI on IPv6 (assuming the storage supports it) and enable Jumbo Frames (assuming the network backend supports very large Jumbo Frames), instead of going with FC over Ethernet!

Overhead and Replication

iSCSI has been in the business for a long time (6+ years) and works well with the standard Ethernet payload size, but better with jumbo frames. Many storage systems offer iSCSI and have been offering it for a long time. It also doesn't require any special protocols and it "just works" including across data-centers, as long as the link the stable (but there's latency, obviously).

FCoE is relatively new, and requires a heap of protocols to maintain a main "feature": Being Lossless. All this protocol overhead means extra latency, and from what I've been reading, it's not yet possible to push FCoE between datacenters. This means that whenever you need data replication across datacenters, you'll need to attach your storage box into an FC-only infrastructure, which means investing in an FC infrastructure! (OK, maybe just 2 switches, but they still cost money!)

Note: This IBM document mentions that it is possible to replicate between different V7000 storage systems via FCoE only, but the article above puts the limitation on the router/networking end, not on the storage end. Also, that post is 3 years old, so things might have changed now. Approach with caution, anyway, and validate with your vendors.

Another overhead is the encapsulation encoding and decoding process. Disks speak SCSI protocol, and what FC packets do, is put the SCSI commands and their data inside an FC packet, then send it over to the storage, which will strip out the FC packet, then execute the SCSI commands and data.

With FCoE, the server is inserting a SCSI payload inside an FC payload and that is inserted into an Ethernet payload!

Multi-Protocol Failure

Currently, FCoE is enabled on Converged Network Adapters (CNAs) that offer standard Ethernet functions (normal network access) + iSCSI + FCoE. When enabling FCoE, the CNA presents to the Operating System (OS) a bunch of storage adapters of type FC.

What happens when you have a failure on the adapter? You lose both network access and storage access. What happens if the FCoE switch fails? You lose both network access and storage access.

A related scenario would be storage upgrades where one path needs to be offline to move the equipment from old stuff to new stuff, this means affecting both network and storage. One more scenario is your usual network administrator mistake where the spanning tree configuration goes wrong, adds a new VLAN, or plugs a cable into a non-configured switch and the network goes into a loop (think of it as a denial of service attack).
While one would think that if the network fails, then why do you need storage access, is completely valid, the issue here is that the sudden loss of storage could also lead to data corruption.

That's why I personally prefer to keep the two activities separate: Network and Storage. It can still be done if you buy separate switches for FCoE and Ethernet, but then where is the "convergence" of your datacenter and its cost reductions?

Port Reservation

I don't know about other vendors, but on the IBM Flex chassis switches, using FCoE requires reserving 2 external ports from the switch (must be Omni Ports), even if you're using a V7000 Flex plugged into the chassis.

The FCoE protocol requires having an FC Forwarder (FCF) even if the traffic is internal to the chassis. These ports have to be reserved and configured in pairs. 2 ports are needed for every storage system to be connected via FCoE.

You do not need to plug SFPs into these reserved ports.

Limited Communications to Storage Systems

FCoE communicates through VLANs on the Ethernet network. Each NIC must belong to one VLAN when talking to an FCoE target. Because of that, a NIC can only talk to one storage system. If you need a node to talk to multiple storage systems, you'll need to assign each pair of NICs to a separate FCoE VLAN belonging to each FCoE storage system.

This limitation is not there for FC infrastructures, as a node's FC adapter registers itself on the FC SAN fabric, and then the administrator zones (groups) each adapter with a storage system, and an adapter can belong be grouped with multiple storage systems, as long as all storage systems use the same FC adapter settings.

The FCoE connectivity limitation can be avoided by virtualizing various storage systems under one storage system, and expose that one system to the nodes. IBM's Storage Volume Controller and its little brother the V7000 can do that.

Administration Role Separation

In large organization, the roles of a network admin and a storage admin are separated. With Network Convergence, who will be responsible for configuring the network switches? Will the admin take responsibility for both network and storage?

Lab Setup

Alright, enough blabbing. Let's get to business. This is the lab setup for this experiment:

  1. IBM Enterprise Flex Chassis
  2. Two x240 nodes (Intel processors)
    1. Windows Server 2012 was preinstalled by a colleagure so I used it for tests
    2. Installed ESXi 5.1 U1 (IBM Customized image) for Boot from SAN tests
  3. One 4-port CN4054 CNA on each node
    1. Firmware: 4.4.180.3
    2. Feature on Demand (FoD) to enable FCoE
  4. V7000 Flex storage (mounted into the chassis)
    1. Firmware: 6.4.1.3
  5. Two CN4093 converged switches
    1. Firmware: 7.5.3
    2. Base license, allowing use of only 2 ports on the 4-port cards
IBM's Flex chassis allows one to contain nodes, Ethernet switches, FC switches, and storage, all into a 10U chassis, and the communication between the components is internal to the chassis at a minimum of 10Gbps. End of shameless plug.

Note0: The 4-port CNA is made by Emulex, and it has the same chipset found on the 2-port LAN on Motherboard (LoM) built into some x240 nodes.

Note1: The firmware levels above are important and you should meet these as a minimum. As of this writing, the storage has newer firmware, but I kept it at this level as it's the minimum required and for testing purposes.

Configuration Overview

  1. Configure x240 nodes and their CNAs
    1. Understanding the CNA
    2. Possible NIC Configurations
    3. Configure FCoE Feature on the NICs
      I won't cover OS configuration nor multipathing driver installation
    4. Configure nodes for SAN Boot via FCoE
  2. Configure V7000 Storage
  3. Configure the CN4093 Converged Switches
    1. Sample Configuration
    2. Configuration Explanation
  4. Profit!
If you need help upgrading the firmware of any component, refer to the device's user manual in the device links posted above. I won't cover these here.

Note: Throughout the guide, screenshots and configuration, I have masked the WWPNs and MACs of the devices used in the lab, because I'm paranoid. Deal with it.

1) Configuring x240 nodes and their CNAs

This is easy, but you could lose yourself within the forest of menus, so I have a few screenshots to make you happy. You can either follow the text description, or spoon-feed yourself with my awesome screenshots.

Understanding the CNA

I'll quickly explain how the CNA is going to function, so that you don't get confused when you configure it.

The 4-port 10Gbps CNA and the 2-port LoM, have 4 physical ports, and 2 physical ports respectively. When enabling Multichannel functionality, the CNA automagically splices each physical port into 4 virtual ports (vNICs).

So physical port 1 will have 4 vNICs: A1 = A1.1 + A1.2 + A1.3 + A1.4. Each vNIC can be allocated bandwidth, not exceeding 10Gb, and the total bandwidth of 10Gb is shared among all 4 vNICs, so you cannot over-commit the bandwidth. So, in an OS, you'll see 8 NICs if you have a 2-port LOM, and 16 NICs if you have a 4-port CNA (4 vNICs per physical port).

You can change the bandwidth allocation dynamically from the switch for any port, live. It's up to you how much bandwidth is allocated to the FCoE port. If you have a license to use all 4 ports, I suggest you use Ethernet on the 1st and 2nd NICs, and FCoE on the 3rd and 4th. This way, you'll be able to allocate full 10Gb to FCoE.

Use the NICs in sequence (1+2, 3+4) to make sure Ethernet passes through both CN4093 switches, and FCoE passes through both CN4093 switches. Ports 1 and 3 communicate with switch0 located in Bay1, while ports 2 and 4 communicate with switch1 located in Bay2.

Possible NIC Configurations

  1. Use physical NICs
  2. Use virtual NICs
  3. Use a mix of pNICs and vNICs

Remember that a 2-port LOM will have each of its physical ports connect to 1 switch. port0 to switch0 and port1 to switch1. So, if you have 2 switches only, you have to use option (2): vNICs.

vNICs are mandatory if you want to share Ethernet and FCoE on the same pipe and you want to guarantee bandwidth for FCoE. If you do not use vNICs, FCoE and Ethernet will compete on the bandwidth. If your servers are busy, it may lead to delayed I/Os and performance degradation.

My favorite configuration is if you have a 4-port adapter, and Upgrade1 switch licenses for your 2 switches, then you can use 2 ports as pNICs for FCoE and 2 ports as pNICs for Ethernet. No need for vNIC configuration.

Alternatively, you can also enable vNICs on the first 2 ports, and leave the 3rd and 4th ports as pNICs. Or the opposite. So you can mix, but they'll have to be in pairs.

If you have a 4-port adapter, with the base license of the switches, your options are the same as the LOM, in the first paragraph.

Configure FCoE Feature on the NICs

  1. Power on the node and press F1 to login to the UEFI setup
  2. UEFI main menu -> System Settings -> Network -> Select 1st NIC (PFA 17:0:0 here) -> Emulex 10G NIC
    You're now at the Emulex NIC Selection menu
  3. Notice the link speed. It should report a number.
  4. Switch Configuration: Change it to IBM Virtual Fabric -- default: Switch Independent
  5. Personality: Change it to FCoE -- default: NIC
  6. Multichannel: Enable if you want to enable vNICs
  7. Controller Configuration -> View Configuration
  8. The 2nd NIC should report itself as FCoE. Only 1 NIC will have FCoE functions.
    Notice that the numbering of the NICs is all even. These NICs belong to switch0 located in Bay1.
  9. Press Esc until you're back at the Emulex NIC Selection menu.
  10. Feature on Demand -> Install FCoE license
  11. You're now done with the first NIC. The 2nd NIC will have the same settings as the 1st. You will need to repeat the steps above for the 3rd NIC, and that NIC's settings will be applied to the 4th.
  12. Press Esc until you're back at the Network menu and select the 2nd NIC.
    Notice that the NICs have odd numbers. These are mapped to switch1 located in Bay2.
  13. Esc to the System Settings menu -> Emulex Configuration Utility
    If you do not see this option, Esc to UEFI main menu, save, then exit to reboot the node.
  14. Highlight the 1st NIC (001) but don't click on it. Write down the NIC's Port Name and node name in a text file for later use. Highlight the other NICs and write their PNs.
    If you don't have an Upgrade1 license for your CN4093 switches, you won't be able to use the 3rd and 4th NICs, so you can ignore them.
  15. Click on the 1st NIC. You're now at the Emulex Adapter Configuration menu.
  16. Configure DCBX Mode: Change it to CEE -- default: CIN
  17. Later on when you're done configuring the storage and the switch, come back here and run Scan for Fiber Devices and you should see the V7000 listed (ID 2145)
    Also, scroll down and click on Display Adapter Info sub-menu and you'll see the FCoE VLAN ID, if your switch was configured properly. This is auto-discovered.
  18. Esc to the Emulex Adapter Configuration menu, and select the 2nd NIC (002) then repeat the same steps.
  19. Esc to UEFI main menu, save and reboot back to the Scan for Fiber Devices for later use.
With those easy steps, you have completed ONE node. Repeat the same for all nodes. If you're fortunate enough to have had ordered the Flex System Manager node, then it's your lucky day! You can create a Configuration Template of the configured node, which would capture its hardware component configurations, and deploy its hardware configuration to other nodes. It's magic.

If you intend to use the Configuration Templates, I recommend that you configure all the components (internal disk RAID setup, time, boot order, ...etc.), then create the template out of the node.

UEFI -> System Settings menu

Network -> Select NIC

Click on that to get the juicy settings

Change settings as listed. Multichan is for vNICs

Showing the available options

Showing the available options

Click it!

FCoE vNIC is always the 2nd

FCoE requires a license. Install it.

To the next adapter

The 2nd NIC follows the settings of the 1st


Emulex Configuration Utility for FCoE HBA Settings

Select 1st NIC. Note Port Name for FC zoning

Change settings to CEE

After storage and switch config is done, scan fiber devices

Configure nodes for SAN boot via FCoE

Each canister/controller will have 1 port looking at one switch and the other port looking at the other switch, which means on each switch you'll see both controllers.

This guide is specific to V7000 and V7000 Flex and VMware ESXi 5.1 (IBM Customized Image). For other storage types, I highly recommend you read and follow the steps in the "Storage and Network Convergence Using FCoE and iSCSI" redbook (link in references). It explains booting from SAN with FCoE and iSCSI, and has excellent tips.

  1. Configure the FCoE switches and make sure that the storage and nodes are functioning properly.
  2. Create a volume and assign it to the node that will boot from SAN. It must be the first volume assigned to the node (LUN 0/SCSI ID 0).
  3. Boot the node into UEFI -> System Settings menu -> Emulex Configuration Utility
  4. Select the 1st adapter
  5. Set Boot from SAN: Change it to Enable
  6. Validate storage connectivity and volume/LUN assignment: Navigate to Add Boot Device -> Select 1st Controller
    If you don't see the storage or LUN 0000, then you need to finish configuring the switches, assign a LUN to the node, then come back here.
    Do not select a boot device here. This is only for validation of connectivity.
  7. Configure HBA and Boot Parameters -> Boot Target Scan Method: Select Boot Path Discovered Targets
    Commit Changes.
  8. Esc to Adapter Selection menu and select the 2nd adapter, and repeat the same steps.
  9. Configuring FCoE SAN boot should be sufficient on 2 ports.
  10. Esc to System Settings menu -> Devices and I/O Ports -> Enable/Disable Onboard Devices
  11. SAS Controller: Disable to disable booting from local disks on the node. Do this even if you don't have any local disks.
  12. Esc to Devices and I/O Ports -> Device Boot Priority
  13. Drag the SAS Controller to the bottom of the list. Save/Commit.
  14. Esc to Main Menu -> Boot Manager -> Add Boot Option -> Generic Boot Option
  15. Add Hard Disk 0, 1, 2, 3
    If you configure 2 FCoE ports, you'll have 4 possible paths to boot from. By selecting 4 disks, the UEFI will configure each path into a Hard Disk, and boot from the first available one.
  16. Esc to Boot Manager -> Delete Boot Option: Delete anything that you don't need (PXE, Floppy)
  17. Esc to main menu -> Save
  18. Reboot and install OS
Note: During adapter preparation phase in UEFI, it'll probe the FCoE ports and see which one is online, and will nominate and use one of them only.

Steps 7 and 15 allow high flexibility and reduce configuration time for implementations that have many nodes. The typical method of implementation is defining the boot LUN and path for each node. So if you have 10 nodes, and 2 FCoE ports, you'd need to repeat those configurations 40 times! Using Boot Discovery and auto Hard Disk assignment by UEFI, you avoid this headache.

It does add some extra time to the boot process, but it's not really important at the advantage of flexibility.

Emulex Configuration Utility

Adapter Selection

Enable Boot from SAN for both adapters

You should be able to see storage and LUNs here

Do not add the LUNs. Just validate connectivity.

Configure HBA and Boot Parameters

Boot Path Discovered Targets

Add Boot Option -> Generic Boot Option

Add Hard Disk 0, 1, 2 and 3 for a total of 4 paths

Devices and I/O Ports

Enable / Disable Onboard Devices

Disable SAS Controller


2) Configure V7000 Storage

If you have purchased the V7000/V7000 Flex with the FCoE daughter cards, there's no configuration for FCoE. If you bought the daughter cards at a later stage, you'll need to activate them from the canisters. This won't be covered here. Please refer to the online manual.

If you login to the V7000's web interface, you'll see each canister's (controller) Port Numbers, for both FC and Ethernet. You'll see these numbers once the switch is configured.

V7000 Flex canister/controller 1

V7000 Flex canister/controller 2

Note that the port type is FC



3) Configure the CN4093 Switches

As mentioned before, a minimum of 2 external Omni ports must be reserved, even if you're using a V7000 Flex inside the same chassis as the nodes.

This switch configuration will assume default bandwidth allocations. I highly advise you to read the CN4093 redbook (link in references) for optimizations.

I'll first write the entire switch config, then explain each section.

Login to the switch in "iscli" mode, then type "enable" to access the enable mode. Now type "config terminal" to be able to modify.

version "7.5.3"
switch-type "IBM Flex System Fabric CN4093 10Gb Converged Scalable Switch"
!
system port EXT15-EXT16 type fc
!
interface port INTA1
name "Flex System Manager node"
no flowcontrol
exit
!
interface port INTA2
name "Power p260 node"
no flowcontrol
exit
!
interface port INTA3
name "x240 node1"
tagging
no flowcontrol
exit
!
interface port INTA4
name "x240 node2"
tagging
no flowcontrol
exit
!
interface port INTA5
tagging
no flowcontrol
exit
!
interface port INTA6
tagging
no flowcontrol
exit
!
interface port INTA7
name "v7000 flex node1"
tagging
pvid 1002
no flowcontrol
exit
!
interface port INTA8
name "v7000 flex node2"
tagging
pvid 1002
no flowcontrol
exit
!
interface port INTA9
tagging
no flowcontrol
exit
!
interface port INTA10
tagging
no flowcontrol
exit
!
interface port INTA11
tagging
no flowcontrol
exit
!
interface port INTA12
tagging
no flowcontrol
exit
!
interface port INTA13
tagging
no flowcontrol
exit
!
interface port INTA14
tagging
no flowcontrol
exit
!
vlan 1
member INTA1-INTA6,INTA9-INTA14,EXT1-EXT2,EXT11-EXT16
no member INTA7-INTA8
!
vlan 1002
enable
name "fcoe"
member INTA3-INTA4,INTA7-INTA8,EXT15-EXT16
fcf enable
!
!
vnic enable
vnic port INTA3 index 1
bandwidth 25
enable
exit
!
vnic port INTA4 index 1
bandwidth 25
enable
exit
!
vnic vnicgroup 1
vlan 3001
enable
member INTA3.1
member INTA4.1
exit
!
spanning-tree stp 80 vlan 3001
!
spanning-tree stp 113 vlan 1002
!
!
!
!
fcoe fips enable
!
fcoe fips port INTA3 fcf-mode off
fcoe fips port INTA4 fcf-mode off
fcoe fips port INTA7 fcf-mode on
fcoe fips port INTA8 fcf-mode on
fcoe fips port EXT15 fcf-mode on
fcoe fips port EXT16 fcf-mode on
!
!
cee enable
!
!
fcalias v7k_node1_p1 wwn 50:00:00:00:00:04:00:76
fcalias v7k_node2_p1 wwn 50:00:00:00:00:04:00:77
fcalias node3 wwn 10:00:00:00:00:00:00:5d
fcalias node4 wwn 10:00:00:00:00:00:00:6b
!
zone name v7k_node3
        member fcalias v7k_node1_p1
        member fcalias v7k_node2_p1
        member fcalias node3
zone name v7k_cluster
        member fcalias v7k_node1_p1
        member fcalias v7k_node2_p1
zone name v7k_node4
        member fcalias node4
        member fcalias v7k_node2_p1
        member fcalias v7k_node1_p1
zoneset name ActiveConfig
member v7k_node3
member v7k_cluster
member v7k_node4
zoneset activate name ActiveConfig
!
no ip routing
!
!
end

Configuration Explanation

system port EXT15-EXT16 type fc
This changes the type of the Omni ports from being Ethernet ports to FC ports. This is required to bind the ports to a storage system, whether the storage is internal to the chassis or external. If your storage is external, these are the ports where you have to plug the FC SFPs and cables to your external SAN fabric.

interface port INTA1-INTA14
name "port name"
no flowcontrol
tagging
pvid 1002
interface port : defines which ports you want to work on. You can specify 1 port or a range. If you have Upgrade1 license, you can also define INTA1-INTB14 to modify all 28 ports in one shot.

name "port name" : It's better that you do this on a per port basis, to give each port a unique name, to know which system is using that port.

no flowcontrol : Disables traffic flowcontrol. A requirement for FCoE.

tagging : Enable VLAN tagging on a port, allowing that port to belong to multiple VLANs. Do not enable this on ports that will not use FCoE, nor require VLAN tagging. An example to this is a standalone Windows/Linux node.

pvid 1002 : Set the Private VLAN ID (native VLAN) on the port. The default is 1 in all networks. This has to be changed to the VLAN of the FCoE on the V7000 Flex ports. If you do not have a chassis storage, no internal port needs this PVID set.

vlan 1
member INTA1-INTA6,INTA9-INTA14,EXT1-EXT2,EXT11-EXT16
no member INTA7-INTA8
!
vlan 1002
enable
name "fcoe"
member INTA3-INTA4,INTA7-INTA8,EXT15-EXT16
fcf enable
!
These are VLAN definitions, and which ports belong to the VLAN and which don't.
1002 is the preferred VLAN ID for FCoE. You can change this to whatever you want, but make sure the customer network doesn't have the same ID on the Ethernet network to not cause confusion for your nodes.

fcf enable : Enable Fiber Channel Forwarding on this VLAN. This is a must on the FCoE VLANs if you have a V7000 Flex or an upstream (Top of Rack) switch that understands FCoE. If you're connecting the chassis to a SAN fabric, you need to enable NPV mode. See the CN4093 redbook for details.

vnic enable
vnic port INTA3 index 1
bandwidth 25
enable
exit
vnic enable : This is only needed if you need vNICs and want to enable it.

vnic port index 1 : This is vNIC1 of the internal physical port 3. In other words, it's INTA3.1.
You only need to set this, if you want to use this specific vNIC. If you do not set these settings, it'll appear as disconnected on the OS.

bandwidth 25 : Allocate 25% of the 10Gb bandwidth, which is 2.5 Gbps to this vNIC.

Note: You do not allocate bandwidth nor define a vNIC index for the FCoE port.

vnic vnicgroup 1
vlan 3001
enable
member INTA3.1
member INTA4.1
exit
vnic vnicgroup : Create a vNIC Group to add members to it. This is a must for vNIC configurations. Not required for non-vNIC setup.
The group members can be vNICs, internal physical ports, and external ports. In the example above, only internal ports were added. No external ports were configured.

vlan 3001 : Each vNIC Group requires its own VLAN, and this must not be an existing VLAN. This is only for internal communication, and will not conflict with the customer side VLANs.

vNICs not added to a vNIC Group, will appear as disconnected.

spanning-tree stp 80 vlan 3001
If spanning tree is enabled, this will place the VLAN 3001 in its own Spanning Tree Group number 80. The firmware will by default assign each VLAN into its own STG without having to do this manually.

fcoe fips enable
!
fcoe fips port INTA3 fcf-mode off
fcoe fips port INTA4 fcf-mode off
fcoe fips port INTA7 fcf-mode on
fcoe fips port INTA8 fcf-mode on
fcoe fips port EXT15 fcf-mode on
fcoe fips port EXT16 fcf-mode on
!
cee enable

Enable fcoe initialization protocol snooping, which will detect which ports support FCoE and which don't.

fcf-mode off/on/auto : It should be OFF for the internal ports of the compute nodes, and on for the storage and FC ports. You can also avoid messing things, and set this to auto on all ports.

cee enable : Enable Converged Enhanced Ethernet to allow FC packet encapsulation over Ethernet.

fcalias
Define an alias to make it easy to identify nodes and storage ports.

no fcalias wwn
To remove an already configured fcalias.

zone name
Create a zone and add aliases to this zone.

zoneset name
zoneset activate name
Create a zoneset, which is a group of zones to enable this set for the entire switch.

no ip routing
Disable Layer3 routing, and make the switch a Layer2 switch only.

show fcoe database
-----------------------------------------------------------------------
 VLAN  FCID                  WWN                     MAC         Port
-----------------------------------------------------------------------
 1002  011000     50:00:00:00:00:04:00:77      0e:fc:00:01:10:00   INTA8
 1002  011100     50:00:00:00:00:04:00:76      0e:fc:00:01:11:00   INTA7
 1002  011101     10:00:00:00:00:00:00:5d      0e:fc:00:01:11:01   INTA3

 Total number of entries = 3

-----------------------------------------------------------------------
Displays the currently established FCoE connections on the switch. It doesn't show any node-storage associations. It shows the nodes/storage that has been detected to have FCoE. The section in orange is a sample output.

show zone
List the configured zones on the switch.

For details and explanations of each command, or extra details, do read the CN4093 redbook (linked below in the references).

Note: The above configuration should be the same for the 2nd CN4093 switch, except for the FCalias parts as the WWPNs will be different.

References

  1. IBM V7000 Storage
    1. IBM Storwize V7000 Information Center
    2. Configuration Limits and Restrictions for IBM Storwize V7000
    3. Implementing the IBM Storwize V7000 V6.3
    4. IBM Flex System V7000 Storage Node Introduction and Implementation Guide
  2. Internet Small Computer Systems Interface (iSCSI)
    1. iSCSI Standard by IETF
    2. Comparing Performance Between iSCSI, FCoE and FC
  3. FCoE
    1. Storage and Network Convergence Using FCoE and iSCSI (redbook)
    2. FCoE Between Datacenters
    3. Fixing Stupid, an FCoE Response
    4. FCoE: Additional Considerations (T11 Fiber Channel Committee)
    5. FCoE Questions and Answers (Cisco)
    6. Datacenter Bridging Exchange (DCBX)
  4. Fiber Channel
    1. Fiber Channel Generations (16 Gbps FC)
    2. FC vs iSCSI (Trusted Network Solutions)
    3. FC Frames
  5. IBM CN4093 and EN4093R
    1. Application Guide for EN4093 and EN4093R - Second Edition
    2. Application Guide for CN4093 - First Edition
    3. IBM Networking OS 7.5 Release Notes for CN4093
  6. Emulex
    1. Emulex Universal Multichannel Reference Guide (Guide for the CN4054 VFA)
    2. White papers and documents for cards by Emulex made for IBM
    3. More white papers
    4. Emulex Virtual Fabric Adapter drivers, firmware and user guide
  7. Network Frames
    1. IPv6 Packets
    2. FCoE Frames
    3. Jumbo Frames
    4. Ethernet Frames
    5. Internet Protocol (IP)

39 comments:

Leo said...

Any experience with adding a top-of-rack switch to the mix? Say, an IBM 8264 between the flex chassis and the V7000?

I'm interested to see what changes need to be made at the ToR to keep keep all this working the same.

Thanks ahead of time.

MBH said...

Hello Leo,

I don't have hands-on experience with that, yet, but we did design an exact solution for a customer (waiting for them to sign the deal).

In an email sent to IBM Techline, the engineer said that the ToR should operate in Full Fabric mode (requiring firmware 7.7).

But if you're using a ToR, you won't need a CN4093 switch in the chassis. Instead, you should use an EN4093R and simply forward the FCoE traffic to the ToR.

On the ToR, you create the FC zones and aliases, like I did in the CN.

Tom B. said...

Hey There -

I would recommend against using the PVID command when using FCoE.

The reason I say this, is that at the FIP protocol layer there can be instances where PVID will strip out the VLAN tag on egress from the switch.

PVID by definition says 'if tag matches PVID, remove'. So if you set the PVID to 1002 and your FCoE traffic is on 1002 every time a FCoE packet traverses the switch when it exits to the initiator or device you will find unusual results. I've seen everything from 'small files copy, big ones won't' to 'no communication'.

My recommendation would be to never use PVID for FCoE traffic. If you need to use it say on the FSM on the Flex chassis I'd recommend TAG PVID command as it will not remove the command.

I've had many support issues come up where the issues with FCoE traffic were resolved by not doing th PVID command.

MBH said...

Hello Tom!

Thank you for the input. The configuration was done by IBM storage and networking engineers, but I do understand the point you're presenting.

On VMware, I've been noticing some discard messages, but I couldn't pin point the issue.

In my case, the V7000 is in the same chassis, not external. So the traffic is flowing within the chassis switches.

The V7000 Flex seem to require having its packets tagged with PVID, as the storage doesn't have any VLAN tagging capability. I'll use PVIF Tagging and see if it resolves my problems with VMware.

Thank you!

Anonymous said...

..." let's assume for now that FCoE will run purely over IPv4 packets."

From what i know, FCoE protocol stack doesn't relay on IP, just runs over Ethernet.

MBH said...

Anonymous,

Thank you for the correction. You are right. With reading about packets then jumping to IPv6, I didn't notice that IPv6's JumboFrames is an exception, and the norm is for Jumbo Frames to be on Ethernet packets not IP.

I'll review the post for corrections, and add resources for those who need more info.

MBH said...

Corrections made & resources added under 'Network Frames'.

Thank you Anonymous for the correction, and I hope I didn't make more mistakes!

Jordi said...

Hi MBH.

Thanks to you for your great overall post. It's almost impossible not to make a small bug with the vast amount of detailed information you offer within your post.

Caparros

Anonymous said...

What an awesome article, thanks.

Minh

Anonymous said...

Superb Document. Works perfectly. Thank you so much....

Regards,
Fnetmohan

fcoe.info said...

Just finished a config a minutes ago with two CN4093 switches and a directly attached Netapp Filer. Our switches arrived with version 7.5.5.0 and following this configuration guide we were not able to get the nodes/blades to successfully show up in the FLOGI database.

One of my colleagues opened a ticket with IBM who recommended setting the PVID=1 as well as to upgrade the firmware to 7.8.8.0. Once we did these two things it all worked just as desired.

MBH said...

Hi John,

What kind of NIC virtualization did you use in the UEFI settings? IBM Virtual Fabric or Unified?

Joao Almeida said...

Hi I used this as a template for a UFP variation and it works perfectly.

Thanks for the great document.
Cheers

FCoE-n00b said...

Hey,

This post is great, and some of the comments are pretty interesting too. I'm trying a similar setup with a couple of differences: CN4093 switches are running latest & greatest firmware (7.7.8) and storage is outside the Flex (a V3700 and not a V7000). The storage is connected over 10G Omni ports and we're planning on running FCoE end-to-end with the CN is FCF mode.

The thing is I'm stuck right now. I can't get a basic FCoE config to work and the 10G Ethernet ports on the V3700 are shown as "Inactive Unconfigured" instead of "Active". I believe that, if FCoE and FIPS are working fine on the CN4093, the storage should be able to show these ports as Active, even if the zoning is wrong, or if nodes are not present or properly configured, right? I've played with the PVID/tagging settings, removed the config and restarted from scratch just with a basic FCoE config but didn't manage to get a different result.

Did any of you guys experience this problem?

Thanx!

MBH said...

Hello!

Can you paste your config on pastebin.com & share the link here?

You can set an expiration date to the paste so it doesn't stay there forever, if you wish.

FCoE-n00b said...

Hi again,

Actually I got it working yesterday. I tested many different things but in the end I believe it's all related to the way the PVID is handled... let's see if I can explain myself:

The SAN is connected to two Ethernet/FCoE ports configured as tagged with PVID 1 (default). With this setup I got it working, and alternatively I was able to make it work with PVID 1002 and with tag-pvid enabled. This is an alternate config to get rid of VLAN 1.

After I got it working on SW1 I moved on to fix SW2. SW1 was reset to factory defaults before deploying a minimal config to get FCoE working. SW2, on the other hand, was already configured for different VLANs, trunks, LACP, etc. I deployed the same config towards the V3700 and it didn't work. The difference between SW2 and SW1 at this point was the VLAN configuration of the FC-type ports. I had somehow configured the FC ports to be just on VLAN 1002 (non tagged and PVID 1002) and SW1 was working with a tagged FC ports and PVID 1. I redid the config on SW2 setting the FC-type ports to tagged with PVID 1 and member of VLAN 1002 and that did the trick. I'm still trying to figure out why the trunking/VLAN configuration of the FC-ports makes a difference... I mean, FC-ports are not Ethernet ports, I think they just need to be members of the FCoE VLAN, why do they also need to be on VLAN 1 and with tagging enabled? Perhaps a bug? This seems like a grey area on the IBM docs.

One last thing... somehow related. The SAN FCoE ports in my deployment are external ports. They run STP by default which means you need to either set them to "edge" mode (portfast in Cisco-slang) or disabled STP altogether. I disabled STP on the SAN facing ports as I figure it's almost impossible to loop the network through a FCoE-enabled SAN canister.

You thoughts and comments are welcomed!

Take care

MBH said...

In my communications with IBM Labs, they were specific in setting the PVID of the FCoE omni ports to the same as the VLAN ID for FCoE.

You may want to clear the switch 2 settings and do the config. I think the command in iscli mode was: clear running-config
Then do a write.

Also, you don't need spanning tree if your storage is directly hooked to the chassis switches, or if it's built into the chassis. You need it if you're connected to a converged top of rack switch.

I'd still appreciate it if you could share the config.

FCoE-n00b said...

Hi there,

I tried once more to change from PVID 1 + tagged 1002 to PVID 1002 on the port to the SAN. As soon as I changed PVID to 1002 the FCoE link dropped. I only recovered it when I went back to PVID 1 + tagged 1002. So I'm basicaly unable to get FCoE running with PVID 1002 on the SAN ports.

I don't know what to tell you, really. The IBM documentation seems to be inline with the information you've received from IBM Labs. My own experience shows a different behaviour, however.

It seems like FIPS sees FCoE traffic on the link and automatically encapsulates it with VLAN ID 1002 (since the UEFI/server/NIC and the SAN canisters are not necessarily aware of the actual FCoE VLAN ID). Now, if the PVID matches the FCoE VLAN ID, then the incoming FCoE traffic gets tagged (by FIPS) and then the tag is stripped (because of the PVID), leaving the FCoE traffic in a limbo somehow. I have no way of proving this, of course... it's just an educated guess based on my own experience these recent days and the feedback I got from Emulex. This is inline with the comment made by Tom B. above, BTW.

Hopefully this will help someone else that runs into the same kind of trouble.

Take care!

Usman said...

Hi,

I have a similar kind of scenario in which there are x240 nodes with 2-port LOM. The customer wants to use FCoE but also has two brocade SAN switches which connect to CN4093 and a network switch for TCP/IP traffic.

The customer wants to use the brocade switches for zoning instead of the CN4093.

The current connectivity is that the two SFP+ ports connect to the network switch and four Omni ports connect to the brocade SAN switch.

What will be the configuration in such a scenario?

MBH said...

Usman,
You'll want to configure NPIV instead of FCF mode on the CN4093 switches since the Brocade Top of Rack (ToR) will handle the zoning.

Unknown said...

hello,

i am really confused to how to configure CN-4022 FCOE adapter, i know this has to be with PVID 1 for FCOE traffic but after that network traffic stopped, there are two vlans 1002 for FC and vlan 200 for ethernet.
your early support will be highly appreciated.

MBH said...

Muhammad,

The CN4022 does not support booting via FCoE. You can only use it if you have an OS installed to a local disk, and after the OS boots, the CN4022 drivers will allow you access FCoE storage devices.

In the UEFI (BIOS) settings for CN4022, enable CEE but don't define VLANs. The FCoE VLAN will be automatically picked up thanks to CEE.

Unknown said...

hello,

i am not trying to boot from san or fcoe, both 2 port adapter are set to Pnic mode in UEFI because we dont need Vnics.The only problem is the limitation of Broadcom netxtreme adater CN4022-2Port which is reported from IBM has to be set PVID 1 for FCOE in default Vlan 1002 and its works properly but network traffic stop working but when change from PVID1 to 200 network trafic work fine but FC stop my question is how its work at same together with EN4093 switch. see this below MIGR No
http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5094204

MBH said...

Muhammad,
I understand now. Keep PVID on EN4093 as 1 & enable VLAN Tagging on the internal port, then add the internal port and external ports to VLAN 200 & VLAN 1 (added by default).

On the OS, use NIC teaming/bonding and create a virtual interface that is VLAN tagged for VLAN 200.

If you're using Linux, you'll need to do NIC Bonding then VLAN Tag on top of the bond interface. There are many guides for that and they're easy.

If you're on Windows 2008 or below, use the Broadcom Advanced Communication Suite (BACS) to do the NIC Teaming and then create a virtual NIC interface and VLAN tag the virtual interface.

If you're using Windows 2012 or 2012 R2, it has built in NIC Teaming that does the same thing, or you can use BACS. It's up to you.

You can set the VLAN on the physical NIC directly from inside Windows without using NIC Teaming, but that's not a proper way to handle networking in a Flex chassis.
If you want to test it, right click the phyaical network adapter in the network adapters window in Windows, choose Properties, then on the top of the dialog, choose Configure. Then in one of the tabs, (maybe Advanced) you'll find NIC options and one of them is the VLAN.

This does not provide redundancy or failure protection, which is why I prefer using NIC Teaming/Bonding.

I hope this helps.

Unknown said...


thanks Leo, these tips are really helpfull.

Unknown said...

Hi,

We are having one issue and would like to seek advise.

One of our customer x240 node(INTA3) WWN Number is not showing up on CN4093 adapter. earlier it was coming back and forth but now its completely vanish as we replaced the board due to issues once in RAM slot and second in IMM Slot.

Configuration attached for your review for both CN4093. further license for FoD was successfully been upgraded

Leo Son said...

Hello Mhtech.

We have four x240 nodes, CN4093 and Flex V7000. Currently working with iSCSI and bad performance. All is working using VNIC. I want to do some experiments with FCoE in only one x240 node. The mixed environment is possible???

Eduardo said...

Good afternoon to everyone.


We have a problem configuring a pair of switch E4093R and a c4022 card.

The configuration of the cards cn4022 is in independent mode switch .

The problem is in the configuration of the card lpvid cn4022 I cannot find the option to set this value in each virtual cards.

Anyone have a similar experience with this detail.

Best Regards

MBH said...

Leo Son,
Apologies for the late reply. It's better to use iSCSI rather than iSCSI, but make sure you follow the redbook (Application Guide) for the best practices.

Since the traffic is all internal, I highly suggest you configure VLAG on the CN4093 switch and connect the 40 Gb ports (if you have the license or unused server slots) or connect at least 2x 10 Gb ports. You can use Direct Attach Cables and save money, instead of buying SFPs and FC cables.

FCoE has a bigger overhead than iSCSI, so your efficiency will drop. If you have the Emulex cards in the x240 nodes, then you can enable the proper NIC Mode to support hardware iSCSI offloading.

As for your last question, yes it is possible to mix the environments, but 1 server can connect only through 1 type: either iSCSI or FCoE, but not both. I do not recommend mixing, since it will complicate the switch configuration a lot and it will be difficult to troubleshoot.

Eduardo,
Apologies for the late reply. The C4022 is a Broadcom card and doesn't have the same option as Emulex cards.

Also, you don't have to set the LPVID per vNIC on the card, since you can configure the VLAN Groups/Tagging on the switch.

Anonymous said...

Hi,

I am new to this area. Here in my location, i have 4 * IBM Flex Nodes with CN4052 HBA, IBM Flex Chassis + 2 Nos of 10Gb EN4093R FCoE / iSCSI Switches + V7000 Storage. We are using 10G iSCSI software initiator for Host - Storage Connectivity and all are working fine.

Now, I want to enable 10G FCoE settings in all Flex Hosts to connect the v3700 Storage.. Could you please help me how to do FCoE Zoning in EN4093R Switch and HBA Settings in Flex Nodes.

Thanks in Advance,

JP Johnson

JP Johnson said...

Hi,

I am new to this are. We are having the following at our premises..

1. IBM Flex Chassis - 1 No
2. IBM EN4093R 10G FcoE / iSCSI Switches - 2 Nos
3. IBM Flex Nodes with CN 4052 HBA - 4 Nos
4. IBM V7000 Storage - 1 No.

Here, we are using 10G iSCSI Software Adapter / IQN for LUN Connectivity from all our servers. Now, we want to utilize FCoE option and want to do Zoning in EN4093R Switch. Could you please help me how to do Zoning and to enable the HBA Setting in the nodes..

Thanks in Advance,

JP

Osku Äs said...

Dear Sir,
I found this post after upgrading hardware.
Would you have any recollection on how to activate afterwars-installed 10G cards? v7000 gen1 manuals seem to not be available anymore on IBM url.

At the moment, cards are not showing up anywhere.

Thank you.

MBH said...

Osku As,

Hello there. I'm not sure what you're referring to. Are you referring to the 10 Gb cards inside the Flex node or the 10 Gb cards inside the V7000 itself?

If inside the Flex, you need to make sure you update the flex nodes to the latest UEFI as well as the firmware of the Converged Network Adapters (CNAs). Then you enter the UEFI and you should find the CNAs there. Make sure their profile is set to FCoE, because maybe after a firmware update, their settings were reset and were set to normal Ethernet profile.

If you mean the 10 Gb NICs inside the V7000, then maybe you need to open a support ticket with IBM, because the firmware of the V7000 should auto detect the storage's interfaces.

Osku Äs said...

I meant the very V7000 node canisters.
It looks like what I try to do is impossible, though; we have 2076-124 and it might have 10GbE support disabled - unlike 2076-324.

dmesg logs on support package show probed be2net kernel module and card entries, but that's as far as any sign of life can be seen on them.

MBH said...

Osku As,

The support is never disabled on the interfaces as far as I understand, but maybe your ports are disabled? Did you try to login from the user interface and go to the port settings and make sure they're enabled, and set on the correct VLAN?

I just checked the IBM product info page here, and it looks like your model doesn't support 10 Gb at all. It supports only 8 Gb FC and 1 Gb Ethernet.

Osku Äs said...

I tried the port settings, alas the ports aren't appearing there at all - so no chance to set enabled nor set VLAN.

Yes, the support page kind of summarizes it; -124s "don't have" 10Gb support. -324s have, although the 324 node looks exactly the same as 124 node - even as in so far, that I have a scrap -324 node, bought without the boot ssd - just taking the boot ssd drive from -124 to the -324 makes the -324 think it's a -124.

These seem to be defined in the boot drive as /data/model and /data/modelid; a -124 has "100" and "8" in them, and I supposed -324 has "300" and "10". That conclusion I drew from /compass/svc_nodetypes.xml from the same ssd bootdrive.

Of course, one's not supposed to go there, even less touch anything.

Osku Äs said...

Likewise, one can see the -124 and -324 difference in /compass/fru.tgz;
100.fru misses ADAPTER_6 -lines that are present on 300.fru.

As a very first ADAPTER_6 line, there's a comment about "10G Ethernet card"

Osu Äs said...

Changing aforementioned /data/model and /data/modelid makes the system recognize 10GbE cards.
Cards configure just fine afterwards.

However, now stuck with Warning 820: wrong type of canisters. System still thinks enclosure is 2076-124, and is slightly unhappy about 2076-324 canisters.

Now I'm wondering how to get rid of this, since "Run Fix" says "Event marked as fixed", and the same error 820's jump straight on.

Osku Äs said...

This might be "decided" on /compass/svc_nodetypes.xml. Possibly telling 2076-124 mtm for node type 10 would suppress warning 820.