Table of Contents
- Pre-Introduction
- Introduction
- Manuals and Files
- Environmnt
- Hardware
- Software Versions and Firmwares
- Hardware Connectivity and Layout
- Required Papers and Licenses
- BOFM Advanced Tool Download and Director 6.3 Plugin
- IBM Java vs Sun/Oracle Java
- OFM and AMM Tips and Limitations
- Installing on Windows: Server-Client Combined Bundle
- Installing OFM on Linux: Server Process as a Service
- Preparing the Spare Blade Server
- OFM Templates
- Using BOFM Advanced
Pre-Introduction
Initially, the intention of this post was to detail a full implementation of HS22 blade servers, V7000 storage, DS4800 storage, BladeCenter Open Fabric Manager (BOFM) and SUSE Linux Enterprise Server (SLES) 11 SP1 and SP2, but I decided to break it down for easier writing and understanding.If you do not find what you want here, email me, and hopefully I'll be able to answer you and add your question and its answer to this post for everyone's benefit.
Introduction
IBM BladeCenter Open Fabric Manager (BOFM or OFM) is a feature on IBM's Blade chassis which allows the user to change the WWNN, WWPN, NIC MACs & virtual adapter WWN/MAC addresses for each blade slot, and assign specific boot targets for each blade slot. This is useful in case one blade server fails, another can be placed in its slot, powered on, and it'll boot from SAN the same image of the previous server because it has its WWNN/WWPN/MACs. It's important to boot from SAN otherwise the solution is almost useless.This post details the implementation of BOFM v4.1 Advanced and configuring the tool as a service on a Linux environment.
Note: I was tasked to install and configure BOFM on a pre-existing environment that is running in production mode, thus, I'll be using existing WWNs to not alter fabric zoning nor host mappings.
The information has been laid out in order for a reason. Read it all to not miss anything.
Manuals and Files
Finding some manuals was hard, and IBM has 2 versions of the 4.1 (June 2011) manual: A public version and one that is included with the BOFM Advanced Utility package. There's slight difference between them.I've uploaded all files to my account for you to download, and save you the headache of finding them.
- BOFM_IUG_5ed.pdf: 4.1 Manual - came with the ZIP file of the Advanced Utility
- jr1bs_bofm_pdf.pdf: 4.1 Manual - public version
- LicensingFAQ.pdf
- LicensingSteps.pdf
- bladecenter_interoperability_guide_2012-march.pdf
- ofm: wrapper install script
- wrapper.conf: wrapper config file
- BC1template_final.csv: BladeChassis 1 BOFM template
- BC2template_final.csv: BladeChassis 2 BOFM template
You can download all these files from my account. The linked files above point at their original source.
Environment
Hardware
Some items were just purchased and others were refurbished.
- 2x IBM BladeChassis H 8852-4TG
- 19x IBM HS22 blade servers 7870-H2G
- QLogic 8Gbps expansion cards for the HS22 blades (part number 44X1947)
- 9x IBM HS21 blade servers 8853-C2G
- Emulex 4Gbps expansion cards for the HS21 blades (part number 39Y9183)
- Built in NIC cards. No expansions added.
- 4x Advanced Management Modules (AMM) (part number 80Y9080)
- 1x IBM StorWize V7000 storage unit
- 1x IBM DS4800 storage unit
- 4x IBM BNT 1Gbps switch modules (part number 32R1866)
- 4x IBM Brocade 8Gbps SAN switches (part number 44X1924)
Software versions and Firmwares
Do not update to the latest version available!
Make sure that you apply the firmwares that are compatible with the BOFM interoperability guide (compatibility matrix), or even better, check the changelog of the latest firmware and make sure it supports BOFM. If there's no mention, ask your IBM vendor to contact IBM TechLine and verify whether the latest firmwares support BOFM or not (regardless of the BOFM compatibility matrix).
The BOFM compatibility list isn't maintained up to date always and thus, at the time of your implementation, important bug fixes may have been introduced in the latest firmwares.
The list below does not mean things were up to date (as it wasn't my task). It means that I had these during my implementation and they worked for me.
The list has been produced from the AMM's firmware VPD page.
- AMM Firmware BPET62P (62)
- HS22 BIOS: P9E156C (1.17)
- HS22 Diagnostics: DSYT92O (4.01)
- HS22 blade management processor: YUOOD4G (1.32)
- HS22 QLogic UEFI Driver: 2.27
- HS22 QLogic BIOS Driver: 2.09
- HS22 QLogic FCode Driver: 3.14
- HS22 QLogic firmware: 5.03.09
- HS21 BIOS: BCE148BUS (1.21)
- HS21 Diagnostics: BCYT30AUS (1.08)
- HS21 blade management processor: BCBT63A (1.23)
- BNT switches: WMZ04000 (0502)
- Brocade switches: BREFSM (632b)
- BladeCenter Open Fabric Manager Advanced (BOFM) OFM41K
- VMware vSphere 5 update 1
- SUSE Enterprise Linux Server (SLES) 11 SP1 and SP2 (3.2 kernel)
- IBM Java 1.6.0 (what was available in SLES SP2 DVD)
- Windows 2008 R2
- IBM Java 1.6.0 (came with IBM Director v6.3)
- IBM Systems Director v6.3
- SAP ECC 6 (ERP application)
- Wincor (Point of Sale application)
Hardware Connectivity and Layout
The customer has 3 VMware servers (HS22 blades). 2 of which are zoned to the V7000 and 1 is isolated and zoned to the DS4800.
Note: Due to the settings required on the HBA, a server zoned for the V7000 cannot be zoned with a different storage unit (like DS4000, DS5000, DS3000) at the same time.
The customer wanted to use the DS4800 for all systems along with the V7000, but that requires a storage virtualization license on the V7000, which was outside of the customer's budget, so servers were split: SAP Production zoned to V7000, and Development and Quality Assurance servers zoned to the DS4800.
All SAP production servers are HS22 blades. The HS21 blades are used for SAP Dev & QA, and all Wincor systems. Wincor systems are also split between the V7000 & DS4800.
Required Papers and Licenses
You will need access to the papers shipped by IBM: The license activation codes for BOFM Standard and BOFM Advanced. BOFM Standard license is a requirement for the Advanced license to work.The BOFM Standard license paper has the following title: "BladeCenter Open Fabric Manager license entitlement information" -- Part number: 2019B1X, and the authorization code is made of 25 digits: ABCDE-ABCDE-ABCDE-ABCDE-ABCDE
The BOFM Advanced license paper has the following title: "Activation Services information" -- Part number: 4812S3X, and the authorization code is made of 12 digits: IBM00000-0000
To get your licenses, and download the BOFM Advanced Utility, you have to do the following:
- Create an account for your organization at http://licensing.datacentertech.net (as instructed in the papers)
- Add your BladeChassis type and serial. Can be found in the AMM -> MM Control -> License Manager -> select any item & click edit
- Activate the BOFM Standard license by adding your 25 digit authorization code to the website above
- Activate the BOFM Advanced license by adding your 12 digit activation code to the same website above
- Register for support for your BOFM Advanced Edition on http://www.serversoftwaresubscription.com
- Upon registration, you'll be able to download the BOFM Advanced Utility/Tool for the site in #5
BOFM Advanced Tool Download and IBM Director 6.3 Plugin
None of the IBM documentations mention this, but version 6.3 of IBM Systems Director does not have a plugin for BOFM. It has been discontinued and the Advanced tool is now a standalone utility. You do not need any version of IBM Director at all, and Director is listed here only because I was tasked to install it.
You can download the latest version of BOFM Advanced utility/tool from http://www.serversoftwaresubscription.com after you register your license.
The direct download link to BOFM Advanced v4.1 (Windows & Linux): http://www.serversoftwaresubscription.com/Downloads/46D0959GMER.zip
You will need to apply the licenses to your chassis (whether temporary or permanent) for the tool to connect to the chassis and function properly.
IBM Java vs Sun/Oracle Java
I was surprised to see that SLES ships with IBM Java rather than Sun's, but it turned out to be for the best. IBM Java has some different implementations and options for system signals (interrupts) and if you use Sun's Java, things may not function as they should.
So my advice to you is to stick with IBM Java and make sure you have version 1.6.0 or higher. I do not know if version 1.7 will work for you. It wasn't available in SUSE's repository, and on Windows, I used the JRE that was shipped with IBM Director.
I have initially installed BOFM Advanced Utility on the same server as IBM Director, which was a Windows 2008 R2 OS, then later on, I moved the setup to a virtual machine running SLES 11 SP2. I'll explain why later.
OFM and AMM Tips and Limitations
These are some tips and limitations of the Open Fabric Manager and the Advanced Management Module of the Blade Chassis:- Create a separate user for OFM to use, so that if the user is locked, it doesn't lock you out of the AMM.
- The password has a max length of 15 characters. The AMM accepts special characters but the ftp login used by the BOFM tool doesn't, so stick to alpha-numeric characters only.
- The spare blade that other blades fail-over onto it must have OFM disabled on it.
- Always checking the zoning! Make sure the boot-targets are the same as zoned in the fabric, otherwise you'll have a lot of disconnectivity or the links will keep showing as degraded.
- Follow the OFM guide rules on what parameters to set on the AMM.
- Make sure the TCP max commands in AMM under Network Protocols is set to 20 (as the guide says) otherwise it won't let you open the OFM page to apply the settings.
- Configuring SNMP is required. When the tool initially connects to the AMM, it sets the 3rd SNMP server as the OFM server's IP address.
- Use IP addresses instead of hostnames to avoid DNS query delays or to be able to reach the systems in case the DNS system wasn't working.
- If your server has multiple IPs and you'd like to bind the server process to a specific IP, modify the file server.prop found in "C:\ofm\data\" or "/opt/ofm/data/" and change "localIPAddress=localhost" from localhost to the desired IP.
- To connect the OFM Console to a different server, open the file "C:\ofm\data\OFMConsole.prop" or "/opt/ofm/data/OFMConsole.prop" and change "ServerIP=localhost" to the server's IP.
- Use a text editor to edit the template files rather than Excel/Calc because those may add commas to empty lines, which OFM rejects.
Installing on Windows: Server-Client Combined Bundle
The ZIP package has 2 installers: One for Linux & another for Windows. Make sure that Java is installed and that it's configured in the system PATH environment variable. You can test that by running "cmd" then typing "java -version" -- if it says the command couldn't be found, then Java isn't properly setup, otherwise it'll print the Java version.After installing the package on Windows, you'll need to open "cmd" and navigate to the installation directory and into the "Combined" directory, then run:
java -jar "C:\Program Files\OFM\V41\combined\OFMcombined.jar"
That will launch the OFM server process then the client interface afterwards which will connect to the same server process. It may take some time (2-5 minutes) and that is normal, especially after you configure the chassis login info.
This is useful for your initial configuration and template design, then deployment. Also, it's to test the fail-over and fail-back features.
The "Combined" package runs the server process with the client, and as soon as you close the client user interface, it'll terminate the server process. For permanent monitoring, you'll need to configure the server process to run as a service/daemon, and configure the client to connect to that specific server process.
This guide will go through that step but it'll be on Linux. If the same software doesn't work for you on Windows, look up guides online, if you work it out, feel free to contact me and I'll link to your post or add your findings here and credit you for it (obviously).
Installing OFM on Linux: Server Process as a Service
Before we begin, I should explain why I chose Linux, and in this particular case, on VMware. The client has a VMware environment already in place and most of their systems are running Linux. Because OFM Advanced runs as a continuous service, it's of utmost importance to make sure the service is available at all times, and that the server doesn't reboot for updates or whatnot whenever it feels like it (like the default behavior of Windows).Also, with VMware in place, the customer makes use of VMware's High Availability cluster, so that if one physical server fails, the virtual machine will start automatically on the other available systems. This is better than relying on 1 physical server, because if it fails, OFM will no longer function and production systems will be at risk if one fails.
Alright! With that explained, let's get to the juicy stuff!
Unpack the ZIP file then make sure the binary is executable: chmod +x OFM41K.bin
Then execute it to start the installation: ./OFM41K.bin
I suggest you install the process as "root" because the process needs to bind to a port.
For some reason, after installation, the data directory wasn't created & I only noticed after the running the server process and saw the errors. So, to create the data directory, run:
mkdir -p /opt/ofm/data/
The installation will deploy the files in the same directory where you ran the installer, or the user's home directory.
If your server has multiple IPs, then follow the instructions in the Tips & Limitations section above to bind the process to a specific IP.
There isn't much to do now apart from configuring the server component to run as a daemon on Linux. For that, I used the Java Service Wrapper (JSW): http://wrapper.tanukisoftware.com/doc/english/download.jsp#stable
Make sure you download the latest stable, not the just latest!
Unpack the file anywhere you like. Copy the following files from it and put them in the OFM server directory: {OFM_INSTALLATION_DIR}/usr/OFM/V41/server/
- {WRAPPER_DIR}/bin/wrapper
- {WRAPPER_HOME}/src/bin/sh.script.in
- {WRAPPER_HOME}/conf/wrapper.conf
Rename sh.script.in to ofm: mv sh.script.in ofm
So, now in the OFM "server" directory, you should have the following files: wrapper, ofm, wrapper.conf, and the OFM original files.
Before we start modifying stuff, you need to copy a few library files to OFM's lib directory {OFM_INSTALLATION_DIR}/usr/OFM/V41/lib/:
- {WRAPPER_DIR}/lib/libwrapper.so
- {WRAPPER_DIR}/lib/wrapper.jar
The last thing to do now is to change the wrapper.conf file to match OFM's requirements and use the Wrapper Class to run it as a service: Download my copy of wrapper.conf and the "ofm" script, and either overwrite your wrapper.conf or modify it manually. Whatever you feel like, but make sure you read the entire file to see if you want to enable a certain function that I hadn't enabled for my setup (like notifications).
You may make the same mistake I did, so I'll explain a few lines of that config:
- "wrapper.java.mainclass=org.tanukisoftware.wrapper.WrapperSimpleApp" should be as is and not be replaced with the name of the OFM class. The wrapper will use its own SimpleApp class to implement the service.
- Keep the libraries listed in the same order. I followed the same order as the OFM jar file's meta data.
- "wrapper.app.parameter.1=com.ibm.ofm.server.OfmServer" This points at the OFM class (found inside the OFM jar file).
- "wrapper.app.parameter.2=-c" This runs OFM in console mode.
If you do not configure this, you'll get an exception: java.lang.IllegalArgumentException: Signal already used by VM: INT
O.K.! Almost done, now what's left is to register it as a Linux service:
Run the script file that you renamed to "ofm" like this: ./ofm install
If all goes well, it'll register a new service called "ofm" and you can verify it with: chkconfig ofm -l
The output should be something like this:
ofm 0:off 1:off 2:on 3:on 4:on 5:on 6:off
If you get an error when trying to run the ofm script, make sure it's executable: "chmod +x ./ofm" (without the double quotes), then try to the run the install command above.
You should now be able to start & stop the service manually, and it will start automatically whenever the server starts: "service ofm start" / "service ofm stop" / "service ofm status"
I haven't tried mapping QLogic WWNN/WWPN onto an Emulex card, but it should work as long as you configure everything properly, but I don't recommend this at all. (It should work because the cards accept both WWNN & WWPN values and replace the defaults with them).
Preparing the Spare Blade Server
In my case, I had 2 spare blades: HS22 and HS21, because the customer was using production systems on both models. Systems running on HS22 will fail over to HS22 and HS21 systems onto the HS21. This keeps things simple in terms of configurations and we avoid any sort of hardware conflicts.I haven't tried mapping QLogic WWNN/WWPN onto an Emulex card, but it should work as long as you configure everything properly, but I don't recommend this at all. (It should work because the cards accept both WWNN & WWPN values and replace the defaults with them).
- You need to make sure that you do not configure the spare blade for OFM. If you have, then disable OFM on it in the config then reapply the config. OFM will not fail over to a blade that has OFM enabled on it.
- Enter the BIOS of the spare blade and change the boot sequence to: Legacy Mode, then Hard Disk 0. Legacy must be the first entry in the entire list.
Your spare blade is now ready to boot any OS from SAN, whenever a blade is failed over to it.
Read the BOFM manual(s) as it properly describes how to make the templates, the available options and what each option means. So, I'll skip describing that in this post.
As I mentioned in the introduction, this implementation of BOFM was for an environment that is in production, and changing the WWNs/WWPNs of each server, in the storage and the SAN switches was going to require A LOT of downtime, which the client wasn't willing to do, nor is it required in the first place.
BOFM generates its own WWNs & MAC addresses but you don't have to use those. You can change the addresses to anything you want and that's what I did: I changed the addresses in the templates to match the existing servers' addresses, which meant that we do not have to change anything in the SAN fabric nor the storage host mappings.
After applying the BOFM templates, you have to restart the server for BOFM to be enabled on the blade servers and for the WWN/MAC to function properly, even if you kept the same addresses.
BOFM allows you to configure multiple WWNs/MACs and even ones for virtual NIC adapters, but in my case, I only needed to configure 2 WWNs/WWPNs & 2 MACs per blade server.
Here's a screenshot of the Advanced Management Module (AMM) after applying BOFM templates:
Here's a screenshot showing the IBM Storwize V7000's identity after applying the BOFM (meaning that the server is seeing the storage properly):
Step 2: Click on the Inventory tab, right click and choose Host Discovery then fill in the info.
If the tool doesn't automatically fetch the inventory, right click the newly added chassis host and select Get Inventory. You can monitor the progress in the events window at the bottom and wait till it's done.
Make sure you save the user/pass into the tool otherwise the failover monitor will not have access to the chassis.
Step 3: Click on the Templates tab. Here you'll create the templates and deploy them.
Step 4: You'll need to create them in order. The Address Manager is the CSV template I have attached above. You can import the CSV template after modifying it, too.
You can either use the templates I provided above or generate ones from the tool then modify them. If you're using a version of OFM newer than 4.1, it's better to generate new ones and modify them to make sure they're compatible.
Make sure you add the MAC addresses and not just the WWNs, otherwise an OS will treat the adapters as if they're new ones and you'll lose the IP configuration.
Step 5: After adding the proper info (WWNs/MACs), create the Standby Pool template. This does not require a CSV file and it'll deploy into the screen directly.
Here you will select the Failover rules for the spare blade and which spare blades to use. You can select multiple spare blades from multiple chassis.
Because the HS22 blades have QLogic and are configured to use V7000, and the HS21 have Emulex and are configured to use the DS4700, I created two separate standby pools: One for V7000/HS22 and one for DS4700/HS21.
Later when selecting the blades to protect (enable failover protection for), a standby pool must be selected, which means the blades in that pool must be compatible with the blade being protected and configured to use the same storage.
In my setup, I used only 1 spare blade.
Step 6: Now it's time to configure the Failover Monitor(s). Each monitor associates a blade or more to one pool.
Select a blade, or more, that you'd like to be monitored for failures, and choose what type of failures to monitor, then click Save.
Select the Standby Pool and the restrictions to apply on how to choose the spare blade to failover to it:
Click Finish to save the Failover Monitor template.
You can add/remove spares from the Standby Pool at any time, and I recommend you re-do the Failover Monitor after modifying the pool.
The Failover Monitor can be paused and resumed to prevent any failovers during maintenance windows.
Manual failover is also possible. First pause the Failover Monitor then right click the Standby Pool and choose Manual Failover then choose a blade to failover to. Once you do that, OFM will shutdown the main blade then boot up the spare.
Once you're done testing, you can revert back the settings (since OFM wrote the WWN/MAC to the spare), by manually failing over to the original blade.
OFM Templates
The OFM templates are available in the Manuals and Files section above to see how I setup the servers for my client's specific environment. The MAC addresses & WWN numbers are the client's and I got permission to use them here as is.Read the BOFM manual(s) as it properly describes how to make the templates, the available options and what each option means. So, I'll skip describing that in this post.
As I mentioned in the introduction, this implementation of BOFM was for an environment that is in production, and changing the WWNs/WWPNs of each server, in the storage and the SAN switches was going to require A LOT of downtime, which the client wasn't willing to do, nor is it required in the first place.
BOFM generates its own WWNs & MAC addresses but you don't have to use those. You can change the addresses to anything you want and that's what I did: I changed the addresses in the templates to match the existing servers' addresses, which meant that we do not have to change anything in the SAN fabric nor the storage host mappings.
After applying the BOFM templates, you have to restart the server for BOFM to be enabled on the blade servers and for the WWN/MAC to function properly, even if you kept the same addresses.
BOFM allows you to configure multiple WWNs/MACs and even ones for virtual NIC adapters, but in my case, I only needed to configure 2 WWNs/WWPNs & 2 MACs per blade server.
Here's a screenshot of the Advanced Management Module (AMM) after applying BOFM templates:
Here's a screenshot showing the IBM Storwize V7000's identity after applying the BOFM (meaning that the server is seeing the storage properly):
Using BOFM Advanced
Step 1: Run the client you installed and configured to connect to the OFM server process.Step 2: Click on the Inventory tab, right click and choose Host Discovery then fill in the info.
If the tool doesn't automatically fetch the inventory, right click the newly added chassis host and select Get Inventory. You can monitor the progress in the events window at the bottom and wait till it's done.
Make sure you save the user/pass into the tool otherwise the failover monitor will not have access to the chassis.
Step 3: Click on the Templates tab. Here you'll create the templates and deploy them.
Step 4: You'll need to create them in order. The Address Manager is the CSV template I have attached above. You can import the CSV template after modifying it, too.
You can either use the templates I provided above or generate ones from the tool then modify them. If you're using a version of OFM newer than 4.1, it's better to generate new ones and modify them to make sure they're compatible.
Make sure you add the MAC addresses and not just the WWNs, otherwise an OS will treat the adapters as if they're new ones and you'll lose the IP configuration.
Step 5: After adding the proper info (WWNs/MACs), create the Standby Pool template. This does not require a CSV file and it'll deploy into the screen directly.
Here you will select the Failover rules for the spare blade and which spare blades to use. You can select multiple spare blades from multiple chassis.
Because the HS22 blades have QLogic and are configured to use V7000, and the HS21 have Emulex and are configured to use the DS4700, I created two separate standby pools: One for V7000/HS22 and one for DS4700/HS21.
Later when selecting the blades to protect (enable failover protection for), a standby pool must be selected, which means the blades in that pool must be compatible with the blade being protected and configured to use the same storage.
In my setup, I used only 1 spare blade.
Step 6: Now it's time to configure the Failover Monitor(s). Each monitor associates a blade or more to one pool.
Select a blade, or more, that you'd like to be monitored for failures, and choose what type of failures to monitor, then click Save.
- Power Off
- CPU Failure
- Blade Communication Errors
- Blade Removal (from its slot in the chassis)
- Hard disk failure
- Blade Denied Power (if one or more power supplies were dead)
- Memory Failure
- Voltage Warnings (happens with faulty motherboards)
- PFA Events (Predictive Failure Analysis reports events that will eventually lead to hardware failure)
Select the Standby Pool and the restrictions to apply on how to choose the spare blade to failover to it:
Click Finish to save the Failover Monitor template.
You can add/remove spares from the Standby Pool at any time, and I recommend you re-do the Failover Monitor after modifying the pool.
The Failover Monitor can be paused and resumed to prevent any failovers during maintenance windows.
Manual failover is also possible. First pause the Failover Monitor then right click the Standby Pool and choose Manual Failover then choose a blade to failover to. Once you do that, OFM will shutdown the main blade then boot up the spare.
Once you're done testing, you can revert back the settings (since OFM wrote the WWN/MAC to the spare), by manually failing over to the original blade.
No comments:
Post a Comment