Tuesday, March 29, 2011

PXE Installation of VMware ESXi 4.1

Introduction

Installing ESXi on multiple hosts at the same time over the network is achieved through PXE booting. Unfortunately many of the guides I found online take a long route to setup PXE booting on Linux and install multiple programs, each with its own config file which complicates the matter.

I chose to use DNSmasq because it provides DHCP, DNS, PXE & TFTP services all in one program. In addition, thanks to Simon, he added a feature where you could assign IPs sequentially rather than based on the Mac address. Read here for details.

This mini-guide assumes the use of Linux. If you're a Windows user, I suggest you use 3C Daemon tool from 3Com which offers DHCP, FTP, TFTP & PXE services for Windows.

I have setup a virtual machine dedicated to PXE booting & installation to make it portable & share it with others. Feel free to run your tests on a VM or a physical box.

Requirements

  • Linux OS. My choice was Debian.
  • VMware ESXi Hypervisor ISO file.
  • Internet connection.
  • pxelinux.0 file from syslinux version 3.
  • Chocolate chip cookies. mmmmm.

Installation & Configuration

0] Install the operating system (Debian) and setup a static IP on the NIC.
1] Edit the file: /etc/network/interfaces -- My editor of choice is nano.
auto lo
iface lo inet loopback

allow-hotplug eth0
iface eth0 inet static
   address 10.172.0.250
   netmask 255.255.255.0
   gateway 10.172.0.254

2] Run the command: service networking restart
Note: In the initial setup, put the IP above to match your network to be able to download then change it once you're done with this guide to the above to avoid conflicts with any network.

3] Install apache and dnsmasq: apt-get install apache2 dnsmasq
4] Edit: /etc/dnsmasq.conf
dhcp-range=10.172.0.1,172.10.0.100,255.255.255.0,infinite
dhcp-option=66,10.172.0.250
dhcp-option=67,"pxelinux.0"
dhcp-boot=/srv/tftp/pxelinux.0
enable-tftp
tftp-root=/srv/tftp

Note 0: The IPs above do not need to match your network.
Note 1: "infinite" is the lease time. The ESXi installer invokes a lease-release token which will cause the IP to be used by another host. I did not want that to happen because I have scripts assigning hosts their IPs sequentially.

5] Create the directory structure: mkdir -p /srv/tftp/pxelinux.cfg

6] Download syslinux v3, extract pxelinux.0 & put it in /srv/tftp: wget <URL>
7] Extract the files: tar -xf <File name>
8] Copy pxelinux.0: cp ./syslinux-3.86/core/pxelinux.0 /srv/tftp/
9] Create PXE boot file: nano /srv/tftp/pxelinux.cfg/default and edit it:
default esxi_scripted
label esxi_scripted
   kernel vmware/esxi411/mboot.c32
   append vmware/esxi411/vmkboot.gz ks=http://10.172.0.250/ks.php --- vmware/esxi411/vmkernel.gz
   --- vmware/esxi411/sys.vgz --- vmware/esxi411/cim.vgz ---
   vmware/esxi411/ienviron.vgz --- vmware/esxi411/install.vgz

prompt 0
timeout 10

Note: Make sure all of the append parameters are on one line. It may pan out here due to little page width.

A] Edit: /var/www/ks.php
accepteula
rootpw password
autopart --firstdisk --overwritevmfs
install url http://10.172.0.250/vmware/esxi411
network --bootproto=dhcp --device=vmnic0
reboot

The above is a kickstart script which the ESXi installer will execute. These are the defaults that are found in the PXE Guide by VMware.
Note: This will install to the first disk detected by the BIOS and will overwrite existing VMFS filesystems.

File Preparation

B] mkdir -p /srv/tftp/vmware/esxi411
C] Copy the contents of the ISO file to the directory above. You can mount an ISO by: mount -o loop /path/to/isofile /mnt. The files will be in /mnt: cp -Rv /mnt/* /srv/tftp/vmware/esxi411/
D] Link to vmware directory: ln -s /srv/tftp/vmware /var/www/vmware
E] service dnsmasq restart

By now, things should be good to go!

Caveats

  • If you set this in a VM, keep the NIC disabled by default to avoid wiping systems by mistake and broadcasting DHCP over the LAN.
  • If using a VM, the physical adapter must have a static IP not set on DHCP
  • To reset the list of leases: echo "" > /var/lib/misc/dnsmasq.leases

This shows a very basic and default setup of installing ESXi over multiple boxes. Hopefully I'll have the time to post my custom scripts that integrate into the kickstart script to auto-assign IPs, VLANs, and a few more tasks to streamline the installation.

Sunday, March 13, 2011

BarCamp Kuwait Two

The 2nd barcamp is being planned and we're looking for people interested in participating with presentations to make our reservations. You're welcome too if you'd like to just attend!

Details about the event: http://goo.gl/DuLTu -- Please make sure you fill in the form at the end of the page!

What is BarCamp? An ad-hoc gathering where people present their projects and experience in the IT field. We usually have key speakers and then The Grid, where other speakers arriving to the event reserve a slot to give their speech. Slots are first-come-first-serve.

Saturday, March 12, 2011

DNSmasq Offers Sequential IP Addressing

A few days back Simon, the developer of DNSmasq, added an option for DNSmasq to serve IPs in a sequential manner rather than based on a hash of the MAC address, upon my request.

He was kind enough to implement it, allowing me to use DNSmasq as a DNS, DHCP, PXE & TFTP daemon for my VMware ESXi automated deployments.

This feature is available in version 2.58 test 4. I tested it on a bunch of virtual machines simulating an ESXi installation and everything went smooth.

I have to note: If you're using it for ESXi deployments, you may want to set the lease expiration time to infinite because the ESXi installer sends a lease release after the installation is done, causing subsequent hosts to get the same IP.

Big thanks go to Simon!

Friday, March 4, 2011

VMware ESXi 4.1 on IBM BladeCenter with Nortel Switches

Update: We resolved the issue permanently and I understand what was going on, but have little time to post everything soon. If you're in a hurry and want help, leave a comment or email me.

One of our customers purchased two BladeCenter H Chassis for deploying VMware on them. Each chassis came with two Nortel switches providing 2 internal ports and 9 external ports. The external network ports are: 3x 10Gbit and 6x 1Gbit.

The customer has a physically isolated DMZ network, so one 1Gbit port from each switch was dedicated to a DMZ switch (VLAN1).

The client had purchased only 2x 10Gbit SFPs, so the third port was empty and won't be used in this setup.

To make use of VMware's Virtual Switch Tagging (VST) network concept, the switches which the blades have to be connected to must be configured as a trunk and allow the required VLANs to pass. Also, the Nortel switch on the BladeCenter must be configured to pass those VLANs, for both external and internal ports.

The following VLANs were created:

  • Management
  • vMotion
  • Fault Tolerance (FT)
  • Virtual Machines
After configuring the external and internal ports of the Nortel switch to be part of those VLANs, a strange problem popped up: I couldn't ping or reach any of the ESXi hosts in any way, unless I pinged my workstation from within the ESXi server first!

To make matters clear, here's how things were connected:
My workstation -> Server Farm Switch
BladeCenter -> Server Farm Switch

Ping from workstation to any ESXi host: Fails
After 1 ping from an ESXi host to my workstation: Succeeds, and all pings from my workstation to that specific ESXi host go through.

Also, even after traffic is established and I connect using vSphere Client, it disconnects me after about 15 minutes and I can no longer communicate with that host until I ping my workstation from that host again!

After poking around for hours, the solution was to take one external port (1Gbit) out of all VLANs except 1. That is, that port must not belong to any VLAN except VLAN1 (untagged). Doing so, allowed us to communicate with all servers smoothly.

I still don't understand why that worked and whether traffic is now passing through the tagged external ports or that specific untagged port. I'll be doing more investigations next week and update this post.