Tuesday, March 26, 2013

OpenQRM for ESXi management? A road trip approach!

Recently I was pointed towards OpenQRM for hypervisor management (I realize its good for a lot more than just that). Wondering what it could do I went ahead and set up a PoC... with a surprising outcome. Please keep in mind I have not gone to reading the OpenQRM documentation (yet), which I'm sure is extensive and thorough.

In correspondence with the blog title I shall elaborate a little bit on the test lab "infrastructure". Well, infrastructure is a bit over the top, its all being run from an aging laptop. The laptop in question is a Lenovo Thinkpad T400, anno ca. 2009, powered by an Intel Core 2 Duo P8700 CPU (not the best choice when it comes to virtualizing hypervisors, I tell ya!), 8GB RAM and a WD3200BEVS-0 (320GB Western Digital notebook drive, 5400 rpm, 8MB cache). You can see where this is heading. Running on Ubuntu 12.04 LTS its been serving me quite well as my day to day laptop for internet, mail, remote works and small virtualization projects on top of VMware Workstation, VirtualBox and so forth. However recently I went a little further.

The PoC


At first I went ahead and installed ESXi 5.1 in VMware Workstation 9, just to see what would happen. In a nut shell, it works and will allow you to run nested 32bit VMs. A few years ago I had set up a small PoC of VCenter 4.1 connecting to a 2 node Oracle 10G real application cluster including two ESXi4.1 server VMs, all running inside VirtualBox, spread across two hosts with 4GB RAM each. Why would I do such a thing? Well, one reason was the so called BIC-factor. BIC == Because I Can. Labs are fun. The other was that in a client project we were seeing some DB constraints and I wanted to see how VCenter would behave along side a load balanced DB cluster. I used the same cluster setup later to connect the - back then - brand new HP Quality Center 11, only to find a minor flaw in their documentation, but that's a different story. It worked just nicely, but was hellishly slow. Both hosts were swapping like mad, the small Debian VM providing one iSCSI LUN to the ESXes for persistent storage and one to the RAC for their DBs was screaming at me to terminate it, it was in major pain. But the main outcome was in fact that it did work!

So in order to have a bunch of hosts for OpenQRM to connect to I set up a complete lab, starting with a Win2K8R2 64 bit VM to act as Active Directory controller, DNS server and as VCenter Client and PowerCLI Terminal. 1GB of RAM is more than enough, I've come to realize that Windows boxes are not so bad on their memory consumption either. Secondly I deployed the VCenter Server Appliance only to run into a few culprits. Using namely Tom Fojta's Blog and Duncan Epping's Yellow-Bricks for pointers I went ahead, deployed and downsized to 2GB right away.

If you do that, watch out for the following:
- setting hostname/fqdn using the admin web gui might fail
- recreating SSL certificates to reflect the hostname/ip address change might fail
- active directory setup might fail

I suggest running the appliance with lots of Ram, swapping won't be too bad at this stage and you will be able to configure it fully in due time. Once you made sure that everything works the way you want it, go ahead and downsize to no less than 2GB Ram. Thats about the break even point where the swapping inside the VM will get so bad that it's virtually unusable (load dropping from 10 to about 6 after 20(!!!) minutes of uptime).

Back on topic the next thing was to get auto deploy up and running. I had a DHCP/BOOTP server VM from a different PoC lying around, which was even preconfigured to serve ESXi5.1 images as well, but I decided to go with the TFTP server on the VCenter appliance and just used the VM for DHCP (I know the appliance comes with its own DHCP server). Fired up the Auto Deploy service, created a VM for ESXi, generated mac address and added to DHCP config so it would be assigned to the right group and network range and gave it a go...the VMs PXE client comes up and complains that there is no server profile. So next on I went to the AD server and setup following the Auto Deploy Proof of Concept setup guide, added my ESXi image profile, my auto deploy rules and then deployed my first host. Once it was up and running I went through the painfully slow process of creating and adjusting a host profile, by then my laptop was "swapping like a pig", so to speak. Suffice it to say the host profile editor in the VCenter Client will not work at this setup. It will fail with "vcenter server took too long to respond". The vSphere web client can still be used, although I would like to keep this out of the setup as it introduces a minimum of 800MB more memory consumption on the VCenter appliance and some unknown amount of resource consumption on the flash plugin side (yes, the client browser is on the same laptop). After some fiddling around with the host profiles the hosts came up just fine, keyboard layout German, reflecting my physical keyboard on the host, root password set to 'start#123' (this one is actually important and I will get to it in a few moments), ntp configured, up and running, cluster in VCenter configured and hosts being added to it by FQDN, the whole shebang.

Introducing OpenQRM

Luckily at this point the OpenQRM VM was already setup. I was seeing some unpleasant load situations on this poor laptop already...

ronald@thinkpad:~$ uptime 
 16:54:28 up 3 days, 18:39,  5 users,  load average: 22.81, 19.91, 13.92

The VM in question is a Debian 64 bit minimal installation with OpenSSH-server, 512MB Ram and 1 CPU. It was actually provisioned first but with the VMs to come in mind I wanted to find a compromise of what OpenQRM may or may not require and what I may (or would definitely not have to) spare. The installer is fully automatic, you just download a less than 3 MB tgz, unpack and run it. It is targeted at Debian and CentOS distributions and will install required dependencies automatically, download boot images for server provisioning and pretty much do a bang up job at self configuring. Props to the OpenQRM guys for this fantastic installer! Get it fired up, then get yourself a coffee, lean back and watch the show (or do something more sustainable in the meantime, a half hour run on the treadmill for instance). 

Once the install was finished I went ahead, logged into the GUI, enabled a few plugins at random including VMware and its dependencies and started playing around a little. The VMware plugin in OpenQRM will scan the network for available ESX servers and will then ask to provide login credentials so the hosts can be added to its environment and be managed by it. While the discovery was running my browser was asking me repeatedly whether I wanted to terminate the script as it seemed to be running for too long a time.

First I'd like to note that it did indeed discover both ESX hosts and the VCenter. Well done.

Next thing was to supply login credentials and thats where things went a little sideways. Entering my trusty and all time favorite password 'start#123' I was greeted with an error message saying the only the following characters where supported:

[A-Za-z1-0]

That was a major disappointment and an instant mood kill!!! Having gone through the aforementioned lengths to setup the environment only to find that was reason enough to finish up for the day and have dinner. Now I didn't give up hope yet. I went away and reenabled the vSphere web client, went to the host profile editor and changed the ESX hosts passwords to a more suitable 'Start123', which given the limited resources took me about 45 minutes. By then I was being pushed to finally join the rest of the family at the dinner table... and thus the story ends here.

Hope you enjoyed reading my first tech blog post as much as I enjoyed writing it. I shall add a few more in the days to come, talk a little about the things I'm doing. Maybe I help someone along the line. Or maybe its just a brain dump for myself. In any case, I hereby welcome myself to the tech bloggosphere! :)

Cheers, Ronald!

First post

Hello World