VMware Training

Not sure if this list is the best place, but it is probably the only list that I'm on that won't give me a bunch of grief about the chosen technology.

I looked at VMware's site, and there are a ton of options. I'm wondering if anyone has some basic suggestions or experiences.

I'm a Linux admin by trade (RH based), with "ok" networking ability. I'm sufficiently versed in deploying scripted ESXi (including 5.x) installations for a specific environment, including vswitches/SAN config (but only with NFS datastores backed by a NetApp, unfortunately, no blockbased stores).

I'd like to get experience deploying VCenter clusters, down to DRS/HA config, other block based storage, and anything else a large environment needs.

Thoughts or experiences?

Not sure if this list is the best place, but it is probably the only list
that I'm on that won't give me a bunch of grief about the chosen technology.

I looked at VMware's site, and there are a ton of options. I'm wondering
if anyone has some basic suggestions or experiences.

I'm a Linux admin by trade (RH based), with "ok" networking ability. I'm
sufficiently versed in deploying scripted ESXi (including 5.x)
installations for a specific environment, including vswitches/SAN config
(but only with NFS datastores backed by a NetApp, unfortunately, no
blockbased stores).

If you want block storage, just export an iSCSI device to the ESXi machines
(tgtadm on RedHat is all you need and a few gigs of free space). VMFS is
cluster aware so you can export the same volume to independent ESXi hosts
and as long you don't access the same files, you're good to go.

I'd like to get experience deploying VCenter clusters, down to DRS/HA
config, other block based storage, and anything else a large environment
needs.

All you need is licenses (Enterprise Plus to get all the nice features) and
a vCenter server. If you already have it, just create a new cluster and
follow the prompts in the wizards and play with all the options.

Thoughts or experiences?

When I first started with this it seemed like rocket science, but once you
create a cluster and do DRS/HA/dvSwitch/etc it's all pretty basic:
- HA in VMware means that if a host fails, the VMs will be restarted on a
different host.
- DRS it means automated live migration of virtual machines based on load.
- dvSwitch is a distributed virtual switch whereby you have a consistent
configuration across the hosts that you configure from the vCenter server.

If you know RedHat, than from experience in a few days you can learn the
ins/outs of how a VMware cluster works.

With ESXi 5.1+ you can run ESXi inside an ESXi host so if you have a lot of
memory on a host you can create your own little lab with all the features
and experiment with them.

If you want to certify, than official training is a mandatory requirement.

HTH,
Eugeniu

Hey Phil,
I recently did the VCP certification/course through VMWare however I was
working with the technology over the past 5 years. Based off your desire to
gain experience with it, my recommendation is to load up VMware Workstation
on your computer and deploy ESXi instances as the guests. This is a cost
feasible and although performance won't be production grade, you have the
ability to play with clusters, DRS/HA config, OpenSAN (for your block based
storage), etc. There is a myriad of training docs available but if you do
want the certification itself, you'll have to go through the official
course(s).

Cheers,
Matt Chung

My understanding of "cluster-aware filesystem" was "can be mounted at the
physical block level by multiple operating system instances with complete
safety". That seems to conflict with what you suggest, Eugeniu; am I
missing something (as I often do)?

Cheers,
-- jra

Seeing you are a Linux admin; VMware's prof. training offerings are
basic "point and click" things, not very Linux-admin friendly; no
advanced subjects or even CLI usage in "Install, Configure, Manage". If
you are already at the level of doing scripted ESXi installs and
configuring hosts for SAN storage and networking according to VMware's
best practices, then you should be able to work out the little that is
left by reading the ample documentation and a few whitepapers, unless
you need proof of completing a class as a certification pre-requisite.

One way to get the extra experiences would be to start by putting
together the simplest two-node or three node cluster you can muster; try
various configurations, put it through its paces: make it break in every
conceivable way, fix it....

There is almost nothing extra to do for DRS/HA config, other than to
design the networking, storage, compute, and DNS properly to be resilient
and support them.

You literally just check a box to turn on DRS, and a box to turn on HA,
select an admission policy, and select automation level and migration
threshold.

Of course, there are advanced options, and 'exotic' clusters where you
need to know the magic option names. You may also need to specify
additional isolation IP addresses, or tweak timeouts for VMware tools
heartbeat monitoring, to cut down on unwanted false HA restarts.

These are not things you will find in the training classes; you need to
read the documentation and literature contained on various blogs --- it
would probably be best to read some of Duncan Epping and Scott Lowe's
books; if you have the time, and to further solidify understanding.

Ultimately; you are not going to be able to do this realistically, without
real servers comparable to the real world, so a laptop running ESXi may
not be enough.

You could also find a company to lease you some lab hours to tinker with
other storage technology; i'm sure by now there are online cloud-based
Rent-A-Labs with the EMC VNX/Dell Equallogic/HP storage hardware.

vswitches/SAN config (but only with NFS datastores backed by a NetApp,

unfortunately,

Also... with uh... NetApp units running current software at least can very
easily create an extra block-based lun on top of a volume, to be served out
as a block target. You might want to ask your storage vendor support
what it would take to get the keycode to turn on FC or iSCSI
licenses, so you can present an extra 40gb scratch volume...... Or
you could download the Netapp simulator to play with :open_mouth:

All the ESXi documentation is online, and all the relevant software has a
60-day evaluation grace period after install. You just need to work through
it.
   Get things working in the lab, then start trying out more complicated
scenarios and trying the advanced knobs later, read the installation
directions;
see how things work.

Buying or scavenging a used server is probably easiest to do for long-term
playing; look for something with 32GB of RAM, and 4 or more 2.5" SAS
drives. Try to have 100GB of total disk space in a hardware RAID10 or
RAID0 with 256MB or so controller writeback cache, or a SSD; the idea
is to have enough space to install vCenter and operations manager and a few
VMs.

A 3 year old Dell 11G R610 or HP DL360 G6 likely falls into this
category.
Install ESXi on the server, and create 3 virtual machines that will
be "Nested" ESXi servers; OS of the VMs will be ESXi.

See:
http://www.virtuallyghetto.com/2012/08/how-to-enable-nested-esxi-other.html

If you would rather build a desktop tower for ESXi; look for a desktop
motherboard with a 64-bit Intel Proc with DDR2 ECC Memory support in at
least 32GB of RAM, VT-d support, and onboard Broadcom or Intel
networking.
Network controller and Storage controller choices are key; exotic hardware
won't work

Considering vCenter itself wants a minimum 12GB of RAM: in case you want
to test out _both_
the vCenter virtual appliance, and the standard install on Windows....
about 32GB RAM is great.

In competition against the VMware HCL, there's a "white box" HCL:
http://www.vm-help.com/esx40i/esx40_whitebox_HCL.php

I would look to something such as the Iomega Storcenter PX6, PX4 or
Synology DS1512+ as an inexpensive shared storage solution for playing
around with iSCSI-based block targets. I think the Iomegas may be the
least-cost physical arrays on the official Vmware HCL, with VAAI support.

You can also use a virtual machine running on the local disks of your ESXi
server to present shared storage,
as another VM If you run your cluster's ESXi servers as nested virtual
machines, on one server.

Some software options are Linux... Nexenta.... FreeNAS... Open-e.
HP Lefthand.... Isilon... Falconstor....Nutanix (I would look at the
first 3 primarily)

Or
You can also use a spare Linux machine for shared storage; I would suggest
SSD for this, or when
using disks: something with enough spindles in appropriate RAID level to
give you at least 400 or so
hundred sustained random IOPS, so you can run 3 or 4 active VMs to play
with without the whole
thing being appallingly slow.

FreeNas / Nexenta / ZFS are also great options, 64-bit system with 16gb+
of RAM to give to Solaris.
Finding a hardware configuration on which Solaris X86 will run properly
out of the box can be challenging.

Of course, if you have a spare Linux machine, you can also use that too,
in order to play around with VMs on NFS or iSCSI.

Not sure if this list is the best place, but it is probably the only list
that I'm on that won't give me a bunch of grief about the chosen technology.
I looked at VMware's site, and there are a ton of options. I'm wondering
if anyone has some basic suggestions or experiences.

I'm a Linux admin by trade (RH based), with "ok" networking ability. I'm
sufficiently versed in deploying scripted ESXi (including 5.x)
installations for a specific environment, including vswitches/SAN config
(but only with NFS datastores backed by a NetApp, unfortunately, no
blockbased stores).

--

-JH

> From: "Eugeniu Patrascu" <eugen@imacandi.net>
[snip]
My understanding of "cluster-aware filesystem" was "can be mounted at the
physical block level by multiple operating system instances with complete
safety". That seems to conflict with what you suggest, Eugeniu; am I
missing something (as I often do)?

When one of the hosts has a virtual disk file open for write access on a
VMFS cluster-aware filesystem, it is locked to that particular host,
and a process on a different host is denied the ability write to the
file, or even open the file for read access.

Another host cannot even read/write metadata about the file's directory
entry.
Attempts to do so, get rejected with an error.

So you don't really have to worry all that much about "as long you don't
access the same files", although: certainly you should not try to, either.

Only the software in ESXi can access the VMFS --- there is no ability to
run arbitrary applications.

(Which is also, why I like NFS more than shared block storage; you can
conceptually use the likes of a storage array feature such as FlexClone
to make a copy-on-write clone of a file, take a storage level snapshot,
and then do a granular restore of a specific VM; without having to
restore the entire volume as a unit.

You can't pull that off with a clustered filesystem on a block target!)

Also, the VMFS filesystem is cluster aware by method of exclusion (SCSI
Reservations) and separate journaling.

Metadata locks are global in the VMFS cluster-aware filesystem. Only one
host is allowed to write to
any of the metadata -on the entire volume a- time, unless you have VAAI
VMFS extensions, and your storage vendor supports the ATS (atomic test
and set),
resulting in a performance bottleneck.

For that reason, while VMFS is cluster aware, you cannot necessarily have
a large number of cluster nodes,
or more than a few dozen open files, before performance degrades due to
the metadata bottleneck.

Another consideration is that; in the event that you have a power outage
which simultaneously impacts your storage array and all your hosts: you
may very well be unable to regain access to any of your files,
until the specific host that had that file locked comes back up, or you
wait out a ~30 to ~60 minute timeout period.

Why bother with a clustering FS, then, if you cannot actually /use it/ as one?
- jra

It means your VMs can run on any host and access the files it requires. If
this was not the case then you could not tolerate a hardware failure and
expect your VMs to survive. It also means you can do things like evacuate a
host and take it down for maintenance.

Of course you could build your application in a way that can tolerate the
failure of a host, and just use local storage for your guests. It really
depends on what you are trying to achieve though.

> From: "Eugeniu Patrascu" <eugen@imacandi.net>

> If you want block storage, just export an iSCSI device to the ESXi
machines
> (tgtadm on RedHat is all you need and a few gigs of free space). VMFS is
> cluster aware so you can export the same volume to independent ESXi hosts
> and as long you don't access the same files, you're good to go.

My understanding of "cluster-aware filesystem" was "can be mounted at the
physical block level by multiple operating system instances with complete
safety". That seems to conflict with what you suggest, Eugeniu; am I
missing something (as I often do)?

What you are saying is true and from VMware's point of view, an ISCSI
volume is a physical disk.
And you can mount the same ISCSI disk on many VMware hosts. Just write into
different directories on the disk.

Am I missing something in your question ?

Eugeniu

I guess. You and Jimmy seem to be asserting that, in fact, you *cannot*
mount a given physical volume, with a clustering FS in its partition,
onto multiple running OS images at the same time... at which point, why
bother using a clustering FS?

The point of clustering FSs (like Gluster, say), as I understood it,
was that they could be mounted by multiple machines simultaneously: that
there was no presumed state between the physical blocks and the FS driver
inside each OS, which would cause things to Fail Spectacularly if more
than one machine was simultaneously using them in realtime.

You and Jimmy seem to be suggesting that multiple OSs need to be semaphored.

One of three understandings here is wrong. :slight_smile:

Cheers,
-- jra

> From: "Eugeniu Patrascu" <eugen@imacandi.net>

>
> > My understanding of "cluster-aware filesystem" was "can be mounted at
the
> > physical block level by multiple operating system instances with
complete
> > safety". That seems to conflict with what you suggest, Eugeniu; am I
> > missing something (as I often do)?
>
> What you are saying is true and from VMware's point of view, an ISCSI
> volume is a physical disk.
> And you can mount the same ISCSI disk on many VMware hosts. Just write
> into different directories on the disk.
>
> Am I missing something in your question ?

I guess. You and Jimmy seem to be asserting that, in fact, you *cannot*
mount a given physical volume, with a clustering FS in its partition,
onto multiple running OS images at the same time... at which point, why
bother using a clustering FS?

OK, let me give it another try:

You have a machine that exports an ISCSI disk (such as a SAN or a plain
Linux box).

You have 2-3-5-X machines (hosts) that run ESXi.

You can mount that ISCSI disk on all ESXi hosts at the same time and use it
as a datastore for VMs and run the VMs from there.

What I said (and maybe this caused some confusion) is that you should not
access the same files from different hosts at the same time, but you can
run VM1 on host1, VM2 on host2 and so on without any issues from the same
ISCSI target.

The point of clustering FSs (like Gluster, say), as I understood it,
was that they could be mounted by multiple machines simultaneously: that
there was no presumed state between the physical blocks and the FS driver
inside each OS, which would cause things to Fail Spectacularly if more
than one machine was simultaneously using them in realtime.

In the scenario described above and how VMware ESXi works in general, only
VMware accesses the filesystem (the filesystem is called VMFS). The hard
drives for the virtual machines are actually represented by files on the
VMFS and so the virtual machines does not touch the filesystem on the ESXi
hosts directly.

You and Jimmy seem to be suggesting that multiple OSs need to be
semaphored.

He says that multiple ESXi hosts need to be semaphored when they update the
metadata on the VMFS.
I don't have any opinion on this as no matter how much I abused VMware, the
filesystem stayed intact.

One of three understandings here is wrong. :slight_smile:

I hope I cleared up some of the confusion.

Eugeniu

Why bother with a clustering FS, then, if you cannot actually /use it/ as
one?

It is used as one. It is also a lot more convenient to have a shared
filesystem, than a distributed volume manager.
You could think of VMDK files on a VMFS volume as their alternative to a
Clustered Linux LVM.
Just because you have some sort of clustered volume manager, doesn't make
your guest operating systems cluster-aware.

With VMFS... if two guest operating systems try to open the same disk,
hypothetically, the most likely reason would be
that there is a split brain in your HA cluster, and two hosts are trying
to startup the same VM.

The locking restrictions are for your own protection. If the filesystem
inside your virtual disks is not a clustered filesystem;
two instances of a VM simultaneously mounting the same NTFS volume and
writing some things, is an absolute disaster.

Under normal circumstances, two applications should never be writing to
the same file. This is true on clustered filesystems.
This is true when running multiple applications on a single computer.

There is such a thing as 'Shared disk mode': if your block target is Fibre
Channel, and you are using Microsoft Cluster Services in VMs,
but an extra virtual SCSI adapter on a VM in shared mode, is something
that you have to explicitly configure (It's not the default).

There are many good things to be said about having a single-purpose
filesystem, which can be placed on a shared storage device --- which
multiple hosts can mount simultaneously, and view the same file/folder
structure, and touch resources corresponding
to which applications are running on that node...

That does not require cluster votes and majority nodeset Quorums to keep
the cluster from totally going out,
and it does not need questionable techniques such as STONITH ("Shoot the
other node in the head") as a fencing strategy,
for providing exclusive access to resources.

Different hosts can access files corresponding to resources running on that
host, and HA is able to fail virtual disks over,
so the Highly-specialized filesystem achieves all the objectives,
that it needs to be a clustered filesystem to solve.

[See below]

Why should "two applications should never be writing to the same file"? In a real clustered *file*system this is exactly what you want. The same logical volume mounted across host cluster members, perhaps geodistantly located, each having access at the record level to the data. This permits HA and for the application to be distributed across cluster nodes. If a host node is lost then the application stays running. If the physical volume is unavailable then logically shadowing the volume across node members or storage controllers / SANs permits fault tolerance. You don’t need to “fail disks over” (really logical volumes) as they are resilient from the start, they just don’t fail. When the shadow members return they replay journals or resilver if the journals are lost.

I’d note that this can be accomplished just so long as you have a common disk format across the OS nodes.

These problems were all resolved 40 years ago in mainframe and supermini systems. They’re not new. VMware has been slowly reinventing — more accurately rediscovering — well known HA techniques as it’s trying to mature. And it still has a lot of catching up to do. It’s the same tale that microcomputers have been doing for decades as they’ve come into use as servers.

However I’m not sure what all of this has to do with network operations. :wink:

-d

> The locking restrictions are for your own protection. If the filesystem
> inside your virtual disks is not a clustered filesystem;
> two instances of a VM simultaneously mounting the same NTFS volume and
> writing some things, is an absolute disaster.
>
>
> Under normal circumstances, two applications should never be writing to
> the same file. This is true on clustered filesystems.
> This is true when running multiple applications on a single computer.

Why should "two applications should never be writing to the same file"? In
a real clustered *file*system this is exactly what you want. The same
logical volume mounted across host cluster members, perhaps geodistantly
located, each having access at the record level to the data. This permits
HA and for the application to be distributed across cluster nodes. If a
host node is lost then the application stays running. If the physical
volume is unavailable then logically shadowing the volume across node
members or storage controllers / SANs permits fault tolerance. You don't
need to "fail disks over" (really logical volumes) as they are resilient
from the start, they just don't fail. When the shadow members return they
replay journals or resilver if the journals are lost.

There is a lot of misunderstanding here about how ESXi works in a multiple
host environment.

There are a lot of abstraction layers.
Physical disk -> VMFS -> VMDK files that represent the VM HDD -> VM -> VM
filesystem (NTFS, ext3/4, xfs etc).

The physical disk can be whatever device a controller presents (like a 4
way FC connection for the same LUN).

What we are discussing here is the VMFS capabilities.

Also, what I am saying is that the VM will be very upset when it's HDD
contents are changed without notice. This is why ESXi has a lock per VM
that notifies other ESXi hosts trying to access a particular VM folder that
"hey, it's in use, leave it alone".

And speaking of clustered filesystems, while you may read/write on the at
the same time, except for file storage I do not know of any application
that has no objections that the files it works with have their contents
modified - think database systems.

I'd note that this can be accomplished just so long as you have a common
disk format across the OS nodes.

These problems were all resolved 40 years ago in mainframe and supermini
systems. They're not new. VMware has been slowly reinventing -- more
accurately rediscovering -- well known HA techniques as it's trying to
mature. And it still has a lot of catching up to do. It's the same tale
that microcomputers have been doing for decades as they've come into use as
servers.

Depending on the use case you may be right or wrong :slight_smile:

However I'm not sure what all of this has to do with network operations. :wink:

What, you want political discussions instead? :slight_smile:

Thanks for the responses everyone. I will be petitioning my manager for the vShpere: Install, Configure, Manage v5.5 course.

My homelab currently consists of a custom dual opteron box with lots of disk, an HP P2000, and a massive CoRAID array. Looks like I'll have to scrounge up a couple other hosts for ESXi since my custom system is running CentOS, and ESXi under KVM still looks like a no-go.

As a note to this, if you get it approved, make sure that the trainer has
(a lot of) real life experience implementing vSphere. It makes a big
difference when you run into trouble with the labs or when you have
questions that are related to best practices.

Eugeniu