Block storage has a long history of being a bit fussy when it comes to performance and scalability, especially when used by multiple servers as with VMware’s VMFS. To help mitigate some of the headaches of using shared block storage arrays, VAAI (vStorage APIs for Array Integration) was developed and implemented. VAAI has many different features, called primitives, that are supported by most all modern storage arrays (check the VMware Hardware Compatibility List for yours). With the release of vSphere 5.0, all of the primitives used have been ratified by T10.

In the video I embedded above, I walk through the steps necessary to determine if your block storage LUNs are capable of supporting VAAI, and then show you how to easily see the VAAI primitives in action using esxtop and a vSphere 5.5 environment. Additionally, the remainder of this post will serve as a learning primer for understanding what VAAI is and how you can correlate that knowledge to the video.

What Primitives Can You Use?

To find out the specific primitives supported by your block storage device, you can use the following esxcli command:

esxcli storage core device vaai status get

VAAI Primitive Names, Aliases, and Opcodes

Here are the names and “known aliases” of four major primitives. For a full list of operation codes (opcodes), visit the T10 directory of SCSI Operation Codes.

Atomic Test & Set Aliases: ATS, Compare and Write, Hardware Accelerated Locking, Hardware Assisted Locking

esxcli name: ATS

opcode: 0x89 Extended Copy Aliases: Full Copy, Clone Blocks, XCOPY, Hardware Accelerated Move

esxcli name: Clone

opcode: 0x83 Write Same Aliases: Zero Blocks, Hardware Accelerated Init

esxcli name: Zero

opcode: 0x93 Block Delete Aliases: SCSI UNMAP, Space Reclaim, Block Reclaim

esxcli name: Delete

opcode: 0x42

Here’s an example from the SCSI opcode list for ATS (Compare and Write):

You might think that the vast amount of names and aliases can get confusing. You would be correct.

VAAI in ESXTOP

Below is a screenshot of an esxtop screen with one of my ESXi 5.5 hosts showing the various VAAI primitives.

Here’s how that list of columns correlate to the VAAI primitives:

Atomic Test & Set ATS – commands

ATSF – command failures Extended Copy Clone_RD – reads offloaded

Clone_WR – writes offloaded

Clone_F – failures

MBC_RD/s – the megabytes of data read per second

MBC_WR/s – the megabytes of data written per second Write Same Zero – commands

Zero_F – command failures

MBZERO/s – the megabytes zeroed per second Block Delete Delete – commands

Delete_F – command failures

MBDEL/s – the megabytes of deletion per second

What Do The VAAI Primitives Do?

Here’s a brief overview of the various primitives covered in my esxtop video.

Atomic Test & Set (ATS)

ATS can also be referred to as Hardware Assisted Locking or Hardware Accelerated Locking

Sounds like a nuclear weapon, doesn’t it? Rather than relying on a SCSI reservation to lock an entire LUN, which is required to commit changes without some other neighboring host trying to write to the same file simultaneously, ATS provides the ability to simply lock a segment of the VMFS metadata. This avoids many issues where actions on a single virtual machine would adversely affect performance for all other virtual machines sharing a LUN.

[symple_box color=”yellow” text_align=”left” width=”100%” float=”none”] Note: Jason Boche has a great post called “VAAI and the Unlimited VMs per Datastore Urban Myth” that you should also read). [/symple_box]

Extended Copy

Also goes by Full Copy, Clone Blocks, XCOPY, and Hardware Accelerated Move

When you want to copy data, such as making a virtual machine clone, the storage array is asked to copy the data on the host’s behalf. Without this, the VMkernel’s data mover must handle reading and writing the data. Making the array handle this is typically quicker because there is very little data moving around on the network – it’s more like a local copy (from the perspective of the storage array).

Write Same

Can also be called Zero Blocks or Hardware Accelerated Init

When using a thick VMDK – meaning we’re both provisioning and allocating the blocks on the back end storage array in one swoop – writing zeroes can take a fair bit of time. Especially for large, “eager zeroed” thick disks, which write out all the zeroes the moment they are created. This is in contrast to “lazy zeroed” disks that only write zeroes to the array when a block is first written to by the virtual machine.

Using Write Same allows the vSphere host to simply tell the storage array to write a particular quantity of zeroes without sending them down as SCSI commands.

Block Delete

Also referred to as SCSI UNMAP, Space Reclaim, or Block Reclaim

Thin provisioning storage can be quite handy to save space, but is usually a one way street: as more blocks are written to the thin VMDK file, the larger the file gets. Eventually, the thin VMDK may be allocated to the same size as it was provisioned. This means that it’s not really thin anymore – it’s consuming all of the space that you allowed it to consume. But what about data that has been deleted by the virtual machine? Block Delete is a method to reclaim some of that wasted “dead” space from a LUN and return it to the free space pool.