PowerShell How-To

Understanding Parallel Runspaces in PowerShell

Allowing multiple segments of a code to run concurrently will increase the overall script's efficiency.

Most of the time when writing scripts, performing tasks in a synchronous manner is usually good enough. After all, it's working, right? Also, code in the script will usually have dependencies on other parts of the script, which means some parts must finish before others begin. But what if a script just had a bunch of stuff to do that normally takes a long time to run that doesn't have any dependencies on the rest of the code? In that case, we can introduce a level of multithreading or asynchronous behavior that makes the script performs tasks in parallel.

A common way to perform parallel tasks is to use background jobs. Background jobs is a commonly used feature in PowerShell to perform tasks in parallel but it's not the only way. Another, faster way, is to use parallel runspaces. In a typical PowerShell session, your console runs in a single runspace. Think of a runspace as a container where everything is stored. Inside of this runspace, code is executed asynchronously. But, PowerShell allows you to create your own runspaces and create as many as you want all at the same time! Being able to invoke multiple runspaces at once gives the scripter the ability to run code inside of each runspace independent of the others.

I'm assuming you're not a software developer. If so, you probably don't want to get into the intricacies of runspace pools and runspace factories. You just want an easy way to speed up your script without taking down the machine it's running on. In that case, let's look into Boe Prox's PowerShell module PoshRSJob. PoshRSJob is a PowerShell module that was built to ease the pain that many people go through when attempting to create their own PowerShell runspaces. It was built to mimic traditional PowerShell background jobs and follows the same pattern.

To demonstrate how to bring up parallel runspaces, let's build a simple script that, for now, just does a few folder copies. This script just copies all the files from point A to point B. The file structure may look something like this:

C:\Profiles \F1 \F2 \F3 \F4 ....

Perhaps the Profiles folder contains not just four of these folders, but thousands of them. It's possible to copy the entire contents of Profiles by using the Copy-Item cmdlet.

Copy-Item -Path C:\Profiles -Recurse -Destination C:\DestinationHere

But, be prepared to wait a long time if each of those folders contains gigabytes of data. Luckily, there's a

better way to do this through the use of paralell runspaces and the PoshRSJob module. This is a perfect candidate for parallelization because each folder has no dependencies. We're simply copying each folder, one at a time, to the same destination folder. We can just as easily copy C:\Profiles\F1 at the same time as C:\Profiles\F2 and have no conflicts. Recognizing situations like this is key to figuring out where to implement parallelization.

To demonstrate using parallel runspaces using the PoshRSJob module, we first need to simply list out the name of each folder under C:\Profiles . We'll create a separate runspace for each folder as it's copied to the destination. We can do this by using Get-ChildItem .

$profileFolders = Get-ChildItem -Path C:\Profiles -Directory | Select-Object -Expa ndProperty FullName

Once we have a list of all of the folders to copy, we can then begin reading each directory name and creating a runspace (RSJob) for each folder. But first, we should create the code necessary to execute for each folder. The code is simple since, in each runspace, we're just copying a single folder.

$scriptBlock = { Copy-Item -Path $using:profilePath -Recurse -Destination C:\Desti nationHere }

Notice that I'm able to use the $using feature here. This is a nice feature of PoshRSJob. The profilePath variable will represent the individual folder that's being copied.

foreach ($profilePath in $profileFolders) { Start-RSJob -Name $profile -ScriptBlock $scriptBlock -Throttle 10 }

When started, you'll see that instead of waiting for each folder to copy, Start-RsJob will just keep going. I've chosen to use the Throttle parameter and only allow it to bring up 10 jobs (runspaces) at once. If more are required, PoshRSJob will queue them up and wait for the previous ones to continue.

By using parallel runspaces, I'm able to copy 10 folders at a time which will exponentially speed this up! But watch your disk I/O. If you're not careful, you might just bite off more than your machine can chew!