Speeding Up the Pipeline

The PowerShell Pipeline is robust but tends to be slow. With a couple of tricks you can speed it up tremendously and make it as fast as classic foreach loops.

The PowerShell Pipeline is robust but tends to be slow. With a couple of tricks you can speed it up tremendously and make it as fast as classic foreach loops.

As this article shows, the slowness of the PowerShell Pipeline may be a design issue, and simply by replacing Foreach-Object with the new Foreach-ObjectFast, and Where-Object with the new Where-ObjectFast, a test script that originally took 15 seconds to execute now runs in 0.24 seconds. Any code can be turbo-charged without any significant code changes.

Foreach-ObjectFast and Where-ObjectFast are part of the module PSOneTools and can conveniently downloaded and installed from the PowerShell Gallery:

Install-Module -Name PSOneTools -Scope CurrentUser -Force


Problem: Slow Pipeline

Here is a sample script that produces a large string array. It could as well process database records or traverse the filesystem. Key is: the script uses a loop with a lot of iterations:

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()$result = 1..100000 | ForEach-Object {
"I am at $_" }$report = '{0} elements in {1:n2} seconds'
$report -f$result.Count, $stopwatch.Elapsed.TotalSeconds  Tests on various systems turned out that there are apparently two fundamental groups of systems: Type A systems experience an insane performance hit whenever Foreach-Object is used. Type B systems are less affected. Computer Type Execution Time Type A 15.0 s Type B 0.6 s Simply run the script above and check for yourself: if the script runs in under a second, you are Type B. If it takes 15-30 seconds, you are Type A. Which raises the question what causes the difference and makes a system A or B. We’ll look into this later. Let’s first find out just how much slower the PowerShell Pipeline is than it could be. Classic ForEach is the Gold Standard To examine the optimal performance time, I rewrote the script and replaced Foreach-Object and the PowerShell Pipeline with a classic foreach loop. foreach loops are considered to be the fastest loop available: $stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

$data = 1..100000$result = foreach($_ in$data)
{
"I am at $_" }$report = '{0} elements in {1:n2} seconds'
$report -f$result.Count, $stopwatch.Elapsed.TotalSeconds  The result is eye-opening: Computer Type Foreach-Object foreach Factor Type A 15.0 s 0.09 167 x Type B 0.6 s 0.09 6.5 x On computers of Type A, foreach is 167x faster than Foreach-Object. On computers of Type B, foreach is still 6.5x faster. And apparently, foreach is immune against the issue present on Type B computers. It is equally fast on both systems. Regardless of the type of computer you are using, clearly the PowerShell Pipeline is performing poor. Why the PowerShell Pipeline Rocks! Abandoning the PowerShell Pipeline and routinely moving to foreach instead is most definitely not the panacea and universal cure. There is a reason why the PowerShell Pipeline was created: it is an important alternative to foreach loops. Both have their own use cases: • Downloading: foreach works like downloading a video: you first have to fit all data that you’d like to process into memory. This can eat a lot of memory. And because foreach can’t start working before the input data is complete, you may not see any results for quite some time. It really compares well to watching a video: when you download it, you need a lot of storage and have to wait until the entire video is downloaded before you can start watching it. foreach is best if you have the data already collected in some variable. • Streaming: Foreach-Object and the pipeline work like streaming a video: while the video is streamed, you can already watch it, and you don’t need to store the video anywhere. All you need is the memory to store the current frame you are viewing. That’s why the pipeline appears to be faster to many because the initial results become visible very fast. Foreach-Object is best to save resources and process data directly from an upstream cmdlet, and to give quick feedback to the user. ForEach: No Performance Guarantee There is no guarantee that using foreach speeds up your scripts, so with foreach, you may be wasting the pipeline benefits and get no speed improvement in return. Here is a real-world test: Let’s assume you want to search for log files changed in the past 12 hours. With the pipeline and Where-Object you can easily filter out the files you want. Your script needs very little memory since it only needs the memory to process one file at a time. Plus the first visible results come in quickly because the pipeline emits results in real-time as they are produced: $stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

# search for all log files changed in the last 12 hours
$cutOff = (Get-Date).AddHours(-12) # get all files Get-ChildItem -Path$env:windir -Filter *.log -Recurse -ErrorAction SilentlyContinue -Force |
# find only files changed within past 12 hours
Where-Object { $_.LastWriteTime -ge$cutOff } |
# store them also in $result Tee-Object -Variable result | # output path Select-Object -ExpandProperty FullName$report = '{0} elements in {1:n2} seconds'
$report -f$result.Count, $stopwatch.Elapsed.TotalSeconds  On my system, this script in total took around 14.6 seconds, yet the first results were emitted within just fractions of a second, almost instantaneously. That’s why the PowerShell Pipeline appears to be very responsive to the user. Let’s rewrite it and use a classic foreach loop instead to see the performance difference. foreach needs to “download” the entire files collection first before you it can start running a foreach loop and filter out the relevant files so this script burns a lot more memory. $stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

# search for all log files changed in the last 12 hours
$cutOff = (Get-Date).AddHours(-12) # get all files$files = Get-ChildItem -Path $env:windir -Filter *.log -Recurse -ErrorAction SilentlyContinue -Force # use a list to store the original results [System.Collections.ArrayList]$result = @()

foreach($_ in$files)
{
# find only files changed within past 12 hours
If ($_.LastWriteTime -ge$cutOff)
{
$null =$result.Add($_)$_.FullName
}
}

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


Surprisingly, the foreach approach isn’t any faster in this example than using Foreach-Object and also takes roughly 15 seconds. For the user, things became even worse because the first results also took about 15 seconds to become visible. The pipeline approach in comparison emitted the first results almost instantaneously.

So foreach can be faster overall, but it is not guaranteed to be any faster. Here is why:

• The performance difference between foreach and Foreach-Object is very small - but it is per iteration.
• If your loop doesn’t iterate very often, the performance difference is minute.
• If your loop does iterates often (in other words: if your pipeline processes a lot of objects), the performance difference can be insanely huge and slow down a script for up to many minutes.
• In the example, both scripts roughly take the same time because Get-ChildItem does the heavy lifting and is filtering out .log files. The loop processes only 50 or so log files.

The only thing that is guaranteed with foreach is therefore that you will be burning a lot of memory and have to wait a long time for the first results to appear.

That’s why the PowerShell Pipeline often makes a lot more sense - if only it was faster.

Investigating the Cause

Instead of abandoning the PowerShell Pipeline in favor of foreach, let’s investigate what exactly slows down the pipeline so much. We know by now that the fundamental problem is a very small performance drop that hits repeatedly per iteration and piles up to a monster slow-down when you have a lot of iterations.

So take a look again at the initial script which took hefty 15 seconds to complete a loop with 100.000 iterations for computers of Type A and even for computers of Type B still more than 6x longer than foreach:

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()$result = 1..100000 | ForEach-Object -Process {
"I am at $_" }$report = '{0} elements in {1:n2} seconds'
$report -f$result.Count, $stopwatch.Elapsed.TotalSeconds  Can we use the pipeline and just replace Foreach-Object to find out whether the delay is caused by the pipeline in general or by Foreach-Object specifically? Yes we can: Calling Scriptblocks Directly Foreach-Object is just a way to execute an anonymous scriptblock that you submit to the parameter -Process. So instead of using Foreach-Object, you could as well invoke a scriptblock directly, provided your code is located inside the process{}-block: $stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

$result = 1..100000 | & { process { "I am at$_"
}}

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


To a complete surprise, the moment you do this your pipeline suddenly becomes lightning-fast!

Computer Type Foreach-Object direct ScriptBlock Factor
Type A 15.0 s 0.13 s 115 x
Type B 0.6 s 0.13 s 6.5 x

By replacing Foreach-Object with a direct scriptblock call, the entire pipeline suddenly performs roughly at the same speed as the “perfect” loop foreach, plus you get the pipeline benefits (low resources, fast initial response) for free.

Also interesting: for direct scriptblock calls, the type of computer system doesn’t matter. They both perform the same. Here are some conclusions:

• It seems the cmdlet Foreach-Object is causing delays. The general pipeline concept is not to blame.
• Apparently, it is specifically the cmdlet Foreach-Object that has a special vulnerability on computers of Type A. Neither foreach nor direct scriptblock calls seem to care about the computer system the way Foreach-Object does.

It is not the Parameter Binder

The first suspicion is that the Parameter Binder inside Foreach-Object is causing the delays because when you call a scriptblock directly, you are skipping the Parameter Binder:

Foreach-Object is a so-called Advanced Function and comes with a lot of bells and whistles including the Parameter Binder: you could actually pipe objects of different type to Foreach-Object, and the sophisticated Parameter Binder in Advanced Functions would automatically assign them to the appropriate parameters. Only, you seldom need this. And you most definitely do not need this in Foreach-Object.

When you call a scriptblock directly like we did in the last example, you are using a Simple Function without the sophisticated Parameter Binder.

To test the impact of the Parameter Binder on performance, let’s change the scriptblock and enable the Parameter Binder: whenever you add Attributes to a scriptblock, PowerShell enables all the functionality required by Advanced Functions:

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()$result = 1..100000 | & {
param
(
[Parameter(Mandatory,ValueFromPipeline)]
[int]
$InputObject ) process { "I am at$InputObject"
}
}

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


And yes: with the Parameter Binder enabled, execution time triples, and the code takes 0.39 seconds instead of 0.13 seconds. It is still lighting fast compared to Foreach-Object, though.

So while the Parameter Binder does negatively impact performance, it is by far not the most relevant part.

Gotcha: ScriptBlock Invocation

The next unique feature of Foreach-Object is that it accepts arbitrary code to execute and receives the code via a parameter. So it needs to internally execute the scriptblock you submitted.

Testing Hard-Coded Functions

Let’s next examine if there is a penalty when you use functions in general inside the pipeline. So I created a pipeline-aware Foreach-ObjectHardCoded that does what the scriptblock did before:

function Foreach-ObjectHardCoded
{
param
(
[Parameter(Mandatory,ValueFromPipeline)]
[int]
$InputObject ) process { "I am at$InputObject"
}

}

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()$result = 1..100000 | Foreach-ObjectHardCoded

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


Foreach-ObjectHardCoded takes only 0.36 seconds to run, both on computers of Type A and Type B. So the majority of the performance penalty is not related to using functions or using Advanced Functions and Parameter Binder.

The performance penalty must be related to the way how Foreach-Object invokes the scriptblock that users pass.

Testing Functions with Dynamic Scriptblocks

To prove the theory, I wrote Foreach-ObjectDynamic which - like the original Foreach-Object - accepts a scriptblock as parameter:

function Foreach-ObjectDynamic
{
param
(
[Parameter(Mandatory,ValueFromPipeline)]
[int]
$InputObject, [Parameter(Mandatory)] [ScriptBlock]$Process
)

process
{
$Process.InvokeReturnAsIs($InputObject)
}

}

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()$result = 1..100000 | Foreach-ObjectDynamic -Process {
"I am at $_" }$report = '{0} elements in {1:n2} seconds'
$report -f$result.Count, $stopwatch.Elapsed.TotalSeconds  Boom - here the performance penalty strikes again: Foreach-ObjectDynamic takes the very same execution times as Foreach-Object, and also shows the insane difference between computers of Type A and Type B. So clearly, the way how a scriptblock is invoked inside of Foreach-Objectis causing all the trouble. Result: Why it’s Slow Foreach-Object is so slow because it invokes the submitted scriptblock for each and every iteration, again and again. So for each iteration, a new internal pipeline is created, and there is plenty of spin-up and spin-down per iteration. Apparently, on computers of Type A, this spin-up and spin-down seems to be especially expensive, causing extreme delays and performance hits. All of this does not happen when a scriptblock is called directly, or when you create a pipeline-aware function. It only occurs when a scriptblock is freshly invoked for each and every pipeline object. Foreach-ObjectFast To optimize Foreach-Object and make it just as fast as foreach, why not use the same techniques that are employed when you call a scriptblock directly or run a pipeline-aware function? So here is the optimized version Foreach-ObjectFast, more than100x faster on computers of Type A, and still 2.5x faster on computers of Type B: function Foreach-ObjectFast { param ( [ScriptBlock]$Process,

[ScriptBlock]
$Begin, [ScriptBlock]$End
)

begin
{
# construct a hard-coded anonymous simple function from
# the submitted scriptblocks:
$code = @" & { begin {$Begin
}
process
{
$Process } end {$End
}
}
"@
# turn code into a scriptblock and invoke it
# via a steppable pipeline so we can feed in data
# as it comes in via the pipeline:
$pip = [ScriptBlock]::Create($code).GetSteppablePipeline()
$pip.Begin($true)
}
process
{
# forward incoming pipeline data to the custom scriptblock:
$pip.Process($_)
}
end
{
$pip.End() } }$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

$result = 1..100000 | Foreach-ObjectFast -Process { "I am at$_"
}

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


Foreach-ObjectFast takes a mere 0.2 seconds compared to the 15 seconds on computers of Type A that was required by Foreach-Object. Now the pipeline is on equals height with foreach loops, plus you get all the streaming benefits from the pipeline on top. And Foreach-ObjectFast seems to not be vulnerable to the Type A vs. Type B computer systems: it performs well on all computers.

Awesome stuff.

How it works…

The main speed penality occurs when Foreach-Object invokes the submitted scriptblock via InvokeReturnAsIs() for each and every iteration, missing all the optimization and compiler strategies to speed up code that is executed repeatedly.

The truth is Foreach-Object internally is using InvokeUsingCmdlet() and not InvokeReturnAsIs() but the former is a private method and not available. In the end, both call InvokeAsPipe() so invoking the scriptblock (one way or another) for each pipeline object seems to be the reason for the delay.

To eliminate the speed penalty, a new way is required to call the submitted scriptblock in a more efficient way that preserves the optimization strategies available from the PowerShell engine.

So essentially, Foreach-ObjectFast takes the submitted scriptblocks once and constructs a scriptblock from it. It then uses a Steppable Pipeline to invoke this scriptblock multiple times, just as if it were a function inside a regular pipeline.

That’s all to it. Hopefully, future versions of PowerShell integrate this approach.

Begin, Process, End…

Let’s play a bit with Foreach-ObjectFast and try and optimize some real-world code to better understand where you can expect speed increases - and where you shouldn’t bother optimizing.

Foreach-ObjectFast supports all three scriptblock sections. So this piece of code counts the files in the Windows folder and takes a long time, 82 seconds in my case to be exact:

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()$count = Get-ChildItem -Path c:\windows -Recurse -ErrorAction Silently -Force |
ForEach-Object -Begin { $c = 0 } -Process {$c++ } -End { $c }$report = '{0} files found in {1:n2} seconds'
$report -f$count, $stopwatch.Elapsed.TotalSeconds  By simply replacing Foreach-Object with Foreach-ObjectFast, execution times dropped to 26 seconds instead of 82. That’s a factor of 3.2, or put differently: a 320% speed increase! Even for the less vulnerable computers of Type B, the speed improvement isn’t bad: Computer Type Foreach-Object Foreach-ObjectFast Factor Type A 82.0 s 25.8 s 3.2 x Type B 35.3 s 25.8 s 1.4 x What About Where-Object? Where-Object is yet another important pipeline cmdlet and very frequently used. The same time penalties apply. Fortunately, Where-Object is just a special case of Foreach-Object: # Where-Object is just...$r1 = Get-Service | Where-Object { $_.Status -eq 'Running' } # a special case of Foreach-Object$r2 = Get-Service | Foreach-Object { if ($_.Status -eq 'Running') {$_ } }

Compare-Object -ReferenceObject $r1 -DifferenceObject$r2 -IncludeEqual


Where-ObjectFast

That’s why you can apply the very same optimization strategies to Where-Object and create a lightning fast Where-ObjectFast as well:

function Where-ObjectFast
{
param
(
[ScriptBlock]
$FilterScript ) begin { # construct a hard-coded anonymous simple function:$code = @"
& {
process {
if ($FilterScript) { $_ }
}
}
"@
# turn code into a scriptblock and invoke it
# via a steppable pipeline so we can feed in data
# as it comes in via the pipeline:
$pip = [ScriptBlock]::Create($code).GetSteppablePipeline()
$pip.Begin($true)
}
process
{
# forward incoming pipeline data to the custom scriptblock:
$pip.Process($_)
}
end
{
$pip.End() } }$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

$result = 1..100000 | Where-ObjectFast -FilterScript {$_ % 5
}

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


The code produces a list of numbers less those numbers that are dividable by 5. To get a taste for the speed difference, replace Where-ObjectFast with Where-Object. What took a 0.23 seconds before now almost takes long enough to grab a new cup of coffee - 15 seconds to be exact, 65x faster, or put differently: a hefty 6.500 % speed difference.

Where-Object has a built-in Simple Mode where you can specify a property instead of a scriptblock. Where-ObjectFast does not implement this mode because it is not related to performance optimization. That’s why Where-ObjectFast is not fully compatible to Where-Object. If you feel like it, maybe you can add the Simple Mode.

Real-World Impact

To test the real-world impact of Where-ObjectFast, let’s return to the script from the beginning that found fresh log files that have changed within the past 12 hours. The script takes 15 seconds in my case. We’ll optimize it now simply by replacing Where-Object with Where-ObjectFast:

$stopwatch = [System.Diagnostics.Stopwatch]::StartNew() # search for all log files changed in the last 12 hours$cutOff = (Get-Date).AddHours(-12)

# get all files
Get-ChildItem -Path $env:windir -Filter *.log -Recurse -ErrorAction SilentlyContinue -Force | # find only files changed within past 12 hours Where-ObjectFast {$_.LastWriteTime -ge $cutOff } | # store them also in$result
Tee-Object -Variable result |
# output path
Select-Object -ExpandProperty FullName

$report = '{0} elements in {1:n2} seconds'$report -f $result.Count,$stopwatch.Elapsed.TotalSeconds


Surprisingly, the speed increase is neglectable. Or rather, not so surprisingly. At the beginning of this article, this script served as example where foreach would not improve the speed, and since Foreach-ObjectFast and Where-ObjectFast just aim to be as fast as foreach, they cant improve the speed either.

Optimization is Per Iteration

The reason is simple as you know by now: the more iterations you have, the higher is the speed improvement. Since Get-ChildItem is prefiltering the file list and returns only 50 or so log files in total to start with, there are really only very few iterations so the optimization does not produce any significant effect:

Get-ChildItem -Path $env:windir -Filter *.log -Recurse -ErrorAction SilentlyContinue -Force | Measure-Object | Select-Object -ExpandProperty Count  So if you don’t see huge speed improvements immediately, keep in mind: Foreach-Object and Where-Object have a tiny speed penalty per iteration. This can quickly add up to minutes of wasted time, depending on how many objects traverse the pipeline. If your pipeline just processes a handful of objects, though, don’t bother optimizing it. What are “Type A” and “Type B” Computers? One mystery remains: field data indicated that some computers (called Type A) are way more vulnerable to the Foreach-Object performance penalty than others (called Type B). Way more. The initial script took 0.6 for Type B but hefty 15 seconds (!) for Type A. So here is what turns a computer into Type A or Type B: ScriptBlock Logging Hits ScriptBlock Invocation Type A computers are machines where full scriptblock logging is enabled. Scriptblock logging is an important security feature and logs the sources of each invoked scriptblock. That comes at a cost. Foreach-Object was invented before scriptblock logging came along, so back then it did not seem like a bad idea to invoke a scriptblock per iteration. Now with scriptblock logging around, this really hits badly. Foreach-ObjectFast is not negatively affected by scriptblock logging, nor are direct scriptblock calls or pipeline-aware functions, because they all have one thing in common: when you invoke them, the scriptblock is optimized and logged just once, before the pipeline starts. Foreach-Object and Where-Object however seem to invoke the logging logic for each and every invocation. Enabling and Disabling ScriptBlock Logging If you want to test-drive the effect yourself, run this to enable full scriptblock logging: #requires -RunAsAdmin$path = "Registry::HKLM\Software\Policies\Microsoft\Windows\PowerShell\ScriptBlockLogging"
$exists = Test-Path -Path$path
if (!$exists) {$null = New-Item -Path $path -Force } Set-ItemProperty -Path$path -Name EnableScriptBlockLogging -Type DWord -Value 1
Set-ItemProperty -Path $path -Name EnableScriptBlockInvocationLogging -Type DWord -Value 1  Don’t forget to restart your computer. Now your computer is a Type A and pipeline performance is hugely degraded. To turn full scriptblock logging off again and make your machine a Type B, run this: #requires -RunAsAdmin$path = "Registry::HKLM\Software\Policies\Microsoft\Windows\PowerShell\ScriptBlockLogging"
Remove-ItemProperty -Path $path -Name EnableScriptBlockLogging -ErrorAction SilentlyContinue Remove-ItemProperty -Path$path -Name EnableScriptBlockInvocationLogging  -ErrorAction SilentlyContinue



Again, restart your computer just to make sure. Changing the registry keys seem to enable and disable scriptblock logging eventually, but only a restart makes sure the settings take effect.

Call for Action

There is a lot to gain here, and a lot of work ahead as well. In a perfect world, both Foreach-Object and Where-Object would use a steppable pipeline instead of calling Invoke…() per iteration. Since all parameters could remain intact, the change would not be expected to introduce any breaking changes. Since both cmdlets are among the most frequently used, even a small performance gain overall would speed up PowerShell code tremendously.

Because of the way how Foreach-ObjectFast and Where-ObjectFast work, there may be a different debugging experience, and internal variables such as $MyInvocation may yield different information. For most every-day scenarios, this won’t matter. In special cases, though, you may have to review your code if you replace the built-in Foreach-Object and Where-Object cmdlets. The great thing about PowerShell being open-source is that anyone with an idea can submit suggestions and even help implementing improvements. I just submitted a Feature Request. Please feel free to comment or enhance it! I have tried to identify some of the major speed-limiting factors when using Foreach-Object but there sure is more to investigate: I took the initial script from this example, raised the iterations to 1.000.000 and ran average tests on a number of approaches, testing only Type B computers (the ones with the “fast” Foreach-Object and scriptblock logging disabled): Approach Factor Type B Foreach-Object 8.7 s Foreach-ObjectFast 3.9 x 2.2 s . { process { ... }} 6.2 x 1.4 s & { process { ... }} 6.7 x 1.3 s function abc { [CmdletBinding()]param($x) process {($x)} } 2.4 x 3.7 s function abc { process {$_ }} 6.7 x 1.3 s

Foreach-ObjectFast performs about 4x faster than Foreach-Object, however both a direct scriptblock invoke in its own scope and a simple function with process{} block are still way faster, running 6.7 times faster than Foreach-Object.

So even with a steppable pipeline, there is still a substantial penalty for invoking a scriptblock that was submitted via a parameter, and room to further improve Foreach-ObjectFast.

You don’t have to wait for this. If you want to maximize the speed of your pipelines, provided they have a significant number of iterations, either replace Foreach-Object with a direct scriptblock call (& { process { ... }}), or hard-code your scriptblock into a simple function with a process block.

If you’d like to suggest further improvements, please visit the project, fork and create pull requests.

Maybe by the time psconf.eu 2020 opens, we’ll see some of this find its way into PowerShell. Meanwhile, it’s really simple to include Foreach-ObjectFast and Where-ObjectFast into your scripts and benefit from a turbo charger without having to re-code anything.