Traditional looping constructs like “foreach
” and “do...while
” cannot stream: you need to wait for all results to be done. With a simple trick, you can add streaming.
By embedding traditional loops into a ScriptBlock you get real-time streaming. This way, you can process the results immediately as they become available, and add an async touch to your scripts.
Let’s go step-by-step and examine first the benefits of streaming, then look at when it makes sense to add streaming to classic loops.
Why is Streaming Important?
The PowerShell Pipeline has built-in real-time streaming, so you receive results when they are created:
Get-ChildItem -Path c:\windows -Recurse -Filter *.log -File -ea 0 |
ForEach-Object { 'processing {0}' -f $_.FullName } |
Out-GridView
While Get-ChildItem
is traversing your Windows folder, the pipeline starts emitting results as they become available. You can see this in Out-GridView
: it is adding new lines as they are emitted by Get-ChildItem
.
This makes a script very responsive (a user sees initial results very quickly) and saves a lot of memory (only one file object at a time needs to be accommodated in memory).
Classic Loops Lack Streaming
PowerShell also supports all classic loop constructs such as foreach
and do...while
. While they perform better overall, they do not support streaming and return their results only when everything is processed:
$files = Get-ChildItem -Path c:\windows -Recurse -Filter *.log -File -ea 0
$results = foreach($_ in $files)
{ 'processing {0}' -f $_.FullName }
$results | Out-GridView
Now you have to wait a long time for Get-ChildItem
to produce the results. Once they are in, foreach
can process the data in $files very fast, and the overall time for this script is faster than the pipeline approach. However, for a user this approach appears to be much slower because there is a long waiting time until the first responses appear.
You can see this in Out-GridView
: it opens only after a long delay, then shows all results almost momentarily.
Overview: Streaming vs. Variables
If you must use a classic loop construct like foreach
or do...while
(and there are good reasons for it), you must save the results to a variable. However, you can add streaming behavior simply by embedding them into a ScriptBlock. ScriptBlocks support streaming by default.
Let’s first look at why that might be a good idea.
Passing Results in Real-Time
I rewrote the code and embedded the loop in a ScriptBlock. No longer do I need to assign the results of foreach
to a variable like $results. Instead, the enclosing ScriptBlock can stream the results in real-time to downstream commands such as Out-GridView
:
$files = Get-ChildItem -Path c:\windows -Recurse -Filter *.log -File -ErrorAction SilentlyContinue
& {
foreach($_ in $files)
{ 'processing {0}' -f $_.FullName }
} | Out-GridView
It works but isn’t any faster and still has a long initial delay. This comes at no surprise: it just takes so much time for Get-ChildItem
to gather all the data in $files in the first place. So when foreach
starts to do something (and in fact now outputs results in real-time via streaming), the initial delay has already taken place.
That’s why it is important to first understand when streaming can help you, and when you shouldn’t bother considering it:
Classic Loops Must Have All Data
There is no way to work around the fact that classic loops like foreach
and do...while
can only start running when they have all data already present in some variable. If you want to change that, you must rewrite your code and use the PowerShell Pipeline and Foreach-Object
instead.
So classic loops only make sense when the data is already present in some variable, or is emitted directly from some command.
A “Real” Streaming Example
To see streaming in action, let’s assume the data is already present in some variable $files, and also add a Start-Sleep
to the code to pretend it is doing something very expensive with the data:
# $files is supposed to be filled with data already
$result = foreach($_ in $files)
{
# artificially slowing things down a bit
'processing {0}' -f $_.FullName
Start-Sleep -Seconds 1
}
$result | Out-GridView
You now have to wait a very long time for the results to appear because foreach
returns its result only when it has processed all data.
“That’s not true!”, you may argue. When you remove Out-GridView
from the example above, foreach
does return data in real-time as it becomes available. Please look again: we are looking at returning data so that your script can do something with it. foreach
returns its result only when it has completed the entire loop. What you are seeing when you remove Out-GridView
is not what we are talking about here: whenever you output data directly to the console (not assigning to a variable, not piping to another command), PowerShell emits the results to the console immediately as they become available.
Enabling Streaming
Now let’s turn on streaming for the foreach
loop by embedding the loop into a ScriptBlock:
& {
# $files is supposed to be filled with data already
foreach($_ in $files)
{
# artificially slowing things down a bit
'processing {0}' -f $_.FullName
Start-Sleep -Seconds 1
}
} | Out-GridView
This time, the results are passed on to Out-GridView
in real-time, without having to first collect them in a variable. The user gets first results instantaneously, and a programmer could start filtering out data immediately to conserve memory.
When Adding Streaming Makes Sense
After some chewing on the code, you’ll soon realize that embedding foreach
inside a scriptblock isn’t very useful: you could have used a scriptblock in the first place, and abandon foreach
altogether:
# $files is supposed to be filled with data already
$files | & {
process
{
# artificially slowing things down a bit
'processing {0}' -f $_.FullName
Start-Sleep -Seconds 1
}
} | Out-GridView
All you need to do is place your looping code inside a process{} block so it gets repeated for each incoming pipeline element. So where does adding streaming make sense?
Do..While Loops Do Matter
There is one loop that can’t easily be replaced with a pipeline: Do...While
. It is special because it determines freshly before or after each iteration whether it should iterate again. This is often used for reading database records or file content until the data source encounters an End of File.
You can’t convert Do..While
loops easily to a pipeline because typically you don’t have data to start the pipeline with. Instead, the loop itself produces the data.
With the trick above, you can keep using Do..While
loops and just enable streaming by embedding the loop into a ScriptBlock.
Here are two real-world scenarios:
Reading Large Files
Assume you want to read large log files with maximum control over the read process. Here is an example:
# take a text file to play with
# replace with path to any text file you want
# ensure the file exists:
$Path = "$pshome\types.ps1xml"
[IO.StreamReader]$reader = [System.IO.StreamReader]::new($Path)
$result = while (-not $reader.EndOfStream)
{
# read current line
$reader.ReadLine()
# add artificial delay to pretend this was a HUGE file
Start-Sleep -Milliseconds 300
}
# close and dispose the streamreader properly:
$reader.Close()
$reader.Dispose()
$result | Out-GridView
I added an artificial delay to the loop to mimic reading a really huge file. When you run the code, a StreamReader reads the file line-by-line. A While
loop checks at the begin of each loop whether all lines have been read.
There are plenty of ways to read text files, and PowerShell sports its own
Get-Content
which is really simple to use. I chose to use a StreamReader here solely to have a use case where checking some End of File property is required.
Since While
does not support streaming, all results must be stored to a variable like $result first, and the user sees the result only when all lines have been read. With large files (or thanks to the artificial delay in the example) this can take very long.
Adding Streaming
Let’s add real-time streaming to While
by embedding the loop into a ScriptBlock:
# take a text file to play with
# replace with path to any text file you want
# ensure the file exists:
$Path = "$pshome\types.ps1xml"
[IO.StreamReader]$reader = [System.IO.StreamReader]::new($Path)
# embed loop in scriptblock:
& {
while (-not $reader.EndOfStream)
{
# read current line
$reader.ReadLine()
# add artificial delay to pretend this was a HUGE file
Start-Sleep -Milliseconds 10
}
# process results in real-time as they become available:
} | Out-GridView
# close and dispose the streamreader properly:
$reader.Close()
$reader.Dispose()
When you run this, the results are emitted to Out-GridView
immediately. There is no need anymore to store all results in a variable like $result, and a scripter could add filters like Where-Object
to immediately filter out useful lines.
Reading Databases
When you execute a SQL statement on a database, you receive records one by one until the database returns EOF (End-of-File). So PowerShell does not know how often the loop is going to iterate which is why a head-controlled While
loop is used.
There are plenty of ways how to connect to and read database content. I chose an approach via COM objects just to come up with an example where it is required to check a EOF (end-of-file) property in a
While
loop. I am sure there are smarter ways to read databases.
Querying Database via SQL
The next script connects to a local SQLServer instance and reads the names of all tables.
The script assumes a local SQLServer database and queries the list of tables in the database. If you’d like to connect to a different database, change $connectionString accordingly.
# define your database details:
$InstanceId = "$env:computername\FIRSTDB"
$Database = "master"
$connectionString = "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=$Database;Data Source=$InstanceId"
$sql = "select * from sys.databases"
# connect to database
$connection = New-Object -ComObject ADODB.Connection
$connection.Open($connectionString)
$rs = $connection.Execute($sql)
# loop through records
$result = while ($rs.Eof -eq $false)
{
# turn each record into an object:
$hash = @{}
foreach($field in $rs.Fields)
{
$hash[$field.Name] = $field.Value
}
[PSCustomObject]$hash
$rs.MoveNext()
}
# close database
$rs.Close()
$connection.Close()
# emit results
$result | Out-GridView
Since While
loops can’t stream, you get the results only when all records have been processed. For the case of system tables, this delay does not matter, but when you are querying real tables with thousands of records, it does matter.
Adding Streaming
Now let’s add just a tiny bit of code to make the While
loop stream. This way, you get the results from your database in real-time as they are retrieved:
# define your database details:
$InstanceId = "$env:computername\FIRSTDB"
$Database = "master"
$connectionString = "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=$Database;Data Source=$InstanceId"
$sql = "select * from sys.databases"
# connect to database
$connection = New-Object -ComObject ADODB.Connection
$connection.Open($connectionString)
$rs = $connection.Execute($sql)
# loop through records
# embed code in scriptblock
& {
while ($rs.Eof -eq $false)
{
# turn each record into an object:
$hash = @{}
foreach($field in $rs.Fields)
{
$hash[$field.Name] = $field.Value
}
[PSCustomObject]$hash
$rs.MoveNext()
}
# emit the results in real-time to the next command
# i.e. Out-GridView:
} | Out-GridView
# close database
$rs.Close()
$connection.Close()
By embedding the loop inside a ScriptBlock, you can stream its result directly to Out-GridView
in real-time. No need to hog memory in $results, and immediate feedback to the user.