Improving filter performance in Powershell by using collections

Improving filter performance in Powershell by using collections

Background

When you think filtering in Powershell, the first thing that often comes to mind is the venerable Where-Object cmdlet. It's used quite often, and can be a very easy way to remove elements from a pipeline based on a set of conditions you supply. Consider this basic example:

PS> 1..10 | Where-Object {$_ -eq 7}
7

In an array of 10 objects, this task takes virtually no time to complete. We'll illustrate how to measure this performance when we do the same for 1,000,000 numbers:

PS> $Results = Measure-Command { 1..1000000 | Where-Object {$_ -eq 842} }

PS> $Results.TotalSeconds
6.2798882

We see it took over 6 seconds to complete. If this command were inside a foreach loop ( or even a nested foreach loop) it can quickly add up in terms of overall run time.

Enter the .FindAll() method.

FindAll is a very common method if you're familiar with C#, and it's implemented in a wide variety of classes. In this case we're going to look at the [Systems.Collections.Generic.List<t>] class. (You can read more about that at Microsoft's documentation here.) When we invoke this method with Powershell we're, in essence, creating a sort of function inside the constructor with a parameter and a script block to be used as search criteria for our filter. Here is the same above example using FindAll instead of Where-Object:

PS> $Results = Measure-Command {
    # Create the list
    $List = [System.Collections.Generic.List[Object]]::new(1..1000000)
    # Query the list for the number 842
    $List.FindAll({ param($item); $item -eq 842 })
}

PS> $Results.TotalSeconds
3.2298538

That's a savings of almost 50% for the same query filter!

Conclusion

This method isn't one you'll use every day, but under the right circumstances, it can be a powerful way to query a collection for specific criteria and improve your script performance.