How to speed up Powershell Get-Childitem over UNC

Okay, this is how I’m doing it, and it seems to work.

$files = cmd /c "$GETFILESBAT \\$server\logs\$filemask"
foreach( $f in $files ) {
    if( $f.length -gt 0 ) {
        select-string -Path $f -pattern $regex | foreach-object { $_ }
    }
}

Then $GETFILESBAT points to this:

@dir /a-d /b /s %1
@exit

I’m writing and deleting this BAT file from the PowerShell script, so I guess it’s a PowerShell-only solution, but it doesn’t use only PowerShell.

My preliminary performance metrics show this to be eleventy-thousand times faster.

I tested gci vs. cmd dir vs. FileIO.FileSystem.GetFiles from @Shawn Melton’s referenced link.

The bottom line is that, for daily use on local drives, GetFiles is the fastest. By far. CMD DIR is respectable. Once you introduce a slower network connection with many files, CMD DIR is slightly faster than GetFiles. Then Get-ChildItem… wow, this ranges from not too bad to horrible, depending on the number of files involved and the speed of the connection.

Some test runs. I’ve moved GCI around in the tests to make sure the results were consistent.

10 iterations of scanning c:\windows\temp for *.tmp files

.\test.ps1 "c:\windows\temp" "*.tmp" 10
GetFiles ... 00:00:00.0570057
CMD dir  ... 00:00:00.5360536
GCI      ... 00:00:01.1391139

GetFiles is 10x faster than CMD dir, which itself is more than 2x faster than GCI.

10 iterations of scanning c:\windows\temp for *.tmp files with recursion

.\test.ps1 "c:\windows\temp" "*.tmp" 10 -recurse
GetFiles ... 00:00:00.7020180
CMD dir  ... 00:00:00.7644196
GCI      ... 00:00:04.7737224

GetFiles is a little faster than CMD dir, and both are almost 7x faster than GCI.

10 iterations of scanning an on-site server on another domain for application log files

.\test.ps1 "\\closeserver\logs\subdir" "appname*.*" 10
GetFiles ... 00:00:00.3590359
CMD dir  ... 00:00:00.6270627
GCI      ... 00:00:06.0796079

GetFiles is about 2x faster than CMD dir, itself 10x faster than GCI.

One iteration of scanning a distant server on another domain for application log files, with many files involved

.\test.ps1 "\\distantserver.company.com\logs\subdir" "appname.2011082*.*"
CMD dir  ... 00:00:00.3340334
GetFiles ... 00:00:00.4360436
GCI      ... 00:11:09.5525579

CMD dir is fastest going to the distant server with many files, but GetFiles is respectably close. GCI on the other hand is a couple of thousand times slower.

Two iterations of scanning a distant server on another domain for application log files, with many files

.\test.ps1 "\\distantserver.company.com\logs\subdir" "appname.20110822*.*" 2
CMD dir  ... 00:00:00.9360240
GetFiles ... 00:00:01.4976384
GCI      ... 00:22:17.3068616

More or less linear increase as test iterations increase.

One iteration of scanning a distant server on another domain for application log files, with fewer files

.\test.ps1 "\\distantserver.company.com\logs\othersubdir" "appname.2011082*.*" 10
GetFiles ... 00:00:00.5304170
CMD dir  ... 00:00:00.6240200
GCI      ... 00:00:01.9656630

Here GCI is not too bad, GetFiles is 3x faster, and CMD dir is close behind.

Conclusion

GCI needs a -raw or -fast option that does not try to do so much. In the meantime, GetFiles is a healthy alternative that is only occasionally a little slower than CMD dir, and usually faster (due to spawning CMD.exe?).

For reference, here’s the test.ps1 code.

param ( [string]$path, [string]$filemask, [switch]$recurse=$false, [int]$n=1 )
[reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null
write-host "GetFiles... " -nonewline
$dt = get-date;
for($i=0;$i -lt $n;$i++){
  if( $recurse ){ [Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles( $path,
      [Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,$filemask
    )  | out-file ".\testfiles1.txt"}
  else{ [Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles( $path,
      [Microsoft.VisualBasic.FileIO.SearchOption]::SearchTopLevelOnly,$filemask
    )  | out-file ".\testfiles1.txt" }}
$dt2=get-date;
write-host $dt2.subtract($dt)
write-host "CMD dir... " -nonewline
$dt = get-date;
for($i=0;$i -lt $n;$i++){
  if($recurse){
    cmd /c "dir /a-d /b /s $path\$filemask" | out-file ".\testfiles2.txt"}
  else{ cmd /c "dir /a-d /b $path\$filemask" | out-file ".\testfiles2.txt"}}
$dt2=get-date;
write-host $dt2.subtract($dt)
write-host "GCI... " -nonewline
$dt = get-date;
for($i=0;$i -lt $n;$i++){
  if( $recurse ) {
    get-childitem "$path\*" -include $filemask -recurse | out-file ".\testfiles0.txt"}
  else {get-childitem "$path\*" -include $filemask | out-file ".\testfiles0.txt"}}
$dt2=get-date;
write-host $dt2.subtract($dt)

Leave a Comment