How do I compress files using powershell that are over 2 GB?

To explain the limitation named on PowerShell Docs for Compress-Archive:

The Compress-Archive cmdlet uses the Microsoft .NET API System.IO.Compression.ZipArchive to compress files. The maximum file size is 2 GB because there’s a limitation of the underlying API.

This happens because this cmdlet uses a Memory Stream to hold the bytes in memory and then write them to the file. Inspecting the InnerException produced by the cmdlet we can see:

System.IO.IOException: Stream was too long.
   at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at CallSite.Target(Closure , CallSite , Object , Object , Int32 , Object )

We would also see a similar issue if we attempt to read all bytes from a file larger than 2Gb:

Exception calling "ReadAllBytes" with "1" argument(s): "The file is too long.
This operation is currently limited to supporting files less than 2 gigabytes in size."

Coincidentally, we see the same limitation with System.Array:

.NET Framework only: By default, the maximum size of an Array is 2 gigabytes (GB).


There is also another limitation pointed out in this question, Compress-Archive can’t compress if another process has a handle on a file.

How to reproduce?

# cd to a temporary folder and
# start a Job which will write to a file
$job = Start-Job {
    0..1000 | ForEach-Object {
        "Iteration ${_}:" + ('A' * 1kb)
        Start-Sleep -Milliseconds 200
    } | Set-Content .\temp\test.txt
}

Start-Sleep -Seconds 1
# attempt to compress
Compress-Archive .\temp\test.txt -DestinationPath test.zip
# Exception:
# The process cannot access the file '..\test.txt' because it is being used by another process.
$job | Stop-Job -PassThru | Remove-Job
Remove-Item .\temp -Recurse

To overcome this issue, and also to emulate explorer’s behavior when compressing files used by another process, the function posted below will default to [FileShare] 'ReadWrite, Delete' when opening a FileStream.


To get around this issue there are two workarounds:

  • The easy workaround is to use the ZipFile.CreateFromDirectory Method. There are 3 limitations while using this static method:
    1. The source must be a directory, a single file cannot be compressed.
    2. All files (recursively) on the source folder will be compressed, we can’t pick / filter files to compress.
    3. It’s not possible to Update the entries of an existing Zip Archive.

Worth noting, if you need to use the ZipFile Class in Windows PowerShell (.NET Framework) there must be a reference to System.IO.Compression.FileSystem. See inline comments.

# Only needed if using Windows PowerShell (.NET Framework):
Add-Type -AssemblyName System.IO.Compression.FileSystem

[IO.Compression.ZipFile]::CreateFromDirectory($sourceDirectory, $destinationArchive)
  • The code it yourself workaround, would be using a function which does all the manual process for creating the ZipArchive and the corresponding ZipEntries:
using namespace System.IO
using namespace System.IO.Compression
using namespace System.Collections.Generic

Add-Type -AssemblyName System.IO.Compression

function Compress-File {
    [CmdletBinding(DefaultParameterSetName="Force")]
    param(
        [parameter(Position = 0, Mandatory, ValueFromPipeline, ValueFromPipelineByPropertyName)]
        [Alias('FullName')]
        [ValidateScript({
            if(Test-Path -LiteralPath $_) {
                return $true
            }

            throw [InvalidOperationException]::new(
                "The path '$_' either does not exist or is not a valid file system path."
            )
        })]
        [string] $Path,

        [parameter(Position = 1, Mandatory)]
        [string] $DestinationPath,

        [parameter()]
        [CompressionLevel] $CompressionLevel = [CompressionLevel]::Optimal,

        [parameter(ParameterSetName="Update")]
        [switch] $Update,

        [parameter(ParameterSetName="Force")]
        [switch] $Force,

        [parameter()]
        [switch] $PassThru
    )

    begin {
        $DestinationPath = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($DestinationPath)
        $Path = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($Path)

        if($Force.IsPresent) {
            $fsMode = [FileMode]::Create
        }
        elseif($Update.IsPresent) {
            $fsMode = [FileMode]::OpenOrCreate
        }
        else {
            $fsMode = [FileMode]::CreateNew
        }

        $ExpectingInput = $null
    }
    process {
        if(-not $ExpectingInput) {
            try {
                $destfs = [File]::Open($DestinationPath, $fsMode)
                $zip    = [ZipArchive]::new($destfs, [ZipArchiveMode]::Update)
                $ExpectingInput = $true
            }
            catch {
                $zip, $destfs | ForEach-Object Dispose
                $PSCmdlet.ThrowTerminatingError($_)
            }
        }

        if([File]::GetAttributes($Path) -band [FileAttributes]::Archive) {
            [FileInfo] $Path = $Path
            $here = $Path.Directory.FullName
        }
        else {
            [DirectoryInfo] $Path = $Path
            $here = $Path.Parent.FullName
        }

        $queue = [Queue[FileSystemInfo]]::new()
        $queue.Enqueue($Path)

        while($queue.Count) {
            try {
                $current = $queue.Dequeue()
                if($current -is [DirectoryInfo]) {
                    $current = $current.EnumerateFileSystemInfos()
                }
            }
            catch {
                $PSCmdlet.WriteError($_)
                continue
            }

            foreach($item in $current) {
                try {
                    if($item.FullName -eq $DestinationPath) {
                        continue
                    }

                    $relative = $item.FullName.Substring($here.Length + 1)
                    $entry    = $zip.GetEntry($relative)

                    if($item -is [DirectoryInfo]) {
                        $queue.Enqueue($item)
                        if(-not $entry) {
                            $entry = $zip.CreateEntry($relative + '\', $CompressionLevel)
                        }
                        continue
                    }

                    if(-not $entry) {
                        $entry = $zip.CreateEntry($relative, $CompressionLevel)
                    }

                    $sourcefs = $item.Open([FileMode]::Open, [FileAccess]::Read, [FileShare] 'ReadWrite, Delete')
                    $entryfs  = $entry.Open()
                    $sourcefs.CopyTo($entryfs)
                }
                catch {
                    $PSCmdlet.WriteError($_)
                }
                finally {
                    $entryfs, $sourcefs | ForEach-Object Dispose
                }
            }
        }

        if($PassThru.IsPresent) { $Path }
    }
    end {
        $zip, $destfs | ForEach-Object Dispose
    }
}

This function should be able to handle the same as CreateFromDirectory method but also allow us to filter a folder for specific files to compress and also keep the file / folder structure untouched.

Examples

  • Compressing multiple files:
Get-ChildItem .\path -Recurse -Include *.ext, *.ext2 |
    Compress-File -DestinationPath dest.zip
  • Compressing multiple directories:
Get-ChildItem .\path -Recurse -Directory |
    Compress-File -DestinationPath dest.zip
  • Replacing an existing Zip Archive:
Compress-File -Path .\myDir -DestinationPath dest.zip -Force
  • Adding and updating new entries to an existing Zip Archive:
Get-ChildItem .\path -Recurse -Directory |
    Compress-File -DestinationPath dest.zip -Update

Performance Measurements

I have decided to add some performance tests comparing this function with Compress-Archive as, in addition to the 2Gb limitation, it’s also worth noting that the built-in cmdlet is insanely slow in PowerShell Core. Source code for this test can be found here.

Average Results

Test                        Average  RelativeSpeed
----                        -------  -------------
Compress-File (Optimal)     1178.75  1x
Compress-Archive (Optimal)  34179.89 29.00x

Results per Test Run

TestRun Test                        TotalMilliseconds
------- ----                        -----------------
      3 Compress-File (Optimal)              1132.38
      4 Compress-File (Optimal)              1151.72
      2 Compress-File (Optimal)              1156.69
      5 Compress-File (Optimal)              1157.54
      1 Compress-File (Optimal)              1295.44
      2 Compress-Archive (Optimal)           33884.40
      4 Compress-Archive (Optimal)           33907.80
      3 Compress-Archive (Optimal)           33940.75
      5 Compress-Archive (Optimal)           34264.44
      1 Compress-Archive (Optimal)           34902.04

Further updates to this function will be posted to this GitHub repo.

Leave a Comment