To explain the limitation named on PowerShell Docs for Compress-Archive
:
The
Compress-Archive
cmdlet uses the Microsoft .NET APISystem.IO.Compression.ZipArchive
to compress files. The maximum file size is 2 GB because there’s a limitation of the underlying API.
This happens because this cmdlet uses a Memory Stream to hold the bytes in memory and then write them to the file. Inspecting the InnerException produced by the cmdlet we can see:
System.IO.IOException: Stream was too long.
at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
at CallSite.Target(Closure , CallSite , Object , Object , Int32 , Object )
We would also see a similar issue if we attempt to read all bytes from a file larger than 2Gb:
Exception calling "ReadAllBytes" with "1" argument(s): "The file is too long.
This operation is currently limited to supporting files less than 2 gigabytes in size."
Coincidentally, we see the same limitation with System.Array
:
.NET Framework only: By default, the maximum size of an Array is 2 gigabytes (GB).
There is also another limitation pointed out in this question, Compress-Archive
can’t compress if another process has a handle on a file.
How to reproduce?
# cd to a temporary folder and
# start a Job which will write to a file
$job = Start-Job {
0..1000 | ForEach-Object {
"Iteration ${_}:" + ('A' * 1kb)
Start-Sleep -Milliseconds 200
} | Set-Content .\temp\test.txt
}
Start-Sleep -Seconds 1
# attempt to compress
Compress-Archive .\temp\test.txt -DestinationPath test.zip
# Exception:
# The process cannot access the file '..\test.txt' because it is being used by another process.
$job | Stop-Job -PassThru | Remove-Job
Remove-Item .\temp -Recurse
To overcome this issue, and also to emulate explorer’s behavior when compressing files used by another process, the function posted below will default to [FileShare] 'ReadWrite, Delete'
when opening a FileStream
.
To get around this issue there are two workarounds:
- The easy workaround is to use the
ZipFile.CreateFromDirectory
Method. There are 3 limitations while using this static method:- The source must be a directory, a single file cannot be compressed.
- All files (recursively) on the source folder will be compressed, we can’t pick / filter files to compress.
- It’s not possible to Update the entries of an existing Zip Archive.
Worth noting, if you need to use the ZipFile
Class in Windows PowerShell (.NET Framework) there must be a reference to System.IO.Compression.FileSystem
. See inline comments.
# Only needed if using Windows PowerShell (.NET Framework):
Add-Type -AssemblyName System.IO.Compression.FileSystem
[IO.Compression.ZipFile]::CreateFromDirectory($sourceDirectory, $destinationArchive)
- The code it yourself workaround, would be using a function which does all the manual process for creating the
ZipArchive
and the correspondingZipEntries
:
using namespace System.IO
using namespace System.IO.Compression
using namespace System.Collections.Generic
Add-Type -AssemblyName System.IO.Compression
function Compress-File {
[CmdletBinding(DefaultParameterSetName="Force")]
param(
[parameter(Position = 0, Mandatory, ValueFromPipeline, ValueFromPipelineByPropertyName)]
[Alias('FullName')]
[ValidateScript({
if(Test-Path -LiteralPath $_) {
return $true
}
throw [InvalidOperationException]::new(
"The path '$_' either does not exist or is not a valid file system path."
)
})]
[string] $Path,
[parameter(Position = 1, Mandatory)]
[string] $DestinationPath,
[parameter()]
[CompressionLevel] $CompressionLevel = [CompressionLevel]::Optimal,
[parameter(ParameterSetName="Update")]
[switch] $Update,
[parameter(ParameterSetName="Force")]
[switch] $Force,
[parameter()]
[switch] $PassThru
)
begin {
$DestinationPath = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($DestinationPath)
$Path = $PSCmdlet.GetUnresolvedProviderPathFromPSPath($Path)
if($Force.IsPresent) {
$fsMode = [FileMode]::Create
}
elseif($Update.IsPresent) {
$fsMode = [FileMode]::OpenOrCreate
}
else {
$fsMode = [FileMode]::CreateNew
}
$ExpectingInput = $null
}
process {
if(-not $ExpectingInput) {
try {
$destfs = [File]::Open($DestinationPath, $fsMode)
$zip = [ZipArchive]::new($destfs, [ZipArchiveMode]::Update)
$ExpectingInput = $true
}
catch {
$zip, $destfs | ForEach-Object Dispose
$PSCmdlet.ThrowTerminatingError($_)
}
}
if([File]::GetAttributes($Path) -band [FileAttributes]::Archive) {
[FileInfo] $Path = $Path
$here = $Path.Directory.FullName
}
else {
[DirectoryInfo] $Path = $Path
$here = $Path.Parent.FullName
}
$queue = [Queue[FileSystemInfo]]::new()
$queue.Enqueue($Path)
while($queue.Count) {
try {
$current = $queue.Dequeue()
if($current -is [DirectoryInfo]) {
$current = $current.EnumerateFileSystemInfos()
}
}
catch {
$PSCmdlet.WriteError($_)
continue
}
foreach($item in $current) {
try {
if($item.FullName -eq $DestinationPath) {
continue
}
$relative = $item.FullName.Substring($here.Length + 1)
$entry = $zip.GetEntry($relative)
if($item -is [DirectoryInfo]) {
$queue.Enqueue($item)
if(-not $entry) {
$entry = $zip.CreateEntry($relative + '\', $CompressionLevel)
}
continue
}
if(-not $entry) {
$entry = $zip.CreateEntry($relative, $CompressionLevel)
}
$sourcefs = $item.Open([FileMode]::Open, [FileAccess]::Read, [FileShare] 'ReadWrite, Delete')
$entryfs = $entry.Open()
$sourcefs.CopyTo($entryfs)
}
catch {
$PSCmdlet.WriteError($_)
}
finally {
$entryfs, $sourcefs | ForEach-Object Dispose
}
}
}
if($PassThru.IsPresent) { $Path }
}
end {
$zip, $destfs | ForEach-Object Dispose
}
}
This function should be able to handle the same as CreateFromDirectory
method but also allow us to filter a folder for specific files to compress and also keep the file / folder structure untouched.
Examples
- Compressing multiple files:
Get-ChildItem .\path -Recurse -Include *.ext, *.ext2 |
Compress-File -DestinationPath dest.zip
- Compressing multiple directories:
Get-ChildItem .\path -Recurse -Directory |
Compress-File -DestinationPath dest.zip
- Replacing an existing Zip Archive:
Compress-File -Path .\myDir -DestinationPath dest.zip -Force
- Adding and updating new entries to an existing Zip Archive:
Get-ChildItem .\path -Recurse -Directory |
Compress-File -DestinationPath dest.zip -Update
Performance Measurements
I have decided to add some performance tests comparing this function with Compress-Archive
as, in addition to the 2Gb limitation, it’s also worth noting that the built-in cmdlet is insanely slow in PowerShell Core. Source code for this test can be found here.
Average Results
Test Average RelativeSpeed
---- ------- -------------
Compress-File (Optimal) 1178.75 1x
Compress-Archive (Optimal) 34179.89 29.00x
Results per Test Run
TestRun Test TotalMilliseconds
------- ---- -----------------
3 Compress-File (Optimal) 1132.38
4 Compress-File (Optimal) 1151.72
2 Compress-File (Optimal) 1156.69
5 Compress-File (Optimal) 1157.54
1 Compress-File (Optimal) 1295.44
2 Compress-Archive (Optimal) 33884.40
4 Compress-Archive (Optimal) 33907.80
3 Compress-Archive (Optimal) 33940.75
5 Compress-Archive (Optimal) 34264.44
1 Compress-Archive (Optimal) 34902.04
Further updates to this function will be posted to this GitHub repo.