TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

7-Zip DEFLATE Compression Ratios

This article shows the result of DEFLATE compression with 7-Zip. 262 files were tested. The smallest was 258 fast bytes and 13 passes.

DEFLATE is used to compress GZIP files.

This algorithm is implemented efficiently in 7-Zip. We can turn the "fast bytes" and "number passes" knobs to optimize the algorithm. This influences DEFLATE compression in 7-Zip.

128 fast bytes

10 passes: 915282 bytes [biggest]
11 passes: 915020 bytes
12 passes: 914958 bytes
13 passes: 914898 bytes
14 passes: 914938 bytes
15 passes: 914899 bytes

258 fast bytes

10 passes: 915277 bytes
11 passes: 915017 bytes
12 passes: 914953 bytes
13 passes: 914897 bytes [smallest]
14 passes: 914933 bytes
15 passes: 914898 bytes [second smallest]

Intro. Unlike the 7z format, 7-Zip doesn't offer many options for GZIP, ZIP, and DEFLATE files. However, it allows you to adjust maximum fast bytes and the number of passes. For simple tasks, you can use the -mx options on the command line.

Here: The starting point will be the 7-Zip ultra compression for GZIP. It has the majority of the gains.

7za.exe -tgzip archive.gz input -mx=9

7za.exe:     the 7-zip executable
-tgzip:      specifies GZIP and Deflate as the method
archive.gz:  the target file
	     will be created or overwritten
input:  the input file to be compressed
-mx=9:       specifies ultra compression

Options. Here I describe options. As the 7-Zip manual states, the two options with DEFLATE are "mpass" for passes and "mfb" for fast bytes. We replace the -mx=9 switch with combinations of these two switches.

Info: I tested the 7-Zip 4.60 beta version for Windows Vista. The files tested were small HTML files.

As you increase the number of passes from 10 to 15, the compression ratio generally improves. And specifying more bytes, 258, never reduces the compression rate. These commands were run.

Commands used

7za.exe -tgzip file2 file1 -mpass=10 -mfb=128
7za.exe -tgzip file2 file1 -mpass=10 -mfb=258

7za.exe -tgzip file2 file1 -mpass=11 -mfb=128
7za.exe -tgzip file2 file1 -mpass=11 -mfb=258

7za.exe -tgzip file2 file1 -mpass=12 -mfb=128
7za.exe -tgzip file2 file1 -mpass=12 -mfb=258

7za.exe -tgzip file2 file1 -mpass=13 -mfb=128
7za.exe -tgzip file2 file1 -mpass=13 -mfb=258

7za.exe -tgzip file2 file1 -mpass=14 -mfb=128
7za.exe -tgzip file2 file1 -mpass=14 -mfb=258

7za.exe -tgzip file2 file1 -mpass=15 -mfb=128
7za.exe -tgzip file2 file1 -mpass=15 -mfb=258

Discussion. Adding passes and fast bytes to the already excellent compression ratio of ultra mode in 7-Zip resulted in a file size decrease of 0.042%. In other words, it saved 384 bytes in a 915282 byte archive.

Certainly, this isn't impressive, but when dealing with compression, understanding the knobs are important. In this case, going beyond ultra mode in 7-Zip wasn't useful. It is mainly a waste of time.

Note: Most GZIP algorithms, including those included in the .NET Framework, have results that are commonly 10% larger than 7-Zip's.

If you create archives frequently, don't use switches above 9. Also, if there are more important improvements to make, pursue those first. However, if your data is going to be compressed once and left, consider aggressive options.

Adding more passes: I found 7-Zip accepts many more passes—I even bumped it up to 100. But there was no improvement past 15 passes.

File names. You don't always need the original file name. In this case, before you archive your files, rename the original files to a single-character file name. This will save several bytes off your archive.

 

Summary. Here we saw that with 7-Zip, there are no substantial gains in DEFLATE when you go beyond the top preset option of ultra. However, knowing this is useful to some extent, and the knowledge can save time and frustration.

Note: My experiments here shaved 0.042% off of my archive's final size, which is better than nothing, but not dramatic.


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf