TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

PPMd Compression Sizes in 7-Zip

This article looks at the PPMd Prediction by Partial Matching algorithm in 7-Zip.

PPMd is a lossless

compression algorithm. It is effective at compressing text files containing natural language text. The 7-Zip open-source compression utility provides several compression options. It includes the PPMd algorithm.

Original file:

URL: http://shakespeare.mit.edu/macbeth/full
Compressed gzip on server: 55186 bytes
Decompressed html on disk: 195747 bytes

PPMd Test:

Format:            7z
Compression level: Ultra
Dictionary size:   192 MB
Word size:         32
Solid block size:  'solid'
SIZE:              37211 bytes

GZIP Test:

Format:            GZip
Compression level: Ultra
Dictionary size:   32 KB
Word size:         258 bytes
SIZE:              52530 bytes

LZMA Test:

Format:            7z
Compression level: Ultra
Dictionary size:   64 MB
Word size:         273
Solid block size:  'solid'
SIZE:              47921 bytes

BZIP2 Test:

Format:            7z
Compression level: Ultra
Dictionary size:   900 KB
Solid block size:  'solid'
SIZE:              39892 bytes

Intro. The acronym PPMd stands for Prediction by Partial Matching. It describes an algorithm that chooses how to compress data further in the stream by the data it has most recently encountered.

Also: Wikipedia describes this approach as an adaptive statistical data compression technique.

Prediction by partial matching

7-Zip. It is simple to use PPMd in recent versions of the 7-Zip open-source compression utility for Windows. To compress a file with the PPMd algorithm, right-click on the file in Windows explorer and select 7-Zip > Add to archive.

Then: When the archive format is specified as 7z, you can select "PPMd" for the compression method.

Results. The document tested was the Macbeth play hosted on the MIT campus network. This document is fairly large (around 200 KB) when uncompressed. On the MIT servers, the document is gzipped and is about 55000 bytes.

When the document was compressed with PPMd, the size was reduced to about 37000 bytes, which was a better ratio than GZIP, BZIP2, and LZMA yielded. PPMd here had an improvement of about 32.57% percent over the GZIP algorithm used at MIT.

Uses. The PPMd algorithm yields outstanding compression ratios for text files, which make up most of the world's important information. It could greatly decrease the bandwidth on many internet sites if it were implemented in all software.

However: This is not imminent. Its best use is to compress private text files or database dumps containing important information.

 

Summary. The PPMd compression algorithm implemented in 7-Zip yields excellent results on text files. The PPMd algorithms were used as the basis for the PAQ algorithms. These implement the best lossless compression.

Note: They received the Hutter Prize, which promotes artificial intelligence research.

And: The results show how PPMd can be useful in decreasing text file compression sizes.


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf