C# Decompress GZIP

This C# program decompresses a GZIP byte array using GZipStream.

Decompress GZIP. GZIP data is often decompressed before use.

A byte array containing GZIP bytes can be translated into one with the original representation of bits. It is possible to use a wrapper method around the GZipStream and other streams.

 

 

Example. First this program receives a byte array that contains GZIP data and transform it into a byte array that contains the original representation of bytes. In this example, we use a specific GZIP-compressed file on the C:\ directory.

 

To use this program on your computer, you will need to change the path to point to a GZIP file. The program contains a single Decompress method, which receives a GZIP byte array and returns the uncompressed byte array.

C# program that decompresses GZIP file

using System;
using System.IO;
using System.IO.Compression;

class Program
{
    static void Main()
    {
	// Open a compressed file on disk.
	// ... Then decompress it with the method below.
	// ... Then write the length of each array.
	byte[] file = File.ReadAllBytes("C:\\perlgzips\\~stat.gz");
	byte[] decompressed = Decompress(file);
	Console.WriteLine(file.Length);
	Console.WriteLine(decompressed.Length);
    }

    static byte[] Decompress(byte[] gzip)
    {
	// Create a GZIP stream with decompression mode.
	// ... Then create a buffer and write into while reading from the GZIP stream.
	using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
	{
	    const int size = 4096;
	    byte[] buffer = new byte[size];
	    using (MemoryStream memory = new MemoryStream())
	    {
		int count = 0;
		do
		{
		    count = stream.Read(buffer, 0, size);
		    if (count > 0)
		    {
			memory.Write(buffer, 0, count);
		    }
		}
		while (count > 0);
		return memory.ToArray();
	    }
	}
    }
}

Output
    (Please change the filename in the program to a GZIP file.)

9106
36339

In Decompress, the GZipStream object is first instantiated. The backing store is a MemoryStream wrapped around the GZIP buffer. The second argument to the GZipStream is the CompressionMode.Decompress enumerated constant.

Next, a byte buffer array is allocated. We use 4096 elements for this. We use the value 4096 because arrays that are powers of two are better aligned on memory caches and therefore faster.

Then: The GZIP array is read from the GZipStream and decompressed. This is written to the MemoryStream.

MemoryStream

 

Use. This code reads in a byte array and then decompresses that array to another byte array. Because GZIP compression is often used for websites, you can store web pages as byte arrays in compressed form and then decompress them when required.

 

Tip: Because the GZIP version is more compact, this form can be used to store the pages on the disk.

 

Decompress web page. This C# console program decompresses web pages in GZIP format. It uses types from System.IO, System.IO.Compression and System.Net namespaces. When you pass it a URL from the command line, it will download the page in GZIP form.

 

Next: It passes that byte array to the Decompress method. Finally, it converts that byte array to a string.

Convert String, Byte ArrayWebClient

C# program that decompresses web pages

using System;
using System.IO;
using System.IO.Compression;
using System.Net;

class Program
{
    static byte[] Decompress(byte[] gzip)
    {
	using (GZipStream stream = new GZipStream(new MemoryStream(gzip),
						  CompressionMode.Decompress))
	{
	    const int size = 4096;
	    byte[] buffer = new byte[size];
	    using (MemoryStream memory = new MemoryStream())
	    {
		int count = 0;
		do
		{
		    count = stream.Read(buffer, 0, size);
		    if (count > 0)
		    {
			memory.Write(buffer, 0, count);
		    }
		}
		while (count > 0);
		return memory.ToArray();
	    }
	}
    }

    static void Main(string[] args)
    {
	try
	{
	    Console.WriteLine("*** Decompress web page ***");
	    Console.WriteLine("    Specify file to download");
	    Console.WriteLine("Downloading: {0}", args[0]);

	    // Download url.
	    using (WebClient client = new WebClient())
	    {
		client.Headers[HttpRequestHeader.AcceptEncoding] = "gzip";
		byte[] data = client.DownloadData(args[0]);
		byte[] decompress = Decompress(data);
		string text = System.Text.ASCIIEncoding.ASCII.GetString(decompress);

		Console.WriteLine("Size from network: {0}", data.Length);
		Console.WriteLine("Size decompressed: {0}", decompress.Length);
		Console.WriteLine("First chars:       {0}", text.Substring(0, 5));
	    }
	}
	finally
	{
	    Console.WriteLine("[Done]");
	    Console.ReadLine();
	}
    }
}

Output
    [Argument = http://en.wikipedia.org/]

*** Decompress web page ***
    Specify file to download
Downloading: http://en.wikipedia.org/
Size from network: 15228
Size decompressed: 56362
First chars:       <!DOC
[Done]

The compressed page from the example required 15,228 bytes. The expanded form required 56,362 bytes (several times larger). Getting the GZIP page with the WebClient would enhance network (and likely overall) performance.

Tip: Expanding a page in memory is typically much faster than downloading an additional 41,000 bytes.

Note: This console program demonstrates how you can download a GZIP page and expand it in memory.

But: It does not contain adequate error-handling mechanisms, so will fail on servers that do not support GZIP.

 

Summary. We decompressed an array of GZIP bytes into an array of the original bytes. The C# method shown receives a GZIP byte array and returns the original byte array. The method translates the two arrays using stream interfaces.

 

And: We noted that there are ways you can use byte arrays for website pages to improve efficiency—this code is useful here.

Compress Data: GZIP