Gzip, HTTP compression and the holy details

As far as I remember I’m using Gzip compression in all the websites we build; it’s sort of a no-brainer. You build a website, you use Gzip, you be happy.

But few days back, I decided to know more about Gzip just than the fact that it is used for HTTP compression; hence I browsed my way to it.

Gzip

The G in in Gzip stands for GNU, which is a collaborative development project. GZip simply is a file format which works on the DEFLATE algorithm (LZ77 + Huffman Encoding) to perform compression. Fun fact, DEFLATE is also the same algorithm which is used in files like PNG.

Gzip is normally used to compress only single files. Compressed archives are created by assembling individual files into a TAR archive and then compressing that TAR (Tape Archive) archive with Gzip; the result then would be .tar.gz or .tgz which is also known as Tarball.

HTTP Compression

In most simple words, HTTP compression is about serving the data sitting on the server to the browser (client) in a compressed form, on demand. Later, client then decompresses that data before showing it to the user.

HTTP compression can work in either of the two ways – Lower Level or Higher Level.

Lower Level – Transfer Encoding header field is used to indicate that the message/data being recieved is in the compressed form.

Higher Level – Content Encoding header field is used to indicate that the message/data being recieved is in the compressed form.

Some browsers do not advertise the support for Transfer Encoding to avoid triggering bugs in the servers and hence Content Encoding approach is more preffered method.

Gzip is one of the 3 standard formats of the HTTP compression.

Gzip – GNU Zip program

Compress – UNIX file compression program

Zlib – Abstraction of DEFLATE

Zlib was at one point better than Gzip because Gzip additionally adds eleven bytes of overhead in the form of headers and trailers but it is not widely used as Microsoft IE does not implement the Zlib standard correctly.

Gzip is useful in compressing files including xHTML, CSS, JS and text files but is actually of no use if you’ll try compressing an already compressed file or an image file like PNG, because such files already uses some compression technique and Gzip then anyway would add additional data to the file.

Apache and Gzip

Most used, Apache servers support Gzip compression via mod_deflate and mod_gzip module.

mod_deflate: It usually comes bundled with Apache modules. It is faster in terms of compression and decompression and uses less resources. It is also better documented and is easier to configure.

mod_gzip: It is an additional module for Apache. It is slower in terms of compression and decompression and used slightly higher resources.

mod_deflate is most commonly adopted way of implementing Gzip on Apache; but it sucked on version prior to v2 because it produced lower compression ratios back then. In and after Apache v2, the compression level for mod_deflate can be configured.

NGINX had the support for Gzip in-built.

Server – Client – Compression

Browser: Hey yo server, check out my Accept Encoding block in Content Header; I’d like data in zipped format.

Accept-Encoding: gzip, deflate

Server: Wasup Browser, sure thing. Check out the Content Encoding block that I sent with data; the data is zipped. Peace.

Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip

If at any point, the Content Encoding block returns “identity”, it means the data is in its original uncompressed form.


After this, I read about the common code which is used to compress different sort of files using both mod_deflate and mod_gzip. You can find the code in the below mentioned reference.

Ref:

https://en.m.wikipedia.org/wiki/Gzip
http://www.gzip.org/
https://www.freebsd.org/cgi/man.cgi?gzip
https://en.m.wikipedia.org/wiki/HTTP_compression
https://www.giftofspeed.com/enable-gzip-compression/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s