I was working on trying to speed up the transmission and reduce the bandwidth of a web service I’ve been building in EC2. I’m using nginx (en-JINN-ex) as my load balancer and the cloud because, it is awesome.
A little overview for people who have lives and think an HTTP Header is a football term:
When you send a request to a server via your browser a series of Headers(little variables) are sent across which tells the server what type of browser you are, what language you prefer, and a whole host of other interesting info.
When you get data back from the webserver, it also sends back some headers like the status of the request (200 is OK, 404 is a url with no content, etc), the length of the content, type of content, etc.
When a server is using mod_deflate or mod_gzip in apache (or The Gzip Module for Nginx), then it is capable of sending back its content in a compressed (gzipped) form. All modern browsers support transparent decryption on the browser side. What this means is that you see a normal HTML page which is 70k, but only 15 or 20k went over the pipe because of compression. Cool, right? Chances are it just happened as you saw this page.
Now how does a web server know you can accept gzip’d content? Well, you send one of those headers that looks like this:
Accept-Encoding: gzip,deflate
and the server responds with
Content-Encoding: gzip
By default, php does not transparently support gzip encoding, but it can be done. See the following:
if (function_exists('gzinflate')) {
$gz_on = true;
}
if ($gz_on) {
// Tells the drupal_http_request function to send this header over.
$headers = array (
'Accept-Encoding' => 'gzip,deflate',
);
}
$return = drupal_http_request($url,$headers,'GET');
// This checks to make sure that we actually got gzip'd content back
if ($gz_on == true && stristr($return->headers['Content-Encoding'],'gzip')) {
//First 10 chars are junk
$string = substr($return->data, 10);
$output = gzinflate($string);
} else {
$output = $return->data;
}
Go ahead and try it with $gz_on false and true. You'll see that $return->data will have different lengths if it is compressed or not, but $output will be the same.
One caveat when working with nginx (this one hurt after 2 hrs):
From the nginx manual
Turns gzip compression on or off depending on the HTTP request version.
When HTTP version 1.0 is used, the Vary: Accept-Encoding header is not set. As this can lead to proxy cache corruption, consider adding it with add_header. Also note that the Content-Length header is not set when using either version. Keepalives will therefore be impossible with version 1.0, while for 1.1 it is handled by chunked transfers.
Drupal uses HTTP 1.0 for some reason... I don't know why it uses version 1.0, but I filed an issue about the same, because 1.1 is a lot better and pretty much ubiquitous IIRC.
Happy Header Hacking!