I have gotten the first version of my compressed content module as described in RFC: Mod Perl compressed content. I decided not to go with mod_gzip mainly as I wanted a pure Perl solution and I couldn't guarantee it would do exactly what I wanted.

This is my first mod_perl module so I would appreciate feedback on its quality as well as if its at all worth feeding to CPAN (or not :P).

Note that I have tested it successfully both on Redhat and on Win2k.
package Apache::Precompress; use strict; use Compress::Zlib 1.0; use Apache::Log; use Apache::Constants qw(:common); use vars qw($VERSION); $VERSION = sprintf '%d.%d', q$Revision: 0.1 $ =~ /: (\d+).(\d+)/; sub handler { my $r = shift; my $buffer; my $fh; # Quick file check unless(-e $r->filename . '.gz') { error($r->log,"Cannot open " . $r->filename . ".gz\n"); return NOT_FOUND; } if ($r->dir_config->get('SSI') || $r->header_in('Accept-Encoding') + !~ /gzip/) { $r->send_http_header; my $gz = gzopen($r->filename() . '.gz', "rb") or return error($r->log,"Cannot open " . $r->filename . ". +gz: $gzerrno\n"); while($gz->gzread($buffer,4096) > 0) { $r->print($buffer); } if($gzerrno != Z_STREAM_END) { return error($r->log,"Error reading from " . $r->filename +. ".gz: $gzerrno\n"); } $gz->gzclose(); } else { $r->content_encoding('gzip'); $r->send_http_header; open(FILE, $r->filename . '.gz') || return NOT_FOUND; binmode(FILE); while( read(FILE, $buffer, 4096) > 0) { $r->print($buffer); } close(FILE); } return OK; } sub error { my $handle = shift; my $msg = shift; $handle->error($msg); return SERVER_ERROR; } 1; __END__ =head1 NAME Apache::Preompress - Deliver already compressed files or decompress on + the fly =head1 SYNOPSIS PerlModule Apache::Precompress # Handle regular files, ie index.html.gz # Incoming request would be index.html <Directory "your-docroot/compressdfilesdir"> SetHandler perl-script PerlHandler Apache::Precompress </Directory> # Handle files by given extension .gzhtml <FilesMatch "\.gzhtml$"> SetHandler perl-script PerlHandler Apache::Precompress </FilesMatch> # You want to use SSI but your templates are compressed AddHandler server-parsed .html <FilesMatch "\.shtml$"> Options +Includes PerlSetVar SSI 1 </FilesMatch> =head1 DESCRIPTION This module lets you send pre-compressed files as though they were not. For those clients that do not support compressed content, the file is de-compressed on the fly. This module overcomes the overhead of having to compress data on the fly by keeping the data compressed on disk at all times. The driving force behind this approach was that I couldn't afford to upgrade my ISP account to have more disk space. The effective savings on bandwidt +h are also quite handy. This module will not allow the file to have SSI directives parsed out. See the to do section. If you have got SSI turned on then you simply need to use Options -Includes inside your directives. =head1 Note The intent of this module is to hide the fact that the content has bee +n precompressed from the client. At no time should the client expect to call a file by anything other than its normal extension. Additionally, the content should not link to other content other than in the normal way, ie: <a href="/compressed/test.html">Valid</a> and not <a href="/compressed/test.html.gz">Invalid</a> =head1 TO DO The SSI handling requires the setting of a variable as otherwise we end up with compressed content within the middle of an uncompressed page. We should be to tell if we are called via ssi by some other mean +s. Also, support for Apache::SSI would be useful. =head1 AUTHOR Simon Proctor, www.simonproctor.com Based on the work of Apache::Compress =head1 COPYRIGHT Copyright (C) 2002 Simon Proctor. All Rights Reserved. This module is free software; you can redistribute it and/or modify it + under the same terms as Perl itself. =head1 THANKS TO belg4mit for valuable feedback =cut


Update #1Added read call and extra POD stuff
Update #2

I've added code to check for SSI. However it requires the setting of a Perl var (see POD) for correct decompressing.

Is there a better way of checking if the module is called via SSI? Without this Perl var test, the server includes the content but the content is compressed such that you can have an uncompressed file inter-dispersed with compressed content.

Not pretty :)

Replies are listed 'Best First'.
Re: Compressed content module
by belg4mit (Prior) on Dec 01, 2002 at 22:36 UTC
    Looks fine. There are a few stylistic, efficiency, and configurability things though. I'd inline the test of Accept-Encoding myself. (Update)It would probably be better to use read with something like a 4k chunk at a time instead of linewise (Update: when passing compressed content), especially considering it's a binary file and $/ may not even occur in the file for some platforms (think \r\n). Finally, I'd consider letting the user somehow define the extension(s), or perhaps MIME-types to support decompression for.

    UPDATE: As for SSI you might mention that the user could instead construct a skeleton document with the existing SSI and then include a body handled by this module as an SSI.

    --
    I'm not belgian but I play one on TV.

      Thanks for the feedback, I'll be updating the code and update the root node. One question though. You mentioned letting the user define the extensions however it does this straight out of the box:
      <FilesMatch "\.chtml$"> SetHandler perl-script PerlHandler Apache::Precompress </FilesMatch>
      Is this something along the lines of what you meant (I kinda like the chtml extension :) )? If not could you give me an example?

      Thanks once again, SP
        Doh! I forgot about File and friends. That is one way for the user to do so, and a good one too. I'd actually slightly mangled the module logic in my head there (the test for gz suffix). I'd recommend giving an example of that in the documentation (BTW MS uses CHTML for compiled HTML, there's also cHTML - compact HTML). Also, you might want to remove the test for gz extension and look for magic a la file(1). It's more UN*Xy, even if it is more expensive.

        --
        I'm not belgian but I play one on TV.