itsscott has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks!

I have an old cgi that I don't currently have time to re-write into our new library that would gzip all the output as the existing program does not know how to gzip the output.

The cgi get's 1000's of views a day and of course not being gzipped output, the bandwidth usage is wasteful! It's also slower loading for the visitor (of course!)

Is there a small perl program that maybe I could name the same as the existing cgi, rename the old cgi to something else, and then all the calls to that cgi it's stdout would be gzipped and sent via the perl script?

The server is Apache 2 if that helps

Thanks in advance

Replies are listed 'Best First'.
Re: Wrapper to Gzip CGI output
by tobyink (Canon) on Sep 07, 2014 at 00:44 UTC
Re: Wrapper to Gzip CGI output
by Anonymous Monk on Sep 07, 2014 at 00:46 UTC

    If the script was written with the OO interface of CGI, then CGI::Compress::Gzip might be worth a try.

    See Also: OFT: Gzipping output from a CGI

    If you're using mod_perl that should also give you some more options.

    Taking a step back - What is the size of the pages being generated? Is the bandwidth usage actually too much, or does it just "seem" that way to you? Do you know that the slow page load times are because of the size of the page, and not the execution time of the script or other factors? Benchmarks, benchmarks!

      It's a compiled ansi C cgi using our own library, so there are no tools out there that I can use, it would require a lot of re-writing, as you have to know the content length of the document. We put out about 1.5TB of output a month, so if I can 'wrapper' that so the 'wrapper' receives the data from the cgi and then compresses it and sends it to the visitor, that would be my ideal solution. And effectively that should save us about 50% or more in bandwidth usage.

        So, you need a wrapper. Do you know the CGI protocol? If not, read it now (RFC3875). Also read http://en.wikipedia.org/wiki/HTTP_compression. From there, it is quite obvious how to implement such a wrapper:

        The wrapper needs to pass the entire environment and standard input unmodified to the original CGI. It has to write all CGI response headers unmodified to standard output, plus it has to write a Content-Encoding: gzip header. Then, it has to read all of the CGI output, compress it, and write the compressed data to standard output.

        There is only one trap left. Not all HTTP clients accept gzip compressed data. Luckily, this special case is simple to detect, and simple to handle: Clients that can handle gzip compressed data send an Accept-Encoding header that contains the word gzip. The webserver places the value of that header in the environment variable HTTP_ACCEPT_ENCODING. If that variable does not exist at all or does not contain gzip, simply replace the wrapper with the actual CGI (exec("/path/to/real.cgi")). Do this test as early as possible.

        Back to the wrapper implementation (after the Accept-Encoding test): Unless you actively prevent it, standard input is automatically inherited to child processes, and so is the environment. So you don't have to take care of these. Standard error is also inherited, this is a good thing, too. All that's left to do is to fork() a child process, exec("/path/to/real.cgi") there, and read in whatever the child process writes to its standard output. (You may need pipe in C.) Read line by line, write to standard output whatever you read from the child, until you read an empty line. Don't write out that empty line, instead write the Content-Encoding header line plus an empty line to standard output. After that, read large chunks (not lines!) from the child process with the real CGI, compress them, and write the compressed data to standard output.

        Add some error handling. Write a minimalistic 500 Internal Server Error page whenever something went wrong, and write details to standard error so you can see actual error messages in the webserver's error log.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)