Re^2: Wrapper to Gzip CGI output

Replies are listed 'Best First'.
Re^3: Wrapper to Gzip CGI output by afoken (Chancellor) on Sep 07, 2014 at 08:44 UTC
So, you need a wrapper. Do you know the CGI protocol? If not, read it now (RFC3875). Also read http://en.wikipedia.org/wiki/HTTP_compression. From there, it is quite obvious how to implement such a wrapper: The wrapper needs to pass the entire environment and standard input unmodified to the original CGI. It has to write all CGI response headers unmodified to standard output, plus it has to write a `Content-Encoding: gzip` header. Then, it has to read all of the CGI output, compress it, and write the compressed data to standard output. There is only one trap left. Not all HTTP clients accept gzip compressed data. Luckily, this special case is simple to detect, and simple to handle: Clients that can handle gzip compressed data send an `Accept-Encoding` header that contains the word `gzip`. The webserver places the value of that header in the environment variable `HTTP_ACCEPT_ENCODING`. If that variable does not exist at all or does not contain `gzip`, simply replace the wrapper with the actual CGI (`exec("/path/to/real.cgi")`). Do this test as early as possible. Back to the wrapper implementation (after the `Accept-Encoding` test): Unless you actively prevent it, standard input is automatically inherited to child processes, and so is the environment. So you don't have to take care of these. Standard error is also inherited, this is a good thing, too. All that's left to do is to `fork()` a child process, `exec("/path/to/real.cgi")` there, and read in whatever the child process writes to its standard output. (You may need `pipe` in C.) Read line by line, write to standard output whatever you read from the child, until you read an empty line. Don't write out that empty line, instead write the `Content-Encoding` header line plus an empty line to standard output. After that, read large chunks (not lines!) from the child process with the real CGI, compress them, and write the compressed data to standard output. Add some error handling. Write a minimalistic `500 Internal Server Error` page whenever something went wrong, and write details to standard error so you can see actual error messages in the webserver's error log. Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: Wrapper to Gzip CGI output
by afoken (Chancellor) on Sep 07, 2014 at 08:44 UTC

So, you need a wrapper. Do you know the CGI protocol? If not, read it now (RFC3875). Also read http://en.wikipedia.org/wiki/HTTP_compression. From there, it is quite obvious how to implement such a wrapper:

The wrapper needs to pass the entire environment and standard input unmodified to the original CGI. It has to write all CGI response headers unmodified to standard output, plus it has to write a Content-Encoding: gzip header. Then, it has to read all of the CGI output, compress it, and write the compressed data to standard output.

There is only one trap left. Not all HTTP clients accept gzip compressed data. Luckily, this special case is simple to detect, and simple to handle: Clients that can handle gzip compressed data send an Accept-Encoding header that contains the word gzip. The webserver places the value of that header in the environment variable HTTP_ACCEPT_ENCODING. If that variable does not exist at all or does not contain gzip, simply replace the wrapper with the actual CGI (exec("/path/to/real.cgi")). Do this test as early as possible.

Back to the wrapper implementation (after the Accept-Encoding test): Unless you actively prevent it, standard input is automatically inherited to child processes, and so is the environment. So you don't have to take care of these. Standard error is also inherited, this is a good thing, too. All that's left to do is to fork() a child process, exec("/path/to/real.cgi") there, and read in whatever the child process writes to its standard output. (You may need pipe in C.) Read line by line, write to standard output whatever you read from the child, until you read an empty line. Don't write out that empty line, instead write the Content-Encoding header line plus an empty line to standard output. After that, read large chunks (not lines!) from the child process with the real CGI, compress them, and write the compressed data to standard output.

Add some error handling. Write a minimalistic 500 Internal Server Error page whenever something went wrong, and write details to standard error so you can see actual error messages in the webserver's error log.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

[reply]
[d/l]
[select]