pdc has asked for the wisdom of the Perl Monks concerning the following question:

Hello wise monks!

I'm writing a script to do some monitoring of a webapp, which involves following a bunch of redirects around. One of these redirect steps is behind ModSecurity on apache, and is rejecting my normal requests from $mechanize->get(...). I've managed to get it to work with firefox and openssl s_client, but I'm not sure how to make Mechanize send the request in a format that ModSecurity will accept.

Mechanize wants to send something like this:

GET https://host1.example.com/application-web/function?param1=abc&para +m2=def Accept-Encoding: gzip User-Agent: WWW-Mechanize/1.34

ModSecurity will only accept something like this:

GET /application-web/function?param1=abc&param2=def HTTP/1.1 Host: host1.example.com User-Agent: WWW-Mechanize/1.34 Accept: text/html,application/xhtml+xml,application/xml Accept-Encoding: gzip

Notice that ModSecurity wants the Host identifier on a separate line from the get, which is the first issue I'm having trouble with.
It also requires an HTTP version specification on the GET line.
(It also needs an Accept: line, but I believe that can be addressed once the other issues are).

I've looked at the documentation for HTTP::Request but I'm still not sure if there's a way to do this. I'm ok using lower-level stuff like LWP and HTTP::Request, but I need to share a cookie jar with the other requests in my redirect chain (or rewrite all those to use a different module which can handle my specific need and works with cookies). Changing the ModSecurity policies is not an option at this time.

Thanks in advance.

Replies are listed 'Best First'.
Re: Changing WWW::Mechanize Request to obey ModSecurity rules
by ikegami (Patriarch) on Feb 26, 2009 at 04:43 UTC

    Mechanize wants to send something like this:

    Wants to send or actually did send. What's in the HTTP::Request isn't what gets sent. You need to listen to the wire to see what's actually set. You didn't specify what version of LWP you are using, but the latest one claims to use HTTP/1.1 by default (which will use the Host header as in your second snippet).

      I was basing it on something like this:

      my $response = $mech->get($location); print "REQUEST: \n" . $response->request->as_string ." --REQUEST\n";
      It looks like you're right about it being different, which makes things a bit more difficult.

      Normally I'd use wireshark to look at this, but I'm not really a network admin and I'm not sure if there's a way to look directly at the wire and decrypt the SSL content.

        Create a little program that creates a server socket, accepts a connection, and dumps the incoming data. Connect to that instead of a real HTTP server.
Re: Changing WWW::Mechanize Request to obey ModSecurity rules
by Anonymous Monk on Feb 26, 2009 at 04:29 UTC
    WWW-Mechanize/1.34
    Upgrade to WWW-Mechanize-1.54 . While you're at it, upgrade to libwww-perl-5.825.

    ModSecurity will only accept something like this
    What exactly are you basing it that on? That assumption is almost certainly wrong.

      Try
      package MyMechanize; use base 'WWW::Mechanize'; sub _make_request { my $self = shift; my $request = shift; $request->protocol('HTTP/1.1'); # modify $self->SUPER::_make_request($request, @_); } package main; my $ua = MyMechanize->new; $ua->add_handler("request_send", sub { print "request_send\n";shift-> +dump; return }); $ua->add_handler("response_done", sub { print "response_done\n";shift- +>dump; return }); my $uri = URI->new("https://www.modsecurity.org/"); $ua->get( $uri, 'host' => $uri->host, 'Accept' => 'text/html,application/xhtml+xml,application/xml', ); __END__ request_send GET https://www.modsecurity.org/ HTTP/1.1 Accept: text/html,application/xhtml+xml,application/xml Accept-Encoding: gzip Host: www.modsecurity.org User-Agent: WWW-Mechanize/1.54 (no content) response_done HTTP/1.1 200 OK Connection: close Date: Thu, 26 Feb 2009 05:37:00 GMT Accept-Ranges: bytes Server: Apache........ ....... .....

      Thank you anonymous monk! Upgrading to 1.54 (along with adding the Accept header) gets me at least one step farther on this chain.

      For the record: I determined what ModSecurity would and would not reject by looking at the request Firefox sends (with the Live HTTP headers plugin) and pasting in combinations of lines to an openssl s_client session and figuring out which lines/features were necessary to get a 302 redirect instead of a 400 bad request.