10.16.0.2 - - [19/Jan/2008:03:45:06 -0800] "GGG99994" 200 752 "-" "-" +"10.16.0.2"
(here we have no method, no discernible request, and no version)

Not exactly true. You have a method. It's just a really weird (and probably invalid) one. I'm not sure why your server would 200 it; I can only presume some slightly odd config.

Take it in individual steps. First try splitting out into the 3 main pieces:

my ($method, $uri, $proto, @extra) = split /\s+/, $request; die "Unexpected extra bits in request: @extra" if @extra > 0; die "No method" unless defined $method; # Or whatever other error-handling mechanism you want

You shouldn't have any extra bits, becaue if you do, that means that your $method, $uri, $proto may not hold what you expect them to, so that needs error-checking.

As well, you should have a method. The minimal possible HTTP request AFAIK would be a method of " ", with nothing else. That would leave all the vars undefined, and probably isn't something you care about anyway, so another error there.

The protocol may not be there. But expect that in higher level code, or defined-or it to an empty string here if you prefer.

That leaves the URI. Using URI::Split as suggested above in Re: splitting a string that appears inconsistently in structure would be better than trying to split it up manually. Imagine, for instance, the case of having a '?' in the password; a simple regexp would give you a wrong answer then.

Note that the $uri can be undefined. A request of just "GET " is interpreted as "GET /" (similarly with POST), and would leave $uri undefined after that split. You probably want to make sure it's defined (as an empty string in this case) before you pass it to uri_split(). The URI::Split docs say:

The $path part is always present (but can be the empty string) and is thus never returned as "undef".

So take care not to blow up if it's empty.


In reply to Re^3: splitting a string that appears inconsistently in structure by fullermd
in thread splitting a string that appears inconsistently in structure by TheGorf

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.