Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I need to get the directory path from the HTTP_REFERER variable but I am having trouble with the regexp to do this: $mypath = from beginning of string to the last "/" found in the string thanks!

Replies are listed 'Best First'.
Re: regexp to get path from URL
by chromatic (Archbishop) on Sep 29, 2000 at 21:50 UTC
    The URI module has a path() method... could come in handy.

    If you *know* the path is sane, you might do something like this:

    (my $path = $ENV{HTTP_REFERRER}) =~ s!//(.+?)/[^/]+\?*?!$1!;

    I would rather have a module do it, though I've also split on the forward slashes...

    Update: In the interests of historical accuracy, I'll simply humour Ovid by providing a more workable regex:

    (my $path = $ENV{HTTP_REFERER}) =~ s!\.\w+/(.+?)/[^/]+\?*?!$1!;

    This is your brain on work.

(Ovid) Re: regexp to get path from URL
by Ovid (Cardinal) on Sep 29, 2000 at 22:10 UTC
    chromatic's regex appears to be picking up the domain rather than the path. Did I miss something?

    Heh. chromatic misspelled 'HTTP_REFERER' by spelling 'referrer' correctly :) The beauty of misspelled standards...

    I'd use the following regex, so long as you remember that the referrer can be easily spoofed and this isn't being used for security:

    $ENV{HTTP_REFERER} =~ m!^[^:]+://[^/]+/([^?]+)! or die "No path info!" +; my $path = $1;
    The final [^?]+ strips off the query string. Switch it to .* if you want the query string.

    The or die on the regex kicks off if we don't have a path. Obviously, if the referrer is something like "http://www.yahoo.com/", you're going to have a problem. However, I assumed that your referrer would probably have a path appended. You can change that to or some_error_subroutine() if you like.

    Alternately, if no path info is acceptable, make sure that $path is set to "" on a regex failure. Otherwise, if anything was already in $1, then that would be assigned to $path and obviously, this is undesirable (thanks to chromatic for pointing that out to me).

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just go the the link and check out our stats.