htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I"m trying to grab the current URL, and find a page number at the end. The URL is like the following:

http://www.domain.com/sub/name/anothername/page2/
http://www.domain.com/sub/name/anothername/page3/
http://www.domain.com/sub/name/anothername/page4/

I'm trying to set the page number to a var. I tried this, but the .* takes out the page# too.

$current_url = $ENV{REQUEST_URI}; $current_url =~ s/\/sub-dir\/name/.*\///;

I'm trying to end up with the page number only.

Thanks,

Replies are listed 'Best First'.
Re: Stripping part of URL
by graff (Chancellor) on Mar 24, 2015 at 02:12 UTC
    If it's true that the thing you really want is always "pageN", then you should just match that:
    $current_url =~ s{.*?/(page\d+).*}{$1}:
    Note the use of "?" to invoke a "non-greedy match", so that the initial ".*" will stop matching as soon as there's a slash followed by "page\d+".
      This worked! Thank you.

        Hi, This is just my two cents. I tend write very verbose code (sorry in advance). Most of what I point out here may be obvious.

        $current_url = "http://www.domain.com/sub/name/anothername/page4/"; print "1. $current_url\n"; $save = $current_url; # adding "|| die" will alert you to a pars problem. $current_url =~ s/.*?\/(page\d+).*/$1/ || die "Cant pars $current_url\ +n"; print "$current_url\n"; # I personally like to do this, It's a lot more code but it allows # you to recoved from a pars error. # or ignore URL's that dont match your expected format) $current_url = $save; print "2. $current_url\n"; if ($current_url =~ /^http[s]{0,1}:\/\/.+\/page(\d+).*$/i) { $current_url = $1; print "Decimal page number is: $current_url\n"; } else { print "pars error on $current_url\n"; #Do what you may with this issue, but you # know it happened... }
Re: Stripping part of URL
by Anonymous Monk on Mar 24, 2015 at 02:04 UTC
Re: Stripping part of URL
by jeffa (Bishop) on Mar 24, 2015 at 15:29 UTC

    Just another way to do it:

    use strict; use warnings; use URI; my $uri = URI->new( $ENV{REQUEST_URI} ); my $page = ($uri->path_segments)[-2];
    The problem with this code is that the return value of URI::path_segments() changes if the URI contains a trailing slash or not. You could improve the robustness at the expense of added complexity by using map to squash the return of URI::path_segments() however:

    my $page = (map $_ ? $_ : (), $uri->path_segments)[-1];
    Kinda defeats the purpose of using methods from a module however. ;)

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)