in reply to Re^2: Difficult? regex
in thread Difficult? regex

Regarding your original post, the replies given so far do solve the problem you mentioned.

If the behavior is still not what you expected, then there are other things that you will want to say, because we cannot guess what that expected behavior is.

You say the first and third regexps work. Let me show you that the second also works, and the returned value is '0', just like you want (unless you really mean 'zero' and not '0'.

sub test_url { my ( $s, $server ) = @_; # return 1; # Ok to index/spider # return 0; # No, don't index or spider; # ignore any common image files return 0 if $s =~ /\.(gif|jpg|jpeg|png)?$/; # ignore directory listing sorting links return 0 if $s =~ /\?(C=N;O=D|C=M;O=A)$/; # make sure that the path is limited to the docs path return $s =~ m[^/starteam_area/]; } my $res; $res = test_url('http://someurl.com/?C=N;O=D'); print "returned value was - ".$res."\n"; $res = test_url('http://someurl.com/?C=M;O=A'); print "returned value was - ".$res."\n"; $res = test_url('http://someurl.com/'); print "returned value was - ".$res."\n"; $res = test_url('http://someurl.com/?C=X;O=A'); print "returned value was - ".$res."\n"; ---- #output returned value was - 0 returned value was - 0 returned value was - returned value was -

Note that I replaced $uri with $s because I don't know what kind of structure $uri is.

Replies are listed 'Best First'.
Re^4: Difficult? regex
by Anonymous Monk on Feb 22, 2008 at 15:43 UTC
    Ah, your last sentence is what led me to the bug. $uri->path from my example returns the URL without the URL parameters (i.e. without everything after the question mark). I didn't discover this until I created some tests similar to the one you posted. Good regex, bad input. Anyway, I ended up using the following regex (from one of the answers) because it is what I was eventually aiming for:
    /\?(C=[NMSD];O=[AD])$/
    Thank you to everyone for your help.