Why is this greedy matching?

awohld has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Why is this greedy matching? by toolic (Bishop) on Mar 23, 2012 at 14:41 UTC
File::Basename is easier: `use warnings; use strict; use File::Basename; my $file = 'http://txs.corp.com:8080/area/es/2215.csv.gz'; my $gzip = basename($file, qr/\.gz/); print "$gzip\n"; __END__ 2215.csv.gz` [download] Demystify regular expressions by installing and using the CPAN module `YAPE::Regex::Explain` (Tip #9 from Basic debugging checklist) The regular expression: (?-imsx:/(.?\.gz)$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- / '/' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- .? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- \. '.' ---------------------------------------------------------------------- gz 'gz' ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- [download]	[reply] [d/l] [select]
Re: Why is this greedy matching? by BrowserUk (Patriarch) on Mar 23, 2012 at 15:10 UTC
If you don't want to capture any slashes, say that!: `print 'http://txs.corp.com:8080/area/es/2215.csv.gz' =~ m[ ( [^/]+ $ ) + ]x;; 2215.csv.gz` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. The start of some sanity?	[reply] [d/l]
Re: Why is this greedy matching? (leftmost) by tye (Sage) on Mar 23, 2012 at 14:48 UTC
"First match" (leftmost) trumps "shortest match" (anti-greedy). You can use a greedy .* to replace "leftmost (sub)match" with "rightmost (sub)match" to end up with "shortest match" in cases like this: `m{./(.\.gz)$}` [download] Note that it doesn't matter whether you make the part inside the parens greedy or not here. - tye	[reply] [d/l]
Re: Why is this greedy matching? by ww (Archbishop) on Mar 23, 2012 at 14:35 UTC
...perhaps because the deathstar in your capture matches everything after the second "`/`" in "`http://`"	[reply] [d/l] [select]
Re: Why is this greedy matching? by JavaFan (Canon) on Mar 23, 2012 at 20:29 UTC
Left most trumps non-greedy.	[reply]