in reply to This regular expression has me stumped

Not every problem is best solved with a big fat regex.

while (my $line = <FILE>) { my @files = map { m!/([\w\.\-]+)\W*$!; $1 } grep { m!/! } split ' ', $line; # blah }

The logic goes split on whitespace, ignore all tokens that don't have a file path sep / with , then get the last bit after the / up to the end or optional \W* using map. The character class [\w\.\-] should match most filenames. Normally I would use [^/] but this is problematic in this case. Should work on your data as described.

Replies are listed 'Best First'.
Re^2: This regular expression has me stumped
by tsk1979 (Scribe) on May 01, 2008 at 09:08 UTC
    I was hung up on regexp because I want this via a command line perl -nei.bak..... I checked my log files /blah/blah/blah/filename can be follows by a whitespace, a "@' q "," or a ":" I have searched for perl non greedy and I suspect
    /.*?[@:,\s+]/
    will actually match the whole <code>/blah/blah/blah/filename.ext>/code> the problem here is, how to retain the filename...?

      You almost never want .* A negated character class is generally better. For example m!/[(^/)]+$! will grab the last bit of the filepath reliably but the regex posted above in the map should DWIM

      You could certainly code the example above as a one liner but it seems a waste of time to me. You can make a reusable 4 line script in less time than it will take fiddling. You can put options like -p -F -n on the shebang. As a one liner it would be like:

      perl -F -ane 'print map{"$_\n"} map{ } grep { } @F' <file>

      where the map and grep blocks are as above.

      Picking up with the theme you were following, I got this to work. I haven't thought alot about corner cases, performance or reusability, so Grandfather's and tachyon-II's solutions are probably better.

      update: apparently I'm just confused on this matter && added comment on second s/// with no effect: I didn't like doing the substitution twice just to get the end-of-line anchor to work. Perhaps some wiser monks can explain that to me. update: That was before I added chomp, so never mind . . .

      #/usr/bin/perl -W $\="\n"; use strict; use warnings; while (<DATA>) { chomp; print $_; s/\/(?:[^\@:,\s+]*\/)(.*?)[\@:,\s+]*/\/new\/path\/$1/g; #s/\/(?:[^\@:,\s+]*\/)(.*?)[\@:,\s+]*$/\/new\/path\/$1/g; print $_; print ''; } # produces: # C:\chas_sandbox> # 683879resp.pl # file /user/name/some/path/to/filename@@ dumped: replaced /user/name/ +blah/blah/filename # file /new/path/filename@@ dumped: replaced /new/path/filename # # @@@@user/some/file/filename.sdc: dumped # @@@@user/new/path/filename.sdc: dumped __DATA__ file /user/name/some/path/to/filename@@ dumped: replaced /user/name/bl +ah/blah/filename @@@@user/some/file/filename.sdc: dumped


      #my sig used to say 'I humbly seek wisdom. '. Now it says:
      use strict;
      use warnings;
      I humbly seek wisdom.
A no-op in this map block: was Re^2: This regular expression has me stumped
by Narveson (Chaplain) on May 01, 2008 at 22:59 UTC

    No need for the; $1 in map { m!/([\w\.\-]+)\W*$!; $1 }

    since a match in list context returns the captured substrings and the block of a map is in list context.

      Good point. It was a rather off the cuff untested solution....