in reply to This regular expression has me stumped

For this task a little looking around helps as does knowing what not to find, oh, and taking care of lose ends helps too. Consider:

use strict; use warnings; my @tests = ( "First: /home/user/blah/filename and /home/user/blah/filename2 end +", "/home/user/blah/filename,/home/user/blah/filename2", "/home/user/blah/filename; /home/user/blah/filename2", "/home/user/blah/filename\@10:30 /home/user/blah/filename2", ); for my $str (@tests) { $str =~ s!(?:^|/)[^\s,@;]*(?<=/)([^\s,@;]+?)(?=[\s,@;]|$)!$1!g; print "$str\n"; }

Prints:

First: filename and filename2 end filename,filename2 filename; filename2 filename@10:30 filename2

Perl is environmentally friendly - it saves trees

Replies are listed 'Best First'.
Re^2: This regular expression has me stumped
by tsk1979 (Scribe) on May 01, 2008 at 10:18 UTC
    Hmm your solution looks like its working! Great. Now the big problem. I cannot make a head or tail of the regexp :( could you explain me a little bit on what exactly happened up there. It made a whooshing sound and flew right by :)

      :-D

      Ok, let's take it a a little at a time:

      s! you know, although it's possible you didn't know you can use pretty much any character for the expression delimiters.

      (?:^|/) matches (without capturing) either the start of the string or a /.

      [^\s,@;]* matches as many characters that aren't in the set of terminal characters as can be found.

      (?<=/) looks back and asserts the last character matched was /.

      ([^\s,@;]+?) matches and captures as few non-terminal characters as it can and still find a match. That's the filename that you want.

      (?=[\s,@;]|$) looks ahead and asserts that the next character is a terminal character or the end of the string.

      !$1!g you are probably completely familiar with - replace all the matched stuff with the captured string and do it for every match that can be found.

      So with a little head scratching the introductory line of my initial reply might make a more sense along with the regex. For further study consult perlretut, perlre and perlreref.


      Perl is environmentally friendly - it saves trees
      To supplement GrandFather's excellent explanation, here is the output generated by YAPE::Regex::Explain.
      use warnings; use strict; use YAPE::Regex::Explain; my $re = 's!(?:^|/)[^\s,@;]*(?<=/)([^\s,@;]+?)(?=[\s,@;]|$)!$1!g'; my $parser = YAPE::Regex::Explain->new($re); print $parser->explain;
Re^2: This regular expression has me stumped
by tsk1979 (Scribe) on May 02, 2008 at 05:57 UTC
    I found a corner case.... :) how about ../filename or ../some/path/filename or ../../some/path/filename
      Another one /some/silly/path/here/../../another/silly/path/filename
Okay, I know why is it failing
by tsk1979 (Scribe) on May 02, 2008 at 06:16 UTC
    this can work right ? /fjsdklf/fjsldkfs/fsjdklf-fs-0-fsf/../fjskfjs/.. +/../../fsfkslf/filename ../../../../filename ../hello dofghello/two/forut/../filename2 Will this work ../../../../jfsdfjskdlfjs/../fjsklf/fjksfjskflsd/filena +me I will do replacement for ../filename this can work right /fjsdklf/fjsldkfs/fsjdklf-fs-0-fsf/../fjskfjs/../ +../../fsfkslf/filename I will think of even/more/silly/../../harder/cases/../analysis/filenam +e and do it ../twice as well as put/some/path/and/make/it/thrice
    We always assume that the whole path starts with / But the path can be some/path/to/filename also! In that case this will definitely fail. I am scratching my head as to what kind of check to put in for that. Helllp!! :)
      Solved!
      use strict; use warnings; my $file; foreach $file (@ARGV) { open (INFILE,"<$file") or die "Cannot open Input file\n"; while (<INFILE>) { s!(?:^|\w*/|\.\./)[^\s,@;:]*(?<=/)([^\s,@;:]+?)(?=[\s,@;:]|$)! +$1!g; # s!\.\.!!g; print "$_"; } close INFILE; }

        Hmm, so that's better than a split/grep/map solution? Why try to do something with one very complicated regex (well you use two) when breaking the task down into small chunks can make it easy to do, easy to understand, easy to debug, easy to maintain.....