in reply to Re: Extract a pattern from a string
in thread Extract a pattern from a string

$ perl -E 'my $s="ME170-5/2/8-ME172-2/2/6-ME4028"; while ($s=~m|(\d{1, +2})/(\d{1,2})/(\d{1,2})|g){++$x;say "$x Found $1-$2-$3:" . pos($s)}'
I have changed the code and now i get the index i need
$ perl -E 'my $s="ME170-5/2/8-ME172-2/2/6-ME4028"; while ($s=~m|(\d{1, +2}/\d{1,2}/\d{1,2})|g){++$x;say "$x Found $1:" . (index($s,$1)+1)}'
thank you
Avi

Replies are listed 'Best First'.
Re^3: Extract a pattern from a string
by NetWallah (Canon) on Jun 10, 2012 at 14:46 UTC
    This version is slightly more efficent:
    perl -E 'my $s="ME170-5/2/8-ME172-2/2/6-ME4028"; while ($s=~m|(\d{1,2 +}/\d{1,2}/\d{1,2})|g){++$x;say "$x Found $1:" . " POS=" . (pos($s)-l +ength($1)+1)}'
    The reason is that the 'index' operator re-scans the string, while 'pos' uses an existing value, and the 'length' computation is significantly faster than scanning.
    Update: It also does not suffer from the bug ambrus (++) points out below.

                 I hope life isn't a big joke, because I don't get it.
                       -SNL

      Hi
      I have made those two codes based on what you wrote.
      in the case of multiple identical port numbers in the string.
      i need to know which would be faster/wiser to use ?
      $s=ME170-5/2/8-ME172-5/2/8-ME4028ME172-5/2/8-ME196-5/2/8-ME4002; while ($s=~m/(\d{1,2}\/\d{1,2}\/\d{1,2})/g) {++$r;print "$r Found $1:" + .(pos($s)-length($1)+1) .$nl;}<br>
      _OUTPUT_
      1 Found 5/2/8:7
      2 Found 5/2/8:19
      3 Found 5/2/8:37
      4 Found 5/2/8:49
      $s=ME170-5/2/8-ME172-5/2/8-ME4028ME172-5/2/8-ME196-5/2/8-ME4002; while ($s=~m/(\d{1,2}\/\d{1,2}\/\d{1,2})/g) {++$e;print "$e Found $1:" + .($-[1]+1) .$nl;}<br>
      _OUTPUT_
      1 Found 5/2/8:7
      2 Found 5/2/8:19
      3 Found 5/2/8:37
      4 Found 5/2/8:49
      Thank you
      Avi
        Brilliant ! - $-[1] is already calculated in the regex (as @-), so yes, it is more efficient than pos+length.

                     I hope life isn't a big joke, because I don't get it.
                           -SNL

Re^3: Extract a pattern from a string
by ambrus (Abbot) on Jun 10, 2012 at 22:14 UTC

    Beware with such a use of

    index</i>, it is incorrect for it won't give you the offset you want i +f the substring appears more than once in the input. Eg. <c> $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/10-ME4028"; while ($s=~m|(\d{ +1,2}/\d{1,2}/\d{1,2})|g){++$x;say "$x Found $1:" . (index($s,$1)+1)}' 1 Found 5/2/10:7 2 Found 5/2/10:7 $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/1-ME4028"; while ($s=~m|(\d{1 +,2}/\d{1,2}/\d{1,2})|g){++$x;say "$x Found $1:" . (index($s,$1)+1)}' 1 Found 5/2/10:7 2 Found 5/2/1:7 $

    Instead, if you really want to know the offsets, then use either the pos or the @- match variable to find where the regular expression has matched:

    $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/10-ME4028"; while ($s=~m|(\d{ +1,2}/\d{1,2}/\d{1,2})|g){++$x;say "$x Found $1:" . $-[1]}' 1 Found 5/2/10:6 2 Found 5/2/10:19 $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/1-ME4028"; while ($s=~m|(\d{1 +,2}/\d{1,2}/\d{1,2})|g){++$x;say "$x Found $1:" . $-[1]}' 1 Found 5/2/10:6 2 Found 5/2/1:19 $

    However, maybe you don't want to know the positions at all, but instead match the port numbers and dates with a single regular expression that has two captures.

    Also, those newlines and plus signs inside the braces are just a mistake you made when pasting here, right?

      Hi
      thank you for the warning, I just hit that problem during the run.
      when i got the same port number in a line.
      i'll modify my script again, with your input.
      thank you
      Avi
      p.s.
      all those newlines and plus signs inside the braces are not
      mine but were placed there after pasting the code.