Re: Extract a pattern from a string

Replies are listed 'Best First'.
Re^2: Extract a pattern from a string by AnomalousMonk (Archbishop) on Jun 10, 2012 at 17:42 UTC
I used the "\|" as regex delimiter, to avoid the "leaning toothpicks" syndrome - ie to avoid having to escape the "/". But doing that to avoid LTS puts you in danger of succumbing to STD (Straight Toothpick Distemper) the first time you use an `\|` alternation in your regex. Why not just use a pair of nesting delimiters, `{ }` for e.g., and be immunized against many of these pathologies? `$s =~ m{ (\d{1,2}) / (\d{1,2}) / (\d{1,2}) }xmsg` [download]	[reply] [d/l] [select]
Re^3: Extract a pattern from a string by NetWallah (Canon) on Jun 11, 2012 at 14:01 UTC
Agreed. (++) vaccines - your best shot at good health. (Immunization slogan). I hope life isn't a big joke, because I don't get it. -SNL	[reply]
Re^2: Extract a pattern from a string by avim1968 (Acolyte) on Jun 10, 2012 at 11:34 UTC
`$ perl -E 'my $s="ME170-5/2/8-ME172-2/2/6-ME4028"; while ($s=~m\|(\d{1, +2})/(\d{1,2})/(\d{1,2})\|g){++$x;say "$x Found $1-$2-$3:" . pos($s)}'` [download] I have changed the code and now i get the index i need `$ perl -E 'my $s="ME170-5/2/8-ME172-2/2/6-ME4028"; while ($s=~m\|(\d{1, +2}/\d{1,2}/\d{1,2})\|g){++$x;say "$x Found $1:" . (index($s,$1)+1)}'` [download] thank you Avi	[reply] [d/l] [select]
Re^3: Extract a pattern from a string by NetWallah (Canon) on Jun 10, 2012 at 14:46 UTC
This version is slightly more efficent: `perl -E 'my $s="ME170-5/2/8-ME172-2/2/6-ME4028"; while ($s=~m\|(\d{1,2 +}/\d{1,2}/\d{1,2})\|g){++$x;say "$x Found $1:" . " POS=" . (pos($s)-l +ength($1)+1)}'` [download] The reason is that the 'index' operator re-scans the string, while 'pos' uses an existing value, and the 'length' computation is significantly faster than scanning. Update: It also does not suffer from the bug ambrus (++) points out below. I hope life isn't a big joke, because I don't get it. -SNL	[reply] [d/l]
Re^4: Extract a pattern from a string by avim1968 (Acolyte) on Jun 11, 2012 at 05:35 UTC
Hi I have made those two codes based on what you wrote. in the case of multiple identical port numbers in the string. i need to know which would be faster/wiser to use ? `$s=ME170-5/2/8-ME172-5/2/8-ME4028ME172-5/2/8-ME196-5/2/8-ME4002; while ($s=~m/(\d{1,2}\/\d{1,2}\/\d{1,2})/g) {++$r;print "$r Found $1:" + .(pos($s)-length($1)+1) .$nl;}<br>` [download] _OUTPUT_ 1 Found 5/2/8:7 2 Found 5/2/8:19 3 Found 5/2/8:37 4 Found 5/2/8:49 `$s=ME170-5/2/8-ME172-5/2/8-ME4028ME172-5/2/8-ME196-5/2/8-ME4002; while ($s=~m/(\d{1,2}\/\d{1,2}\/\d{1,2})/g) {++$e;print "$e Found $1:" + .($-[1]+1) .$nl;}<br>` [download] _OUTPUT_ 1 Found 5/2/8:7 2 Found 5/2/8:19 3 Found 5/2/8:37 4 Found 5/2/8:49 Thank you Avi	[reply] [d/l] [select]
Re^5: Extract a pattern from a string by NetWallah (Canon) on Jun 11, 2012 at 13:54 UTC
Re^6: Extract a pattern from a string by avim1968 (Acolyte) on Jun 14, 2012 at 05:15 UTC
Re^3: Extract a pattern from a string by ambrus (Abbot) on Jun 10, 2012 at 22:14 UTC
Beware with such a use of `index</i>, it is incorrect for it won't give you the offset you want i +f the substring appears more than once in the input. Eg. <c> $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/10-ME4028"; while ($s=~m\|(\d{ +1,2}/\d{1,2}/\d{1,2})\|g){++$x;say "$x Found $1:" . (index($s,$1)+1)}' 1 Found 5/2/10:7 2 Found 5/2/10:7 $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/1-ME4028"; while ($s=~m\|(\d{1 +,2}/\d{1,2}/\d{1,2})\|g){++$x;say "$x Found $1:" . (index($s,$1)+1)}' 1 Found 5/2/10:7 2 Found 5/2/1:7 $` [download] Instead, if you really want to know the offsets, then use either the `pos` or the `@-` match variable to find where the regular expression has matched: `$ perl -E 'my $s="ME170-5/2/10-ME172-5/2/10-ME4028"; while ($s=~m\|(\d{ +1,2}/\d{1,2}/\d{1,2})\|g){++$x;say "$x Found $1:" . $-[1]}' 1 Found 5/2/10:6 2 Found 5/2/10:19 $ perl -E 'my $s="ME170-5/2/10-ME172-5/2/1-ME4028"; while ($s=~m\|(\d{1 +,2}/\d{1,2}/\d{1,2})\|g){++$x;say "$x Found $1:" . $-[1]}' 1 Found 5/2/10:6 2 Found 5/2/1:19 $` [download] However, maybe you don't want to know the positions at all, but instead match the port numbers and dates with a single regular expression that has two captures. Also, those newlines and plus signs inside the braces are just a mistake you made when pasting here, right?	[reply] [d/l] [select]
Re^4: Extract a pattern from a string by avim1968 (Acolyte) on Jun 11, 2012 at 04:03 UTC
Hi thank you for the warning, I just hit that problem during the run. when i got the same port number in a line. i'll modify my script again, with your input. thank you Avi p.s. all those newlines and plus signs inside the braces are not mine but were placed there after pasting the code.	[reply]
Re^2: Extract a pattern from a string by avim1968 (Acolyte) on Jun 10, 2012 at 08:53 UTC
Hi thank you very much, i understand now were i was wrong. i took your code as a base and modified it a bit and it works great. except that the position is off by few chars ?? from where do you index it? I do have a question, why does the matchs come out in $1=4 $2=2 $3=5 and not as a single substring "4/2/5" is there a way to get it like this? thank you Avi	[reply]
Re^3: Extract a pattern from a string by Anonymous Monk on Jun 10, 2012 at 09:00 UTC
Because that is how the pattern was written, each () corresponds to $n, so the first () is $1 the second is $2 and so on. Read http://perldoc.perl.org/perlintro.html#Parentheses-for-capturing, perlrequick and/or something from Tutorials	[reply]
Re^4: Extract a pattern from a string by avim1968 (Acolyte) on Jun 10, 2012 at 10:01 UTC
Hi thank you for your answer, i have made the needed changes and now i get the full pattern. Avi	[reply]
Re^4: Extract a pattern from a string by avim1968 (Acolyte) on Jun 10, 2012 at 09:28 UTC
Hi except that the position of the substring is off by few chars ?? from where do you index it? Avi	[reply]
Re^5: Extract a pattern from a string by Anonymous Monk on Jun 10, 2012 at 09:38 UTC