Regular expressions and metacharacters

Jaya has asked for the wisdom of the Perl Monks concerning the following question:

I have a file and i need to match one line and extract it.

the line is as follows:

12..13 13..14 14..15 15..1 15..2 14..3 13..16 16..4 16..17 17..5 17..18 18..6 18..7 12..19 19..20 20..8 20..10 19..9 12..11

I tried all ways of getting that line and i failed. The combinations i tried are:

  for(my $i = 0; $i <= $#filedata; ++$i){
    if($filedata[$i] =~ m/\d\.\.\d/){
       my $branch = $filedata[$i];}}
 print("$branch \n");
[download]

That was the last thing i tried the others are :

if($filedata[$i] =~ m/\d..\d/)
if($filedata[$i] =~ m/\.\./)
if($filedata[$i] =~ m/\d+\.\.\d+/)
if($filedata[$i] =~ m/../)
if($filedata[$i] =~ m/\../)
[download]

Suggestions are welcome on how to get this line

After getting this line I again need to seperate and have 12..13, 13..14, 14..15 etc..

Comment on Regular expressions and metacharacters Select or Download Code

Replies are listed 'Best First'.
Re: Regular expressions and metacharacters by halley (Prior) on Apr 07, 2005 at 01:14 UTC
You didn't show us what should NOT be matched. As far as we know, `m/./` might be all that's needed to match the correct non-empty line. Or maybe only lines that give exactly those numbers and no other possible numbers should be accepted. Or if negative numbers or decimal points or fractions might be encountered. We'll just have to guess something out of thin air. `for (@filedata) { next if not m/\d+\.\.\d+/; @spans = m/(\d+\.\.\d+)/g; print "@spans\n"; }` [download] -- `[ e d @ h a l l e y . c c ]`	[reply] [d/l] [select]
Re: Regular expressions and metacharacters by tlm (Prior) on Apr 07, 2005 at 01:21 UTC
Try `my $re = qr(\d{1,2}\.\.\d{1,2}); # $re = qr(\d+\.\.\d+) # is more general, and probably better # unless you want to be able to detect # deviations from the pattern you posted; # likewise, the space used in the full # regexp below may be needlessly specific; # you may want to change it to \s+ to allow # for variable-length whitespace between # the parts. $line =~ /^((?:$re )+$re)/; my $captured = $1; my @parts = split ' ', $captured;` [download] Update: : added line to collect the desired parts, the commented more general alternative for `$re`, and explanatory comments. the lowliest monk	[reply] [d/l] [select]
Re: Regular expressions and metacharacters by inman (Curate) on Apr 07, 2005 at 10:50 UTC
After getting this line I again need to seperate and have 12..13, 13..14, 14..15 etc.. Try the following. Read in the file one line at a time using the outer while loop. For each line run the inner while loop to extract all of the matches. The inner while tracks the progress of the pattern along the data because of the g modifier. The pattern itself captures the matched data (using parentheses) which can then be pushed onto an array as $1. `#! /usr/bin/perl -w use strict; use warnings; my @data; while (<DATA>) { while (/(\d+\.\.\d+)/g) { push @data, $1; } } print "@data\n"; __DATA__ This is a file that contains lines like this 12..13 13..14 14..15 15..1 15..2 14..3 13..16 that we want to extract. 16..4 16..17 17..5 17..18 18..6 18..7 12..19 19..20 20..8 20..10 19..9 + 12..11 That was another one` [download]	[reply] [d/l]
Re: Regular expressions and metacharacters by ryantate (Friar) on Apr 07, 2005 at 01:15 UTC
If the line `if($filedata[$i] =~ m/../)` did not match, it is highly likely `@filedata` does not contain what you think it contains, or much of anything at all, since two periods matches any two characters (except in some cases a newline). Have you tried printing every line before the IF statement to make sure the data is actually in `@filedata` as you expect? The final regex you settled looks like it should work fine, by the way (although as halley noted may give you false positives).	[reply] [d/l] [select]
Re: Regular expressions and metacharacters by lidden (Curate) on Apr 08, 2005 at 02:15 UTC
You where not clear on what you did not want to match, but maybe something like this is what you want. `my @filedata = ( '24..4 foo bar', '12..13 13..14 14..15 15..1 15..2 14 +..3 13..16 16..4 16..17 17..5 17..18 18..6 18..7 12..19 19..20 20..8 +20..10 19..9 12..11', 'junk'); my @result; for my $line (@filedata){ if($line =~ m/^ (?: \s* \d+ \. \. \d+ \s* )+ $/x ){ @result = split ' ', $line; last; } } print join ', ', @result;` [download]	[reply] [d/l]