Jaya has asked for the wisdom of the Perl Monks concerning the following question:

I have a file and i need to match one line and extract it.

the line is as follows:

12..13 13..14 14..15 15..1 15..2 14..3 13..16 16..4 16..17 17..5 17..18 18..6 18..7 12..19 19..20 20..8 20..10 19..9 12..11

I tried all ways of getting that line and i failed. The combinations i tried are:

for(my $i = 0; $i <= $#filedata; ++$i){ if($filedata[$i] =~ m/\d\.\.\d/){ my $branch = $filedata[$i];}} print("$branch \n");
That was the last thing i tried the others are :
if($filedata[$i] =~ m/\d..\d/) if($filedata[$i] =~ m/\.\./) if($filedata[$i] =~ m/\d+\.\.\d+/) if($filedata[$i] =~ m/../) if($filedata[$i] =~ m/\../)
Suggestions are welcome on how to get this line

After getting this line I again need to seperate and have 12..13, 13..14, 14..15 etc..

Replies are listed 'Best First'.
Re: Regular expressions and metacharacters
by halley (Prior) on Apr 07, 2005 at 01:14 UTC
    You didn't show us what should NOT be matched. As far as we know, m/./ might be all that's needed to match the correct non-empty line. Or maybe only lines that give exactly those numbers and no other possible numbers should be accepted. Or if negative numbers or decimal points or fractions might be encountered. We'll just have to guess something out of thin air.
    for (@filedata) { next if not m/\d+\.\.\d+/; @spans = m/(\d+\.\.\d+)/g; print "@spans\n"; }

    --
    [ e d @ h a l l e y . c c ]

Re: Regular expressions and metacharacters
by tlm (Prior) on Apr 07, 2005 at 01:21 UTC

    Try

    my $re = qr(\d{1,2}\.\.\d{1,2}); # $re = qr(\d+\.\.\d+) # is more general, and probably better # unless you want to be able to detect # deviations from the pattern you posted; # likewise, the space used in the full # regexp below may be needlessly specific; # you may want to change it to \s+ to allow # for variable-length whitespace between # the parts. $line =~ /^((?:$re )+$re)/; my $captured = $1; my @parts = split ' ', $captured;

    Update: : added line to collect the desired parts, the commented more general alternative for $re, and explanatory comments.

    the lowliest monk

Re: Regular expressions and metacharacters
by inman (Curate) on Apr 07, 2005 at 10:50 UTC
    After getting this line I again need to seperate and have 12..13, 13..14, 14..15 etc..

    Try the following. Read in the file one line at a time using the outer while loop. For each line run the inner while loop to extract all of the matches.

    The inner while tracks the progress of the pattern along the data because of the g modifier. The pattern itself captures the matched data (using parentheses) which can then be pushed onto an array as $1.

    #! /usr/bin/perl -w use strict; use warnings; my @data; while (<DATA>) { while (/(\d+\.\.\d+)/g) { push @data, $1; } } print "@data\n"; __DATA__ This is a file that contains lines like this 12..13 13..14 14..15 15..1 15..2 14..3 13..16 that we want to extract. 16..4 16..17 17..5 17..18 18..6 18..7 12..19 19..20 20..8 20..10 19..9 + 12..11 That was another one
Re: Regular expressions and metacharacters
by ryantate (Friar) on Apr 07, 2005 at 01:15 UTC
    If the line if($filedata[$i] =~ m/../) did not match, it is highly likely @filedata does not contain what you think it contains, or much of anything at all, since two periods matches any two characters (except in some cases a newline). Have you tried printing every line before the IF statement to make sure the data is actually in @filedata as you expect?

    The final regex you settled looks like it should work fine, by the way (although as halley noted may give you false positives).

Re: Regular expressions and metacharacters
by lidden (Curate) on Apr 08, 2005 at 02:15 UTC
    You where not clear on what you did not want to match, but maybe something like this is what you want.
    my @filedata = ( '24..4 foo bar', '12..13 13..14 14..15 15..1 15..2 14 +..3 13..16 16..4 16..17 17..5 17..18 18..6 18..7 12..19 19..20 20..8 +20..10 19..9 12..11', 'junk'); my @result; for my $line (@filedata){ if($line =~ m/^ (?: \s* \d+ \. \. \d+ \s* )+ $/x ){ @result = split ' ', $line; last; } } print join ', ', @result;