rhxk has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

I have a text file that has about 50-60 columns showing some kind of a report separated by spaces. and I want to capture some of the variables w/o typing in the whole reg. expression.

my question is, if I concatenate it, can i still capture the i'th element? I tried it but $3* and $4* come up with empty strings....any suggestions?

if ($line =~ /^([A-Z]{3})-([A-Z])!?((\s+)(\S+)){27}/m) { $a = $1; $b = $2; $c = $4; $d = $5; $e = $13; $f = $33; $g = $35; $h = $49; #do stuff here; }
thanx

2006-05-17 Retitled by planetscape, as per Monastery guidelines

( keep:2 edit:24 reap:0 )

Original title: 'calling reg exp gurus'

Replies are listed 'Best First'.
Re: Capturing columnar data from a text file
by ptum (Priest) on May 16, 2006 at 17:49 UTC

    This is the kind of thing that seems to cry out for use of the split command, if your data is reliably separated by a unique string that doesn't appear inside the column values. Once you've got an array, that is when I would apply a (comparatively more expensive and detailed) regex to the array elements I cared about, if it was still necessary.


    No good deed goes unpunished. -- (attributed to) Oscar Wilde
Re: Capturing columnar data from a text file
by dsheroh (Monsignor) on May 16, 2006 at 18:13 UTC
    If the values are separated by single spaces, then the previous suggestion to look at split is probably the way to go: my @fields = split ' ', $line;

    If the lines are fixed width, with each field starting at a specific position, then split may still work for you (you can split on a regex instead of a fixed character to avoid getting empty values wherever there are consecutive spaces), but you might also want to look at using unpack instead.

Re: Capturing columnar data from a text file
by ruzam (Curate) on May 16, 2006 at 21:26 UTC
    If the data is at fixed columns, then the problem cries out for the use of unpack.
    my @cols = unpack("A5 (x2 A10)*", $line); # $cols[0] gets first 5 characters. # $cols[1] gets next 10 characters (after skipping 2). # $cols[2] gets next 10 characters (after skipping 2). # $cols[3] gets next 10 characters (after skipping 2). # etc...
Re: Capturing columnar data from a text file
by TedPride (Priest) on May 16, 2006 at 23:10 UTC
    my ($x, $y, $c, $d, $e, $f, $g, $h) = (split / /, $line)[0, 1, 3, 4, 12, 32, 34, 48]; print $e;
Re: Capturing columnar data from a text file
by johngg (Canon) on May 16, 2006 at 22:38 UTC
    It is telling that the replies from ptum, esper and ruzam all contain the caveats "if your data ... " or "if your lines ... " or the like. I think it would be helpful here if you could post a sample of the data.

    Cheers,

    JohnGG

Re: Capturing columnar data from a text file
by planetscape (Chancellor) on May 17, 2006 at 10:37 UTC