kbradford has asked for the wisdom of the Perl Monks concerning the following question:

I have a single line that is 46 characters long with no delimeters. Looks like:

1222222222222222222223333333333333333333344444

The numbers just show where the data ends and a new set begins. Now I have to parse these, but I've never done it with no delimeters. Also, some fields may be left blank with all spaces. I was trying to use the regex expression:

/([\s.])([\s.]{20})([\s.]{20})([\s.]{5})/;

But it didn't like it. I'm fairly new to Perl, but from what I know this should match ANYTHING (\s for space, . for any other character) and match it for how many spaces are specified. But it sure doesn't work! Any help would be greatly appreciated.

Kevin

Replies are listed 'Best First'.
Re: Regex Matching
by Hofmator (Curate) on Jul 05, 2001 at 17:22 UTC

    This is not the place to use a regex. They are not good for everything ;-)

    Have a look at unpack and pack with the @ directive.

    -- Hofmator

Re: Regex Matching
by jeroenes (Priest) on Jul 05, 2001 at 17:22 UTC
    The trick lies in the fact that you want to match any character, but that the dot looses its special meaning in the square brackets. See perlop and perlre. You could just replace the bracket stuff with a single dot.

    I would prefer to use substr to get the data (untested, you get the idea mehopes):

    my @lengths = qw/1 20 20 5/; #inside some looplike thing my @array = (); push @array, substr( $line, 0, $_, '' ) for @lengths; print join "\t", @array; }
    Hope this helps,

    Jeroen
    "We are not alone"(FZ)
    Update: Just stick to unpack as Hofmator says. It gives you all in one line:

    my @array= unpack 'a1a20a20a5', $line;
    I couldn't get it to strip trailing spaces/nulls with A or Z or @, though.
      Using substr seemed to work. Didn't use it in an array though, just set each one to a scalar and stuck it in a while loop. Works great. Thanks guys!

      Kevin

Re: Regex Matching
by MZSanford (Curate) on Jul 05, 2001 at 17:28 UTC
    Assuming the data is in $line, you could do any of the following :

    1. my ($fieldA,$fieldB,$fieldC,$fieldD) = unpack("a1a20a20a5",$line);
    2. $line =~ m/(.)(.{20,20})(.{20,20})(.{5,5})/; ## note, \s is part of .
    3. $fA = substr($line,0,1);$fB = substr($line,1,20); ## etc...

    may the foo be with you
      There is no need to write {20,20} in your regex. {20} means the same, is less typing, and, IMO, easier to read.

      -- Abigail

Re: Regex Matching
by tachyon (Chancellor) on Jul 05, 2001 at 17:42 UTC

    Actually what your regex is trying to match is spaces tabs or literal '.' characters. In a character class the '.' char matches a literal '.' Outside it matches *anything* including spaces but excluding only newlines. It will match newlines as well with a /s modifier.

    $_='1222222222222222222223333333333333333333344444'; /(.)(.{20})(.{20})(.{5})/; print "$1 $2 $3 $4"; # '.' will match a space print " " =~ /./ ? "\nmatch space" : "\nno match space"; # this is how you make . match everyting with a /s print "\n" =~ /./ ? "\nmatch" : "\nno match"; print "\n" =~ /./s ? "\nmatch" : "\nno match";

    Hope this helps

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n\w+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print