Regex Matching

kbradford has asked for the wisdom of the Perl Monks concerning the following question:

I have a single line that is 46 characters long with no delimeters. Looks like:

1222222222222222222223333333333333333333344444

The numbers just show where the data ends and a new set begins. Now I have to parse these, but I've never done it with no delimeters. Also, some fields may be left blank with all spaces. I was trying to use the regex expression:

/([\s.])([\s.]{20})([\s.]{20})([\s.]{5})/;
[download]

But it didn't like it. I'm fairly new to Perl, but from what I know this should match ANYTHING (\s for space, . for any other character) and match it for how many spaces are specified. But it sure doesn't work! Any help would be greatly appreciated.

Kevin

Comment on Regex Matching Download Code

Replies are listed 'Best First'.
Re: Regex Matching by Hofmator (Curate) on Jul 05, 2001 at 17:22 UTC
This is not the place to use a regex. They are not good for everything ;-) Have a look at unpack and pack with the @ directive. -- Hofmator	[reply]
Re: Regex Matching by jeroenes (Priest) on Jul 05, 2001 at 17:22 UTC
The trick lies in the fact that you want to match any character, but that the dot looses its special meaning in the square brackets. See perlop and perlre. You could just replace the bracket stuff with a single dot. I would prefer to use substr to get the data (untested, you get the idea mehopes): `my @lengths = qw/1 20 20 5/; #inside some looplike thing my @array = (); push @array, substr( $line, 0, $_, '' ) for @lengths; print join "\t", @array; }` [download] Hope this helps, Jeroen "We are not alone"(FZ) Update: Just stick to unpack as Hofmator says. It gives you all in one line: `my @array= unpack 'a1a20a20a5', $line;` [download] I couldn't get it to strip trailing spaces/nulls with A or Z or @, though.	[reply] [d/l] [select]
Re: Re: Regex Matching by kbradford (Novice) on Jul 05, 2001 at 20:54 UTC
Using substr seemed to work. Didn't use it in an array though, just set each one to a scalar and stuck it in a while loop. Works great. Thanks guys! Kevin	[reply]
Re: Regex Matching by MZSanford (Curate) on Jul 05, 2001 at 17:28 UTC
Assuming the data is in `$line`, you could do any of the following : 1. `my ($fieldA,$fieldB,$fieldC,$fieldD) = unpack("a1a20a20a5",$line);` 2. `$line =~ m/(.)(.{20,20})(.{20,20})(.{5,5})/; ## note, \s is part of .` 3. `$fA = substr($line,0,1);$fB = substr($line,1,20); ## etc...` may the foo be with you	[reply] [d/l] [select]
Re: Regex Matching by Abigail (Deacon) on Jul 05, 2001 at 19:26 UTC
There is no need to write `{20,20}` in your regex. `{20}` means the same, is less typing, and, IMO, easier to read. -- Abigail	[reply] [d/l] [select]
Re: Regex Matching by tachyon (Chancellor) on Jul 05, 2001 at 17:42 UTC
Actually what your regex is trying to match is spaces tabs or literal '.' characters. In a character class the '.' char matches a literal '.' Outside it matches anything including spaces but excluding only newlines. It will match newlines as well with a /s modifier. `$_='1222222222222222222223333333333333333333344444'; /(.)(.{20})(.{20})(.{5})/; print "$1 $2 $3 $4"; # '.' will match a space print " " =~ /./ ? "\nmatch space" : "\nno match space"; # this is how you make . match everyting with a /s print "\n" =~ /./ ? "\nmatch" : "\nno match"; print "\n" =~ /./s ? "\nmatch" : "\nno match";` [download] Hope this helps cheers tachyon s&&rsenoyhcatreve&&&s&n\w+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]