Your first problem is that you are looking for two single letters but using \d in your pattern each time, which means a digit; use a character class of [A-Z] (or [A-Za-z] if you expect lower case as well). Secondly, you are using parentheses to capture your three floating point numbers but you dont seem to be using the captures afterwards. Thirdly, a dot is a regular expression metacharacter matching any character (with caveats, see perlretut and perlre) so you need to escape it to match a literal dot. Fourth, using \t is fine if you are absolutely certain that you will only ever have a single tab as a separator; \s+ for one or more white spaces is more robust. Fifth, /SEQ+/ means an 'S', an 'E' then one or more 'Q's and if the 'SEQ' should be at the beginning of the line the pattern should be anchored with a caret; so /^SEQ.+/ might be better.
You are likely to have far more data lines than header lines so it wil save cpu ticks to put that condition first.
... if( $csv_line =~ /([A-Z])\s+([A-Z])(?:\s+\d\.\d{3}){3}/ ) { ... } elsif( $csv_line =~ /^SEQ.+/ ) { ... } else { # What do you do with a line that matches neither pattern? } ...
I hope these points are useful.
Cheers,
JohnGG
In reply to Re: Regular Expression Problem
by johngg
in thread Regular Expression Problem
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |