in reply to How to split line with varying number of tokens?
If the FROM field is the only real wild-card then be specific about what you do know, and relaxed about what you don't. By anchoring with specifics to the left and the right of the FROM field, you can relax your specification of that one field and still build a relatively robust regular expression:
while( my $line = <DATA> ) { print $line; chomp $line; my( $reqid, $dest, $from, $date, $time, $npages, $rcv ) = $line =~ m[ ^ # Beginning of input line. (\d+)\s+ # REQID (\w+)\s+ # DEST (\S.*?\S)\s+ # FROM (Accept non-space, anything [non- # greedily], non-space) (\d{1,2}/\d{1,2})\s+ # DATE (\d{1,2}:\d{1,2})\s+ # TIME (\d+)\s+ # nPages (\w+)\s* # RCV $ # End of input line. ]x; print "REQID: [$reqid]\tDEST: [$dest]\tFROM: [$from]\n"; print "DATE: [$date]\tTIME: [$time]\n"; print "nPages: [$npages]\tRCV: [$rcv]\n\n"; }
(I'm assuming that the fact your columns are not vertically aligned is not a typo; ie, that the fields aren't fixed length. If they are fixed length, this solution would be silly.)
Dave
|
|---|