With (mostly) irregularly formatted data, you always have a portion which can't be chopped up neatly. Try to match
most of the lines with a
split or regular expression, and "redirect" the non-matching lines to another file for examining by hand or running some other program over.
Arjen