i don't see a end of record marker..or you could use just the start of record marker (LOCUS?)...to process the records in 2 pass fashion. on first pass you initialize and assign a temporary hash structure, with LOCUS, DEFINITION, etc as keys and whatever remains becomes value for each such key. when you hit a new record and prior to processing that (or when EOF) process each value (still just a string) of the temporary hash into an appropriately more elaborate structure as required. then you have the option of either processing the fully fledged data structure record by record, or grow it and process according to whatever business rules at the end. this separates the logic for subfield processing (business rules) from the routine record processing on input. does away with very convoluted code anyway..code that parses input and processes record at field and subfield levels all at once