in reply to Reaped: sorry -double post

I thought initially I could just split on new line, but I cant due to the sequence data occurring over several lines

Split on entire records, a record being header + sequence.  It looks like the appropriate record separator would be newline followed by '>' (i.e. "\n>"). Then split each record into header and sequence, by treating stuff up to the first newline as header.