in reply to processing text from flat file

What, nobody has suggested messing with $/ (the record separator)?

Setting $/ to the |---| line will let you read the records one at a time, without having either to glob the entire file or to buffer the lines.

#!/bin/perl -w use strict; $/="|--------------------------------------|\n"; <DATA>; # skip the first (empty) record before the first |---| +line while( <DATA>)# read an entire record { chomp; # to get rid of the extra \n that would mess up the spl +it my %data= (m/^([^:]*):([^\n]*)$/mg); # get field name (before the +:) and text (until the \n) for each line # %data now holds the fields and their text, and I prove it: while( my( $field, $text)= each %data) { print "$field: $text\n"; } print "\n"; } __DATA__ |--------------------------------------| Date: today Request: text here text here text here Name: Joe Bloggs Tel: 0123 45677 email: joebloggs@bloggs.com |--------------------------------------| Date: Today Request: text here text here text here Name: John Smith Tel: 0123 45677 email: johnsmith@smith.com |--------------------------------------|

UpdateOf course the original way to parse the record (%data= split /[:\n]/) breaks if the text includes a : so I replaced it with the regexp.

Replies are listed 'Best First'.
Re: Re: processing text from flat file
by blakem (Monsignor) on Oct 02, 2001 at 01:17 UTC
    Don't go mucking around with special global vars w/o localizing them first (at least, not when giving advice to others)! Even if it doesn't make much of a difference in this small script, it is a bad habit to get into and *will* come back to bite in a longer program.
    #!/bin/perl -w use strict; { local $/="|--------------------------------------|\n"; while( <DATA>) { [SNIP] } [SNIP] } ### Here in the program, the <FH> operator will now behave normally ### Before you could have been burned, since any other calls ### to <FH> would see the specialized value of $/, even calls ### in subroutines. As you can imagine, this is a pain to debug ### Its best to avoid this trap altogether as I have done above.
    Update: Added parenthetical clarification....

    -Blake