in reply to Simple Text Conversion

I think the op's actually got a carriage-return separated file, if you view source. So I'm not absolutely sure on the format, but it looks like there are 5 lines per record, basically. The other thing is, do you now want the spouse and phone number in the pipe-separated records?

Anyway:

use strict; open FH, "foo" or die "Can't open: $!"; my @recs; while (<FH>) { chomp; push @{$recs[int(($.-1) / 5)]}, $_; } close FH or die "Can't close: $!"; open FH, ">foo.new" or die "Can't open: $!"; for my $ref (@recs) { # break up the city, state, zip into 3 parts my($city, $state, $zip); if ($ref->[2] =~ /(.*?),\s(.*?)\s(.*)/) { ($city, $state, $zip) = ($1, $2, $3); } # join it all together into a pipe-separated # record, then write it out my $new = join "|", @{$ref}[0,1], $city, $state, $zip, @{$ref}[3,4]; print FH $new, "\n"; } close FH;
I haven't tested this very thorougly, but it looks like it'll work. It's rather ugly, too. :)

Replies are listed 'Best First'.
RE: Re: Simple Text Conversion
by chromatic (Archbishop) on Apr 06, 2000 at 21:38 UTC
    Deuglification, then:
    use strict; open FH, "foo" or die "Can't open: $!"; my @recs; foreach my $line (<FH>) { chomp $line; push @recs, [ split /,?\s/, $line ]; } close FH or die "Can't close: $!"; open FH, ">foo.new" or die "Can't open: $!"; foreach my $line_ref (@recs) { my $line = join '|', @$line_ref[0 .. 4]; print FH $line, "\n"; } close FH;
    Note that this is also untested. I maintain that a proper use of split is better than an apple a day.

    Interesting bits for the Original Poster:

    • We push an array reference onto @recs
    • split can take an arbitrarily complex regex, instead of just a single character. Use it liberally!
    • We use an array slice to get at only the first few fields we want.
      Yes, but then, mine worked. :) Just kidding.

      You're just pushing each line arbitrarily onto @recs--you need to group them in sets of 5, because that's what the original file looked like. Plus why are you using foreach in the original reading-in-the-file loop? That

      foreach my $line (<FH>)
      reads the entire file into a temp array--not a horrible thing in many cases, but still, if we can process on a line by line basis, we may as well do so. :)

      I like your use of split quite a lot, though.

      So, combining your ideas and mine:

      use strict; open FH, "foo" or die "Can't open: $!"; my @recs; while (<FH>) { chomp; push @{$recs[int(($.-1) / 5)]}, split /,?\s/; } close FH or die "Can't close: $!"; open FH, ">foo.new" or die "Can't open: $!"; print FH map join('|', @$_) . "\n", @recs; close FH;
      Notice I took the array slice out--I think the op wanted everything in the array. If not, though, he/she should stick it back in, just
      @$ref[0..4]
      instead of
      @$ref
      And I'm now using map, just cause map is great.
        I'm happy with the resulting script. It's far more clear and far less ugly. Regarding some of the bogosities in my postulate:
        • Obviously, the line-by-line processing depends on whether the data file has each group of records on one line, or each record on a separate line. I went for the first assumption (as it makes more real-world sense to me) and you went for the second (as you had the patience to View Source).
        • That also lets me get away from the push reference position bit. Not a big loss.
        • I thought avoiding the implicit local $_ behavior in the while loop would be more clear for the OP.
        • People don't use split and join nearly enough.
        Well done.