#!/usr/local/bin/perl -wi.bak
use strict;
my @recs;
while (<>) {
chomp;
push @{$recs[int(($.-1) / 5)]}, split /,?\s/;
print map join('|', @$_) . "\n", $recs[int($.-1) / 5]
if !($. % 5);
}
Run it like
% reformat.pl <file>
Saves the original in <file>.bak. | [reply] [d/l] [select] |
If what you have is a space-separated file (hmm, there's a comma in there too) and you want to change it to a pipe-separated file (and turn the commas into spaces), you want the tr/// operator:
my $data =~ tr/ /|/;
$data =~ tr/,//d; # get rid of the comma, if you want
This will transl(iter)?ate the space character into the pipe character, and the comma character into the space character.
If there's a possibility that any of the fields will contain spaces or pipes or commas, look at the Text::CSV module from CPAN. Writing a regex to take care of this will not only test your skills at planning for all contingencies, it will cause you to go bald. | [reply] [d/l] |
I think the op's actually got a carriage-return separated
file, if you view source. So I'm not absolutely sure on
the format, but it looks like there are 5 lines per record,
basically. The other thing is, do you now want the spouse
and phone number in the pipe-separated records?
Anyway:
use strict;
open FH, "foo" or die "Can't open: $!";
my @recs;
while (<FH>) {
chomp;
push @{$recs[int(($.-1) / 5)]}, $_;
}
close FH or die "Can't close: $!";
open FH, ">foo.new" or die "Can't open: $!";
for my $ref (@recs) {
# break up the city, state, zip into 3 parts
my($city, $state, $zip);
if ($ref->[2] =~ /(.*?),\s(.*?)\s(.*)/) {
($city, $state, $zip) = ($1, $2, $3);
}
# join it all together into a pipe-separated
# record, then write it out
my $new = join "|", @{$ref}[0,1], $city, $state,
$zip, @{$ref}[3,4];
print FH $new, "\n";
}
close FH;
I haven't tested this very thorougly, but it looks like
it'll work. It's rather ugly, too. :) | [reply] [d/l] |
use strict;
open FH, "foo" or die "Can't open: $!";
my @recs;
foreach my $line (<FH>) {
chomp $line;
push @recs, [ split /,?\s/, $line ];
}
close FH or die "Can't close: $!";
open FH, ">foo.new" or die "Can't open: $!";
foreach my $line_ref (@recs) {
my $line = join '|', @$line_ref[0 .. 4];
print FH $line, "\n";
}
close FH;
Note that this is also untested. I maintain that a proper use of split is better than an apple a day.
Interesting bits for the Original Poster:
- We push an array reference onto @recs
- split can take an arbitrarily complex regex, instead of just a single character. Use it liberally!
- We use an array slice to get at only the first few fields we want.
| [reply] [d/l] |
foreach my $line (<FH>)
reads the entire file into a temp array--not a horrible
thing in many cases, but still, if we can process on a line
by line basis, we may as well do so. :)
I like your use of split quite a lot, though.
So, combining your ideas and mine:
use strict;
open FH, "foo" or die "Can't open: $!";
my @recs;
while (<FH>) {
chomp;
push @{$recs[int(($.-1) / 5)]}, split /,?\s/;
}
close FH or die "Can't close: $!";
open FH, ">foo.new" or die "Can't open: $!";
print FH map join('|', @$_) . "\n", @recs;
close FH;
Notice I took the array slice out--I think the op wanted
everything in the array. If not, though, he/she should
stick it back in, just
@$ref[0..4]
instead of
@$ref
And I'm now using map, just cause map is great. | [reply] [d/l] [select] |