in reply to Sorting A File Of Large Records
There are some subtelties, mostly specified above that help. The first trick (suggested by dmitri) is to set $/ to '-------'.$/ so that you get each record.
If the file is far too big to fit into memory then the second is to create the sort based on the location within the file and the zip code - use tell() with each record.
Other tricks may depend on whether the file is local or not (whether you can afford to read it multiple times) whether you want to sort on a secondary key as well and so on.
Actually thinking about, assuming that you have sufficient memory and no secondary key you wish to sort on my prefered solution would be a two phase sort. Phase one is an insertion sort into a hash of zip codes. Then you read the file again and write out the sorted version.
Disclaimer - code untested may contain horrible bugslocal $/ = '----------'.$/; my %zips; open (FILE,"<filename") or die "error $!"; my $teller = tell(FILE); NB need position before file read! while (<FILE>); die "no zip code found record at $teller\n" unless (/Zip:\d{5}/s); push @{$zips{$1}}, $teller; $teller = tell(FILE); # FILE is optional here but a good idea! } open (SORTED,">newfile") or die "Can't open newfile: $!"; for (sort keys %zips) { for (@{$zips{$_}) { seek FILE, $_, 0; print SORTED <FILE>; } } close FILE; close SORTED;
Dingus
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Sorting A File Of Large Records
by vek (Prior) on Dec 10, 2002 at 23:18 UTC |