orbital has asked for the wisdom of the Perl Monks concerning the following question:
I was able to figure out how I wanted to parse my data and sort it, I have no problems there. My problem lies in the performance of the sort that I created, lets take a look at the code:
@sortedarray= map{ $_->[0] } sort{ $a->[1] cmp $b->[1] || $a->[2] <=> $b->[2] || $a->[3] cmp $b->[3] } map { if ( m/^(.+?)\((\d+)\)\s-\s\[(.+?)\].+?"(.*?)"\.$/ ) { my ($disc_file,$page,$key,$val) = ($1,$2,$3,$4); [$_,$disc_file,$page,$key]; } } <DATA>; foreach(@sortedarray){ print "$_\n"; } __END__ CD1\01100809.pdf(1) - [Account Number] Indexed key "654546654". CD2\01100809.pdf(1) - [Invoice Date] Indexed key "10/08/2001". CD1\01100809.pdf(1) - [Customer Name] Indexed key "FOOBAR". CD2\01100809.pdf(1) - [Contact Name] Indexed key "Dr. FOO". CD4\01100809.pdf(20) - [Account Number] Indexed key "54356564". CD4\01100809.pdf(20) - [Invoice Date] Indexed key "10/08/2001". CD1\01100809.pdf(20) - [Customer Name] Indexed key "FOOBAR". CD1\01100809.pdf(20) - [Contact Name] Indexed key "Dr. FOO". CD1\01100814.pdf(33) - [Account Number] Indexed key "56357576537". CD3\01100814.pdf(33) - [Invoice Date] Indexed key "10/08/2001". CD3\01100814.pdf(33) - [Customer Name] Indexed key "FOOBAR". CD1\01100814.pdf(33) - [Contact Name] Indexed key "Dr. FOO". CD2\01100813.pdf(27) - [Account Number] Indexed key "73677576". CD3\01100813.pdf(27) - [Invoice Date] Indexed key "10/08/2001". CD1\01100813.pdf(27) - [Customer Name] Indexed key "FOOBAR". CD3\01100813.pdf(27) - [Contact Name] Indexed key "Dr. FOO".
This code does exactly what I want it to accomplish, it ignores lines that don't match and sorts by my CD\filename then by page number and then by the keys being indexed.
The problems I am running into is with the speed of this sort (seems very slow 16sec on a 2.3MB file on a PIII 766Mhz) can I speed this code at all?
My other issues is with file size, the larger the logfile the more memory perl hogs. What kind of techniques can I use for sorting a huge file without taking a bunch of RAM in the process.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
(tye)Re: Complex Sorting Optimization?
by tye (Sage) on Nov 21, 2001 at 00:24 UTC | |
|
Re: Complex Sorting Optimization?
by dragonchild (Archbishop) on Nov 21, 2001 at 00:21 UTC | |
|
Re: Complex Sorting Optimization?
by dash2 (Hermit) on Nov 21, 2001 at 00:10 UTC | |
|
Re: Complex Sorting Optimization?
by dws (Chancellor) on Nov 21, 2001 at 11:06 UTC | |
|
Re: Complex Sorting Optimization?
by frankus (Priest) on Nov 21, 2001 at 00:25 UTC |