Hi I have performance problem with perl sort. The logic what I am currently using is creating a hash, getting the key value pair and then sorting based on ST sort technique for the fields date and time. The normal input filesize for sorting is around 3-4GB I have posted the code that is currently working but is taking around 3 hrs to complete the sort process on a 12GHz memory 64 bit windows system. (If I use the same script/technique on 32 bit windows system with 4GB RAM, it is resulting in out of memory error) The actual requirement is to sort this file and then split into no of files as 3 GB file could not be opened. The file splitting section is working appropriately. Please help if performance can be improved and out of memory issue could be resolved. Thanks in advance. Any help on this is greatly appreciated.
The sample input file content is: 2012/02/12 @ 14:29:26,519 @ -> java.lang.NullPointerException 2012/02/12 @ 14:23:26,519 @ -> | WARN | RMI TCP Connection(184923)- +170.80.0.9 | Error in getting the Network Adapter 2012/02/12 @ 14:20:26,522 @ -> | WARN | RMI TCP Connection(184923)- +170.80.0.9 | Error in getting the Network Adapter and output should look like: 2012/02/12 @ 14:20:26,522 @ -> | WARN | RMI TCP Connection(184923)- +170.80.0.9 | Error in getting the Network Adapter 2012/02/12 @ 14:23:26,519 @ -> | WARN | RMI TCP Connection(184923)- +170.80.0.9 | Error in getting the Network Adapter 2012/02/12 @ 14:29:26,519 @ -> java.lang.NullPointerException
open FH_duplicate, "$file_duplicate" or die "$!"; open FH1_sorting, ">>$file_consolidated_sort" or die "$!"; my %hash = (); my $key; my $val; while(<FH_duplicate>) { chomp; ($key,$val)=split(/,,/); $hash{$key} .= $val; } close FH_duplicate; ### hash creation ### sorting begins for $key(map{$_ -> [0]} sort{ $a->[1] cmp $b->[1] || $a->[2] cmp $b->[2]} map{[$_,(spl +it)[0],(split)[2]]} keys %hash) { print FH1_sorting "$key -> $hash{$key}"; } close FH1_sorting;
In reply to perl ST sort performance issue for large file? by rkshyam
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |