This is very fast for the first ~20,000 lines (out of a total of ~300,000), then suddenly slows down dramatically. Lack of memory is not the problem - at the point it slows down I still have upwards of 700 MB free. At first I thought the processing of each line into the case hash with its attendent splits and regular expressions was the problem, but if I alter the above code to:my @cases=(); my @FILE=<DATA>; close(DATA); my $num_lines=scalar(@FILE); $#cases=$num_lines; #pre-extend array foreach my $line (@FILE) { if (($dot % 1000) == 0) { print STDERR "."; } $line=~/^(\S*) [0-9.]* (.*)$/o; my ($class, $feature_vector) = ($1, $2); my %case; $case{'class'}=$class; foreach my $feature (split /\s+/, $feature_vector) { $case{'fv'}{$feature}=1; } push @cases, \%case; $dot++; }
then the entire file is processed on the order of 100 times more quickly. I've tried using something like $cases[$dot]=\%case or even making cases a hash indexed by case number, but both approaches exhibit a similar slow-down. Any ideas on why this slow-down occurs? (Perl version 5.6.1 being run under a Windows XP system with 1 GB RAM) Thanks, Ryan Gabbardmy @FILE=<DATA>; close(DATA); my $fred; foreach my $line (@FILE) { if (($dot % 1000) == 0) { print STDERR "."; } $line=~/^(\S*) [0-9.]* (.*)$/o; my ($class, $feature_vector) = ($1, $2); my %case; $case{'class'}=$class; foreach my $feature (split /\s+/, $feature_vector) { $case{'fv'}{$feature}=1; } $fred= \%case; $dot++; }
In reply to Slowness when inserting into pre-extended array by ryangabbard
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |