in reply to Count byte/character occurrence (quickly)

Don't read the file byte by byte, but read it in larger chunks and have an inner loop counting the characters:

open my $file, '<', shift; binmode($file); my %seen = (); $/ = \( 1024 * 1024 ); # Set default buffer size while ( <$file> ) { for my $buf (split //, $_) { if ( !$seen{$buf} ) { $seen{$buf} = 1; } else { $seen{$buf}++; } } } print "$_ - $seen{$_}\n" for ( keys %seen );

Replies are listed 'Best First'.
Re^2: Count byte/character occurrence (quickly)
by james28909 (Deacon) on Apr 01, 2016 at 06:53 UTC

    Sadly, when I try to set it to a high(er) buffer size it slows down for some reason. I have an Intel i5 2500k and samsung 840 pro with a biostar tz77xe4 motherboard. It should be reading faster than heck. Is mine (or yours) code some how saving the results of the read instead of throwing them away after reading and adding to seen? I tried to undef $buf as well and that even slowed it down

    EDIT: also, i am using perl 5.16.3 for certain reasons.