Here's another approach that may be a little faster than my previous one (but it will be nowhere near 3 - 4 seconds for 256 MB!).
c:\@Work\Perl\monks>perl -wMstrict -le "my $s = 'A man, a plan, a canal: Panama!'; ;; my @char_counts; $#char_counts = 255; ++$char_counts[ ord(substr $s, $_, 1) ] for 0 .. length($s) - 1; die 'oops...' if $#char_counts != 255; ;; printf qq{'%s' (0x%02x) == $char_counts[$_] (%6.3f%%) \n}, chr, $_, ($char_counts[$_] / length $s) * 100 for grep defined($char_counts[$_]), 0 .. $#char_counts; " ' ' (0x20) == 6 (19.355%) '!' (0x21) == 1 ( 3.226%) ',' (0x2c) == 2 ( 6.452%) ':' (0x3a) == 1 ( 3.226%) 'A' (0x41) == 1 ( 3.226%) 'P' (0x50) == 1 ( 3.226%) 'a' (0x61) == 9 (29.032%) 'c' (0x63) == 1 ( 3.226%) 'l' (0x6c) == 2 ( 6.452%) 'm' (0x6d) == 2 ( 6.452%) 'n' (0x6e) == 4 (12.903%) 'p' (0x70) == 1 ( 3.226%)
... how HxD is able to do it ...
... is by writing the code in C or some such compiled language — at least, I'd be willing to bet doughnuts to dollars that's the case. You, too, can do this with Inline::C! (Update: See also Inline::C::Cookbook.) In fact, the array-based approach in the code example above should, I think, convert very neatly to C. The learning curve for Inline::C is not too bad (assuming you know C!) and well worth the effort if you have a need for speed! (I need to brush up on Inline::C myself, so if I have some time later, I may play around with this.)
In reply to Re^5: Computing the percentage of certain characters in a file
by AnomalousMonk
in thread Computing the percentage of certain characters in a file
by james28909
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |