It's nice to avoid "use of map in void context", unnecessary copies of data, and "$counter++" where a simpler method (length()) will do. </nitpick>my $fil = shift; open (FIL, $fil) || die ("can't read $fil: $!"); my $filstr = do { local $/; <FIL> }; close FIL; my %charhash; $charhash{$_}++ for ( split //, $filstr ); my $ent = entropy(\%charhash, length( $filstr )); printf ("file %s\ncontents entropy = %30.20f\n",$fil,$ent); sub entropy { my ($hashref, $total, $baselog) = @_; $baselog = 0.693147180559945 unless $baselog; # log(2) return undef unless ( ref $hashref and $total > 0 ); my $sum; $sum += $_ * (log($_)/$baselog) for ( map { $_/$total } values %$has +href ); return -$sum; }
(updated to eliminate the unnecessary "@values" array; then updated a couple more times to fix details associated with removing @values.)
In reply to Re: file/language entropy calculator
by graff
in thread file/language entropy calculator
by wufnik
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |