Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

(Ovid) Re: Letter frequencies

by Ovid (Cardinal)
on Nov 24, 2000 at 03:51 UTC ( [id://43179]=note: print w/replies, xml ) Need Help??


in reply to Letter frequencies

I don't know of any such resource, but perhaps writing a perl script to do figure that out would be the way to go? Have it read plenty of text in your target language and calculate the frequency of each symbol.

You'd have to ensure that you're reading plaintext, though, and not markup or something like that. The following would populate %symbol with a frequency count. You'd just pass it a list of files on the command line. What you'd do with the data from there would be up to you.

while ($line = <>) { $symbol{ $_ }++ for ( split //, $line ); }

Cheers,
Ovid

Update: mdillon had a good point. Here's a rewrite:

while ($line = <>) { for ( split //, $line ) { $symbol{ $_ }++; $total++; } }
Or, the fun method: use my first code and add the following after the loop:
$total = eval (join '+', values %symbol); # :)

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: (Ovid) Re: Letter frequencies
by mdillon (Priest) on Nov 24, 2000 at 03:56 UTC
    i would add a running total of all characters seen as well, to facilitate calculating the actual frequencies, not just the counts.

    update: what MJD said (infra).

      mdillon says:
      > i would add a running total of all characters seen as well
      While I agree with you that such a total would be useful, it's clearly much cheaper to compute it at the end:
      for my $symbol (keys %symbol) { $total += $symbol{$symbol}; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://43179]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2024-04-19 10:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found