Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Creating Dictionaries

by davidrw (Prior)
on Dec 16, 2005 at 13:59 UTC ( #517241=note: print w/replies, xml ) Need Help??

in reply to Creating Dictionaries

Try this instead ...
use strict; use warnings; my %hash; while (my $line=<STDIN>){ foreach my $word ( split( /[^a-zA-Z]+/ , $line) ){ my $len = length($word); $hash{lc $word}++ if 2<=$len && $len < 5; } } print $_."\n" for sort keys %hash;
  • be sure to 'use strict' and 'use warnings' .. (thus need 'my $line' and 'my $word')
  • note that lowercasing isn't done til last.
  • replaced m/^.?$/ regex with length (i assume that's what it was doing -- length should be faster, and clearer, than a regex)
  • removed the !~m/(\w)\1\1\1\1/ and replaced with a length check for speed and clarity. (don't want 5+ letter words, right?) Update: as PerlMouse pointed out I misread the regex--it's excluding words with a letter repeated 5 or more times
  • removed the s/[^a-z ]+/ /g; and replaced it implicitly with the regex in the split()... now have it split on non-leters, which all go away and you're left with the words of just letters. Not having the substitution should help a lot speed-wise.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://517241]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2023-06-01 12:19 GMT
Find Nodes?
    Voting Booth?

    No recent polls found