in reply to Creating Dictionaries
Try this instead ...
comments/explanations:use strict; use warnings; my %hash; while (my $line=<STDIN>){ foreach my $word ( split( /[^a-zA-Z]+/ , $line) ){ my $len = length($word); $hash{lc $word}++ if 2<=$len && $len < 5; } } print $_."\n" for sort keys %hash;
- be sure to 'use strict' and 'use warnings' .. (thus need 'my $line' and 'my $word')
- note that lowercasing isn't done til last.
- replaced m/^.?$/ regex with length (i assume that's what it was doing -- length should be faster, and clearer, than a regex)
- removed the !~m/(\w)\1\1\1\1/ and replaced with a length check for speed and clarity. (don't want 5+ letter words, right?) Update: as PerlMouse pointed out I misread the regex--it's excluding words with a letter repeated 5 or more times
- removed the s/[^a-z ]+/ /g; and replaced it implicitly with the regex in the split()... now have it split on non-leters, which all go away and you're left with the words of just letters. Not having the substitution should help a lot speed-wise.
In Section
Seekers of Perl Wisdom