Re: promoting array to a hash

If all you are doing is printing a list of unique words from stdin, why not save a lot of wasted code and do:


print "$_\n" for sort <> =~ /\b(\S+)\b(?!.*\b\1\b)/g
[download]

That is, use a negative lookahead to check the word doesn't appear again. Saves you joining, splitting, grepping, and mapping :). I have not benchmarked it, though.

Comment on Re: promoting array to a hash Download Code

Replies are listed 'Best First'.
Re^2: promoting array to a hash by sleepingsquirrel (Chaplain) on Jun 14, 2004 at 17:10 UTC
Benchmarking is worthwhile in this instance. The regex backtracking turns an Nlog(n) problem (assuming the sort dominates) into an N^2 problem. Here's the result of applying the two algorithms to the Net-Howto (which is 100 times smaller than the data set I initially used). `greg@spark:~/test$ cat sleepingsquirrel #!/usr/bin/perl print "$_\n" for sort keys %{{map {$_,()} grep /^[a-z]+$/, (split /\s/ +, join(" ",<>))}}; greg@spark:~/test$ time sleepingsquirrel Net-HOWTO >words.txt real 0m0.178s user 0m0.158s sys 0m0.016s greg@spark:~/test$ cat jasper #!/usr/bin/perl $/=undef; print "$_\n" for sort <> =~ /\b([a-z]+)\b(?!.\b\1\b)/sg greg@spark:~/test$ time jasper Net-HOWTO >words2.txt real 1m8.477s user 1m8.471s sys 0m0.003s` [download] ...only about 350x slower. YMMV	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: promoting array to a hash
by sleepingsquirrel (Chaplain) on Jun 14, 2004 at 17:10 UTC

Net-Howto

greg@spark:~/test$ cat sleepingsquirrel 
#!/usr/bin/perl

print "$_\n" for sort keys %{{map {$_,()} grep /^[a-z]+$/, (split /\s/
+, join(" ",<>))}};
greg@spark:~/test$ time sleepingsquirrel Net-HOWTO >words.txt

real    0m0.178s
user    0m0.158s
sys     0m0.016s

greg@spark:~/test$ cat jasper 
#!/usr/bin/perl

$/=undef;
print "$_\n" for sort <> =~ /\b([a-z]+)\b(?!.*\b\1\b)/sg
greg@spark:~/test$ time jasper Net-HOWTO >words2.txt

real    1m8.477s
user    1m8.471s
sys     0m0.003s
[download]

[reply]
[d/l]