I have an idiom I use a lot to do quick searches through log files or data streams, or to do a "uniq" on a file without having to sort it.
Using a hash to keep track of what has already gone by is quite common. Even the name you used is seen everywhere: %seen. You can however make it a bit faster. You're now increasing $seen{$1} every time, while you don't really care if it's 1 or 2 or 3, as long as it's not 0 (undef).
It's not great for your golf score, but it can be a lot faster!perl -ne'$seen{$_}++, print unless $seen{$_}';
Or, using evil symbolic references:# 123456789_12345 perl -pe'$_=""if$s{$_}++'
(Can break when the last line has no trailing linefeed, and equals the name of a special (scalar) variable ;))# 123456789_12 perl -pe'$_=""if$$_++'
U28geW91IGNhbiBhbGwgcm90MTMgY
W5kIHBhY2soKS4gQnV0IGRvIHlvdS
ByZWNvZ25pc2UgQmFzZTY0IHdoZW4
geW91IHNlZSBpdD8gIC0tIEp1ZXJk
In reply to Re: Better "uniq" idiom?
by Juerd
in thread Better "uniq" idiom?
by RMGir
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |