in reply to Better "uniq" idiom?
I have an idiom I use a lot to do quick searches through log files or data streams, or to do a "uniq" on a file without having to sort it.
Using a hash to keep track of what has already gone by is quite common. Even the name you used is seen everywhere: %seen. You can however make it a bit faster. You're now increasing $seen{$1} every time, while you don't really care if it's 1 or 2 or 3, as long as it's not 0 (undef).
It's not great for your golf score, but it can be a lot faster!perl -ne'$seen{$_}++, print unless $seen{$_}';
Or, using evil symbolic references:# 123456789_12345 perl -pe'$_=""if$s{$_}++'
(Can break when the last line has no trailing linefeed, and equals the name of a special (scalar) variable ;))# 123456789_12 perl -pe'$_=""if$$_++'
U28geW91IGNhbiBhbGwgcm90MTMgY
W5kIHBhY2soKS4gQnV0IGRvIHlvdS
ByZWNvZ25pc2UgQmFzZTY0IHdoZW4
geW91IHNlZSBpdD8gIC0tIEp1ZXJk
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Better "uniq" idiom?
by Sidhekin (Priest) on Mar 21, 2002 at 15:14 UTC | |
by Juerd (Abbot) on Mar 21, 2002 at 22:09 UTC | |
|
Re: Re: Better "uniq" idiom?
by RMGir (Prior) on Mar 20, 2002 at 17:21 UTC |