a way to do 'sort|uniq'

jch341277 has asked for the wisdom of the Perl Monks concerning the following question:

I've been thinking for a while that there should be a fairly simple way for doing the same thing as the shell idiom:

$ cat file.txt|sort|uniq
[download]

Here's what I've come up with to sort a list of phone numbers:

my $l = 0;
map     { defined $_ and print "$_\n" }
        map { ($_->[1] ne $l)? $l=$_->[1] : undef }
        sort { $a->[1] <=> $b->[1] }
        map { [s/[\D\n]//g, $_] }
        <>;
[download]

But you could use it to sort strings just as easily. You probably don't want to use this to sort any really big files.
I'd be interested in knowing if anyone has found other ways to do this...

Update:After reviewing all the other ways to do this I have to let on that I've been playing with Schwartzian transforms lately so that's why mine took this overly complicated form.
The one-liners are great - however, even after doing a super search for %_ I still can't figure out why doing: @_{@telephones}=(); initializes %_ from the array @telephones?

Comment on a way to do 'sort\|uniq' Select or Download Code

Replies are listed 'Best First'.
Re: a way to do 'sort\|uniq' by rnahi (Curate) on Aug 02, 2005 at 15:35 UTC
`perl -e 'print sort keys %{{map{$_=>1} <>}}' unsorted_redundant_file.t +xt` [download] See: Perl Idioms Explained - keys %{{map{$_=>1}@list}} in our Tutorials.	[reply] [d/l]
Re^2: a way to do 'sort\|uniq' by xdg (Monsignor) on Aug 02, 2005 at 18:38 UTC
Perl Idioms Explained - keys %{{map{$_=>1}@list}} does caveat that this may not be appropriate for large lists. If that's a concern with the input being processed, the same thing can be done in line-by-line fashion. `perl -e '$seen{$_} = 1 while <>; print sort keys %seen' input.txt` [download] The article has some good commentary. For example, this approch above is slowest. Faster is this with `$seen{$_} = undef`, the grep approach is faster still, and the slice approach is apparently even faster. -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l] [select]
Re: a way to do 'sort\|uniq' by merlyn (Sage) on Aug 02, 2005 at 15:40 UTC
`cat file.txt\|sort\|uniq` [download] Deleting the useless use of cat, and using a far-underused sort flag, I'd type that as: `sort -u file.txt` [download] -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply] [d/l] [select]
Re^2: a way to do 'sort\|uniq' by ikegami (Patriarch) on Aug 02, 2005 at 15:48 UTC
That search doesn't result in anything particularly useful. The first match, the one which looks the most promising and the one which refers to you by name, is a broken link. A cached version is readable. It would be nice if you provided a link to your document instead of having us wade through broken links trying to figure out what you mean.	[reply]
Re^2: a way to do 'sort\|uniq' by tlm (Prior) on Aug 03, 2005 at 03:48 UTC
...the useless use of cat... What do we have here?! It sure looks like a useless use of "use of"¹. What's wrong with "the useless `cat`"? :-) ¹Or, if I'm to practice what I preach, a useless "use of". the lowliest monk	[reply]
Re^3: a way to do 'sort\|uniq' by merlyn (Sage) on Aug 03, 2005 at 06:47 UTC
That's actually part of the joke, that the "award" itself is broken in a way similar to the item about which it complains. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re^4: a way to do 'sort\|uniq' by tlm (Prior) on Aug 03, 2005 at 11:12 UTC
Re^2: a way to do 'sort\|uniq' by jch341277 (Sexton) on Aug 02, 2005 at 16:30 UTC
I was generalizing the construct - I usually do something like: `$ tail +2 file.txt\|tr -dc '[0-9\n]'\|sort\|uniq -c\|sort -rnk 1` [download] I realize that the 'cat' would be useless in the simplified example that I gave, but the point was I wanted a way to do that same type of thing all in one step with perl.	[reply] [d/l]
Re: a way to do 'sort\|uniq' (efficient) by tye (Sage) on Aug 02, 2005 at 18:26 UTC
To be efficient, it is something that the sorting algorithm should support itself. Hence "sort -u" existing despite "sort \| uniq" working (if you don't run out of resources). So this option would be a nice addition to sort.pm. - tye	[reply]
Re: a way to do 'sort\|uniq' by Roy Johnson (Monsignor) on Aug 02, 2005 at 17:56 UTC
Another way: `use warnings; use strict; my %seen; print sort grep !$seen{$_}++, <DATA> __DATA__ c c b c b b a c a b` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re: a way to do 'sort\|uniq' by sh1tn (Priest) on Aug 02, 2005 at 16:43 UTC
`@_{@telephones}=(),print+join$/,sort{ $a <=> $b }keys%_;` [download]	[reply] [d/l]
Re^2: a way to do 'sort\|uniq' by jch341277 (Sexton) on Aug 02, 2005 at 18:35 UTC
sh1tn - I don't understand how this works: `@_{@telephones}=(),print+join$/,sort{ $a <=> $b }keys%_;` [download] I hope you're not disinclined to explain?	[reply] [d/l]
Re^3: a way to do 'sort\|uniq' by sh1tn (Priest) on Aug 02, 2005 at 19:12 UTC
`@_{@telephones}=(); # %_ hash from @telephones print+ # print the join $/, # joined with "\n" separator sort{ $a <=> $b }keys%_; # list which comes from %_ keys` [download]	[reply] [d/l]
Re: a way to do 'sort\|uniq' by Anonymous Monk on Aug 02, 2005 at 16:45 UTC
If the Perl equivalent of a shell one-liner is 6 lines, I'd use `system` to shell out. However, your Perl version isn't equivalent - it's removing anything that's not a number (`s/[\D\n]//g` contains many hooks to improvement - I'd write it as `tr/0-9//cd` or if you insist on a substitution: `s/[\D]+//g`). If I were to do it in Perl, I'd remove the duplicates first (using a hash), then sort. If there are a lot of duplicates, this ought to win (although in modern Perls, sorting with many duplicates is fast).	[reply] [d/l] [select]


more useful options
	PerlMonks