Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re (tilly) 1: == uniq

by tilly (Archbishop)
on Feb 13, 2001 at 08:27 UTC ( [id://58079]=note: print w/replies, xml ) Need Help??


in reply to == uniq

You could modify these implementations.

A related problem I once had fun with. Process a file and only print lines once (unlike uniq no assumption of sorted input). That isn't quite what you wanted, but here was the fastest version I could come up with in Perl:

perl -ne 'print if 1 == ++$s{$_}' < in > out
(Yes, details like where to place the plusses mattered.)

Replies are listed 'Best First'.
(dkubb) Re: (2) Printing unique lines in a file
by dkubb (Deacon) on Feb 13, 2001 at 13:25 UTC

    Here's the fastest and shortest way I could find to process an input file, and print the unique lines inside it:

    perl -ne "$s{$_} ||= print" < in > out

    Explanation: The ||= operator has a behaviour called "short-circuit". If the left-hand side of the operation evaluates to true, the operation will short-circuit and stop - essentially the right-hand side will be ignored and never evaluated. But if the left side is false, then whatever the right-hand side evaluates to will be assigned to the left.

    In this case, we are processing each line from a file called "in". The first time we see $s{$_} it will evaluate to false, setting off the right hand side to be evaluted. This will print $_, return a 1, then set $s{$_} to 1.

    The next time we see $s{$_} inside the loop, it is true, so the ||= never again runs the right-hand side operation. This results in the above code only printing a line the first time it's seen, and no more.

Re: Re (tilly) 1: == uniq
by MeowChow (Vicar) on Feb 13, 2001 at 08:54 UTC
    can't resist...
    perl -ne '!$s{$_}++ && print' < in > out
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
      That executes more slowly. You have to waste time creating new variables because of using postincrement rather than preincrement.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://58079]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2024-03-28 14:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found