Dear fellow Monks,

The Unix uniq commands only removes adjacent identical lines. I was asked about a command that removes duplicates that are not adjacent, w/o sorting the file.

"No problems", I said, "a 5 line Perl script and you're done". But later it occurred to me - why 5 lines ? This will do the job:

cat file1 file2 | perl -ne 'print if !(defined($foo{$_})); $foo{$_} = +1;' > out_file
So, how about making it even shorter ?

Note: my version is ungolfed for clarity
Note2: golf only the Perl code between 's, not the shell wrapper

Replies are listed 'Best First'.
Re: (Golf) unique
by broquaint (Abbot) on Apr 24, 2003 at 15:14 UTC
    01 23456789012345678 perl -ne '$_{$_}++ or print' < infile > outfile
    That's 18.

    _________
    broquaint

      In the same manner even:
      01 234567890123 perl -ne "$$_++||print" < infile > outfile

        Very nice. ++

        But it will fail if the infile looked like this: ;-)

        $ cat infile here comes trouble 1 2

        --
        John.

        Here's a version which shouldn't rely on 5.8:

        01 2345678901234567 perl -ne '${_.$_}++||print'

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: (Golf) unique
by jmcnamara (Monsignor) on Apr 24, 2003 at 16:18 UTC

    Same count as broquaint:
    perl -ne 'print if!$h{$_}++' file1 file2 ... > newfile
    Shorter (16):
    perl -pe '$_=$,if$,{$_}++' file1 file2 ... > newfile

    --
    John.

      perl -pe '$_ x=!$h{$_}++' < infile > outfile

      Jasper
Re: (Golf) unique (and the winner is...)
by sauoq (Abbot) on Jun 04, 2003 at 07:47 UTC

    A combination of both insensate's and Jasper's gems.

    01 23456789012 perl -pe '$_ x=!$$_++'

    Update: This is, unfortunately, version specific. It works on 5.8 but not on earlier versions. See the short discussion starting with jmcnamara's reply to insensate. Another variation which shouldn't be version specific and which ties Jasper's:

    01 234567890123456 perl -pe '$_ x=!${_.$_}++'

    -sauoq
    "My two cents aren't worth a dime.";