in reply to how to remove duplicate based on the first column value

Heres one way.
use strict; use warnings; my %seen; while ( <DATA> ) { print unless $seen{(split /\|/)[0]}++; } __DATA__ 123|abc 123|cde 234|efg 456|hij

-enlil

Replies are listed 'Best First'.
Re: Re: how to remove duplicate based on the first column value
by EvdB (Deacon) on Jun 10, 2003 at 15:17 UTC
    Which could also be done as the following one liner:
    perl -e' while(<>) {print unless $s{(split /\|/)[0]}++;}' < infile > o +utfile
    where infile is a file with the values to parse and outfile is where you want the results.

    Indeed:

    perl -i.bak -e' while(<>) {print unless $s{(split /\|/)[0]}++;}' infil +e
    will edit infile in situ (putting backup in infile.bak).

    Update: I have assumed that you are using a shell with redirection, such as bash. I have been told off for this sort of assumption before so best to make it clear.

    --tidiness is the memory loss of environmental mnemonics

      Another nice perl command line switch is -n. This builds the while(<>) loop for you. Example:

      perl -i.bak -ne 'print unless $s{(split /\|/)[0]}++' infile

      Note: more of these can be found in perlrun.