in reply to Trying to remove duplicate rows using hashes

On a machine with a decent set of command line tools, you could do something like

sort -u file > sorted_file_without_duplicates

where file has your data.

As kyle pointed out, this suggestion doesn't meet the OP's needs. Sorry, and please disregard.


Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

  • Comment on Re: Trying to remove duplicate rows using hashes

Replies are listed 'Best First'.
Re^2: Trying to remove duplicate rows using hashes
by kyle (Abbot) on Oct 21, 2008 at 16:30 UTC

    The OP is not trying to remove identical lines but rather lines that have two of four fields equivalent. In the example given, the lines removed differ in the first field, so "sort -u" would not remove them.