in reply to Removing repeated lines from file

IMO, the best way is not to use Perl. There's an excellent utility doing exactly this, it's called uniq, which comes standard with any flavour of Unix I've encountered, and is also present in Unix toolkits for Windows.

uniq in_file out_file

Abigail

Replies are listed 'Best First'.
Re: Re: Removing repeated lines from file
by zby (Vicar) on Jun 24, 2003 at 12:55 UTC
    uniq works on sorted files only. The man page suggests that it does remowe consecutive duplicates, but the first sentence states that it removes repeating lines from sorted files.

    Update: As Abigail-II pointed out according to POSIX the sugggestion is right. But anyway it works only for consecutive duplicates.

      uniq works on sorted files only. The man page suggests that it does remowe consecutive duplicates, but the first sentence states that it removes repeating lines from sorted files.

      Then either your man page lies, or your vendor uses a uniq implementation that's not confirming to the POSIX standard.

      From the POSIX 1003.1 standard:

      DESCRIPTION
      The uniq utility shall read an input file comparing adjacent lines, and write one copy of each input line on the output. The second and succeeding copies of repeated adjacent input lines shall not be written.

      Repeated lines in the input shall not be detected if they are not adjacent.

      Abigail

      My testing show that uniq does only remove duplicate lines from sorted rows.