in reply to Deleting duplicate lines from file

Uncomment the appropriate print staement to get the output order you need.

use warnings; use strict; @ARGV or die "You need to supply a file name.\n"; open my $fh, '<', shift or die "$!\n"; my @lines = <$fh>; my %unique; @unique{@lines} = (1) x @lines; # unique lines #print keys %unique; # unique sorted lines #print sort keys %unique; # unique lines in order seen in original file do { print if delete $unique{$_} } for @lines;

Replies are listed 'Best First'.
Re^2: Deleting duplicate lines from file
by blazar (Canon) on Feb 17, 2006 at 13:15 UTC
    @unique{@lines} = (1) x @lines;

    Also

    @unique{@lines} = ();

    since you don't use the values anyway. Whatever, if he wants them in the original order, then slurping the whole file in at once is, as is commonly the case, an overkill, and I would regard the usual print if !$seen{$_}++ technique as a superior solution. Of course if one needs or may need sorting then the slurping must take place in some form or another and yours is just as good as any other one. Probably you already knew, I'm just pinpointing some details for the benefit of the OP...

      Also

      @unique{@lines} = ();

      since you don't use the values anyway.

      Actually, it does need true values for each key or delete will return false. So, while it's true it doesn't use them, it does need them.

        Fair enough. I hadn't noticed. And in a certain sense it does use them. Whatever, since your proposed solution

        do { print if delete $unique{$_} } for @lines;

        although perfectly working is not that intuitive and I seem to remember that deleting elements has an effective execution load, I'd still stick with a %saw-like solution. But I admit that all in all yours is an interesting and smart use of delete.