use strict; use warnings; use Benchmark qw(cmpthese); our @lines = <DATA>; chomp @lines; our $rsHashSlice = sub { my %uniques; @uniques{@lines} = (); my @sorted; push @sorted, $_ for sort keys %uniques; return @sorted; }; our $rsSeen = sub { my %seen; my @sorted; foreach (@lines) { push @sorted, $_ unless $seen{$_}++; } return @sorted; }; cmpthese(100000, { HashSlice => $rsHashSlice, Seen => $rsSeen }); __END__ black black black black black black black black black black black black black black black black blue blue blue blue blue blue blue blue blue green green green green green green green green green green grey grey grey grey iolet mauve mauve mauve mauve mauve mauve mauve mauve pink pink pink pink pink purple purple purple red red red red red red red red violet violet violet violet violet violet violet violet violet white white white white white white white yellow yellow yellow yellow yellow yellow yellow
produces
Rate Seen HashSlice Seen 18939/s -- -34% HashSlice 28571/s 51% --
I have returned a list in each case as that seems to be closer in essence to the print in the OP than the list reference I would normally use.
Cheers,
JohnGG
In reply to Re^3: removing duplicate lines
by johngg
in thread removing duplicate lines
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |