ultibuzz has asked for the wisdom of the Perl Monks concerning the following question:
Hello Monks,
i need to duplicate check arrays , array size is from 5 million till 1 billion elements.
i use the following code
if ($file =~ $spec_text){ my $file_date = (split(/\./,$file))[3]; open(IN, '<', $file) or die("open failed: $!"); my @rows; while (<IN>) { chomp; my @eles = split(";",$_); push @rows,$eles[0].";".$eles[1].";".$file_date; } print scalar(@rows),"\n"; my @non_dupe_rows = do { my %seen;grep !$seen{$_}++, @rows }; print scalar(@non_dupe_rows),"\n"; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: how to speed up dupe checking of arrays
by wind (Priest) on Jul 31, 2007 at 09:57 UTC | |
by ultibuzz (Monk) on Jul 31, 2007 at 10:25 UTC | |
by oha (Friar) on Jul 31, 2007 at 10:46 UTC | |
|
Re: how to speed up dupe checking of arrays
by FunkyMonk (Bishop) on Jul 31, 2007 at 10:23 UTC | |
by ultibuzz (Monk) on Jul 31, 2007 at 10:43 UTC | |
|
Re: how to speed up dupe checking of arrays
by mjscott2702 (Pilgrim) on Jul 31, 2007 at 12:35 UTC | |
by mjscott2702 (Pilgrim) on Jul 31, 2007 at 12:40 UTC | |
by ultibuzz (Monk) on Jul 31, 2007 at 13:21 UTC | |
|
Re: how to speed up dupe checking of arrays
by radiantmatrix (Parson) on Jul 31, 2007 at 14:45 UTC | |
by ultibuzz (Monk) on Jul 31, 2007 at 14:58 UTC |