Use a hash, with your numbers as keys. That way, grepping through the entire array would become a simple hash lookup.
#!/usr/bin/perl my @array = map {int rand 1e6} 1..400000; # create file with numbers to look up open my $fh, ">", "in.txt" or die "$!"; for (1..1000000) { print $fh int rand 1e6, "\n"; } close $fh; my %lookup_table; $lookup_table{$_}++ for @array; open (in , "<", "in.txt") || die "$!"; while (<in>){ my ($num) = m/^(\d+)/; print "$num, " if $lookup_table{$num}; } close in; __END__ $ time ./757954.pl >out real 0m4.141s user 0m4.004s sys 0m0.132s
(Memory requirement approx. 100 M — or 80 M, if you get rid of the map for the @array initialisation)
Update: with 300_000_000 rows, it takes about 15 min., which includes creating the 2 Gig random data file "in.txt" plus writing a 760 M output file. (Memory requirement is the same.)
In reply to Re: searching through data
by almut
in thread searching through data
by baxy77bax
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |