Update: Demo code:
use strict; use warnings; use DBI; require DBD::SQLite; unlink 'track.db'; my $db = DBI->connect( 'dbi:SQLite:dbname=track.db', '', '', { RaiseError => 1, AutoCommit => 1, }, ); $db->do(<<EOSQL); CREATE TABLE tracking ( fname varchar(255), fext varchar(255) ) EOSQL my $count_query = $db->prepare(<<EOSQL); SELECT COUNT(*) FROM tracking WHERE fname=? AND fext=? EOSQL my $insert_query = $db->prepare(<<EOSQL); INSERT INTO tracking (fname, fext) VALUES(?,?) EOSQL open(my $fh, "<", "input.txt") or die "cannot open < input.txt: $!"; while (my $line = <$fh>) { chomp $line; my ($fname, $fext) = split(' ',$line); $count_query->execute($fname, $fext); my ($count) = $count_query->fetchrow_array; $count_query->finish; if (!$count) { print "$fname $fext\n"; $insert_query->execute($fname, $fext); } } $db->disconnect;
Yeah, it's longer, but you are going to have to do work on disk if you can't hold it in memory. Feel like implementing your own Merge sort instead?
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
In reply to Re: Large file, multi dimensional hash - out of memory
by kennethk
in thread Large file, multi dimensional hash - out of memory
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |