fortunezhang has asked for the wisdom of the Perl Monks concerning the following question:
It is too large to fill in RAM, so I used the following code to tie it with a hash:node1 node2 weight status 23 34 -897 1 24 46 -10 0
Here I used @seqRelation to store the data, and then assign it to the hash with the only key 'key', because BerkeleyDB::Hash can only accept hash. The above code is runnable without warning. However, I am confused by the tied file size of /tmp/relation.db. When I checked it after the program finished. It is only 48K, but original file is >4GB. It is unbelievable (but maybe I am wrong because I am not familiar the mechanism of BerkeleyDB). Is this correct or normal? I expected a much larger file size for /tmp/relation.db. I have no idea why it is so small. I am worrying whether some data was missed when tying. By the way, I also need change the status values in my program. Any help or idea is appreciated. Thank you in advance! Best regards! Zhenguouse BerkeleyDB; use MLDBM qw(BerkeleyDB::Hash Storable); my %hash; my @seqRelation; $hash{'key'} = \@seqRelation; my $dbFile = '/tmp/relation.db'; tie %hash, 'MLDBM', -Filename => $dbFile, -Flags => DB_CREATE or die $ +!; # read into the file content my $inFile = shift; open(IN,"zcat $inFile | ") or die "Can not open $inFile:$!"; while(<IN>) { chomp; my @fields = split "\t"; my ($seq1,$seq2,$w,$s) = @fields; push @{$seqRelation[$seq1]}, join(',', $seq2, $w, $s); # also store it in the other direction push @{$seqRelation[$seq2]}, join(',', $seq1, $w, $s); } ......
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Large files tied by BerkeleyDB with MLDBM
by Eliya (Vicar) on May 06, 2011 at 22:16 UTC | |
by Anonymous Monk on May 13, 2011 at 20:29 UTC |