lihao has asked for the wisdom of the Perl Monks concerning the following question:
Hello, monks:
I am quite new to BerkeleyDB and hope to get some guidelines from experienced developers before digging into more BerkeleyDB details for its better performance.
So far I've built several BerkeleyDB files(size from 20MB to 1.5GB in both BerkeleyDB::Hash and BerkeleyDB::Btree) and as of this writting, they are all working well.
Most of my BerkeleyDB files are imported like the following sample code:
#!/usr/bin/perl use strict; use warnings; use BerkeleyDB; my $db_file = '/path/to/lib/myapp.db'; unlink $db_file if -f $db_file; my $bdb = tie my %tree, 'BerkeleyDB::Btree', -Filename => $db_file, -Flags => DB_CREATE, or die $!; my raw_file = '/path/to/raw.dat'; open my $fin, $raw_file or die "can not open $raw_file for reading: $!"; while(<$fin>) { my ($key, $value) = split/\t/; next if not fit_condition($key); $bdb->db_put($key, $value); } sub fit_condition { #.skip.# }
For an input with about 50M key-value pairs, the above code took about 10 hours to finish building the DB file (about 17M key-value pairs, 1.1GB in file-size). My questions are:
Other informtaion:
Thank you for any helpful suggestions or links.
lihao
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Questions about BerkeleyDB
by perrin (Chancellor) on Jun 13, 2008 at 20:37 UTC | |
Re: Questions about BerkeleyDB
by TGI (Parson) on Jun 13, 2008 at 23:12 UTC | |
Re: Questions about BerkeleyDB
by starbolin (Hermit) on Jun 14, 2008 at 08:20 UTC |