Re: converting from some database format do Berkeley DB.

Replies are listed 'Best First'.
Re^2: converting from some database format do Berkeley DB. by pc2 (Beadle) on Jul 19, 2007 at 12:05 UTC
salutations, thank you for the response, jZed. we installed the DBD:CSV module via ActivePerl's package manager, then we wrote the following DBI:CSV code, which uses ";" as a text delimiter: `use DBI; $dbh = DBI->connect(qq{DBI:CSV:csv_sep_char=\\;}); $dbh->{'csv_tables'}->{'dict'} = { 'file' => 'dict.txt'}; $sth = $dbh->prepare("SELECT * FROM dict"); $sth->execute() or die "Cannot execute: " . $sth->errstr();` [download] how can we proceed to transform the table "dict" (which is the "dict.txt" file) into a BerkeleyDB file? is some other module necessary? note: the $dbh->prepare() and $sth->execute() lines in this code seem to be too slow; may that be because the dict file is too big (385090 lines)? thank you in advance.	[reply] [d/l]
Re^3: converting from some database format do Berkeley DB. by jZed (Prior) on Jul 19, 2007 at 16:04 UTC
Hi. A little detective work lead me to your previous posting which supplies some context for your question here. Yes, DBD::CSV will be slow with a file as large as the one you are dealing with. The fastest way to get your data from a CSV file into a quickly-searchable form is to use the loading mechanism of a database (for example LOAD INFILE with MySQL). Using a database would also simplify and speed up future updates and searches. If you absolutely must use BerekeleyDB instead of a database, then you can convert from CSV to BerekeleyDB with something like this: #!/usr/bin/perl use warnings; use strict; use BerkeleyDB; my( $csv_file, $berk_file) = qw( dict.txt dict ); my $db = BerkeleyDB::Hash->new( -Filename => $berk_file, -Flags => DB_CREATE ) or die "Cannot open file '$berk_file': $! $BerkeleyDB::Error\n"; open( DICTE, $csv_file ) or die "Cannot open file '$csv_file': $!\n"; for (<DICTE>) { chomp; my($key, $value) = split(/;/,$_,2); $db->db_put($key,$value); } # the file "dict" is now a BerkeleyDB file with entire # contents of the CSV file "dict.txt" [download]	[reply] [d/l]