Creating an SQLite DB is not that hard. I haven't tested this code, but this is at least the general idea.
#!/usr/bin/perl -w
use strict;
use DBI;
my $dbfile = "./DNA.sqlite";
print "Creating new DNA Database\n";
if (-e $dbfile) {unlink $dbfile or die "Delete of $dbfile failed! $!\n
+";}
my $dbh = DBI->connect("dbi:SQLite:name=$dbfile","","",{RaiseError =>
+1})
or die "Couldn't connect to database: " . DBI->errstr;
$dbh->do ("CREATE TABLE dna
( id integer PRIMARY KEY AUTOINCREMENT,
protein varchar(10) DEFAULT '',
sequence varchar(1000) DEFAULT ''
);
");
$dbh->do('PRAGMA synchronous = 0'); # Non transaction safe!!!
$dbh->do('PRAGMA cache_size = 200000'); # 200 MB dynamic cache increa
+se
# makes index creation faster
$dbh->do("BEGIN");
import_data();
$dbh->do("COMMIT");
$dbh->do("BEGIN");
$dbh->do ("CREATE INDEX iprotein_idx ON dna (protein)");
$dbh->do("COMMIT");
sub import_data
{
my $add = $dbh->prepare("INSERT INTO dna ( protein, sequence)
VALUES(?,?)");
#...your loop to read the data goes here
# foreach protein and sequence...
{
$add->execute($protein, $sequence);
}
#
}
Update:
Basically when creating the DB, you want to turn all the ACID stuff off.
That means: don't wait for writes to complete, don't do complete transactions for each "add", run the cache size up from default of 20MB to at least 200 MB for more efficient index creation. Run as "lean and mean" as you can.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.