in reply to Re: Joint Database Technology
in thread Flat File Database
Code example you have requested:
This Perl program retrieves 5 verses of King James Version Bible text
from a large Flat File (with fixed-length, "text" records) by random access lookup.
The Flat File contains 180 complete copies of the KJV Bible, with a bogus
translation number (tr) assigned to each Bible copy (tr = 1 to 180)
to make a unique key: {translation_nbr + book_nbr + chapter_nbr + verse_nbr}.
Record offsets (in bytes) are persistently stored in a binary Perl SDBM database file,
of key/value pairs, tied to a program hash table. The value is the offset.
The key is {tr + bk + chp + ver} numbers combined/concatenated.
If $offset is a negative value, seek from BOTTOM/END of file.
If $offset is a positive value, seek from BEGIN/TOP of file.
Each Bible contains 31102 verses of text, of max length 528 charater each.
But with the compound index {tr + bk + chp + ver} added to the Bible text,
for the purpose of proving the random access is working, and
MIMEbase64 encoding applied (to hide the Bible text), the fixed
length records have become 760 characters each. Decoding will occur as records are read.
The Flat File is just under 4 GIG. The SDBM file just under 1 GIG.
There are over 5 Million records each, in both the Flat File and SDBM file.
(180 copies of the Bible times 31102 verses per Bible)
Flat File, random access, record lookup, is instantaneous.
You can use Perl Portable Code: sysopen, syswrite, sysseek, sysread.
But the below example is Windows O/S specific Perl Code.
This example is a batch application process (no user front-end), having 5 hard-coded lookup keys.
You can build a user-interface to instead accept the lookup keys from user input: either typed in,
or selected from a GUI widget of preloaded values {tr, bk, chp, ver}.
A RANGE of values could even be selected to print Bible verses for an entire Book (ex. tr="134", bk="01" i.e. Genesis)
(NOTE: Pls know we are not promoting any religion here. The KJV Bible was selected because it is in the public domain, and because it is logically segregated making it a good test file which can easily be copied over-and-over to fill up as many Flat Files as is desired for testing.)
Dozens of Flat Files, each with 180 complete copies of the Bible, could be preloaded, and your application program designed to access any one of them based upon the different set of distinct bogus translation numbers (tr) contained within each Flat File (and its associated SDBM binary file tied to a hash table). Not shown here, was the initial code used to load the Flat File with records, and load the SDBM file with key/val pairs.
To be clear: This "random access" technique requires no reading in of the Flat File records sequentially, nor does it require reading in rows/records sequentially from the binary SDBM file of key/val pairs - to load them into the Hash table. As soon as the below code is launched, the Bible verses are randomly accessed and printed to the screen in a split second.
use Win32API::File 0.08 qw( :ALL ); use Win32; use SDBM_File; use Fcntl; use MIME::Base64 qw(decode_base64); $PWD=Win32::GetCwd(); tie( %BibleVersesIDX, "SDBM_File", '.\BibleFlatFile_760_31102_180_IDX' +, O_RDONLY, 0444 ); if (tied %BibleVersesIDX) { print "BibleVersesIDX Hash now tied to ext +ernal SDBM file\n\n"; } else { print "Could not tie BibleVersesIDX Hash with external SDBM fil +e - Aborting\n\n"; die; } $hFILE = createFile("$PWD\\BibleFlatFile_760_31102_180.dat", "r"); foreach $key ("00101001001", "09066022021", "09101001001", "1800100100 +1", "18066022021") { $offset=$BibleVersesIDX{$key}; if ($offset < 0) { $pos=SetFilePointer( $hFILE, $offset, [], FILE_END); } else { $pos=SetFilePointer( $hFILE, $offset, [], FILE_BEGIN); } ReadFile( $hFILE, $Buf, 760, [], [] ); $decoded_Buf=decode_base64($Buf); $decoded_Buf=~s/ *$//; print $decoded_Buf . "\n\n"; } exit; END { CloseHandle( $hFILE ); untie( %BibleVersesIDX ); sleep 5; }
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Joint Database Technology
by karlgoethebier (Abbot) on Apr 12, 2015 at 20:04 UTC | |
|
Re^3: Joint Database Technology
by AnomalousMonk (Archbishop) on Apr 12, 2015 at 20:47 UTC |