modulereview
idsfa
<h4>Module Author: Jeff Zucker <jeff@vpservices.com><br/>
[http://search.cpan.org/~jzucker/DBD-AnyData-0.08/AnyData.pm|Documentation]</h4>
<h3>Abstract</h3>
<p>Excellent tool for developing programs with limited "database" needs, prototyping full-on RDBMS applications and pulling in common data interchange formats. If you don't need /want the SQL baggage, try [cpan://AnyData] instead.</p>
<h4>Pre-Requisites</h4>
<ul>
<li>[cpan://SQL::Statement]</li>
<li>[cpan://DBD::File]</li>
<li>[cpan://AnyData]</li>
<li>[cpan://DBI]</li>
<li>various others for some data formats</li>
</ul>
<h4>Overview</h4>
<p>[cpan://DBD::AnyData] is a DBI/SQL wrapper around [cpan://AnyData] which
allows the author to use many SQL constructs on traditionally non-SQL data
sources. Descendant from [id://44114|DBD::RAM], DBD::AnyData also implements
that module's ability to load data from multiple formats and treat them as
if they were SQL tables. This table can be held entirely in memory or tied
to the underlying data file. Tables can also be exported in any format which
the module supports.</p>
<h4>Review</h4>
<p>The variety and number of [http://www.wotsit.org/|file formats] in use
is staggeringly large and continues to grow. Perl hackers are often faced
with the job of being syntactic glue between applications, translating output
from one program into the necessary input for another. Abstracting the exact
format of these data allows the programmer to rise above mere hacking and
actually craft something (re)usable. Separating the logic from the
presentation improves the clarity of both.</p>
<p>DBD::AnyData attempts to provide this abstraction by presenting a DBI/SQL
interface. It layers over the required/companion AnyData module, which
presents a tied hash interface. The perl purist will most likely prefer
to stick with AnyData, minus the DBD. The extra layer of abstraction will
be most useful if you are more comfortable with SQL or your application
design requires it. To my mind, the niftiest use of this module is the
ability to prototype your code as if you had a whole relational database,
but have the ease of a few simple CSVs actually holding the data.</p>
<p>The list of supported formats is impressive, and continues to expand. CPAN
currently lists:</p>
<ul>
<li>perl data structures and __DATA__ segments</li>
<li>Delimited text (Comma/Pipe/Tab/Colon/whatever separated)</li>
<li>Fixed length records</li>
<li>HTML Tables</li>
<li>INI Files</li>
<li>passwd Files</li>
<li>MP3 Files (specifically, their ID3 tags)</li>
<li>[http://search.cpan.org/~jzucker/AnyData-0.10/AnyData/Format/Paragraph.pm|Paragraph] Files</li>
<li>Web Server Logs</li>
<li>XML Files</li>
<li>DBI Connections (to leverage existing modules)</li>
</ul>
<p>With [http://www.vpservices.com/jeff/programs/AnyData/AnyData-API.html|more
on the way].</p>
<p>DBD::AnyData has three basic modes of operation: file access, in-memory
access and format conversion. These modes are implemented as five extension
methods over a standard DBD.</p>
<p>In file access mode, the data file is read on each request and written
on each change. The entire file is never read into memory (unless requested)
and so this method is suitable for large data files. Be aware that these
are <b>not</b> atomic commits, so your database could end up in an
inconsistent state. This mode is not supported for remote files or
certain formats (DBI, XML, HTMLtable, MP3 or perl ARRAYs).</p>
<p>In-Memory mode loads the entire data source into memory. Obviously a
problem for huge data sets, but then you probably have those in a relational
database already. This method is ideal for querying a remote data source,
handled in the background by good old [cpan://LWP].</p>
<p>Conversion mode takes data from an input (which can be local or remote,
and in any supported format) and writes it to a local file, perl string or
perl array. This function alone would be reason enough for the module to
exist, and it's really more of an afterthought.</p>
<h4>Caveats</h4>
<ul>
<li>Again, if you don't need SQL, use AnyData instead</li>
<li><strike>Currently, DBD::AnyData will not allow SQL against multiple
tables in the same SQL statement (no JOINs)</strike> <b>Updated:</b> per [id://395695|jZed] this feature is now available</li>
<li>It isn't a real RDBMS. Don't expect atomicity, journals, etc etc</li>
<li>Not all formats are fully featured, and most require more modules</a>
</ul>
<h4>Summary</h4>
<p>DBD::AnyData is one of those fun modules that lets you shove the crud
work off on someone else (the author of the AnyData::Format:: module) and
get on with crafting good code. I've found it especially helpful when
putting together tiny web apps that might end up getting huge (and thus
require a moving to a true database). Anything that lets me stop writing
file format converters is worth checking out in my book.</p>
The DBD::AnyData module provides a DBI/SQL interface to data in many formats and from many sources.