dshahin has asked for the wisdom of the Perl Monks concerning the following question:

If I want to store persistent data from my program in a database, am I better off storing the data in tables that I construct into objects via a constructor method, or simply dumping the entire object in serialized form, via Data::Dumper or Storable, and storing it in the DB?

I suppose the key question is, is there a big performance boost in grabbing the serialized object, rather than constructing a new instance out of the stored data?

If not, then it seems like you give up the ability to search the stored fields if you just serialize. I'd be very interested in hearing any real world* experiences with storing serialized objects in perl, along with the caveats and gotchas it creates and/or avoids.

* as opposed to ones in alternate universes, I guess...

  • Comment on storing serialized object in a database

Replies are listed 'Best First'.
Re: storing serialized object in a database
by btrott (Parson) on Jan 15, 2001 at 09:07 UTC
    You wrote:
    > I suppose the key question is, is there a big > performance boost in grabbing the serialized object, > rather than constructing a new instance out of the > stored data?
    Don't know off the top of my head; you could definitely test it using Benchmark. You could test something like Class::DBI against a hand-rolled solution using Storable and DBI.

    Hate to say it, but it really depends on your application. If you have an application where you need to search on the fields in your object/database, you shouldn't be serializing your data.

    On the other hand, one thing you gain by storing a serialized object is that you don't have to worry about the columns in your table. If you add a new attribute to your object, you don't need to worry about database, etc.

    You also said:

    > If not, then it seems like you give up the ability to search > the stored fields if you just serialize.
    Yes, you do. And in some cases that's a real drawback. You don't want to have to load every object in the database then search through them w/ Perl, just to get back the objects you want. That's ugly and a memory hog and is just the sort of thing you want to avoid.

    In my experience I very often find situations where I do want searchability, but YMMV.

    The situations where serializing works well is where the only thing you'll ever want to search on is some sort of primary key ID. Session data, for example: if you store sessions in a DB and give out IDs for each session, your session layer really doesn't care what's in a session; and your app layer usually doesn't need to search on anything other than the session ID. So it makes a lot of sense to bundle up your data into a serialized stream and store it in the database, then thaw it in your session layer when you pull it back out. Object goes in, object comes out.

    An example of needing searchability is when you have lots of different columns that you may need to search on. For example, you might have a "document record" stored in a DB, and you may need to search by author, title, body, last modified, etc. Here you *really* don't want to serialize your data, because you're just going to end up with a mess of Perl code sorting through the objects, where you really should let your database do that work.

    Pointers to modules. If you want to store unserialized data in a DB--column maps to attribute, row maps to object, etc.--take a look at Class::DBI. I've not used it because I have a DB-object layer of my own that I use, but it's reputable, I believe. :) Update: Also check out Tangram.

    For serializing, you could take a look at Apache::Session, which is a good way of managing session data through various interfaces (file, DB, etc.).

Re: storing serialized object in a database
by Fastolfe (Vicar) on Jan 15, 2001 at 08:58 UTC
    If you can break the data down to its essentials, like a numerical value or some such, I'd highly recommend storing that in the database over a serialized object. Databases are specialized in storing and allowing you to select that data back. Storing a Storable object, for example, as you indicate, prevents you from being able to search against it, or pull certain variables while ignoring others without selecting everything and doing your searching in Perl. If that's OK to you, why use a database at all? Just dump the data into a binary file (or one large array and just use Storable to store that) or a DBM file of some kind.

    Plus, if you come along later and make changes to your object/data structures, you won't have to perform an expensive data migration, since the database would only have the essentials to begin with.

Re: storing serialized object in a database
by gildir (Pilgrim) on Jan 15, 2001 at 15:16 UTC
    I have spend a lot of time meditating and trying out implementations of Perl object persistence in RDMB. Here are some of my findings:

    Do not use serialialization for everything. You will downgrade RDBMS to som 'simple' hash-table file like GDBM. You will not be able to index, search and sort objects, which is the primary reason for using RDBMS. Just use serialization if use want to store some subtree of class hierarchy in one table. Select primary atributes of superclass, store them is some VARCHAR2 or NUMBER columns, make appropriate indexes and constraints. Then store a serialized object in one BLOB column. This will give you the power of RDBMS when searching and sorting, but does not limit the content of subclasses.
    I use this principle with objects, which strcucture changes frequently. Good examples are Session objects (on multi-application sites) or Message objects in enterprise messaging systems.

    Design your objects to have simple attributes that can be stored in RDBMS columns. Then write a select(), update(), delete(), ... methods, that will store these simple attributes in RDBMS columns. Object like this are restorored very quick, just make one select, stuff results into hash and bless that hash. You could even search for value of every attribute, sort it. If you will pay enough attention to design of such objects, all attributes will be stored in RDBMS in thier 'natural' format, for example, dates will be stored as dates, not as a string that contain number wich representns number of seconds since 1/1/1970. And that is good, because you should run queries like 'gimme all data for last week'.
    Disadvantage is, that you must create special table for every subclass. And there is another disadvantage - this kind of persistent object tends to be very complex, implementation very 'obfuscated' and they are extremnly difficult to maintain consistent unless you desing them properly.
    I use this kind of persistent objects for most of 'core', business-logic objects. I'm still looking for some way how to read an UML model and generate object's attribute description directly from datamodel. But have no solution on this yet.

    For now, I use data definitions that look like this:

    %attrs = ( id => { storage => 'String', input => 'Text', syntax => 'identifier' }, realName => { storage => 'String', input => 'Text', syntax => 'text', }, password => { storage => 'String', input => 'Password', syntax => 'password', }, );
    I use these definition to deternime correct RDBMS storage types (e.g. VARCHAR2 for 'String'-s), to validate user-provided value with regeexp prior to RDBMS update (e.g. 'identifier' means /^[\w\d_]+$/), and to select s correct type of HTML inputbox.

    And the best way is to represent all of your objects in XML and store them to some XML-database. But XML-databases are rare in these early days of XML development and representing your data in XML, storing them as XML, retrieving them as XML and rebuilding real perl object from XML could be very resourcers-expensive task. OTOH you will not deal with details about storing attribtes in database, XML-DB will care about that. Other application (non-perl) could perfectly read and understand your data. You could store complex data structures (lists, trees) in database as a single entry. Etc, etc ...
    But this is kind of sci-fi to me. I have no time to try it out now, and I'm affraid of XML performance penaulty.

    First two schemas are implemented in Jewels Application Framework, which is not released yet as it is still under heavy contruction. But if you want to see a (undocumented) examples, there is some old, ugly and undocumented code in download area of Jewels web page. Use at your own risk :-)
    Hope this helps.

      If you don't use (overuse?) inheritance too much, adding a new table for each subclass isn't too bad. Essentially, what you could do is require that each object has a unique identifier, then use that as a key on which to join tables.

      Assuming each class has a deserializing method (or something that lists which db tables contain its data), it's fairly easy to create a join statement to restore data. The serializer would have to take this into account as well, so you're probably better off with a simple method that returns a list of tables containing data to use.

      The Everything Engine uses a similar technique, though it's made complicated by other things. Storing metadata in a relational database is a big win, though, and it's a fairly good solution from an implementation and performance standpoint.

      The Jewels Application Framework is intriguing. I've toyed around with code to create business logic that uses collection and item classes to handle object/attribute validation, attribute setting and getting, etc, through the definitions of records.

      I would like to take a look at some of the sample code. The Jewels site link for downloading code is broken. Where can I send requests for sample code?

      Do you know of any other modules that attempt to handle object persistance using databases in this way?

Re: storing serialized object in a database
by extremely (Priest) on Jan 15, 2001 at 09:40 UTC
    DB, DB, DB; Don't convert structures to strings and back and suffer all the extra load when you can get a DB to do the work it was designed to do best for you. Also, please close the open <I> tag you left in your text... =)

    --
    $you = new YOU;
    honk() if $you->love(perl)

      >>Also, please close the open tag you left in your text
      how do I do that? I've been trying, but I'm not sure what I'm doing wrong. :(
        Click on the post you made, like this link: storing serialized object in a database and there should be a nice little text box at the bottom of the page where you can update your post and make changes.

        After the last line, you need to put one last </i> to fix it.

        --
        $you = new YOU;
        honk() if $you->love(perl)

Re: storing serialized object in a database
by kschwab (Vicar) on Jan 15, 2001 at 09:14 UTC
Re: storing serialized object in a database
by lachoy (Parson) on Feb 14, 2001 at 02:53 UTC

    Another CPAN module you might want to check out is SPOPS: Simple Perl Object Persistence with Security. Creating objects from existing database information is extremely easy -- just create a configuration file! (I'm a little biased, but it really is easy.) (See more on CPAN: SPOPS)

    Chris

    M-x auto-bs-mode