TedYoung has asked for the wisdom of the Perl Monks concerning the following question:
I have a bunch of Perl classes (packages) that extend a base class that knows how to load itself from and save itself to a database (general idea found in Class::DBI).
Each object is automatically given an ID during the save process. This ID is unique only to the class (each class has its own table). I have a separate table mapping classes to next IDs. When I need a new ID, I grab the current value for the given class, and then increment it. This is done in a single transaction for atomicity.(1)
I want to support polymorphic storage (i.e. you can store subclasses of class A, in references/collections containing A). To support this, I would like to have ids that are unique across the database (not just tables, as I have now).
There are two ways, I can think of, for generating unique ids across the db. The most obvious would be to have the id table keep track of only one db level next-id, instead of next-ids for each table. The problem with this is all inserts (across the db) have to be handled serially by the db engine, because each insert requires a lock on the single row in the id table. Looking back on it, right now, inserts on a single table have to be handled serially because there is only one "record lock" per table.
A more ideal way would be to use an external generator, like a UUID generator. Now, no id table locking has to go on at all. Inserts should be a lot faster, and very concurrent. The downside is now these id columns go from being a 64 bit integer, to a 128 bit string. Since these id columns are all indexed and used for all table relationships (joins), I am concerned about performance on this end.
I have looked to see what JPOX (a reference implementation of JDO) does. Unfortunately, since JPOX is run as a persistent service (and not a library like me) they can get away with caching options that I don't have. So, they allow you to choose which strategy you want. They do mention that the idea of using a single table with one id row can cause scalability issues.
You might ask, how much performance do you need? Not a whole lot, but I want to do this correctly now and avoid having to change it later.
Has anyone had any experience along these lines, which they could share? Off the top of my head, I would guess that 50% of my queries are selects (with joins), and only 25% are inserts (25% would be updates and deletes). Is it worth using 128 bit strings in my pk indexes and through all joins/relationships just to save on the contention of having a single row (db level) id table?
(1) I use this strategy over autoincrement, identity, sequence rows to maintain db independence, and so I can easily fetch the ID of the record prior to insertion.
Thanks,
Ted Young
($$<<$$=>$$<=>$$<=$$>>$$) always returns 1. :-)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: [OT] Persistent Object IDs
by kyle (Abbot) on Apr 23, 2007 at 18:45 UTC |