SQL Joins across Subclasses

djantzen has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
•Re: SQL Joins across Subclasses by merlyn (Sage) on Jan 11, 2003 at 16:44 UTC
If you use PostgreSQL, you can declare the derived table types as truly derived. Then a query against the parent type by default does an implicit join over all subtypes (unless you add an ONLY keyword). It's pretty slick. If you have a choice of databases, choose PostgreSQL. Particularly if you are using something like MySQL only out of inertia. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re: SQL Joins across Subclasses by pg (Canon) on Jan 11, 2003 at 16:40 UTC
There are different ways to look at this. From a database design view: If there is no strong reason, I would say always normalize your tables. Otherwise, you may not see a problem now, but big problems in the future. Believe me. The maintainance guys will blame you for the bad design. However if what you are working on is some sort of data warehouse thing, and those tables have millions of rows. Don't hesitate to denormalize your database. In this case, nobody can blame you, you are not a black sheep. Denormalization is one of the conventional ways used in data warehouse to improve performance. This is on paper. From a OO-design view. You better have three classes defined. This is not related to the background database design, even if you decide to denormalize your tables. For example you denormalize those three tables in your post, and only have one asset table, you still should have three classes: asset, media, and document asset. There is no reason to break the OO rules. Also you will use lots more memory space, if you combine three objects into one, when you design your classes. There would be space occupied by those NULLs, BLANKs, ZEROes etc. As a summary, don't denormalize your tables, unless you have a very good reason. If you have a very good reason, denormalization is definitely not a crime, and there are design methodologies to support you. For your classes, always follow the rules. Just want to add one more thing: For database performance, join is usually not a big performance issue, although it will take some extra time. Because most of the time, join is an indexed access, not a full table scan, as long as your database is reasonably well designed, and your sql statement is carefully tuned.	[reply]
Re: Re: SQL Joins across Subclasses by djantzen (Priest) on Jan 12, 2003 at 03:41 UTC
Blazing performance in this case is not the primary objective, rather it is maintainability and extensibility. So given that, I'm inclined to take your advice to normalize and use joins. As far as the object model goes, you're preaching to choir my friend :^) Even if I were to combine the classes into a single Asset, the differences would be minor enough to incur almost no performance degradation over the model with inheritance. However, as a matter of good OO design the latter clearly wins out. Update: In my object model I've decided to solve this dilemma by separating out the database interactivity according to which attributes belong to each class. In other words, `Asset` is responsible for saving only its attributes in its own table, and `DocumentAsset`'s database methods rely on SUPER to inherit the necessary behavior. Thus, `DocumentAsset::save` calls `$this->SUPER::save()` , and then proceeds to save only its distinct attributes in its own table, namely, `docid` and `label`. So far so good :^) thanks pg.	[reply] [d/l] [select]
Re: SQL Joins across Subclasses by tall_man (Parson) on Jan 11, 2003 at 15:17 UTC
This sounds like a case of "denormalization for performance," which is considered a bad idea by relational database experts; so are NULL fields. I think they are right, especially in a case like yours where your tables are so small and simple. Joining them is not a big deal, and data integrity will be easier to maintain. Here is a site that may help: Database Debunkings	[reply]
Re: Re: SQL Joins across Subclasses by autarch (Hermit) on Jan 11, 2003 at 19:12 UTC
Actually, it sounds like a case of what CJ Date calls the "First Great Blunder", which is equating object classes with tables (as opposed to domains/data types). Date would probably suggest using three tables (common attributes in one, unique attributes in two others) and views to solve the problem that was originally posted. Unfortunately, no RDBMS today supports proper (always updateable) views, AFAIK.	[reply]
Re: Re: Re: SQL Joins across Subclasses by djantzen (Priest) on Jan 12, 2003 at 02:53 UTC
What do you mean by "equate", and why is that a blunder? Our design for this object-oriented API is generally to allocate a separate table to each class that requires persistence. More specifically, if a class has a one-to-one relation with a particular attribute, that attribute gets a place in the table for the class. If there is a one-to-many relation then we have a separate table for the attributes, keyed to and constrained by a unique id in the class's primary (parent) table. If this is what you're suggesting is a blunder, I'm wondering what terrible fate should have befallen us in the previous four years of development :^) As to the rest of your comment, I'm leaning toward the three table layout, although since the subclasses in my object model will require all of the contents in Asset and all of the contents in either DocumentAsset or MediaAsset, a join I think is more appropriate than a view. I think of views as more for taking a subset of data from several tables, but this is really a combination of everything keyed to a unique id.	[reply]
Re: Re: Re: Re: SQL Joins across Subclasses by herveus (Prior) on Jan 13, 2003 at 14:15 UTC
Re: Re: Re: Re: SQL Joins across Subclasses by autarch (Hermit) on Jan 16, 2003 at 05:57 UTC
Re: SQL Joins across Subclasses by Anonymous Monk on Jan 11, 2003 at 15:06 UTC
If you suspect that further attributes might get added to the subclasses, you should break them out. Otherwise I'd keep them in one table, with a 'type' field that specifies the subclass name. When you fetch the row you can bless it into the appropriate subclass. Then you'll only have accessors to the relevant columns, as defined by the subclasses. All in MVHO.	[reply]
Re: SQL Joins across Subclasses by Nitrox (Chaplain) on Jan 11, 2003 at 14:58 UTC
I've always been partial to allowing the DB to do the grunt of the work, joins would be fine in my opinion. -Nitrox	[reply]
Re^2: SQL Joins across Subclasses by diotalevi (Canon) on Jan 11, 2003 at 17:01 UTC
If you are using PostgreSQL then there's an explicit syntax to enable this sort of thing. PostgreSQL will then allow you to query on the `assets` table and see all the "objects" that are Assets. If you want to see just the DocumentAsset objects then use `document_asset` table. So far this schema by default won't create inconsistant Asset objects &em; the asset_id is controlled by a single sequence so if that's enough assurance you can proceed and know that asset_id is unique without explicitly constraining it. `CREATE SEQUENCE asset_seq; CREATE TABLE assets ( asset_id integer not null default nextval('asset_seq'), desk text not null, status text not null, priority text not null, deadline date not null ); CREATE TABLE documentasset ( doc_id integer not null unique, label text not null ) INHERITS (assets); CREATE TABLE mediaasset ( media_id integer not null unique ) INHERITS (assets);` [download] If you really have to ensure that asset_id is unique then you have to step outside of SQL. PostgreSQL has these nice inheritance features but doesn't yet allow you to constrain them declaratively. If you add on this PL/pgSQL code then it will constrain the inheritance for you. All of thsi is obviously just a start but it should get you thinking of the possibilities. `/* From your command prompt type createlang plpgsql <database name> It generates the following code automatically so it isn't nessessary to remember this next bit */ CREATE FUNCTION "plpgsql_call_handler" () RETURNS opaque AS '/usr/local/lib/plpgsql.so', 'plpgsql_call_handler' LANGUAGE 'C'; CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql' HANDLER "plpgsql_call_handler" LANCOMPILER 'PL/pgSQL';` [download] And now to create a PL/pgSQL function and hook it into the right place. Please be aware I just wrote this into a window and didn't try to run it first or even verify that it's syntactically correct. You'll want to read up on PostgreSQL's triggers and look into either implementing this sort of function in PL/pgSQL or maybe it's TCL or Perl implementations. CREATE FUNCTION asset_id_ins_upd () RETURNS OPAQUE AS ' DECLARE found INTEGER; BEGIN IF TG_OP = ''UPDATE'' THEN -- Return OK for unchanging asset_id on -- update operations IF NEW.asset_id = OLD.asset_id THEN RETURN NEW; END IF; END IF; SELECT 1 INTO found FROM assets WHERE assets.asset_id = NEW.asset_id; IF FOUND THEN RAISE EXCEPTION END IF; END; ' LANGUAGE 'plpgsql' WITH (isstrict); CREATE TRIGGER assets_asset_id_ins_upd BEFORE INSERT OR UPDATE ON ass +ets FOR EACH ROW EXECUTE PROCEDURE asset_id_ins_upd(); [download] Fun Fun Fun in the Fluffy Chair	[reply] [d/l] [select]
Re: Re^2: SQL Joins across Subclasses by djantzen (Priest) on Jan 12, 2003 at 01:57 UTC
`INHERITS`, yes! But unfortunately, I'm not using Postgres. I've got Oracle 8i, and after searching their docs for "inherit", "derive", "subclass", "under" and "child" I can't find an equivalent keyword. All in all that's not a huge loss though, as we've tried to stay DB agnostic in our SQL to as great a degree as possible. What are your thoughts on the principle of the matter as far as using joins here though? pg points out below that denormalization for performance is an accepted practice, but this application isn't really going to be hit that hard. Rather I'm concerned more with ease of maintenance and extensibility, working on the assumption that the child tables will expand down the road. Which, upon reflection, favors joins strongly, views a little less so IMO. Hmmm. /me dreams of `INHERITS`...	[reply] [d/l] [select]
Re: SQL Joins across Subclasses by hardburn (Abbot) on Jan 13, 2003 at 16:23 UTC
In general, I think object oreinted databases are more about buzzword compliance than really useful. However, I have found problems that seem to lend themselves more naturally to an OO data structure than a traditional relational structure. This seems to be one such case.	[reply]


Perl Monk, Perl Meditation
	PerlMonks