Ace128 has asked for the wisdom of the Perl Monks concerning the following question:

Hey Monks!

Sorry for this not really beeing direcly Perl again, but I know you are so bright here (been here a while now), and this "problem" I have now, I really need feedback on.

Now, Im almost done with my rather complex but nice application I've made in Perl soon to be released for everybody! Basicly, I have this "core" script, that calls others as "plugins" as wanted by user. Now, this has to do with files, and I use a database. In the "core" I save the full path to a file as a key (which can change later, but anyway). Now, the "problem" I have is the communication between the core and the scripts (when it comes to the database talk that is). As it is now, every plugin also saves the full path to the file (just so it was easier and faster developing and testing), but I dont really like the redundancy this is creating, and I really wanna optimize this! Since the "core" also has an Autoincremet column, I was thinking making this something like a foreign key, so there is only one "full path" in the core, and the plugins gets the uniqe increment value, and uses that to lookup in their own table. This really feels like it should be faster than doing a core.fullpath = plugin.fullpath check, and more the way of how it should be done.

However, this means that then I have to do more queries, but I suppose its waaay better doing more queries than having a bad table structure.

Ok, to illustrate:

{Way #1} (as I have it atm)
core table:
id, fullpath, modtime, size, attributes....

plugin table:
fullpath, title, album... (if its a mp3 plugin).

If I now wanna find files with title "Perlmonks dancing", its pretty easy and fast.
SELECT fullpath FROM plugin WHERE title LIKE ("%Perkmonks dancing%");

If I wanna find files with the title "Perlmonks dancing" AND size < 5000 bytes, its abit more (and more to the point, not really great to speed!):
SELECT fullpath FROM core,plugin WHERE title LIKE ("%Perkmonks dancing%") AND size < 5000 AND core.fullpath = plugin.fullpath;

This core.fullpath = plugin.fullpath can't be good whatsoever!

So, this is why I wanted to restructure (+ I dont like the redundancy, really) But, the question is, how to do this the best way?

{Way #2}
core table:
id, fullpath, modtime, size, attributes....

plugin table:
id_ref, title, album... , where id_ref points to core.id.

Now, here its worse with the first example, since I dont have the fullpath stored in the plugin table, but should be faster anyway?
SELECT fullpath FROM core,plugin WHERE title LIKE ("%Perkmonks dancing%") AND plugin.id_ref = core.id;

Second example gives:
SELECT fullpath FROM core,plugin WHERE title LIKE ("%Perkmonks dancing%") AND plugin.id_ref = core.id AND size < 5000; (so not much of a change)

Would really like comments on this, since this is quite essential. I mean, I wanna do things best as soon as possible! :) Specially, since the idea is that one should only have to care about doing plugins, and not the boring work the core does! Problem also arrises when one run my core script, but uses a plugin. If I for some reason need to know the fullpath to a file, I have to ask the plugin about it (so I get the full path to the file in question). Using Way #1, plugin can just send me the info (since it has the column fullpath itself). Using Way#2, plugin has to do little more.

Thanks,
Ace

Edited by planetscape - added readmore tags

Replies are listed 'Best First'.
Re: Relation between tables
by McDarren (Abbot) on Jan 11, 2006 at 06:45 UTC
    hmm, that seems like an awful lot of words to ask a very straight-forward question ;)

    The answer is basically yes. Get rid of the redundancy and use table joins to construct your queries. I'd suggest naming the foriegn key field in the plugin table "core_id" rather than "id_ref", so that's it's absolutely clear what it refers to.

    Also, don't forget to use placeholders when you construct your queries in Perl.

    Cheers,
    Darren :)

      :)
      But I wasnt sure Way #2 was the BEST solution. And I wanted some feedback and thougts on this by experts! :)
      Oh, and I forgot to say... wanna keep the queries as simple as possible, so it works in mySQL, SQLite and postgre! :)

      Joins? Seems to be working without the JOIN sql statement...
        Just because you don't implicitly use JOIN in your query, doesn't mean you aren't joining tables.

        "plugin.id_ref = core.id" is a table join.

Re: Relation between tables
by pajout (Curate) on Jan 11, 2006 at 12:09 UTC
    Imho, the 2nd way is more normalized - you don't keep the same information (fullpath) twice. If you are afraid of join-effectivity, use some tool which imagines how is that query realized, in postgres it is
    explain analyze select fullpath...
    Something about database normal form:
    http://www.databasejournal.com/sqletc/article.php/26861_1428511_4
      Doesn't seem to be so usefull in mysql...