in reply to Re: Querying Meta Data
in thread Querying Meta Data

As a bonus, most SQL databases will let you join against the same table multiple times, allowing you to query for nodes with some combination of characteristics:
SELECT n.* FROM nodes n, meta_data md1, meta_data md2 WHERE md1.name = ? AND md1.value = ? AND n.id = md1.node_id AND md2.name = ? AND md2.value = ? AND n.id = md2.node_id ORDER BY 1

(Check the documentation for your database software to confirm that this is supported and identify any idiosyncratic syntax that you may have to comply with.)

For what it's worth, I use a similar structure to store clinical records data for a hospital's psychiatry department, and after a bit of time spent examining MySQL's query plans and adjusting the indexes, have found the performance of this type of query to be acceptable, even with tens of thousands of entries.

Replies are listed 'Best First'.
Re: Re: Re: Querying Meta Data
by eric256 (Parson) on Jul 24, 2003 at 20:07 UTC

    Thanks dragonchild, i did mean to mention that i had the sql for the first link and not the others.

    and Thanks simonm thats exactly what I had in mind but I had no clue that you could join a table more than once. Off i go looking thre MySQL documentation with a new point of view. How many joins do you do in your case? How many is too many? Just ideas, I know that it will vary from case to case, and depend on the size of the table etc.

    Here is the code i came up with so far. Its a bit rough, but i thought others might find it useful!

    use Data::Dumper; print GetNodes({name => "hello",type => "god", color => "greeen"}); sub GetNodes { my $meta = shift; #expects a hash of {name => value} print Dumper($meta); my $sql = "SELECT md1.node_id FROM "; my @From; # list of tables my @Where; # list of conditions my @LastWhere; my $i = 0; foreach my $name (keys %$meta) { $i ++; push @From, "`meta_data` as md$i"; push @Where,"md$i.name = \"" . $name . "\""; push @Where,"md$i.value = \"" . $meta->{$name} . "\""; push @LastWhere,"md$i.node_id"; } $sql .= join(",",@From) . " WHERE "; $sql .= join(" AND ", @Where); $sql .= " AND " .join(" = ",@LastWhere) . ";"; return $sql; }
    outputs
    SELECT md1.node_id FROM `meta_data` as md1,`meta_data` as md2,`meta_data` as md3 WHERE md1.name = "type" AND md1.value = "god" AND md2.name = "color" AND md2.value = "greeen" AND md3.name = "name" AND md3.value = "hello" AND md1.node_id = md2.node_id = md3.node_id;
    Thanks!
    Update: Added above code :-)
    Eric Hodges

      How many joins do you do in your case?

      Like your example code, I use as many joins as I have distinct criteria. We typically only have a couple of sets of criteria, but sometimes it's five or six, and the performance on MySQL has been fine.

      Here is the code i came up with so far.

      I'd like to reiterate the importance of using placeholders, which allow for efficient query plan caching.

      You could easily modify your sample code to return a parameterized query based simply on how many criteria pairs you had. You can then pass your pairs of meta-data criteria to the DBI as execute() parameters. It eliminates any worries about quoting and escaping the values, and may provide a performance benefit by simplifying the database's query parse/plan efforts.

      sub GetNodes { my $meta = shift; my $sql = buildMetaQuery( scalar keys %$meta ); my $sth = $dbh->prepare_cached($sql); $sth->execute( %$meta ); $sth->fetchrow_array }

      This assumes that buildMetaQuery( 3 ) returns something like:

      SELECT md1.node_id FROM `meta_data` as md1,`meta_data` as md2,`meta_data` as md3 WHERE md1.name = ? AND md1.value = ? AND md2.name = ? AND md2.value = ? AND md3.name = ? AND md3.value = ? AND md1.node_id = md2.node_id = md3.node_id;

        Thanks I was trying to come up with a way to create the query without calling it. Like your buildMetaQuery, but I didn't remember I could use a hash in a list context like that. I will certainly revamp my code to that model as i completly understand the need for placeholders. That was just my baby steps approach, produce the rigth query first, then figure out how to produce it better. Thanks agian!

        ___________
        Eric Hodges