dragonchild has asked for the wisdom of the Perl Monks concerning the following question:

(I'm not sure what to call this node, so someone please retitle this, as necessary.)

I'm writing Yet Another SQL Generator (YASG) and the basic concept is as follows:

  1. There are some number of basic queries (reports)
  2. Each report can be run for any given type, aka the "runfor"
  3. Each report can be grouped by any other given type, aka the "groupby"
  4. For each type, I know the SQL needed to do that limiting action
  5. I know how to link any table X to its neighbor tables
  6. The schema is in Third Normal Form.

The runfor and groupby could be for any tables in the schema, as can the report query. I have figured out the tables needed for the runfor and the groupby. (I already know the tables from the report query.)

What module(s) would be useful to figure out what path I need to take to the schema in order to join the runfor and groupby tables to the report tables. What I'm looking for is something like "To connect tables A and D, you need to go this path: A->B->C->D".

I'm going to be caching the generated SQL when mod_perl starts up, so runtime isn't a major concern. Ideally, I'd like the shortest path, determined by number of elements in the path. (I'm not worried about table sizes or anything like that. Optimization comes later, if necessary.)

------
We are the carpenters and bricklayers of the Information Age.

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Replies are listed 'Best First'.
Re: Path computation for SQL generation
by hardburn (Abbot) on Jan 26, 2004 at 16:52 UTC

    Quick brainstorm: use AI::Pathfinding::AStar. This algorithm is usually used for pathfinding on a grid (such as a strategy game), but I think you could give it input that would work on a database schema. Each "node" represents a table, and the nodes next to the given node are any other tables with a foreign key relation. Each node would be given a movement cost heuristic of 1.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

Re: Path computation for SQL generation
by biosysadmin (Deacon) on Jan 26, 2004 at 17:51 UTC
    Another option would be to use Bio::Coordinate::Graph, it's part of the BioPerl package of Perl Modules. It uses Djikstra's algorithm to solve the shortest path problem, and you just need to provide a hash of hashes for it to compute the paths, like this:
    # this hash represents a directed graph with vertices {A,B,C,D,E} and # these edges: # A -> B (weight = 3) # B -> D (weight = 1) # B -> C (weight = 1) # D -> E (weight = 1) my $hash= { 'A' => { 'B' => 3 }, 'B' => { 'D' => 1, 'C' => 1 }, 'C' => undef, 'D' => { 'E' => 1 }, 'E' => undef }; my $graph = Bio::Coordinate::Graph->new(-graph => $hash); my @node_names = shortest_path( 'A', 'E' ); # node_names now contains a list of the vertex names that make up the # shortest path (A,B,D,E)
    So, in the context of your question you could build a hash of hashes as above, where the node names are SQL table names and there is a path from A->B if there is a foreign key relationship between tables A and B.

    Hope this helps. :)
Re: Path computation for SQL generation
by thor (Priest) on Jan 26, 2004 at 19:21 UTC
    This sounds like a graph problem (as in edges-and-nodes graphs, not x-vs-y graphs), so I'd imagine that some of the Graph modules would help. There are some that are core modules, so that's a bonus.

    thor

      Graph::BFS (breadth-first search) looks like exactly what you need. Tables are vertices, foreign keys are edges (unfortunately, at a quick glance it does not look like you can name edges). Look at its superclass Graph::Traversal for the documentation, which is sparse even there. You'll need to create the graph using Graph::Undirected (again, look at the documentation in the superclass Graph::Base). Here's the Graph package.
        I'm reading through the source (not that I understand much) for Graph. How on earth do I ask the following question:

        Is there a path from X to Y and, if there is, give me the elements I traverse.

        Update: After playing around a bunch, I found the following:

        my $SSSP = $graph->SSSP_Dijkstra($start); my %attrs = $graph->get_attributes($end); my $path = $attrs{path}; die "No path found\n" if !defined $path; print "Path: @$path\n";

        ------
        We are the carpenters and bricklayers of the Information Age.

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.