Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^3: Efficiently Walking Large Data Structures Stored in Multiple Tables

by adrianh (Chancellor)
on Feb 28, 2004 at 17:28 UTC ( [id://332489]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Efficiently Walking Large Data Structures Stored in Multiple Tables
in thread Efficiently Walking Large Data Structures Stored Across Multiple Tables

If you're dealing with a hierarchy of information (OT?) Recursive sql queries? might be of interest.

  • Comment on Re^3: Efficiently Walking Large Data Structures Stored in Multiple Tables

Replies are listed 'Best First'.
Re: Re^3: Efficiently Walking Large Data Structures Stored in Multiple Tables
by aquarium (Curate) on Feb 29, 2004 at 21:51 UTC
    whilst recursive queries are fun, as you can do a whole lot of lookups and join them together in one sql statement; they do take a long time to run, significantly degrading performance. The flatter you can make the structure of the database, especially the often used tables, the faster the retrieval -- unfortunatelly this also means you end up with extra effort in maintaining the tables, as they start having redundant data (not fully normalised).
      whilst recursive queries are fun, as you can do a whole lot of lookups and join them together in one sql statement; they do take a long time to run, significantly degrading performance.

      Take another look at the bits of the thread on using a nested set representation for hierarchical data. Flat structure, normalised data and fast queries. Some performance degradation on insertion, but since people usually read and update a lot more than they insert it's rarely an issue.

      As for recursive queries - the old chestnut about premature optimization comes to mind. That and the fact that several databases have explicit support for hierarchical structures and appropriate optimisations to make access to those structures efficient - even if they set C.J. Date spinning in his grave ;-)

Re: Re^3: Efficiently Walking Large Data Structures Stored in Multiple Tables
by dragonchild (Archbishop) on Feb 28, 2004 at 18:51 UTC
    If you're dealing with a database that can handle tree structures natively, maybe. But, I read the articles on doing tree structures as sets and I was ... underwhelmed. If all you need is the next page and there is only ever one next page, then deal with it that way. If you need a true parent-child structure and a way of asking for all parents without wanting to explicitly code it into your SQL and have it handled correctly by the optimizer ... good luck without Oracle or PostgreSQL.

    Now, you could build the queries using some module (???), but the queries are very very very slow and very very very difficult to get right. Again, good luck. I wouldn't want to try ...

    ------
    We are the carpenters and bricklayers of the Information Age.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      Why underwhelmed?

      I have not had occasion to use the nested set representation, but I have multiple friends who have. And it certainly worked as advertised for them.

      Unless you have some concrete experience of the nested set representation not working as advertised, I'm going to have to heavily discount what you have to say. Even then I'd be interested to see the implementation since I know full well how easy it is to accidentally turn an efficient query into an inefficient one.

        I'm not saying that it works or doesn't work. I have no experience in that regard. What I am saying is that the nested set representation looked like a lot more work than CONNECT BY in Oracle. It also looked like something would be very easy to mess up and that I would want to put in a bunch of PL/SQL routines (or, at least, provide at the DBIx:: level).

        ------
        We are the carpenters and bricklayers of the Information Age.

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      If all you need is the next page and there is only ever one next page, then deal with it that way.

      I did have the words "If" and "might" in my post y'know ;-) If all you need is the next page then it's obviously overkill. If you're doing more complex stuff a more generic solution is worth looking at.

      Now, you could build the queries using some module (???), but the queries are very very very slow and very very very difficult to get right.

      Erm. No they're not. The queries are damn quick. You have a representation that leverages the things that RDBMS are good at. .

      As for difficult to get right - not particularly in my opinion. Takes a little time to get your head around the representation - but the operations to access nodes and subtrees are pretty simple. Insertion is the only vaguely complicated operation.

      Hmm... I feel a meditation coming on...

        I can see how selection isn't very difficult. Nested set trees look to be optimized for retrieval. My issue is modification. Updates to where you stand in the hierarchy (promotion, for example) aren't incredibly hard, unless your entire department moves with you to someone that never had a department under them before. Deletion doesn't look to be a big deal - you simply have to deal with the same problem you have in CONNECT BY, which is what to do with the entities under you.

        I guess my biggest problem with the whole notion is the fact that I affect record A. In Oracle's tree representation (and I'm assuming PG's, as well), that's the end of it. With nested set trees, I might have to re-number the entire hierarchy's intervals. Granted, that's the pathological case, but it's still something you have to code for. And, that has to be checked in every modification. (Maybe not DELETE, but definitely for INSERT and UPDATE.)

        Another thing that concerns me is how this might potentially affect triggers written against that table. Most developers work with tables that aren't nested sets. (I understand that's how RDBMS's think about things, but most developers of my acquaintance aren't as ... flexible.) The data structure, as a whole, looks very fragile to me. There's a lot of bookkeeping involved (unless I'm missing something obvious).

        ------
        We are the carpenters and bricklayers of the Information Age.

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://332489]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (7)
As of 2024-04-19 10:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found