in reply to Re (tilly) 4: millions of records in a Hash
in thread millions of records in a Hash

tilly:
"As for writing your own storage method, I would strongly discourage people from doing that unless they already know, for instance, the internals of how a dbm works. And if someone comes back and asks me for that, my response will be that if they have to ask, the odds are that I can't teach them enough about the subject to do any better than they can do just by using the already existing wheel.And this is definitely true if they think they can build their wheel in Perl."


Given the precise knowledge of the data's signature, the seeker said something about 12 byte keys e.g., one can build very fine wheels using optimized algorithms, with perl or without.
Naturally a starting point would be to look at a DBM implementation, but I wonder why in a Seekers of Perl Wisdom section one would recommend not going the hard way and learn a lot of stuff.

And if you can't teach him/her, there might either be others who can or the seeker might just go his own way and find out himself.
  • Comment on Re: Re (tilly) 4: millions of records in a Hash

Replies are listed 'Best First'.
Re (tilly) 6: millions of records in a Hash
by tilly (Archbishop) on Feb 26, 2002 at 03:37 UTC
    Given the precise knowledge of the data's signature, the seeker said something about 12 byte keys e.g., one can build very fine wheels using optimized algorithms, with perl or without.
    The odds are very, very high that the overhead of working in Perl would wipe out any possible win you would be able to get from knowing that the keys are 12 bytes. While the problem may sound like a fun challenge, this is an example of a common optimization mistake. If you are always looking for ways to rewrite and speed up bits of code, your overall program is almost guarateed to wind up slower than it would have been if you used good development practices. Why? Simply because you lose sight of the forest for the trees. You spend so long making your code unmaintainable that you are unable to spot the "low-hanging fruit" that inevitably provide the biggest improvements. See the sample section from Code Complete for more on this. (I recommend the whole book, but that is another story.) If you want more Perl specific optimization advice, try Re (tilly) 1: Optimizations and Efficiency.

    Naturally a starting point would be to look at a DBM implementation, but I wonder why in a Seekers of Perl Wisdom section one would recommend not going the hard way and learn a lot of stuff.
    Perhaps because the section is named Perl Wisdom and not Perl Masochists?

    Reinventing excellent wheels that you can get for free may be good stuff for an algorithms and data structures class. Understanding this stuff may be wonderful for your evolution as a programmer. But deciding to launch into that when you just need to get something done is stupid. And it is an important lesson to learn not to bother doing that, but to instead learn to reuse existing work when and where that is appropriate.

    See Modules Vs. Manual Coding for further discussion.

    And if you can't teach him/her, there might either be others who can or the seeker might just go his own way and find out himself.
    It would be nice if you were able to see that quote from my perspective and decide to apologize for the intended insult.

    FYI the quote that you were responding to was not an admission of ignorance on my part. Rather it was a comment on how great the gap is between asking the question, "How do dbms work?" and having a chance at writing one that outperforms a good one.

    If you want to disbelieve me, go ahead. In which case for someone with a Perl background and no CS background I would suggest starting at Bricolage: B-Trees and seeing how far you get. That will at least give you a key algorithm. But, for instance, that won't go into the details of how to really do it far enough to understand what any of the key parameters are that people want to tune in real dbms, let alone why they matter...

      tilly: "Perhaps because the section is named Perl Wisdom and not Perl Masochists?"

      hehe

      tilly: "It would be nice if you were able to see that quote from my perspective and decide to apologize for the intended insult."

      Sorry. But actually there was no insult intended, but as you two times pointed out that you cannot help, i just wanted to point out that there might be others that might be able / willing to.

      tilly: "If you want to disbelieve me, go ahead. In which case for someone with a Perl background and no CS background I would suggest starting at Bricolage: B-Trees and seeing how far you get. That will at least give you a key algorithm. But, for instance, that won't go into the details of how to really do it far enough to understand what any of the key parameters are that people want to tune in real dbms, let alone why they matter..."

      Oh, maybe i would just use Progress DB and be happy or maybe i would just use a plain file, who knows.
      Thinking you just assume too much. Your wild guessing about backgrounds and understanding at least indicate this.
Re: Re: Re (tilly) 4: millions of records in a Hash
by dragonchild (Archbishop) on Feb 25, 2002 at 20:35 UTC
    Naturally a starting point would be to look at a DBM implementation, but I wonder why in a Seekers of Perl Wisdom section one would recommend not going the hard way and learn a lot of stuff.

    Two reasons, actually.

    1. People generally come to SOPW to get an answer to a problem so they can go and finish what they're doing with a minimum of fuss.
    2. While I have the background and training to learn how to "go the hard way and learn a lot of stuff", I have absolutely no inclination to do so. My time is worth more than attemting to solve a question that has already been solved. I'd much rather spend that time solving an unanswered question and put that solution out to CPAN. That would be a contribution to the community.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      I see, but then i don't understand why it is called "Seekers of Wisdom" and not just "Questions".
      Maybe because there is already a "Q&A".

      I assume such things have already been discussed,
      if somebody remembers where and tells it to me i'd be thankful.
        Seekers of Perl Wisdon. The idea is to find out how to do something in Perl. The way most professionals use to interact with DBMs and RDBMs in Perl is to use modules like DB_File and DBI. We tend to not roll our own when a perfectly good solution exists.

        That said, we also like to improve existing modules and release those improvements. Why? Because we can save other people time by telling them how we fixed a problem we ran into.

        This isn't to say that learning how something is done is a bad thing. In fact, it's a good thing. But, telling someone to roll their own when all they need is to have this thing work in three hours ... that's not very wise, nor is it Lazy.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.