in reply to DBM Deep Hash of Hash of Hash of array

I would have to see how you are using DBM::Deep in order to give you some direction as to how to improve things. I see you've fixed or bypassed your FileHandle::Fmode problems; that's good.

Without seeing anything, I would suspect that you're doing a lot of iterating over all the keys of some hash. That's documented at DBM::Deep as being slow at the moment.


My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
  • Comment on Re: DBM Deep Hash of Hash of Hash of array

Replies are listed 'Best First'.
Re^2: DBM Deep Hash of Hash of Hash of array
by npai (Initiate) on Apr 16, 2008 at 17:08 UTC

    Hi,

    Good to see that someone remembers that I have been through some other problem :-)

    I have bypassed Fmode by removing the reference. So far that seems to be ok. because the file is not being used by any other program. The other problem that I had earlier was that I was not able to access the DBM deep object in a method to push data into or read from it.

    By chance I found out the weirdest way to fix it. My method/function has several input variables. If my DBM deep object is the last variable to be passed, it fails to recognize it, but the moment I made it the first variable in the list of 7 input variables, I could access it just fine.

    As for the code

    # Making an entry in the errordb push (@{$errordb->{$_BranchName}->{MissingTitle}->{url}}, $_URL); #Increasing the counter $errordb->{$_BranchName}->{MissingTitle}->{count} +=1;

    This adds the URL into the hash.

    Later on I access this is another method to output the values on a html page

    foreach $testurl (@{$errordbref ->{$_BranchName}->{$ThingsToPrint}->{ +url}}) print $testurl;

    The code is fairly straightforward. I think the problem is in the size of the object

    A database is not available and hence all these attempts. Basically the program is a spider for my website (over 40000 pages) to find errors and display that on a html page.

      Try the following to replace your foreach:
      my $size = $#{ $errordbref ->{$_BranchName}->{$ThingsToPrint}->{url} } +; foreach my $idx ( 0 .. $size ) { my $testurl = $errordbref ->{$_BranchName}->{$ThingsToPrint}->{url +}->[$idx]; print $testurl; }
      That may reduce a lot of the RAM and disk usage you're seeing.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      A database is not available and hence all these attempts. Basically the program is a spider for my website (over 40000 pages) to find errors and display that on a html page.
      Databases are easy to set up and don't require any sort of administrator privileges. They are also built to handle large data sets. mysql, for instance, has no problem handling data sets with millions of records. Moreover, using a relational database makes your persistent data much more transparent, and I think you'll find it'll be easier to debug, maintain and extend your code because of that. Anyway, just something to consider...
        Databases are not the panacea that some people think they are. As a DBA, they are a lot harder to work with than some people give them credit for. DBM::Deep is designed to make Perl data structures persist very easily to disk in order to work with structures too large for RAM. This is the right tool for the job.

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?