http://qs1969.pair.com?node_id=218554


in reply to Re: Re: iteration through tied hash eat my memory
in thread iteration through tied hash eat my memory

Excuse me!? BerkeleyDB does databases up to 4 terabytes, transactions, concurrant access and cursors. Perhaps your operating system can't handle files that large but BerkeleyDB is perfectly fine with them. PostgreSQL avoids that OS limit by splitting database files at one gig. Which operating system are you using?

If you want to continue to use PostgreSQL then you'll need to start accessing it smarter (cursors/asynchronous) or just waste minimal memory. In general a tied PostgreSQL interface really isn't the right solution for this (again, unless you do cursors or asynchronous). Really, do this The Right Way.


Update: It also occurs to me that if your dataset that large it's already mandatory that you do things the smart way. Using cute gimicks is nice but you need to be intentional in how you approach your disk access and memory usage. You really can't afford not to.

Update again: If you haven't already then you really need to read the BerkeleyDB document from SleepyCat. The Pod documentation in the CPAN module is really just a gloss on how to translate BerkeleyDB idioms into perlcode. You have to read the actual documentation to get at the right code. For instance - you only get concurrant access if you initialize that subsystem. The Pod documentation barely mentions it - it's fully covered in the library docs. So go read that. It's online at http://www.sleepycat.com/docs/index.html. You probably want to read the bit on the C API since that's where the CPAN module links in.

__SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;

Replies are listed 'Best First'.
Re^4: iteration through tied hash eat my memory
by diotalevi (Canon) on Dec 09, 2002 at 19:20 UTC

    I did more checking. This time on doing cursors in perl. The key here is to go to the PostgreSQL web site and search the archives for 'cursors perl'. That yields the example (from http://archives.postgresql.org/pgsql-general/2001-01/msg01569.php:

    use Pg; my $conn = Pg::connectdb("dbname = test"); my $result = $conn->exec("begin work"); $result = $conn->exec("declare c1 cursor for select foo from bar"); $result = $conn->exec("fetch forward 1 in c1;"); print "Hurray, I fetched a row: ", $result->fetchrow, $/; $result = $conn->exec("end work;");
    __SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;
Re:^4 iteration through tied hash eat my memory
by ph0enix (Friar) on Dec 09, 2002 at 16:40 UTC

    I'm using SuSE Linux 7.3 (Intel 32-bit) which contains Berkeley DB database library in version 3.1.17. When file size with database raised to 2GB I get message like 'File size limit exceeded (SIGXFS2)' (not exactly - translated from localized message). I'm able to create larger files on filesystem (tested up to 6GB). Does it mean that db package in my distribution was miscompiled?

      I did a bit of checking for you and it looks like Linux support for large files was added in version 3.2.9 (see the Changelog at http://www.sleepycat.com/update/3.2.9/if.3.2.9.html). Your signal was from XFS so some other checking brought up the link http://oss.sgi.com/projects/xfs/faq.html#largefilesupport which indicates that your large file support may be conditional on your glibc library. My recommendation is to get the current version of BerkeleyDB and install it into /usr/local. Be very careful not to disturb your existing library since various parts of your OS probably depend on 3.1.17 staying 3.1.17.

      Google is your friend suse xfs 2gb. Obviously just read the changelog on http://www.sleepycat.com's web site for the scoop on BerkeleyDB.

      __SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;