in reply to Design Approach (Processing Huge Hash)

1.The Logfile has around 5-6 million entries
That's a sure sign that you're probably a lot better off with a real database.

May I suggest starting off with DBD::SQLite, and then working your way up to PostgreSQL if that is insufficient?

Then, it'll simply be a matter of writing a half-dozen lines of SQL, and your results will be quick and painless.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.

  • Comment on •Re: Design Approach (Processing Huge Hash)

Replies are listed 'Best First'.
Re^2: Design Approach (Processing Huge Hash)
by mkirank (Chaplain) on Aug 27, 2004 at 05:07 UTC
    we do plan to use a database but the problem is that the script has to support different databases (postgres,oracle,Ms-sql).
    If we use a database , then inserting all the records will take much more time (we cannot do a bulk insert as we deal with different databases) ,so we came up with the idea of summarizing as this will reduce the number of inserts to the database
    As per your suggestion , what i can probably do is insert all the records into SQLLite (as this insertion will be faster than other databases) and then probably summarize the data through SQLlite and then insert that summarized data into postgres or other databases.
    Thanks for your comments .