Hi,

I wrote an HL7 Browser using Perl::Tk that regularly is asked to slurp in 100+ MB files. Not only that but portions of that data are parsed and loaded into HList widgets. Obviously it's consuming 200, 300 or more MB in system memory, but the point is if your loop is not surviving past line 4300, then there is another issue. These 100 MB files I deal with have hundreds of thousands of lines in them.

I'd definitely agree that if you can think of a better way to handle your situation then you should do so, but I would hate to see you go through a bunch of conversion work then find that wasn't the real problem. It might be helpful if we could take a gander at more of your code here. Another thought is that perhaps you have a file with an un-timely EOF marker in it.

As a more permanent solution I like maverick's idea of using MySQL or PostgreSQL, but another method that might work and is far more simple to impliment is a GDBM database. This is another type of file that I have seen work well even as it grows to tens of MB in size. (It works fine at the 150+ MB size, but can take hours to do a reorganization). Do a search for GDBM_File for more information. Yet another way to do it might be to use the filesystem to break your information up into directories to make it a little faster to parse.

Good luck, however you decide to proceed,
{NULE}
--
http://www.nule.org


In reply to Re: FIle Seeking by {NULE}
in thread FIle Seeking by Baz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.