I don't have any experience in this field either, but I recommend trying a few approaches to see how they fare. Measurement is key here.
Start with 'one huge file to rule them all'. First benchmark how long it takes to while (<>) { } the whole file, then see how much running a regex on each line slows that down.
Most SQL databases are pretty good at efficiently storing gobs of data, even if you're only accessing it sequentially. Try something similar to the above approach but just use a SQL table to back it.
Finally, are you sure it's the open/close overhead that would kill the naive approach? I'm with you on this, but the point is that neither of us can tell without measuring. You should have a pretty good baseline of how long a while (<>) { } takes on the raw data from the first approach, so compare to that.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.