Alex,

The amount of text is quite significant ... easily contain 15-30 mb. of text..

You really have 2 problems that you are trying to solve with one script. First, you need to get the data on a separate server, and second, you need to process the data.

For the first part of the problem, I would use  use IO::Compress::Gzip and then send the data to the second machine. Your mileage may vary, but I would expect your 15-30MByte file to compress to 1-3MBype. Fast and secure, and core code. Then use  use IO::Uncompress::Gunzip on the second machine to get back to the original data.

For the second problem, IMO, use the power of Unix. Use multiply scripts to process the data in parallel. The data should be time-stamped, so the data going into the database will be correct, which is more important than the fastest script. I would use cron to check on the status of the running scripts. Save your pids ($$) in a common place and use a small simple perl script to check on their running. It's quite simple to send a "text message" to multiple admins if you discover problems with the scripts! And use 'sleep' or 'usleep' between script passes, you'll get a lot more work done in the long run.

Good Luck!

"Well done is better than well said." - Benjamin Franklin


In reply to Re^3: Unpacking and converting by flexvault
in thread Unpacking and converting by dwalin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.