Good remarks so far but I don't think anyone has yet remarked that you're slurping your input file into an array then iterating over it. For a large multi-meg file that's going to cause a good bit of overhead in and of itself (to say nothing of it inflating your process' size which may affect performance if it then winds up causing extra paging by the OS). Unless you need the full file for context (which it doesn't appear to my (admittedly perfunctory) skimming over the code) there's no reason not to read that input file line-by-line.

Presumably your mapping files are going to be the smaller inputs so (as was mentioned) I'd suggest restructuring things to read those into data structures once, then work over the meat of the main input line-by-line. If you can alter things to work more from a hash lookup instead of the multiple substitutions even if the mappings are "large" you can use something like GDBM_File or DB_File to keep those out of memory and lookup from disk instead.

Edit: s/it the/it then/ ; me no tipe gud tewday. Also if you're really looking to speed things up you might could use MCE::Loop to split the reading of the large file across multiple consumers. But fix the structural problems first then it'll be easier because you'll have a cleaner line-by-line processing loop to shove into the MCE bits.

The cake is a lie.
The cake is a lie.
The cake is a lie.


In reply to Re: Optimization tips by Fletch
in thread Optimization tips by sroux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.