You are slurping the entire file into memory before processing it. This uses more memory than necessary. You only need to read one line at a time from the file. Something like:

while( <DB> ) { chomp; RECORD: while (m/\G(.*?(?<!\\)(\\\\)*)\Q$dataseperator\E/gs) { ... } }

Now it appears that you seem to be jumping through all sorts of hoops because you want to deal with multiline fields (the last-but-one field?).

If that's the case, you should structure the file differently, to take advantage of Perl's strengths. For instance, you could end each record with a special token, like %% (taking care to escape out %% appearing in the fields as data: e.g. \%\%).

Once you have the datafile in that format, you can set the input line separator to '%%', which will simplify your code considerably.

The routine also appears to be relying on a number of external variables: $dataseperator, $templatetail, $datafile and the like. Basic code hygiene would suggest that you pass these variables in as parameters.

Finally, the routine appears to be doing too much. Not only is it performing a search, it is also printing out stuff, and (horrors!) calling exit to end the program.

A better architecture would have the search routine only performing a search. The printing should be hoisted up a level into the calling code (even if that itself is another routine) and the exit call should be placed at the highest level of the code tree. (Either that, or rename the routine print_header_search_results_and_footer_then_exit).

At least people will then have fair warning of what happens when they call the routine. Or for the maintenance programmer reading the code a few years later. In fact, especially for the maintenance programmer reading the code a few years later.

- another intruder with the mooring of the heart of the Perl


In reply to Re: Searching a text file (avoid slurping if you can) by grinder
in thread Searching a text file by monoxide

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.