Dear Perl monks,

I have multiple running instances of a script which periodically update data in a large (~20Mb) text file. My goal is to optimize the mechanism of reading and updating the file.

I have written a new code (VERSION#2) which should replace VERSION#1 code in order to optimize performance of the script.

As I know, Perl's flock should prevent other scripts to write into a file if it is already opened by another script. However, I'm not sure how will it behave when I use rename() the way it's implemented in VERSION#2

I'd like to ask your advice. Is there any flaws in the second version of the code? I mostly concern about the integrity of the processed text file. Is there more chances, than in the first version, that due to heavy simultaneous access to the file by multiple script instances it might be corrupted?

My Environment: Linux, Perl 5.8.8

VERSION#1

# reading file into array and then process it # and write out to the same file: my $file = "./test.txt"; open (LOG, "+<$file"); flock LOG, 2; my @logfile=<LOG>; seek (LOG, 0, 0); truncate (LOG,0); foreach my $line(@logfile){ # do some manipulations with $line... print LOG $line; } close LOG;

VERSION#2

# trying to optimize performance and memory usage # by replacing foreach() with while() my $file = "./test.txt"; my $tempfile = "./tempfile".(rand()*9999); open (LOG, $file); flock LOG, 1; open (TMP, ">$tempfile"); flock TMP, 2; while (my $line = <LOG>){ # here we do some manipulations with $line print TMP $line; } close TMP; rename($tempfile,"$file"); close LOG;


In reply to Trying to optimize reading/writing of large text files. by nikkimouse

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.