Dear Perl monks,
I have multiple running instances of a script which periodically update data in a large (~20Mb) text file. My goal is to optimize the mechanism of reading and updating the file.
I have written a new code (VERSION#2) which should replace VERSION#1 code in order to optimize performance of the script.
As I know, Perl's flock should prevent other scripts to write into a file if it is already opened by another script. However, I'm not sure how will it behave when I use rename() the way it's implemented in VERSION#2
I'd like to ask your advice. Is there any flaws in the second version of the code? I mostly concern about the integrity of the processed text file. Is there more chances, than in the first version, that due to heavy simultaneous access to the file by multiple script instances it might be corrupted?
My Environment: Linux, Perl 5.8.8
VERSION#1
# reading file into array and then process it # and write out to the same file: my $file = "./test.txt"; open (LOG, "+<$file"); flock LOG, 2; my @logfile=<LOG>; seek (LOG, 0, 0); truncate (LOG,0); foreach my $line(@logfile){ # do some manipulations with $line... print LOG $line; } close LOG;
VERSION#2
# trying to optimize performance and memory usage # by replacing foreach() with while() my $file = "./test.txt"; my $tempfile = "./tempfile".(rand()*9999); open (LOG, $file); flock LOG, 1; open (TMP, ">$tempfile"); flock TMP, 2; while (my $line = <LOG>){ # here we do some manipulations with $line print TMP $line; } close TMP; rename($tempfile,"$file"); close LOG;
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |