What I want to achieve is editing a file in-place (via a temporary file) and optionally creating a backup file as well if a backup filename is given.
My idea was therefore to use the backup file as the temporary file, where a backup filename is given, or else use File::Temp to create one for me. I then copy to the temporary file, read from it and write back over the original, and then leave File::Temp to clean up the temporary file if one was created. (If a specified backup file was used instead, then it gets left afterwards, of course.)
I could start by renaming the original to the backup/temporary name instead, as you suggest, but where do I get the temporary name from? File::Temp returns an open filehandle - no good for renaming my original file to, hence I was looking to copy to it instead.
Actually, having read Re-runnably editing a file in place, I'm now thinking something along those lines would be better:
I could get a temporary filehandle, read from the original file, process the data and write to the temporary filehandle. Then I'd want to rename the temporary file to the original filename, but I don't know the temporary filename unless I ignore File::Temp's advice and pick up both the handle and the name. Maybe that's safe enough since I wouldn't be doing anything with the temporary filename except renaming it (and I therefore wouldn't want File::Temp to try to delete the temporary file either). (I'd have to create the backup file separately, rather than using it as the temporary file, in this scheme, of course.)
- Steve | [reply] |
my $file = ...;
my $n=0;
if( -e $file ) { ## Stop endless loop if $file doesn't exist
$n++ until rename $file, "$file.bak$n";
}
else {
die "$file doesn't exist";
}
my $backup = "$file.bak$n";
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon
| [reply] [d/l] |
That doesn't look like a great way to choose a backup filename - the rename will succeed even for candidate backup filenames that exist (permissions permitting), so the backup has potentially just clobbered another file! (Or have I misunderstood you?)
Anyway, as I said, the backup filename is supplied by the caller of this code if a backup file is required. My real concern is what the best way to achieve the in-place edit via a temporary file is, possibly taking advantage of the given backup filename if one is given.
I like the idea of writing the processed data to a temporary file and then moving that back (either (1) by a rename or (2) by copying the contents), rather than my original idea of moving/copying the file to be edited and then writing the processed data back to it, so that the process can be easily re-run if it failed the first time.
However, both options (1) and (2) above have problems:
Option (1) goes something like this (return values obviously need checking, and there are some chmod games that can be played too, but this is the bare bones of it):
use File::Temp qw(tempfile);
my $file = 'test.txt';
my($tmpfh, $tmpfile) = tempfile();
open my $fh, '<', $file;
binmode $fh;
while (<$fh>) {
# Process $_ here
print $tmpfh $_;
}
close $fh;
close $tmpfh;
rename $tmpfile, $file;
I can see two problems with that. Firstly, tempfile() was not called in scalar context so the temporary file will not be cleaned up if the program is interrupted or killed. (A $SIG{INT} handler could arrange for them to be cleaned up if interrupted, but not if the program is killed.) Secondly, while the rename itself is (normally) atomic, there is a race condition between the close and the rename - somebody else could potentially modify the file inbetween.
Option (2) looks like this (with the same caveats as before):
use Fcntl qw(:seek);
use File::Temp qw(tempfile);
my $file = 'test.txt';
my $tmpfh = tempfile();
open my $fh, '<', $file;
binmode $fh;
while (<$fh>) {
# Process $_ here
print $tmpfh $_;
}
close $fh;
seek $tmpfh, 0, SEEK_SET;
open my $fh2, '>', $file;
binmode $fh2;
print $fh2 $_ while <$tmpfh>;
close $fh2;
close $tmpfh;
This time, the temporary file's contents are written back to the original file without the temporary file having been closed, so there is no close/rename race condition. Also, tempfile() was called in scalar context so the temporary file will be cleaned up even if the program is killed (on Win32, at least, via the O_TEMPORARY flag that is used when opening the file). However, the process of copying the temporary file's contents back to the original file is no longer atomic, so if the program is interrupted during the final while loop then the original file will be left partially written.
So neither option is perfect. Which is approach is the lesser of the two evils? Is there another approach with none of these pitfalls? | [reply] [d/l] [select] |