If all you want to do is change a single byte in a file - rather than copy the file - take a look at seek.
You can use it to move to an arbritary point in a file. You can then just overwrite the byte you want to change rather than copying the whole file.
| [reply] |
Doh - Of course!! In my job I'm so used to reading text data in one form and writing it out to a new file in a usually entirely different form that it didn't even occur to me to just alter the existing file.So I should have opened the file with the "+<" mode for read/write, seek to position 148 and write the byte there. (I would still need to copy the file first as I need to keep the original.) Talk about closed thinking - I can't believe I overlooked that. Thanks adrianh and ++ (once I get some more votes!). Cheers!
| [reply] |
Now that I've had time to get over my embarrassment, I'm still a bit curious as to how do this if I were inserting into the file rather than overwriting. If the file is very large, what would be the best/fastest/memory efficient way to copy big sections (including up to the end of the file) of an input file to an output file? A character at a time sounds inefficient, so I'm assuming larger chunks would be better, but then you need to handle the end of the file carefully etc.
Cheers! | [reply] |
Keep reading through the file, a block at a time, until you get to the block with your character. Print out the block up to your character, then whatever you need to insert, then the rest of the block. Then just loop until EOF, again a block at a time, for the rest of the file.
Something like this:
#!/usr/bin/perl -w
use strict;
my($replacepos,$replacestr)=@ARGV;
my $BLOCKSIZE = 10;
my $buf;
my $pos = 0;
while (read(STDIN,$buf,$BLOCKSIZE))
{
# Does this contain our byte?
if ( ($replacepos >= $pos) && ($replacepos < ($pos + length($buf))))
{
print substr($buf,0,$replacepos-$pos),
$replacestr,
substr($buf,$replacepos-$pos+1);
}
else
{
print $buf;
}
$pos += length($buf);
}
10 is a good blocksize for demonstration, because it's easy to verify the edge cases. In real life, on most system 4096 is the best block size (it matches up with the size the system really reads from the disk).
If it's a genuinely tremendous file and there are actually performance problems, using Mmap might be more efficient (it is in C, I haven't use it in Perl).
| [reply] [d/l] |
| [reply] |