in reply to Re^2: In place editing without reading further
in thread In place editing without reading further

If the use case demands it, then so be it.

However, it sounds like you don't actually have any hard numbers. Give it a try both ways and see how long it really takes! Benchmarks are far better than random numbers pulled out of ... the air.

You can also consider a two-pass system, where you do the in-place option first, and if $valueLength was too short, then rewrite the file after you have finished all the easy files.

Either way, benchmark it! Tell us how long it actually takes.

  • Comment on Re^3: In place editing without reading further

Replies are listed 'Best First'.
Re^4: In place editing without reading further
by trippledubs (Deacon) on Jan 29, 2015 at 15:24 UTC
    Test box at home - Ubuntu ext3 fs, I think about.. 2 years old.
    5gb - 32k 2m24.716s 2m25.723s 2m24.012s
    5gb - 64k 2m23.235s 2m25.939s
    5gb - 128k 2m18.724s

    11gb - 32k 5m48.613s 5m50.557s 5m55.207s
    11gb - 128k 5m38.264s 5m29.513s 5m38.922s

    15.5gb 128k 9m31.711s 7m45.154s 9m32.641s

    Beefy server with SAN storage

    14gb - 64k2m16.941s2m40.087s2m30.454s
    14gb - 128k2m14.720s 2m22.201s2m26.875s

    We roughly judge the penalty for failure at about 40 minutes and discard the home server results. The script penalty is about 2 and a half minutes of time and suppose the payoff is a failure rate of 0%. So I interpret loosely this to mean that, if the edit in place script fails more than once out of every sixteen runs, it is not worth running. If it fails less than once out of every sixteen runs, it is worth the risk of damaging the file, and having to redo everything.

    14gb is a good estimate for how large these files will be, but when they become smaller it looks very risky to make the edit in place since the savings become smaller and the time penalty will not decrease proportionally.

Re^4: In place editing without reading further
by trippledubs (Deacon) on Jan 28, 2015 at 20:21 UTC
    Okay. Here is the code I came up with attempting to implement your suggestion and anon's. I will adopt for the real deal shortly and post soon.
    HEADER a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 END HEADER b1 b2 c1 c2 .. z1 z2
    I chose a5 as the line to change and want to only change that line, everything should be exactly the same.
    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; open (my $fh, 'message.txt') or die $!; LINE: while (<$fh>) { last LINE if /END HEADER\s\w/s; } my $headerEndingPositionInBytes = tell($fh); print "Found header ending at $headerEndingPositionInBytes\n"; sysseek $fh,0,0; # Rewind to beginning of file my $header; my $bytesRead = sysread $fh, $header, $headerEndingPositionInBytes; print "Read $bytesRead into header variable\n"; my @lines = split '\n', $header; for (0..$#lines) { $lines[$_] =~ s/^a5$/new magic/; } $header = join "\n",@lines; open (my $newFile, '>','message-fixed.txt') or die "$!"; syswrite($newFile, $header); # Write the header my $blockSize = 32 * 1<<10; #32k my $window; while (my $bytesRead = sysread $fh, $window,$blockSize) { syswrite $newFile, $window, $blockSize; } syswrite $newFile, "\n";