walkingthecow has asked for the wisdom of the Perl Monks concerning the following question:

Hey all I have a pretty simple question. I am trying to replace a line in a file that is being processed. See the code below for a better understanding.

#!/usr/bin/perl while (my $line=<>) { my ($blah,$blah2,$bla,$time,$blah3)=split(/:/,$line); if ($blah eq "test") { $time="UNDEF"; # Replace line in same file. } }
I know, a very simple example, but I think it gets the idea across. Rather than opening and writing to a new file, I just want to replace the line in the file that we are working with. I want to update the $time lets say. See below for a BEFORE/AFTER example.

BEFORE: "test:bob:jones:04/20/09:john",
AFTER: "test:bob:jones:UNDEF:john"

UPDATE: I am not looking to do this for a file at command line. I am looking to do this within my code for any file given to the script. I can do it with a system call to sed (i.e., system("sed \"s/$line/$blah:$blah2:$bla:$time:$blah3/\" FILE > /tmp/file; mv /tmp/file FILE); ), however, that is not something that I wish to do, and I know that there is a far easier way to do this without system calls.

Replies are listed 'Best First'.
Re: Replace current line in while loop
by tilly (Archbishop) on May 13, 2009 at 18:17 UTC
    I really, really, really recommend opening and writing a new file. But if you want you could read the documentation for open to learn about the +< mode, then tell and seek to learn how to move around within a filehandle, then use print to overwrite it.

    If you do that you'll learn that a file is just a stream of bytes. So you can't just replace a line unless the replacement is the same length as the original. Otherwise you have to rewrite the rest of the file to replace the line. And once you go there then you have issues with possibly having the bit you're replacing overrun the bit you haven't read yet, with fairly bad results.

    Should you go down this path, you'll learn why I really, really, really suggest opening a second file, writing the file with modifications, then doing an unlink then rename to finish.

    Update: mr_mischief pointed out that I said pos where I meant tell. Oops...

Re: Replace current line in while loop
by roboticus (Chancellor) on May 13, 2009 at 18:17 UTC
    moocow:

    You can do it with basically the same form as the sed command you gave using standard perl stuff, like so:

    open INF, '<', $IFName or die $!; open OUF, '>', $OFName or die $!; while (my $line = <INF>) { ... do stuff ... print OUF $new_line; } close INF or die $!; close OUF or die $!; rename($IFName, $IFName . ".bak") or die $!; rename($OFName, $IFName) or die $!;

    Above is off the top of my head, and is therefore untested...

    ...roboticus
Re: Replace current line in while loop
by mr_mischief (Monsignor) on May 13, 2009 at 18:45 UTC
    In addition to both of tilly's excellent pieces of advice, you could consider the $^I variable (which you can read about in perlvar). You could also consider something like Tie::File.

    Tie::File would be a better way than messing with $^I within a multi-line program if only for sake of maintainability.

    Really, though, a database is made to update data in place, so that work is already done. You could use a simple key/pair database or a full-blown relational DB like PostgreSQL.

Re: Replace current line in while loop
by kennethk (Abbot) on May 13, 2009 at 17:51 UTC
    The command line switch -i does exactly this, but is usually used for command-line short cuts.

    Update: For example, you could do exactly what you specify in your post with the one liner:

    perl -pi -e 's/^((?:[^:]*:){4})/${1}UNDEF:/ if /^test:/' file.txt

    However, if this script is intended to do more than just a simple search-and-replace, you'll likely be better off writing a proper script where you explicitly output a new file.

      Actually, what the script does is attempts to connect to a server, if the attempt comes back successful then the timestamp should be updated to reflect todays date. However, if the attempt failed, then the timestamp (line) should be left alone. I unfortunately do not want to create a new file, rather I'd just like to update the timestamp if success, and leave the line alone if fail.
        Given this information I'd suggest storing data in some kind of database. Even if it is just a dbm file.
        As other have pointed out there are problematic details with modify in place. You could consider using Tie:File,
        http://search.cpan.org/~mjd/Tie-File-0.96/lib/Tie/File.pm.

        This allows you to view file as an array and handles the details for you. The file isn't automatically all read into memory and so can work on large files. Could be some performance issues or not in your app.

        Update: As another thought, you could reconsider your approach. A common way is to just log the data by appending to a file. Then have some other program that analyzes the log data to make reports. Of course this depends upon how many lines this log file will have, etc...

        Another update: If performance is a consideration, appending to an existing file is pretty much the fastest way to write something to the disk. And this is essentially independent of the file size. Appending data to a 1K file is same cost as appending data to a 1GB file. If your program spews out say 1,000 lines per hour, that's only 24,000 lines per day. Not much data in the scheme of things, 10x that much is not that much. It sounds to me like you just need daily stats. Even on Windows there is the concept of a "chron" job and you can also start tasks at different priorities (from a .bat file, use the "start" command instead of just typing in the executable name). But I don't think you need that. I mean crunching through say a 250K line log file to make a report is easy.

        My advice is: do something simple that functionally does what you want. If that doesn't "work" for performance reasons, then get more complex.

        Given the new information you have provided, the most reasonable approach is following tilly's advice. There is no good reason to attempt in-place editing when output-and-move is sooo much easier and less bug-prone.