amarceluk has asked for the wisdom of the Perl Monks concerning the following question:

Me and my very basic file writing questions again. I hope I'm not posting too many basic questions. But everyone's really been helpful and I greatly appreciate it. In response to my question remove blank lines with regex, Biker gave me this advice:
Consider this:
# Open both IN_FILE and OUT_FILE here! while(<IN_FILE>) { chomp; next unless length; print OUT_FILE "$_\n"; }
I always come back to the following:

Do not slurp in a file into an array. Someday that input file will be huge. And that will happen when used for production data. And you're on vacation. And you'll have to come in to the office during that sunny day on the beach.

Read the input file line by line and act upon each line (here by potentially writing it to the output file.)
This is obviously good advice and I want to take advantage of it. However, I'm not quite sure what the code should look like, specifically for finding and replacing all occurences of a string. Is it something like this? (Please forgive my probably-incorrect code.)
open (INFILE, "input.txt"); open (OUTFILE, ">output.txt"); while (INFILE) { $_ =~ s/foo/bar/gms; print OUTFILE "$_\n"; }
Will something like this work? If not, what will?

Thank you, again!

Replies are listed 'Best First'.
Re: reading/writing line by line
by davorg (Chancellor) on May 22, 2002 at 15:12 UTC
    open (INFILE, "input.txt") or die "Can't open input.txt: $!\n"; open (OUTFILE, ">output.txt") or die "Can't open output.txt: $!\n"; while (<INFILE>) { s/foo/bar/g; print OUTFILE; }

    I removed the /m and /s options from your s/// operator as they have no effect on data that doesn't contain newlines.

    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: reading/writing line by line
by Joost (Canon) on May 22, 2002 at 15:23 UTC
    Just a few code corrections:

    open (INFILE, "input.txt") or die "cannot open input: $!"; open (OUTFILE, ">output.txt") or die "cannot open output: $!"; while (<INFILE>) { # note the < and > brackets s/foo/bar/sg; # no =~ needed for $_ # no m needed print OUTFILE; # no "$_\n" needed: $_ is automatic # and the \n is still in $_ (you # don't cho(m)p it... } close INFILE; close OUTFILE;

    If you need to replace a multi-line string, you might try setting $/ to undef (slurp the whole file) or "" (read in paragraphs), and maybe adding a /s modifier to the replacement regex.

    I'd recommend reading:

    perldoc -f open perldoc perlvar (look for the $/ entry) perldoc perlre (for info about regular expressions)

    Update: added or die, and close parts, trying not to set a bad example :-)

    -- Joost downtime n. The period during which a system is error-free and immune from user input.
Re: reading/writing line by line
by Biker (Priest) on May 22, 2002 at 15:14 UTC

    I understood that your need is to read a file, line by line, and if the line is not blank, write the line to the output file. Did I understand that correctly?

    My code snippet, as you have quoted above, will:

    • Read one line from the input file
    • Remove a potential newline from the end of the line
    • Look if the line is now empty (zero length)
    • If empty, read next line
    • If not empty, write the line and a newline to the output file and then read next line from the input file
    • Repeat until no more lines in input file

    What part is not working as expected?

    Update:
    I admit, I missed the words:
    "...specifically for finding and replacing all occurences of a string..."

    Of course, you're right. ;-) That's what I get for answering a question while on the run to go home. ;-)
    while(<IN_FILE>) s/from/to/g; print OUT_FILE; }

    should do it.

    Everything went worng, just as foreseen.

      Your suggestion did work; thank you! Now I'm trying to apply the same principle to finding and replacing text.
Re: reading/writing line by line
by Juerd (Abbot) on May 22, 2002 at 17:33 UTC

    open (INFILE, "input.txt"); open (OUTFILE, ">output.txt"); while (INFILE) { $_ =~ s/foo/bar/gms; print OUTFILE "$_\n"; }

    /m is multi line: not useful when dealing with a single line at a time. /s makes . match newline: not useful if you don't use ..

    One liner:

    perl -i.backup -pe's/foo/bar/g' filename

    Same thing, inside a larger script that you don't want to have using -i all the time:

    { local @ARGV = ('filename'); local $^I = '.backup'; local $_; while (<>) { s/y/yah/g; print; } }

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      I fear this is an especially dumb question, but here goes...

      I've seen a lot of people recommending the "perl -i" solution for regexes, like your
      perl -i.backup -pe 's/foo/bar/g' filename
      Can that be used for multiple regexes in a file? Like, if you wanted to do s/foo/bar/g and s/y/yah/g on the same file?
        I rescind the question. Yes, it's dumb, and yes, I figured it out by doing a search for "perl -i". Sorry.

        Can that be used for multiple regexes in a file? Like, if you wanted to do s/foo/bar/g and s/y/yah/g on the same file?

        Yes. See what happens if you use just perl -pe'undef':

        LINE: while (defined($_ = <ARGV>)) { undef; } continue { die "-p destination: $!\n" unless print $_; }
        So you can just add whatever should be inside the block:
        perl -i -pe's/foo/bar/g; s/y/yah/g;' filename

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.