You are getting the funny output because you are using buffered I/O on the file and you are neither flushing the buffers nor setting the file position when you switch from reading to writing.

When you read the first line from the file, perl actually reads a buffer full of data. Your file is very small so it fits in the buffer and is read in a single read from the system. Perl then returns the first line from the buffer to your program, leaving the file position at the next character: the start of the second line. The buffer contains the entire file:

Dit is een voorbeeldtekst The quick brown fox jumps over the lazy dog. SOURCEREPOSITORYNAME Roses are red, violets are blue And Osama Is coming To Kill you

This line does not match your RE, so you then write this line back to the same file handle. This writes the data back to the buffer (but not yet to disk). Since the current file position is the start of the second line, the string (a copy of the first line of the file) overwrites the second line in the buffer.

The number of characters overwritten is two more than the number of characters on the line. This is because, in addition to the characters you see on the line, you have line termination. On Windows (I deduce you are on Windows from the path of your file) line termination is two characters: carriage return and line feed.

Count the characters in the first line and add two and move this many characters into the second line and you will see that you have just overwritten the second line up to and including the 'o' of over, introducing a new line termination. This leaves the current file position at the 'v' of over on what was the second line. The buffer now contains:

Dit is een voorbeeldtekst Dit is een voorbeeldtekst ver the lazy dog. SOURCEREPOSITORYNAME Roses are red, violets are blue And Osama Is coming To Kill you

Next you read another line from the file. This reads from the modified buffer, starting at the current file position, which is at the 'v' of "over". You get all the text up to the next end of line: thus you get only the remainder of what was the second line of the file. Now the current file position is at the beginning of what was the third line. I say "what was" because new line terminations are being introduced, changing the total number of lines. The buffer content hasn't changed as a result of this read but the current position has.

Again, this "line" of text does not match your RE, so you write it back to the same file handle. You are now overwriting the third line with the text of a portion of the second line. It happens that you overwrite everything up to and including the 'M' near the end of the third line. This leaves the current file position at the 'E' near the end of the third line (remember the line termination: the last printing character is not quite the end of the line). The buffer now contains:

Dit is een voorbeeldtekst Dit is een voorbeeldtekst ver the lazy dog. ver the lazy dog. E Roses are red, violets are blue And Osama Is coming To Kill you

Next you read the remainder of what was the third line. Again, this doesn't match your RE and you write it out. Now you have overwritten the beginning of what was your fourth line. In this case, the first three characters. Now your buffer contains:

Dit is een voorbeeldtekst Dit is een voorbeeldtekst ver the lazy dog. ver the lazy dog. E E es are red, violets are blue And Osama Is coming To Kill you

Then the same with the remainder of the fourth line and fifth line, after which the buffer contains:

Dit is een voorbeeldtekst Dit is een voorbeeldtekst ver the lazy dog. ver the lazy dog. E E es are red, violets are blue es are red, violets are blue u u

Since the entire file still fits within the buffer, no I/O has been done to disk. All your I/O, since the initial read of the file from disk, has been contained within perl and has been updating perl's file position without changing the system's idea of current file position at all.

When you close the file handle your buffer is flushed to disk. But where does it write it???

At the system level, remember, perl read the entire file in a single read, filling its buffer. This left the system with a file open for read/write and positioned at the end of the file. Now perl comes along and writes its buffer full of data. Since you haven't done a seek to change the current file position, the buffer is written just past the end of the initial content, effectively appending the mixed up buffer to the end of the file.

Now you print your file and see the odd content that you posted.

There are other issues with alternating between read and write. You can read about some of them in open, seek and Mixing Reads and Writes. It is sometimes, but not often, the right thing to do.

Most of the time it is better to write a new file then, after closing both the original and new files, replace the original with the new file. This is what the '-i' option does.


In reply to Re: In place editing of text files by ig
in thread In place editing of text files by jevaly

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.