Harman has asked for the wisdom of the Perl Monks concerning the following question:

Scenario: I 'tie' file 'myFile' to array '@tieArray'. I make modification to some records of '@tieArray'. Now before 'untie' the file I write each record of '@tieArray' to a new file 'tempFile'. In the end I untie '@tieArray'.

Now the contents of file 'myFile' & 'tempFile' should be same. But they are different. Some of the modifications I did on '@tieArray' are not reflected in 'tempFile'.

The problem is fixed when I increase the value of 'memory' parameter to 4Mib from its default value of 2Mib i.e

tie @tieArray, 'Tie::File', "myFile", memory => 4000000;

Can anybody suggest what is the problem in original scenario & how increasing the buffer size to 4Mib fixed this problem?

Note:

1)I am getting this problem when I tie huge files i.e 40 - 50 Mb. For small Files I don't get this kind of problem.

2) Modifications that I made to @tieArray are something like this:

$tieArray[100] .= "\n\nMy new inserted lines";

Replies are listed 'Best First'.
Re: Query regarding 'Tie' file
by Corion (Patriarch) on Sep 09, 2008 at 11:42 UTC

    Your description of the symptoms is confusing me. You talk about Tie::File making modifications but your duplicating/check code not following these modifications. This seems to me to be a case of your duplicating/check code being faulty.

    Tie::File does not support inserting new elements by using strings with \n, as per its documentation. So use splice to insert the new lines. Most likely, one of your problems comes from Tie::File flushing certain lines to disk and upon rereading them, getting its count out of whack because the old (cached) line numbers don't match up with the offsets anymore because you sneaked additional lines into the array instead of using the documented approach.

      Regarding 'Tie::File flushing certain lines to disk and upon rereading them', I am using 'Deferred Writing' for tie. So any modification I make to tied array should be written to disk only when I untie the file. So line numbers should not change while reading this file contents. And while reading from this tiedArray it should give the modified contents & not the original contents.

      I also suspect the problem is coming due to flushing of 'deferred write buffer' as I am writing a lot of data to @tieArray. But my confusion is that in that case the contents of file 'myFile' after 'untie' should also come wrongly.

        Not necessarily, because you don't make anymore changes to the tied array, so no wrong lines need to be either written nor read. But all of this is moot speculation. Change your code to use splice and see if that changes anything.

Re: Query regarding 'Tie' file
by Anonymous Monk on Sep 09, 2008 at 11:38 UTC
    Can anybody suggest what is the problem in original scenario & how increasing the buffer size to 4Mib fixed this problem?
    You haven't proven that to be the case (show code that demonstrates the problem). Anyway, \n\n doesn't make sense, since \n is the default line separator, and that might be causing you problems.