patair has asked for the wisdom of the Perl Monks concerning the following question:

hello Monks,

I am trying to update a value in a file using find and replace. $string1 =~ s/XXXGASPOINTIDXXX/656563/gi; which then writes to a second file which is then used in the second part of the script to update another value in the file using find and replace again

It is a bit basic but it seems to be doing what I want however it does not have time to write to second file eventhough I have sleep()s in the script at various points to allow files to be written to before it is used in the next part of the script.

thanks in advance.

Pat

details below

#!c:/perl/bin/perl.exe -w use strict; my $next_value; my @string1; my $string1; my @string2; my $string2; open (IN, "+<G203.xml"); open (OUT,">_G203.xml") || die ("Cannot open file"); @string1 = <IN>; seek IN,0,0; sleep (1); foreach $string1(@string1){ $string1 =~ s/XXXGASPOINTIDXXX/656563/gi; print OUT "$string1"; } close IN; sleep (3); print "\n sleeping ....\n"; open (IN2, "+<_G203.xml"); open (OUT2,">2_G203.xml") || die ("Cannot open file"); @string2 = <IN2>; seek IN2,0,0; sleep (1); foreach $string2(@string2){ $string2 =~ s/XXXMMREFXXX/62394732472856563/gi; print "$string2"; print OUT2 "$string2"; } close IN2;

Replies are listed 'Best First'.
Re: Find and Replace multiple values in a file.
by scorpio17 (Canon) on Feb 08, 2011 at 14:13 UTC

    I'd write it like this:

    use strict; open my $in, '<', "G203.xml" || die "Cannot open file: $!\n"; open my $out,'>' "_G203.xml" || die "Cannot open file: $!\n"; while (my $string1 = <$in>) { $string1 =~ s/XXXGASPOINTIDXXX/656563/gi; $string1 =~ s/XXXMMREFXXX/62394732472856563/gi; print $out $string1; } close $in; close $out;

    Notes:

    • I use the 3 argument form of 'open'.
    • I read from the file one line at a time, using 'while', process each line, then write it out. No need to hold all the lines in memory (in an array).
    • There's no need to sleep.
    • There's no need to loop over the entire file twice: you can do both regex's inside a single loop. Two passes might be necessary if the patterns somehow overlapped, but these don't seem to do so).
      Nice concise code and nice concise explanation.
Re: Find and Replace multiple values in a file.
by moritz (Cardinal) on Feb 08, 2011 at 13:09 UTC
    however it does not have time to write to second file

    What do you mean? What do you observe, an why do you think whatever you observe is connected to how much time it has? How is the time limited? Is the script killed after a certain period of time?

Re: Find and Replace multiple values in a file.
by elef (Friar) on Feb 08, 2011 at 13:28 UTC
    I'm no expert, but I'm pretty sure the commands in a normal perl script are executed sequentially. I.e. it's not possible for a command to start executing before the previous one completed, not without threading.
    Try closing the OUT filehandle before you open the same file with a different handle. In my experience, perl will often only write (some of) the changes to disk when the filehandle is closed.
    So, remove all those sleeps and go
    # ... close IN; close OUT; open (IN2, "+<_G203.xml"); # etc.
Re: Find and Replace multiple values in a file.
by jethro (Monsignor) on Feb 08, 2011 at 13:49 UTC
    To expand on elef's explanation, writing to files is usually buffered, that means that data is written to disk only after a specific amount of data has beeen collected. Or after the files is closed, because then it is obvious that no further data will arrive.
Re: Find and Replace multiple values in a file.
by mvaline (Friar) on Feb 08, 2011 at 14:05 UTC

    If the problem is buffering and closing the filehandle doesn't solve your problem, try making the filehandle hot. A frequently referenced article on the subject is Suffering from Buffering?

    select OUT; $| = 1; print OUT '';

Re: Find and Replace multiple values in a file.
by CountZero (Bishop) on Feb 08, 2011 at 15:14 UTC
    Unless your file is really huge, it can be done as simple as this:
    use Modern::Perl; use File::Slurp; my $text = read_file( '_G203.xml' ); $text =~ s/XXXGASPOINTIDXXX/656563/gi; $text =~ s/XXXMMREFXXX/62394732472856563/gi; write_file( '_G203.xml', $text );

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Find and Replace multiple values in a file.
by locked_user sundialsvc4 (Abbot) on Feb 08, 2011 at 15:23 UTC

    As others have said, it is never necessary to sleep() to cause file operations to work consistently.   You should write to the file, using an exclusive mode (because you do not want readers to see incomplete results).   Then, you should close the file and reopen it for reading.   This will assure consistency, both to your own program and to any other one(s).   If both of these processes are being done in the same program, the same file-handle can simply continue to be used.

    If you are attempting to read from a file that is concurrently open for writing, unpredictable results (called dirty reads) may result.

    On another note ... “call me an old COBOL hound, but” ... I do come from a day when main-memory was quite small in relation to the quantity of data that you need to process, so I always do such things using intermediate “spill files,” not in-memory stashes.   For one thing, I know that “memory” is really a disk-file, and for another, I know that many (esp. desktop-sized even-if rack-mounted) systems really don’t handle memory-pressure very well at all:   they start to “thrash” at the drop of a hat.   (Especially when dealing with programs that are using in-memory stashes, which are inherently thrash prone.)   But their operating systems handle file-buffering quite sensibly:   they’ll give “unused” memory over to file buffers without being told to do so, and they’ll handle reductions in the amount of memory that is available for that purpose much more graciously than they will handle any application that has suddenly become a million-pound elephant.   So, I’ll read the data from one file and copy it into another.   When asked to sort information, I’ll do on-disk sorts instead of in-memory ones.   I have plenty of experience to tell me that such programs really don’t run slower in the small case (thanks to the buffering), and that they degrade much more gracefully (linearly ...) as the data load grows large.   I basically get to have my cake and eat it too.

Re: Find and Replace multiple values in a file.
by CountZero (Bishop) on Feb 08, 2011 at 15:29 UTC
    These lines are the root of your problem:
    open (IN, "+<G203.xml"); open (OUT,">_G203.xml") || die ("Cannot open file"); @string1 = <IN>;
    You open the file for writing (the second line above) which totally clears your file before you even read what was in it (third line).

    So you never ever get anything inside @string1. Hence you replace nothing and you write out nothing!

    Update: Oops. Sorry. My wrong. I did not notice the underscore in the second file name. Please disregard the above.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James