twdgt has asked for the wisdom of the Perl Monks concerning the following question:

I am attempting to do the following: use the XML::Parser to extract specific information into a text file and then by selective regwx, place the extracted information into another formatted text document. Where I am having an issue is reading the first text file after it has been created and closed. For some reason I cannot read from the reopened INTFILEin or write to FMTFILE within the "while" loop (relevant code below). I did try the same steps not using the XML parser (using a print to INTFILEout of character strings) and had no problem writing to the file.

# open intermediate file for pass of XML file open (INTFILEout, '>', $intfile) or die ("Cannot open file $intfile") unless -f $intfile; # parse thru XML file into intermediate file my $parser = new XML::Parser; $parser->setHandlers( Start => \&startElement, End => \&endElement, Char => \&characterData, Default => \&default); $parser->parsefile($xmlfile); # close intermediate file from first pass of XML file close INTFILEout; # reopen intermediate file for load into formatted output file open (INTFILEin, '<', $intfile) or die ("Cannot open file $intfile") unless -f $intfile; open (FMTFILE, '>', $frmtfile) or die ("Cannot open file $frmtfile") unless -f $frmtfile; # processing intermediate file into formatted file while ($linein = <INTFILEin>){ chomp $linein; print FMTFILE "$linein\n"; } # close files used to create formatted file close INTFILEin; close FMTFILE;

Any assistance or guidance is appreciated. Thanks...

Replies are listed 'Best First'.
Re: Reopen a closed file for read
by Marshall (Canon) on Apr 30, 2012 at 20:35 UTC
    I haven't used XML::Parser, so please excuse if this is an ignorant question, but I don't see anywhere here where the output of $parser->parsefile($xmlfile); is being saved to INTFILEout. Something appears to be missing here. I guess in the Start, End and Char routines that aren't shown.

    This unless -f $intfile stuff is bizarre. There are plenty of reasons why an open could fail on an existing file. I also would take that out.

    Update: On second thought, it could be that the "unless clause" is preventing the open() from even happening! I.e. the unless may apply to the open and not to the "or die" part. The kind of open() that you have will create a new file if none exists, if it does already exist, it will be essentially be "zeroed" out and a new blank file of the same name is open. In any event, I don't see the need or the intent of this "unless -f $intfile" part. See additional update below with some test code.

    # open intermediate file for pass of XML file open (INTFILEout, '>', $intfile) or die ("Cannot open file $intfile") unless -f $intfile; # parse thru XML file into intermediate file my $parser = new XML::Parser; $parser->setHandlers( Start => \&startElement, End => \&endElement, Char => \&characterData, Default => \&default); $parser->parsefile($xmlfile); # close intermediate file from first pass of XML file close INTFILEout; ### is anything in INFILEout at this point? If so, how ### would the parser know how to put it there? ### stop the code here and cat or type the contents of INFILEout
    The close and re-open part looks ok to me. The "unless" stuff is weird.
    Closing a temporary file for write and re-opening it for read is a very normal thing to do and I think you have that part right.

    What are you trying to accomplish here:

    while ($linein = <INTFILEin>){ chomp $linein; print FMTFILE "$linein\n"; }
    Doesn't look like it does much. Something like this is useful to translate between Unix and Windows line endings (when a fiddling with a file moved between systems), but doesn't look like that is what is happening here. I guess this dummy code for some other function, but its not clear to me why you can't get the parsed XML output into the desired format in the first place without having to re-parse some intermediate file?

    Another UPDATE: yep this unless is causing trouble:

    #!/usr/bin/perl -w use strict; my $x; my $y=1; sub setx {$x =1;} #### setx() or die "can't set X" unless $y; print "$x"; #Use of uninitialized value $x in string at C:\TEMP\testunless.pl line + 8.
    If $y=1, the setx() routine (analog to the open()) does not run. The code will not open the file for output if if it already exists. I would presume that if you had strict and warnings in force, some kind of "attempted write to an unopened file handle" error would have resulted?
Re: Reopen a closed file for read
by toolic (Bishop) on Apr 30, 2012 at 20:06 UTC
    open (INTFILEin, '<', $intfile) or die ("Cannot open file $intfile") unless -f $intfile;
    Having the 'unless' clause there seems a little unusual. Can you try deleting that?

    Also, make sure you use warnings.