Discipulus has asked for the wisdom of the Perl Monks concerning the following question:

hello monks,

I'm running on the shaggy way of XML processing and I choose my tool: XML::Twig (thanks mirod!).

I'm experiencing a problem with the parsefile_inplace method described here. I have a simple program:
#!/bin/perl -w use strict; use XML::Twig; my $file = 'orders.xml' ; my $ext = '.'.time; my $twig = XML::Twig->new(pretty_print => 'indented'); #$twig->parsefile( $file) or die "could not parse!"; $twig->parsefile_inplace ( $file, $ext) ; $twig->print();
..and a simple XML file:
<?xml version="1.0"?> <Order> <Date>2003/07/04</Date> <CustomerId>123</CustomerId> <CustomerName>Acme Alpha</CustomerName> <Item> <ItemId> 987</ItemId> <ItemName>Coupler</ItemName> <Quantity>5</Quantity> </Item> <Item> <ItemId>654</ItemId> <ItemName>Connector</ItemName> <Quantity unit="12">3</Quantity> </Item> <Item> <ItemId>579</ItemId> <ItemName>Clasp</ItemName> <Quantity>1</Quantity> </Item> </Order>
but when i run this the XML is printed to STDOUT (it is not selected the FH to the output file?) and i find myself with an empty orders.xml file and a correct sized orders.xml.1374219815 file.

I use strawberry Perl 5.16 on win7 64bit and I can safely run perl -e "use File::Temp"

Many thanks for the attention

L*
there are no rules, there are no thumbs..

Replies are listed 'Best First'.
Re: XML::Twig parsefile_inplace misunderstanding
by Corion (Patriarch) on Jul 19, 2013 at 08:25 UTC

    I think the replacement already happens (or should happen) in ->parsefile_inplace. The output you're seeing comes from your extra call to ->print. At least that's my understanding from looking at the implementation of ->parsefile_inplace.

    I think you should see the file creation time for your file change. If you see no changes to the content, that is maybe because you haven't defined (or at least, shown) the replacement rules...

Re: XML::Twig parsefile_inplace misunderstanding
by mirod (Canon) on Jul 19, 2013 at 09:26 UTC

    It looks like a bug to me. I am not sure whether it's a bug in the code or in the docs.

    The file is replaced when the parse method returns, so any print after this is not sent to the proper filehanlde (the temp file that will then replace the original file). I think it could be done when the twig is destroyed, when it goes out of scope or when the progam exits, but I am a bit worried about causing problems in existing code if I do this.

    I usually use parsefile_inplace with flush, this way the data is flushed at the end of the parse. This requires flush to be called during the parse though so if you don't have handlers, your code should look like this (ugly!):

    my $twig = XML::Twig->new( pretty_print => 'indented', twig_handlers + =. { 'level(1) => { sub { $_->flush} }); $twig->parsefile_inplace ( $file, $ext) ; # no ->print after this

    Alternatively, if you don't need handlers, you could also use perl -i to do this, using parsefile and print.

      Thanks mirod

      will be a good think update the docs to specify the scope of parsefile_inplace.

      Your useful module have a lot of method available.. maybe you can add a parsefile_inplace_global one to extend the redirection of handlers till the existence of the twig.

      thanks again
      L*
      there are no rules, there are no thumbs..

        lol!

        At the very least I'll update the docs, explaining the exact scope of parsefile_inplace and how to use perl -i. But that's a good suggestion though. I'll see if it can be done. Thanks.

Re: XML::Twig parsefile_inplace misunderstanding (xml_pp twig_handlers _all_ flush)
by Anonymous Monk on Jul 19, 2013 at 09:02 UTC

    At first glance it looks like a bug in XML::Twig, but if you see the test suite you'll see the use-case, XML-Twig-3.44-new/t/test_3_26.t

    XML::Twig->new( twig_handlers => { bar => sub { $_->set_tag( 'to +to')->flush; }}) ->parsefile_inplace( $file, '.bak');

    The call to flush does printing (as well as eating of the tree), which you don't have

    If you add  twig_handlers => { _all_ => sub { $_[0]->flush } } you'll get what you're after

      thanks Corion and Anonymous,

      so the inplace options are valid only during the scope of the parse_inplace call? after this normal behaviour and handle are restored?

      So it can be used only with the on-the-fly processing style and not with the tree-processing one? Should be not difficult replicate the inplace behaviour but i prefer to use the XML::Twig method for shortness and laziness.

      thanks

      L*
      there are no rules, there are no thumbs..

        so the inplace options are valid only during the scope of the parse_inplace call?

        UTSL :) its

        select temp handle; parsefile; restore original handle; rename /backup;

        So it can be used only with the on-the-fly processing style and not with the tree-processing one?

        That is tree mode. If you change the flush to a print, you'll see lots and lots of duplicated output.

        but i prefer to use the XML::Twig method for shortness and laziness.

        short/lazy things are supposed to be easy/obvious to use, I think this one got kind of got away from mirod, but its not like I have better ideas, I mean come on :)

        $twig->infile('file.xml') ->backup('.bak') ->tempout ->parseit ->replaceoriginalwithtemp