Jazz-jj has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone, I'm using XML::Simple to parse some xml. One of the items has a URL in it.  <results_link>https://myserver.com/app/search?q=%7Cloadjob%20scheduler_amJ1bGxvdWdoQGRldi1hbWVyaWNhZmlyc3QuY29t_at_1496880600_682%20%7C%20head%2020%20%7C%20tail%201&amp;earliest=0&amp;latest=now</results_link> I take the all of the XML results I have and pass it through the XML parser
$parsedPayload = XMLin($Payload) $ReportURL = $parsedPayload->{results_link};
Weather I use $ReportURL or $parsedPayload->{results_link} I see that &amp; is now just &. I'm taking XML from one system and passing it as XML to another system. As a result, the stripped & are causing me an issue. Bug? By design? is there another module I should use? thanks all!

Replies are listed 'Best First'.
Re: &amp; and XML::Simple
by Discipulus (Canon) on Jun 08, 2017 at 06:58 UTC
    Hello Jazz-jj and welcome to themonastery and to the wonderful world of Perl,

    > I'm using XML::Simple to parse some xml..

    This is the first error: see XML::Simple needs to go! and the very beginning of the module iself to know why.

    Different valid options are on CPAN to safely parse XML: i choosed XML::Twig and I'm happy with it (but still unhappy with XML in itself!)

    XML::Twig has a lot of features and tutorials on the author website.

    Next times give us a short sample of your data to make it easier to help you.

    I'd go with something like

    use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( pretty_print => 'indented', twig_handlers => { # $_[1] is the elemen +t 'results_link' => sub{ $_[1]->print;} }); my $data =<<EOXML; <?xml version="1.0"?> <!DOCTYPE stats SYSTEM "stats.dtd"> <results_link>https://myserver.com/app/search?q=%7Cloadjob%20scheduler +_amJ1bGxvdWdoQGRldi1hbWVyaWNhZmlyc3QuY29t_at_1496880600_682%20%7C%20h +ead%2020%20%7C%20tail%201&amp;earliest=0&amp;latest=now</results_link +> EOXML $t->parse( $data);

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      Cool. Thanks guys. This is version 1 of this integration script, so, if other modules work better for moving XML from one system to another, I'll definitely give them a try. Thanks! Jazz

      And for completeness, XML::LibXML is probably the other best choice (IMHO) alongside Twig.   This package uses the same libxml2 binary library that is very much an industry standard for handling XML.   You often wind up processing another program’s XML file outputs using the same underlying library that it used to build them.   (Whether or not they used Perl to drive it.)

Re: &amp; and XML::Simple
by Anonymous Monk on Jun 08, 2017 at 00:38 UTC

    Its by design, as "&" is just the xml way to write "&"

    The way you're using XML::Simple it is not giving you "raw" xml and thats a good thing

    The way you're passing on what XML::Simple gives is your problem, the other end is expecting xml, but you're not giving it xml, you forgot to escape/encode/xmlify the data you're sending

      I forgot code tags :) "&amp;" is the xml way to write "&", its how xml encoded/escapes "&"
        ok. So I just added this bit to "XML-ify" what i'm sending:
        $URLFix = '&amp;'; $parsedPayload->{results_link} =~ s/&/$URLFix/g;
        And that would be "The correct way to fix it"?