in reply to XML::Compile template initialized from XML document

I have the feeling that your process is too complicated. With XML::Compile, the process can be:

  1. Read XML using XML::Compile::Translate::Reader into Perl structure
  2. Modify Perl structure
  3. Write Perl structure to schema-compliant XML using XML::Compile::Translate::Writer

There is no need to involve XML::LibXML::Element.

  • Comment on Re: XML::Compile template initialized from XML document

Replies are listed 'Best First'.
Re^2: XML::Compile template initialized from XML document
by tdane (Acolyte) on May 11, 2016 at 21:47 UTC

    I accept the possibility. I'm new to both library families (XML::Compile and XML::LibXML) and they are large and complicated. Let me show you the code I use to compile the schema and to parse the XML that produces a structure I find difficult to use, and maybe you can help me get to this easier model

    I'm trying to post the minimal helpful code. The getSASchema subroutine successfully downloads the XSD and compiles it. The saQuery subroutine successfully pulls the XML I want and runs the reader I get from the schema object's "compile" method. As you see the subroutine returns the reference that comes back from calling the reader with the XML text.

    ... ... sub getSASchema { my ($config, $lwp) = @_; my $saSchemaUrl = "https://" . $config->{saserver} . ":" . $config +->{saport} . "/serverautomation/SA-REST.xsd"; my $sareq = HTTP::Request->new( GET => $saSchemaUrl ); $sareq->authorization_basic($config->{besuser}, $config->{bespassw +ord}); my $xsd = $lwp->request($sareq); my $schema = XML::Compile::Schema->new($xsd->{_content}); return $schema; } ## Handle querying the Server Automation API. sub saQuery { my ($config, $lwp, $schema) = @_; my $saPlanUrl = "https://" . $config->{saserver} . ":" . $config->{saport} . "/serverautomation" . $config->{saplanurl}; my $sareq = HTTP::Request->new( GET => $saPlanUrl ); $sareq->authorization_basic($config->{besuser}, $config->{bespassw +ord}); my $xml = $lwp->request($sareq); my $planreader = $schema->compile( READER => "{http://iemfsa.tivol +i.ibm.com/REST}sa-rest"); my $xmltxt = $xml->{_content}; my $tree = $planreader->($xmltxt); return $tree; } ... ... # Down in my main. $config is a hash containing config # data parsed from a file, and $lwp is an instance of # LWP::UserAgent my $saSchema = getSASchema($config, $lwp); ## Fetch the "raw" automation plan from the BigFix SA server my $doc = saQuery($config, $lwp, $saSchema); my $basenode = $doc->{_}; print Dumper($doc); print Dumper($basenode);

    As you see in my main code, I then use Dumper to see what I get from that process. Here's what I get:

    $VAR1 = { '_MIXED_ELEMENT_MODE' => 'ATTRIBUTES', '_' => bless( do{\(my $o = 85686224)}, 'XML::LibXML::Element +' ) }; $VAR1 = bless( do{\(my $o = 85686224)}, 'XML::LibXML::Element' );

    That is not a hash of hashes. It is XML::LibXML objects. This is how the XML::Compile documentation tells me to make a reader, but I'm not sure this reader is the same reader you refer to as XML::Compile::Translate::Reader. Maybe you can show me a fragment to replace my reader in saQuery with one that you think will produce what I want? And please remember that after modifying my structure I need to be able to use my XML::Compile::Schema object to construct compliant output XML...

    I really appreciate your help. I'm learning a lot.

      Be sure that you create the ::Schema (better the ::Cache) only once in your program, and reuse it. The same for your compiled handlers: compilation is expensive, (re)use is cheap.

      Probably you want to use $xmltxt = $msg->decoded_content

      Apparently, your schema uses mixed="true". When that is not a mistake (which it often is), than you have to do things partially manually. XHTML is an example of mixed XML: it does not translate into a predictable DOM tree. Tune the handling of mixed elements to suit your needs via the mixed_elements parameter.

        I do create the Schema only once. As I have only the one Schema and I use it repeatedly while looping over the contents of a file and then it is gone I saw no advantage to using Cache. This is a "utility" script that consumes a web service, not a web service itself.

        The schema comes from a commercial application. It looks to me like it is an oversight that "mixed" is "true" on their complex types. Again, I'm only passing familiar with the full XML Schema definition. It looks to me like "mixed" should only be true when an element can contain both content (text/data) AND other elements (tags) mixed. Is that right? The actual XML documents I have seen from this interface do not have that. Right now I actually pull the schema from the web service. Would I be better off grabbing a copy and setting "mixed" to "false" for my work?

        Following up. Are you suggesting that the reader is producing the tree of objects instead of a hash of hashes because the schema is mixed? In other words, the library knows it can't render a mixed schema as a hash of hashes and that is why it doesn't?

        I'm not going to wait for your answer. Because I know the schema doesn't have to have mixed elements, I'm going to make a local file copy of the schema, change it so mixed is "false" and see what I get. But I would welcome anyone's input on this while I do that work. Thanks!