in reply to Pulling out sections of an XMI file with XML::Twig

Is not $section what you want as a starting point?

sub uml_class { my ( $twig, $section ) = @_; $section->print (); }

Prints:

<UML:Class name="EARootClass" xmi.id="EAID_..."/><UML:Class name="Data +Type" xmi.id="EAID..."><UML:Classifier.feature xmi.id="EAID..."> <!-- I want to pull out the following UML:Attribute blocks --> <UML:Attribute name="dataTypeId"></UML +:Attribute><UML:Attribute name="name"></UML:Attribute></UML:Classifie +r.feature></UML:Class>

Update:

Note the "Processing just parts of an XML document" section in the XML::Twig documentation that describes twig_roots and in particular the line my( $t, $elt)= @_; in the sample code in that section.

Update:

and to reparse the sub-element:

sub uml_class { my ( $twig, $section ) = @_; my $xml = $section->sprint (); my $subTwig = XML::Twig->new ( twig_roots => { 'uml:attribute' => +\&uml_attr}); $subTwig->parse ($xml); } sub uml_attr { my ($twig, $elt) = @_; $elt->print (); print "\n"; }

Prints:

<uml:attribute name="datatypeid"></uml:attribute> <uml:attribute name="name"></uml:attribute>

DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: Pulling out sections of an XMI file with XML::Twig
by bobf (Monsignor) on Sep 27, 2006 at 23:04 UTC

    *click* ...a lightbulb turns on...

    Thank you for the example code - that made all the difference! I didn't realize that $section was actually an XML::Twig::Elt object. Using your example as a starting point, I was able to get what I needed.

    At first I thought I needed to parse the XML chunks into a data structure (such as that returned by XML::Simple), which I accomplished using

    my $struct = $section->simplify( forcearray => 1 );
    but I soon realized that was overkill for what I really needed - the value of the attributes for the class (etc) tags. The $elt->att( $attribute ) method did the trick. An example of the class handler from my working code is below.
    sub uml_class { my ( $twig, $section ) = @_; print "data for class:\n"; print " name = ", $section->att( 'name' ), "\n"; print " xmi.id = ", $section->att( 'xmi.id' ), "\n"; my $subTwig = XML::Twig->new( twig_roots => { 'UML:Attribute' => \&uml_attr } +); # $subTwig->parse( $xml ); # original code (typo) $subTwig->parse( $section->sprint() ); }

    Thanks again! I knew I was making it too hard. :-)

    Update: corrected typo in the example code

      I must admit that I don't quite understand what you are doing, or even trying to do, but it looks like you are parsing things several time ( the call to parse in uml_class, but I fail to see what's exactly in $xml). It should not be necessary, you can set handlers at different levels of the tree (not twig_roots, but regular twig_handlers).

      If you could repost a complete example I might be able to help.

        Thank you for trying to help - I am still very new to XML::Twig and I am probably not doing things as efficiently as I could be. (Sorry for the typo with $xml in the above code - I fixed it.)

        I'm trying to extract certain attributes from a small set of tags in the XML document referenced above. Specifically, the UML:Class tags contain UML:Attribute tags, and I want to extract the name and xmi.id attributes from them. The general structure of this portion of the document looks like this:

        <UML:Class> <UML:Attribute></UML:Attribute> <UML:Attribute></UML:Attribute> <UML:Attribute></UML:Attribute> </UML:Class>
        A complete example is shown under the readmore.

        If I understand how XML::Twig works, a section of the tree is sent to a handler when the closing tag is reached. Therefore, for the simplified example above, each of the UML:Attribute sections will get parsed before the corresponding UML:Class section is parsed. I would like to parse the Class first (so I don't have to jump through hoops to associate the Attribute data with the Class data later), which is why the handler for UML:Attribute is located in the handler for UML:Class.

        I admit that it seems inefficient to call sprint and then parse. I took that snippet from GrandFather's example. Is there a better way to do it?

        I hope that clarifies what I'm trying to do. I'd appreciate any suggestions that you might have. Thanks.