Re^2: Pulling out sections of an XMI file with XML::Twig

*click* ...a lightbulb turns on...

Thank you for the example code - that made all the difference! I didn't realize that $section was actually an XML::Twig::Elt object. Using your example as a starting point, I was able to get what I needed.

At first I thought I needed to parse the XML chunks into a data structure (such as that returned by XML::Simple), which I accomplished using

my $struct = $section->simplify( forcearray => 1 );
[download]

but I soon realized that was overkill for what I really needed - the value of the attributes for the class (etc) tags. The $elt->att( $attribute ) method did the trick. An example of the class handler from my working code is below.

sub uml_class
{
    my ( $twig, $section ) = @_;

    print "data for class:\n";
    print "   name = ", $section->att( 'name' ), "\n";
    print "   xmi.id = ", $section->att( 'xmi.id' ), "\n";

    my $subTwig = XML::Twig->new( twig_roots => {
                                      'UML:Attribute' => \&uml_attr } 
+);
    # $subTwig->parse( $xml ); # original code (typo)
    $subTwig->parse( $section->sprint() );
}
[download]

Thanks again! I knew I was making it too hard. :-)

Update: corrected typo in the example code

Comment on Re^2: Pulling out sections of an XMI file with XML::Twig Select or Download Code

Replies are listed 'Best First'.
Re^3: Pulling out sections of an XMI file with XML::Twig by mirod (Canon) on Sep 28, 2006 at 09:42 UTC
I must admit that I don't quite understand what you are doing, or even trying to do, but it looks like you are parsing things several time ( the call to `parse` in `uml_class`, but I fail to see what's exactly in `$xml`). It should not be necessary, you can set handlers at different levels of the tree (not `twig_roots`, but regular `twig_handlers`). If you could repost a complete example I might be able to help.	[reply]
Re^4: Pulling out sections of an XMI file with XML::Twig by bobf (Monsignor) on Sep 28, 2006 at 16:00 UTC
Thank you for trying to help - I am still very new to XML::Twig and I am probably not doing things as efficiently as I could be. (Sorry for the typo with `$xml` in the above code - I fixed it.) I'm trying to extract certain attributes from a small set of tags in the XML document referenced above. Specifically, the UML:Class tags contain UML:Attribute tags, and I want to extract the name and xmi.id attributes from them. The general structure of this portion of the document looks like this: `<UML:Class> <UML:Attribute></UML:Attribute> <UML:Attribute></UML:Attribute> <UML:Attribute></UML:Attribute> </UML:Class>` [download] A complete example is shown under the readmore. Read more... (13 kB) If I understand how XML::Twig works, a section of the tree is sent to a handler when the closing tag is reached. Therefore, for the simplified example above, each of the UML:Attribute sections will get parsed before the corresponding UML:Class section is parsed. I would like to parse the Class first (so I don't have to jump through hoops to associate the Attribute data with the Class data later), which is why the handler for UML:Attribute is located in the handler for UML:Class. I admit that it seems inefficient to call `sprint` and then `parse`. I took that snippet from GrandFather's example. Is there a better way to do it? I hope that clarifies what I'm trying to do. I'd appreciate any suggestions that you might have. Thanks.	[reply] [d/l] [select]
Re^5: Pulling out sections of an XMI file with XML::Twig by mirod (Canon) on Sep 28, 2006 at 16:52 UTC
2 things can help you here: you can use `start_tag_handlers`, which are called after the start tag of the element has been parsed (and the element object has been created, empty at that point). This also means taht the UML:Attribute element will get completely parsed before the UML:Class, but when you're in their handler the opening tag of UML:Class has already been parsed. If you don't use `twig_roots` but regular `twig_handlers` the element exists, it's an ancestor of the UML:Attribute element, and it's attributes are already available. If space is a problem, you cant sprinkle `purge` call to taste. So in your case I would write something like (untested): use strict; use warnings; use Data::Dumper; use XML::Twig; my $twig = XML::Twig->new( start_tag_handlers => { 'UML:Class' => \&ulm_class, }, twig_handlers => { 'UML:Class' => sub { $_[0]->purge }, # purg +e at the end of each section, 'UML:Attribute' => \&uml_attr, } ); $twig->parsefile( 'testfile.xmi' ); sub uml_class { my ( $twig, $section ) = @_; print "data for class:\n"; print " name = ", $section->att( 'name' ), "\n"; print " xmi.id = ", $section->att( 'xmi.id' ), "\n"; } sub uml_attr { my ( $twig, $attr ) = @_; # if you need the class id, it's in $attr->parent( ''UML:Class')-> +attr( 'xmi.id') $attr->print; print Dumper( $struct ); # parse the block and extract the data elements $twig->purge; } [download] Does it make sense? It probably doesn't matter that much if your files are small, but it feels better to parse only once each section.	[reply] [d/l]
Re^6: Pulling out sections of an XMI file with XML::Twig by bobf (Monsignor) on Sep 28, 2006 at 17:32 UTC