If I understand things right you want to look for the <userID> tag within the <contents> tag, right? In that case change
print {$parser->{parameters}[( exists $attr->{':userID'} ? 0 : 1 )]} $
+parser->ToXML( $tag, $attr);
to
print {$parser->{parameters}[( (exists $attr->{':contents'} and exists
+ $attr->{':contents'}{':userID'}) ? 0 : 1 )]} $parser->ToXML( $tag, $
+attr);
If you wanted to check for the <userID> tag anywhere below the <start_tag>, we'd have to write it differently. Something like
use strict;
use warnings;
no warnings 'uninitialized';
use XML::Rules;
my $parser = XML::Rules->new(
rules => {
_default => 'raw',
'^start_element' => sub {
my ($tag,$attr,$context,$parents,$parser) = @_;
$parser->{pad}{found_userID} = 0;
return 1
},
userID => sub {
my ($tag,$attr,$context,$parents,$parser) = @_;
$parser->{pad}{found_userID} = 1;
return [$tag => $attr]
},
start_element => sub {
my ($tag,$attr,$context,$parents,$parser) = @_;
print { $parser->{parameters}[ $parser->{pad}{found_userID
+} ] } $parser->ToXML( $tag, $attr), "\n";
}
}
);
open my $FH1, '>', 'c:\temp\test1.xml';
open my $FH2, '>', 'c:\temp\test2.xml';
print $FH1 "<root>\n";
print $FH2 "<root>\n";
$parser->parse( \*DATA, [$FH1, $FH2]);
print $FH1 "</root>\n";
print $FH2 "</root>\n";
__DATA__
<root>
<!-- First Element -->
<start_element>
<header>
<element_num>1</element_num>
</header>
<contents>
<child>MyChild</child>
</contents>
</start_element>
<!-- Second Element -->
<start_element>
<header>
<element_num>2</element_num>
</header>
<contents>
<child>MyChild</child>
<userID>MyUser</userID>
</contents>
</start_element>
<!-- Third Element -->
<start_element>
<header>
<element_num>3</element_num>
</header>
<contents>
<child>MyChild</child>
</contents>
</start_element>
</root>
Let me try to explain. XML::Rules let's you specify what to do with the data for a tag once the start tag is parsed (the "^tagname" rules, only the attributes are available) or once the end tag is parsed (the "tagname" rules, the attributes, textual content and whatever the "handlers" for the child tags returned is available). The handler may decide to ignore the data, process it somehow or just pass it to the handler of the parent tag.
The way the handler returns the data affects how is it made available to the handler of the parent tag. It may be added to the hash of attributes, may be joined with the textual content, may be push()ed at the end of things in the parent's contents, combined with an already existing attribute and any combination of those posibilities.
There are quite a few builtin rules specifying what and how gets passed. The 'raw' used in the new script, puts all the data for a tag into the parents content in a way that ensures that the ->ToXML() call later will write exactly what was parsed including whitespace. The 'raw extended' does the same thing, but also adds the tag's data to the parent tag's attribute hash under the ':'.$tagname name. This makes checking whether that child tag was present easier.
The handlers may also be subroutine references or unnamed subroutines. The one in the older script checks whether there was a childtag named 'user_id' (the _default handler would put it to the start_element's content for output and it's attribute hash for fast lookup) and based on that chooses into which filehandle to print the tag and its data converted back to XML. The scary lookling line could have been written like this:
my $FH;
if (exists $attr->{':user_id'}) {
$FH = $parser->{parameters}[0];
} else {
$FH = $parser->{parameters}[1];
}
print $FH $parser->ToXML( $tag, $attr);
</code></p>
<p>The other script works differently. In the '^start_element' handler
+ it resets the flag (stored in $parser->{pad} which is an attribute o
+f the parser specificaly "to put anything you want to and access it i
+n any handler"), then if the <userID> tag is encountered the fl
+ag is set and then the 'start_element' handler selects one of the fil
+ehandles passed to <c>$parser->parse()
and prints the tag and its data there.
The rest is simple: an object is created, files are opened, text is printed, the parse() method is called (which reads the XML and calls the handlers as it goes through the XML) and the closing tag is printed.
Jenda
Enoch was right!
Enjoy the last years of Rome.
|