comment on

If I understand things right you want to look for the <userID> tag within the <contents> tag, right? In that case change

print {$parser->{parameters}[( exists $attr->{':userID'} ? 0 : 1 )]} $
+parser->ToXML( $tag, $attr);
[download]

print {$parser->{parameters}[( (exists $attr->{':contents'} and exists
+ $attr->{':contents'}{':userID'}) ? 0 : 1 )]} $parser->ToXML( $tag, $
+attr);
[download]

If you wanted to check for the <userID> tag anywhere below the <start_tag>, we'd have to write it differently. Something like

use strict;
use warnings;
no warnings 'uninitialized';

use XML::Rules;

my $parser = XML::Rules->new(
    rules => {
        _default => 'raw',
        '^start_element' => sub {
            my ($tag,$attr,$context,$parents,$parser) = @_;
            $parser->{pad}{found_userID} = 0;
            return 1
        },
        userID => sub {
            my ($tag,$attr,$context,$parents,$parser) = @_;
            $parser->{pad}{found_userID} = 1;
            return [$tag => $attr]
        },
        start_element => sub {
            my ($tag,$attr,$context,$parents,$parser) = @_;

            print { $parser->{parameters}[ $parser->{pad}{found_userID
+} ] } $parser->ToXML( $tag, $attr), "\n";
        }
    }
);

open my $FH1, '>', 'c:\temp\test1.xml';
open my $FH2, '>', 'c:\temp\test2.xml';
print $FH1 "<root>\n";
print $FH2 "<root>\n";
$parser->parse( \*DATA, [$FH1, $FH2]);
print $FH1 "</root>\n";
print $FH2 "</root>\n";


__DATA__
<root>

<!-- First Element -->
<start_element>

<header>
<element_num>1</element_num>
</header>

<contents>
<child>MyChild</child>
</contents>

</start_element>

<!-- Second Element -->
<start_element>

<header>
<element_num>2</element_num>
</header>

<contents>
<child>MyChild</child>
<userID>MyUser</userID>
</contents>

</start_element>

<!-- Third Element -->
<start_element>

<header>
<element_num>3</element_num>
</header>

<contents>
<child>MyChild</child>
</contents>

</start_element>

</root>
[download]

Let me try to explain. XML::Rules let's you specify what to do with the data for a tag once the start tag is parsed (the "^tagname" rules, only the attributes are available) or once the end tag is parsed (the "tagname" rules, the attributes, textual content and whatever the "handlers" for the child tags returned is available). The handler may decide to ignore the data, process it somehow or just pass it to the handler of the parent tag.

The way the handler returns the data affects how is it made available to the handler of the parent tag. It may be added to the hash of attributes, may be joined with the textual content, may be push()ed at the end of things in the parent's contents, combined with an already existing attribute and any combination of those posibilities.

There are quite a few builtin rules specifying what and how gets passed. The 'raw' used in the new script, puts all the data for a tag into the parents content in a way that ensures that the ->ToXML() call later will write exactly what was parsed including whitespace. The 'raw extended' does the same thing, but also adds the tag's data to the parent tag's attribute hash under the ':'.$tagname name. This makes checking whether that child tag was present easier.

The handlers may also be subroutine references or unnamed subroutines. The one in the older script checks whether there was a childtag named 'user_id' (the _default handler would put it to the start_element's content for output and it's attribute hash for fast lookup) and based on that chooses into which filehandle to print the tag and its data converted back to XML. The scary lookling line could have been written like this:

my $FH;
if (exists $attr->{':user_id'}) {
  $FH = $parser->{parameters}[0];
} else {
  $FH = $parser->{parameters}[1];
}
print $FH $parser->ToXML( $tag, $attr);
</code></p>

<p>The other script works differently. In the '^start_element' handler
+ it resets the flag (stored in $parser->{pad} which is an attribute o
+f the parser specificaly "to put anything you want to and access it i
+n any handler"), then if the &lt;userID&gt; tag is encountered the fl
+ag is set and then the 'start_element' handler selects one of the fil
+ehandles passed to <c>$parser->parse()
[download]

and prints the tag and its data there.

The rest is simple: an object is created, files are opened, text is printed, the parse() method is called (which reads the XML and calls the handlers as it goes through the XML) and the closing tag is printed.

Jenda
Enoch was right!
Enjoy the last years of Rome.

In reply to Re^3: Need to process a XML file through Perl by Jenda
in thread Need to process a XML file through Perl by paragkalra

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.