comment on

For this particular task, I'd look at the XML::XPath modules. XPath allows you to search through a tree and find the nodes you need. The way I understand it, XPath is to XML trees as regular expressions are to strings. The XPath syntax is not very hard to figure out. You can read the specs here.

Here is an example from the XML::XPath docs

    use XML::XPath;
    use XML::XPath::XMLParser;
    
    my $xp = XML::XPath->new(filename => 'test.xhtml');
    
    my $nodeset = $xp->find('/html/body/p'); # find all paragraphs
    
    foreach my $node ($nodeset->get_nodelist) {
        print "FOUND\n\n", 
            XML::XPath::XMLParser::as_string($node),
            "\n\n";
    }
[download]

I think XPath is really interesting stuff and if you post some of the xml you are dissecting, I'll try to help you out as best I can.

You might also want to give the perl-xml mailing list a quick search.

Get Strong Together!!

In reply to Re: Expat by aardvark
in thread Using Expat: how to extranct and manipulate elements? by Laila

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.