in reply to HTML::TreeBuilder::XPath and regular expressions

Shame on me! It turns out the simplest Perlier version actually works!

It may not be official w3.org syntax, and doesn't work in XPather, but it does in HTML::TreeBuilder::XPath.

So in case other people search for this... to find attribute values macthing a regular expression using HTML::TreeBuilder::XPath :

  $tree->findnodes( '//element[ @attribute =~ /regex/ ]' );

Replies are listed 'Best First'.
Re^2: HTML::TreeBuilder::XPath and regular expressions
by mirod (Canon) on Mar 29, 2010 at 12:43 UTC

    Yes, I cheated. In XML::XPathEngine, which provides the XPath engine for HTML::TreeBuilder::XPath, it was easy and seemed like the Perlish thing to do to integrate regexps in the XPath syntax itself.

    From looking at the code (which I inherited from XML::XPath), the official XPath way, using fn:matches(subject, pattern, flags) doesn't seem to be supported. Patches welcome.

      Well, if we can use the much simpler and better Perl way, who cares about the convoluted "official" way... :-)

      As it is, it's great. Thanks a lot. The only missing thing seems to be a sentence and/or an example in the module's documentation. In the meantime, hopefully this thread will show up for the kind of searches I made without success today.

        And of course, it actuall IS documented. Not in HTML::TreeBuilder::XPath but in XML-XPathEngine

        Must be a bad day. I keep finding stuff only after posting...

        Yes, it makes sense to document it in HTML::TreeBuilder::XPath, that's the one that people are using. XML::XPathEngine is just a lower layer, users don't need to kow about it, or look for documentation there. I will add it for the next version. Thanks.