Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: XML::LibXML - How to extract an element including the elements within?

by tangent (Parson)
on May 09, 2019 at 20:15 UTC ( [id://1233530]=note: print w/replies, xml ) Need Help??


in reply to XML::LibXML - How to extract an element including the elements within?

You need to find the node itself and then print out the literal value of that node using toString(). That will also print out the enclosing tags but simple regular expressions can strip them out:
my ($node) = $dom->findnodes('/doc/text'); my $string = $node->toString; print "toString:\n$string\n"; # remove enclosing tags $string =~ s/^<[^>]+>//; $string =~ s/<[^>]+>$//; print "toString:\n$string\n";
Output:
toString: <text>From mobile, <ph1 i="1" type="33" x="1"/>dial<ph2 i="1"/> this n +umber:</text> toString: From mobile, <ph1 i="1" type="33" x="1"/>dial<ph2 i="1"/> this number:

Replies are listed 'Best First'.
Re^2: XML::LibXML - How to extract an element including the elements within?
by haukex (Archbishop) on May 09, 2019 at 20:46 UTC
    $string =~ s/^<[^>]+>//; $string =~ s/<[^>]+>$//;

    No, please don't use regular expressions to parse XML...

    An alternative to what poj showed with XML::LibXML::DocumentFragment:

    use warnings; use strict; use XML::LibXML; my $doc = XML::LibXML->load_xml(string => q{<doc><text>From mobile, <ph1 i="1" type="33" x="1"/>dial<ph2 i=" +1"/> this number:</text></doc>}); for my $node ($doc->findnodes('/doc/text')) { my $frag = $doc->createDocumentFragment; $frag->appendChild($_->cloneNode(1)) for $node->childNodes; print $frag, "\n"; } __END__ From mobile, <ph1 i="1" type="33" x="1"/>dial<ph2 i="1"/> this number:

      Hello haukex

      I agree with you regarding the use of RE in this case. I was trying to avoid this, and got 2 good working solutions. I preferred the one from poj because it's simpler, however I'm sure that if I explore and understand more your solution I would find some possible ways to solve other issues that I've not met yet.

      Have a great day too!

      TA

Re^2: XML::LibXML - How to extract an element including the elements within?
by TravelAddict (Acolyte) on May 10, 2019 at 19:37 UTC

    Thanks tangent

    I was actually thinking about doing what you suggest, except that I was thinking this: $string =~ s/^(.*?)<text>(.*?)</text>(.*?)$/$2/sm;

    I finally decided to go with the solution that poj suggested, it looks quite clean.

    TA

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1233530]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (2)
As of 2024-04-20 06:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found