Hi

Doing again some XML after a long time and trying out XML::Twig

That's example code running on a node from HaukeX

I was looking for a more generic way that writing handlers for each tag and found the ->simplify method, which looks good enough for that task. (yeah I know XML::Simple is evil but so seems the monasteries output too ;-p )

use strict; use warnings; use Data::Dump qw/pp dd/; my $data= join "", <DATA>; use XML::Twig; $\="\n"; print "=== HANDLER:\n"; my $twig=XML::Twig->new( twig_handlers => { 'field[@name="doctext"]' => sub { print $_->gi,"Post: ",$_->child_text(0) }, 'author' => sub { print "ID: ", $_->att("id"); print "Name: ", $_->child_trimmed_text(0); }, }, ); $twig->parse($data); print "=== SIMPLIFIED:\n"; $twig=XML::Twig->new(); print pp $twig->parse( $data)->simplify(); __DATA__ <?xml version="1.0" encoding="Windows-1252"?> <node id="11100665" title="Re^5: What does $_ = qq~&quot;$_&quot;~ do? +" created="2019-05-28 16:28:57" updated="2019-05-28 16:28:57"> <type id="11"> note</type> <author id="830549"> haukex</author> <data> <field name="doctext"> &lt;p&gt;More fun facts! I once wrote a script to search a word list f +or words that make valid regexen which convert one valid word into an +other.&lt;/p&gt; &lt;c&gt; $ perl -le 'print bangs =~s engender' bands $ perl -le 'print halved =~s avatar' halted $ perl -le 'print stove =~s evener' stone &lt;/c&gt; </field> <field name="root_node"> 11100593</field> <field name="parent_node"> 11100640</field> <field name="reputation"> 21</field> </data> </node>

what I don't like are the leading newlines in many content fields, like in content => "\nhaukex"

=== HANDLER: ID: 830549 Name: haukex fieldPost: <p>More fun facts! I once wrote a script to search a word list for wor +ds that make valid regexen which convert one valid word into another. +</p> <c> $ perl -le 'print bangs =~s engender' bands $ perl -le 'print halved =~s avatar' halted $ perl -le 'print stove =~s evener' stone </c> === SIMPLIFIED: { author => { 830549 => { content => "\nhaukex" } }, created => "2019-05-28 16:28:57", data => { field => { doctext => { content => "\n<p>More fun facts! I o +nce wrote a script to search a word list for words that make valid re +gexen which convert one valid word into another.</p>\n<c>\n\$ perl -l +e 'print bangs =~s engender'\nbands\n\$ perl -le 'print halved =~s av +atar'\nhalted\n\$ perl -le 'print stove =~s evener'\nstone\n</c>\n", }, parent_node => { content => "\n11100640" }, reputation => { content => "\n21" }, root_node => { content => "\n11100593" }, }, }, title => "Re^5: What does \$_ = qq~\"\$_\"~ do?", type => { 11 => { content => "\nnote" } }, updated => "2019-05-28 16:28:57", }

I couldn't find an option for ->simplify(%options) to trim the content.

I had to use child_trimmed_text(0) when writing handlers....

Question:

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery FootballPerl is like chess, only without the dice


In reply to XML::Twig and the monasteries XML by LanX

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.