Dear omniscient monks,

i got some html/tei like data and want to parse it to xml format. it is working pretty well for some files.. but not for all.. here is my code:
# pragma use strict; use warnings; # modules use XML::Simple; use XML::Tidy; use Data::Dumper; use Data::Diver qw( Dive DiveRef DiveError ); use HTML::TreeBuilder; use XML::Tidy::Tiny; # little helper use constant false => 0; use constant true => 1; ... # get instance of treebuilder my $root = HTML::TreeBuilder->new(); # configure treebuilder $root->ignore_unknown( false ); # dump data to the treebuilder $root->parse( $fileData ); # get name for target file my $target = $file; $target =~ s/$fileExtension$/xml/; # open output filehandle open( $FH, '>', $target ); # configure output binmode $FH, ":utf8"; # ERROR HERE 208: my $data = $root->guts()->as_XML(); print $FH xml_tidy( $data ); close $FH; ...
caption has an invalid attribute name 'n' at script.pl line 208
i substite all 'n' in the file.. but got still the same error. for that the 'n' is not the anchor of this error.. i dont know what going on here?!
$root->guts()
is okey.. it is all about the ->as_XML() :-((

kindly, perlig

$perlig =~ s/pec/cep/g if 'errors expected';

In reply to HTML::TreeBuilder, HTML::Element, as_XML() by AlexTape

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.