in reply to Re: Re: Text to XML
in thread Text to XML
First a complaint: if you really want help, you have to help me: please check your data before posting. I nearly gave up on writing the code, because your input and your output are full of typos: the numbers in the list items don't match, the tag names are inconsistent, so is the indenting... this made it really difficult for me to get the test code to work. So please, next time be more considerate.
With this out of my system ;--( here is the code:
#!/usr/bin/perl -w use strict; use XML::Twig; use Test::More tests => 1; my( $input, $expected); { local $/="\n\n"; $input= <DATA>; ($expected= <DATA>)=~ s{^\s*}{}; } my $t= XML::Twig->new( twig_handlers => { nl => sub { process_list( numbered => @_) +; }, pl => sub { process_list( plain => @_); }, ul => sub { process_list( unnumbered => @_); }, }, pretty_print => + 'indented', ) ->parse( $input); $t->set_indent( ' ' x 4); # if you really want 4 space indents my $result= $t->sprint; is( $result, $expected, "test lists"); sub process_list { my( $type, $t, $list)= @_; $list->set_tag( 'list') ->set_att( type => $type); foreach my $child ( $list->children) { if( $child->is_text) { $child->mark( qr/^(.+?)\s*$/m, 'listitem'); } else { $child->wrap_in( 'listitem'); } } # you need this for the pretty printing to work, or the # empty text elements left by mark will mess up XML::Twig # this is a bug, I will see how best to fix it in the next version foreach my $child ( $list->children) { $child->delete if( $child->text=~ m{^\s*$}); } } __DATA__ <nl>number list 1 number list 2 <ul>unnumbered list1 unnumbered list 2 <pl>plain list 1 plain list 2 <nl>numbered list 1 numbered list 2</nl> </pl> </ul> </nl> <list type="numbered"> <listitem>number list 1</listitem> <listitem>number list 2</listitem> <listitem> <list type="unnumbered"> <listitem>unnumbered list1</listitem> <listitem>unnumbered list 2</listitem> <listitem> <list type="plain"> <listitem>plain list 1</listitem> <listitem>plain list 2</listitem> <listitem> <list type="numbered"> <listitem>numbered list 1</listitem> <listitem>numbered list 2</listitem> </list> </listitem> </list> </listitem> </list> </listitem> </list>
|
|---|