in reply to XML::Simple problem, or How to convert HTML to Perl and then back again.
There are many modules in the HTML hierarchy on CPAN. HTML::TokeParser and HTML::TreeBuilder come to mind. Each one handles the HTML document in a different way depending on how you want to access it. TokeParser as the name implies, tokenizes the HTML into tags and text and lets you make changes and print it out one tag at a time. TreeBuilder converts your document into a tree to represent nested elements.
HTH
Addendum: I was able to scrounge up a script I wrote that searches a given HTML document for table/td/tr tags and removes the width element using HTML::TokeParser::Simple. It's not exactly what you are looking for, but it should give you a head-start:
#!/usr/bin/perl -w use strict; use HTML::TokeParser::Simple; my $p= HTML::TokeParser::Simple->new(shift); while( my $token=$p->get_token) { $token->delete_attr('width') if $token->is_start_tag(qr/t(?:able|d|r)/); print $token->as_is; }
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: XML::Simple problem, or How to convert HTML to Perl and then back again.
by Wonko the sane (Deacon) on Jul 11, 2003 at 18:49 UTC | |
by pzbagel (Chaplain) on Jul 11, 2003 at 19:00 UTC | |
by Wonko the sane (Deacon) on Jul 11, 2003 at 19:06 UTC |