in reply to Switching out characters inside links
This is totally overkill for what you want but it's a nice approach for converting HTML to XHTML and that's the subtext of your original question.
use strict; use warnings; use XML::LibXML; my $raw = do { local $/; <DATA> }; my $parser = XML::LibXML->new; $parser->recover_silently(1); my $doc = $parser ->parse_html_string("<div>$raw</div>"); my $wrapper = [ $doc->findnodes("//body/div") ]->[0]; print $_->serialize(1) for $wrapper->childNodes; exit 0; __DATA__ All content is in a variable like this...<br> <br> <a href="http://www.somedomain.com/index.cgi?page=home&var=1&no=2&so=f +orth&so=on">
You end up with-
All content is in a variable like this...<br/><br/><a href="http://www.somedomain.com/index.cgi?page=home&var=1&no=2&so=forth&so=on">
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Switching out characters inside links
by ikegami (Patriarch) on Nov 27, 2009 at 02:52 UTC | |
by Your Mother (Archbishop) on Nov 27, 2009 at 03:58 UTC |