I've managed to get this working...That, richill, I'd consider a result! Make a backup and lock it in the safe. :-)
I'm worried that creating a new XML::Simple()xml object for every line is a bit wasteful...Perhaps consider parsing the XML once and storing the data in a hash?
Then look at every HTML file checking if any links are in your lookup table and make the change if necessary.$xml_hash($LinkToPage} = ($New_location);
For changing the HTML I would consider a parser. There are many and the one I frequently use is HTML::TokeParser::Simple. Have a look and get back to us if you need a hand.
update: added example of using a parser.
input html:#!/usr/local/bin/perl use strict; use warnings; use HTML::TokeParser::Simple; my %xml_hash = ( 'link1.html' => 'linka.html', 'link2.html' => 'linkb.html', ); my $html_file = 'links.html'; my $p = HTML::TokeParser::Simple->new($html_file) or die "couldn't parse $html_file"; my $new_html; while (my $t = $p->get_token){ if ($t->is_start_tag('a')){ my $href = $t->get_attr('href'); if (exists $xml_hash{$href}){ $t->set_attr('href', $xml_hash{$href}); } } $new_html .= $t->as_is; } print "$new_html\n";
output:<html> <head> <title>links</title> </head> <body> <p>links</p> <a href="link1.html">link1</a> <a href="link2.html">link2</a> </body> </html>
<html> <head> <title>links</title> </head> <body> <p>links</p> <a href="linka.html">link1</a> <a href="linkb.html">link2</a> </body> </html>
In reply to Re: using xml and perl to perform a search and replace on html files
by wfsp
in thread using xml and perl to perform a search and replace on html files
by richill
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |