in reply to Search and replace in html

Anonymous Monk,
There are a plethora of modules on CPAN that could help you, I would suggest looking at the following search. Roll your own solutions with unknown data sources are likely to fail. With that said - let's assume your HTML is perfectly formatted and you want everything between the start and end HTML tags to include other HTML tags.
#!/usr/bin/perl -w use strict; open (INPUT,"file") or die "Unable to open input : $!"; open (OUTPUT,">output") or die "Unable to open outpu : $!"; select OUTPUT; $\ = "\n"; my $foundstart; while (<INPUT>) { chomp; next unless ($foundstart || /<html *>/i); if (/<html *>/i && ! $foundstart) { $_ =~ s/^.*?<html *>(.*)$/$1/i; $foundstart++; next unless($_); } if ($_ =~ m|</html *>|i) { $_ =~ s|^(.*?)</html *>.*$|$1|i; print if($_); last; } print; } close INPUT; close OUTPUT;

Cheers - L~R

Replies are listed 'Best First'.
Re: Re: Search and replace in html
by Anonymous Monk on May 09, 2003 at 01:16 UTC
    Thanks guys, it has started to get me on my way. I have now found that the files contains multiple html references. Would it simply be a matter of just changing the if to a while.
      Anonymous Monk,
      No - you should not try to roll your own unless you are 100% sure of your data. That is what I was trying to point out. Follow tall_man's advice or find a module you like using the search I provided.

      Cheers - L~R