eq does not work with regular expressions, but only for a direct string match :
my $a = "foo"; my $b = "bar"; my $c = ".*"; print "eq" if ($a eq $a); # prints "eq" print "ne" if ($a ne $b); # prints "ne" print "ne" if ($a ne $a); # prints nothing print "RE" if $a =~ /$c/; # prints "RE" print "RE" if $a =~ /f.*/; # prints "RE"
What you maybe wanted was something along these lines (tested :) ):
#!/usr/bin/perl -w use strict; my $filename = $ARGV[0] || "temp.html"; my $open; undef $/; # undefine all line separators open( FILE, $filename ) or die "Couldnīt open $filename : $!\n"; $open = <FILE>; # This slurps the whole file into one scalar (ins +tead of an array) close FILE; # I'll take a simplicistic approach that assumes that # the only place where a ">" occurs is at the start of # a tag. This does fail when you have for example : # <IMG src="less.png" alt="a > b"> # which is valid HTML from what I know. # I also ignore scripts and comment handling. while ($open) { # Match text followed by a tag into $1 and (if a tag follows exist +s) $2: $open =~ s/^([^<]+)?(<[^>]+>)?//; print "Text : $1\n" if $1; print "HTML: $2\n" if $2; }; # the real meat of the code is the "s///;" line # it works as follows : # The two parenthesed parts capture stuff, # the first parentheses capture non-tagged text # the second parentheses capture text that is # within "<" and ">" # one or both of the parentheses are allowed to be empty # Everything that is found is deleted from the start of # the string. # repeat as long as there is stuff in the slurped line
Of course, everything above could maybe be done more correct by using one of the HTML modules, like HTML::Parser - maybe you want to take a look at these modules. takshaka has mentioned a previous discussion of this topic where a working example of usage of HTML::Parser was posted by him - a direct link is here.
For more information about regular expressions read the perlre manpage.
In reply to Re: Search and replace everything except html tags
by Corion
in thread Search and replace everything except html tags
by thatguy
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |