Yep and illustrates nicely why handling HTML with regexes is not recommended. Anyway, I think the following works:
sub is_all_in_bold {
my $title=shift;
my ($title_content)=$title=~m!<title>(.*?)</title>!s;
1 while $title_content=~s#<b>(?:[^<>]+|(?!</?b>).)*</b>##s;
print "$title_content\n";
return !$title_content;
}
while (<DATA>) {
print;
chomp;
next unless /\S/;
print is_all_in_bold($_) ? "Its all in bold\n" : "Its not all in b
+old\n";
}
__END__
<title><b>Meta‐Ethics and The Problem of Creeping Minimalism</b
+></title>
<title>Meta‐Ethics and <b>The Problem of Creeping</b> <b>Minima
+lism</b></title>
<title><b>foo<b>bar</b></b></title>
---
demerphq
First they ignore you, then they laugh at you, then they fight you, then you win.
-- Gandhi
Flux8
|