I see. So you delete everything from the <script>(inclusive) up to the first non<script> tag(exclusive) that follows a </script> tag. Clever. But whether this helps or not I really don't know.
It will strip any text that might follow the </script>, (which may not matter if they only have <script> in the <head>) but these probably do matter
<html>
<head>
<script language="Javascript">
document.write("Don't forget your </script> tag! It's important!");
document.write("Even the <body> tag is important!");
</script>
</head>
<body>
This is just some text.
</body>
</html>
or
<html>
<head>
<script language="Javascript">
document.write("Don't forget your </script> tag! It's important!");
if (x<y) { alert("y > x") }
</script>
</head>
<body>
This is just some text.
</body>
</html>
You'd have to parse the JavaScript (at least to some extent to be able to say whether the </script> is meant to close it or not.
Actually I guess you'd only have to distinguish three states inside the JavaScript. "Inside a singlequoted string", "Inside a doublequoted string" and "Elsewhere".
And you'd only treat the <script> as the closing tag in the "Elsewhere".
Jenda |