in reply to Extracting Text Using Regular Expressions Problem
I'm also more than a bit suspicious of your data: Is the webpage from which you have "taken" the data a wiki or somesuch?
Or, assuming your data is accurately represented above, using...
=~ /(?:===Comments===(.*?)=Section \d=)|(?:===Comments===(.*?)=Another Section=)/sgmight work for multiple comment sections. Sorry, I hate to post untested code, but this is, due to press of time
As to your problem re the end of the page: the last word the server knows about is </html> so the last character will be >, even though that will leave your with some markup tags to remove (possibly "</p></body></html").
|
|---|