in reply to Extracting Text Using Regular Expressions Problem
What I can't seem to do now is extract the next block of text below the second comments box between "===Comments===" and "=Another Section=", as the start tag is already found earlier in the article.First of all, your sample text doesn't contain "=Another Section=". It does contain "=Another section=", but for the (non-folding) regexp engine, 's' is as different from 'S' as '!' is.
Second, I'm not sure whether I spot what your problem is here. Could you elaborate?
As a secondary point. I also need to extract all the text after the "=Aditional Notes=" section. The problem is I do not know what the end tag for this will be, as it will be the last word used here (i.e. the last character in the webpage).Is there an end tag? That is, you need to match everything up to, but not including, the terminating character? /(?s:.)/ matches any character, so you could use /(?s:.)$/ as your "end tag".
Of course, if you just want to match every thing after "=Aditional Notes=", then just use /=Aditional Notes=(?s:.*)/.
|
|---|