What's left is the classic "longest substrings common between two pieces of text" problem. There was a discussion of that recently - let me see if I can find the thread...
In reply to Re: How would you extract *content* from websites?
by TedPride
in thread How would you extract *content* from websites?
by BUU
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |