Unreliably {grin}. I realize you are probably extracting from a consistent source, but webpages don't necessarily have \n delimiters in logical places, and some sites may have \r in them as well. I'd suggest using something like
. It will break the page up into tokens which consist of start tags, end tags, and text tags (and a few other things you probably don't care about). You can easily grab and discard tokens until you get to the one that matches your criterion, then start processing from there.