in reply to lookbehind

One thing chromatic might do is strip the HTML down to a manageable chunk with judicious use of split. Assuming you can pull the table out (easy enough to do, "These nodes all have stuff by user" is a good phrase), you can split on the <TR bgcolor= bit, resulting in an array of lines to parse.

Pull off the HREF bit -- up to the closing angle bracket, and you'll have the node title at the start. The regex there checking for re (case insensitive) is exactly what I'd use.

This requires you to pull in the whole page at once, though, but it won't be big enough to eat up a lot of memory.