in reply to Re^3: simple regex help
in thread simple regex help

__DATA__ <li><span class="title">Title</span><ul><li>one</ul> MATCH HERE </li> +this shouldn't match

outputs

<ul><li>one</ul> MATCH HERE </li> this shouldn't match

instead of the expected

<ul><li>one</ul> MATCH HERE

Replies are listed 'Best First'.
Re^5: simple regex help
by Fletch (Bishop) on Apr 18, 2007 at 17:59 UTC

    And that, class, is why all sane people use a properly tested HTML parser and don't try to roll their own with regexen . . .

    Update: Oh he is. Never mind me . . . %) Perhaps this is why sane people avoid having to parse HTML if they can avoid it. :)

      According to Wikipedia,

      In computer science and linguistics, parsing (more formally syntax analysis) is the process of analyzing a sequence of tokens to determine its grammatical structure with respect to a given formal grammar.

      While using a tokenizer is a step in the right direction, he did roll his own parser (the while loop).