Re^3: simple regex help

I've used a stack to keep track of opening/closing li tags.

#!/usr/bin/perl

use strict;
use warnings;
use HTML::TokeParser::Simple;

my $html = do{local $/; <DATA>};

my $p = HTML::TokeParser::Simple->new(\$html)
 or die "can't parse string: $!\n";

while (my $t = $p->get_token){
  last if $t->is_end_tag('span');
}

my ($match, @li_stack);

while (my $t = $p->get_token){
  if ($t->is_start_tag('li')){
    push @li_stack, 'li';
  }
  if ($t->is_end_tag('li')){
    if (@li_stack){
      pop @li_stack;
    }
    else{
      last;
    }
  }
  $match .= $t->as_is;
}

print "$match\n";

__DATA__
<li><span class="title">Title</span><ul><li>one</li><li>two</li></ul> 
+MATCH HERE </li>
[download]

output:

<ul><li>one</li><li>two</li> MATCH HERE
[download]

update:
Added output.

uptdate 2
see ikegami's reply below.

Comment on Re^3: simple regex help Select or Download Code

Replies are listed 'Best First'.
Re^4: simple regex help by ikegami (Patriarch) on Apr 18, 2007 at 17:52 UTC
`__DATA__ <li><span class="title">Title</span><ul><li>one</ul> MATCH HERE </li> +this shouldn't match` [download] outputs `<ul><li>one</ul> MATCH HERE </li> this shouldn't match` [download] instead of the expected `<ul><li>one</ul> MATCH HERE` [download]	[reply] [d/l] [select]
Re^5: simple regex help by Fletch (Bishop) on Apr 18, 2007 at 17:59 UTC
~~And that, class, is why all sane people use a properly tested HTML parser and don't try to roll their own with regexen . . .~~ Update: Oh he is. Never mind me . . . %) Perhaps this is why sane people avoid having to parse HTML if they can avoid it. :)	[reply]
Re^6: simple regex help by ikegami (Patriarch) on Apr 18, 2007 at 19:48 UTC
According to Wikipedia, In computer science and linguistics, parsing (more formally syntax analysis) is the process of analyzing a sequence of tokens to determine its grammatical structure with respect to a given formal grammar. While using a tokenizer is a step in the right direction, he did roll his own parser (the `while` loop).	[reply] [d/l]