in reply to Re: A nice text processing question
in thread A nice text processing question

Very interesting solution. However, you don't solve the problem as stated. You are able to strip out all the tags, but that wasn't the needed solution. The solution was to break up the breadcrumbs, then put together any tags that were supposed to be pairs, but now were broken up.

I haven't really used HTML::Parser very much, but I don't think it's as much a friend as you might think. The problem should've been stated as such:

  1. I need to split on some number of dashes
  2. foreach string created this way, I need to fill in the tags that might have been orphaned
while (<DATA>) { my @tokens = split /-+/, $_; foreach my $token (@tokens) { $token = do_html_balancing($token); do_something_else_with_token($token); } }

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.