does anyone know how to make this not combine the words?
Are you sure *it* is combining the words? I think your code is doing that. If your sub gets called multiple times, that is because there were tags in between. You do nothing with those tags, but it is very likely that they were meant to render as some sort of white space.
For formatting HTML as plain text, have a look at HTML::FormatText, or consider using w3m -dump, links -dump or lynx -dump.
A quick and ugly fix for your problem would probably be having start and end handlers that add a single space to the string and a substitution on eof to remove duplicate whitespace.
Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }
In reply to Re: HTML::Parser question
by Juerd
in thread HTML::Parser question
by mkurtis
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |