in reply to getting text from HTML

If you have raw HTML and need to get anything-at-all out of it, HTML::Parser is the best way to go. It will process real-world HTML content of arbitrary size and, in this case, call a text event handler each time a block of text is found.

Replies are listed 'Best First'.
Re^2: getting text from HTML
by Your Mother (Archbishop) on May 12, 2020 at 17:21 UTC

    HTML::Parser is pretty far down on the list of things you should recommend, especially to a newish Perl hacker. The OP was already trying HTML::TokeParser::Simple which has a better, higher level, interface, and does the same things. I’m also going to critique answers that come without code and use language like “raw HTML” and “real-world” and “arbitrary size.” At the very best, it’s unhelpful. At face level, it’s detrimental to wisdom seekers.

Re^2: getting text from HTML
by Anonymous Monk on May 12, 2020 at 10:13 UTC
    Wrong. Html parser is too low level.