in reply to How to extract text between two tags?

You should use an xml parser, however, after seeing it even does not have a </body> tag, here is a oneliner I use often:
 cat foo.html |perl -ne 'print if /Paper ID/ .. /\/SELECT/'


you also need to unescape & lt ; back to text, see Unescape characters from XML::Twig

curl -s http://forum.vingrad.ru/act-Print/client/printer/f-5/t-326992. +html |perl -pe 's{<br />}{\n}g' |perl -ne 'print if /Paper ID/ .. /\/ +SELECT/'

I am surprised the browser can handle and display that webpage...

Replies are listed 'Best First'.
Re^2: How to extract text between two tags?
by Anonymous Monk on May 28, 2015 at 22:31 UTC

    Many will complain that you should use an xml parser, however

    You don't need an XML parser to parse html, HTML::TreeBuilder will do just fine

      Not only will HTML::TreeBuilder do fine, but if it's an HTML file an XML parser is likely to die quickly on it. XML parsers are required to fail on invalid XML, while HTML parsers are allowed to be more forgiving (e.g. HTML::TreeBuilder defaults to inserting implicit end tags that would cause an XML parser to quit)