in reply to Parsing/Extracting Data from HTML.

You could also try sending the HTML through the program 'Lynx', a text based Web browser which is installed on most Unix systems. Sometimes you have to write the data to a file and then have lynx open that file with the -dump option, i.e.
$plaintext = `/path/to/lynx -dump $url`;
I think this will work. (make sure to pass $url through a reg-ex if it can be entered by an unknown user). Lynx does really nice conversion from HTML to plain text.