in reply to Strip HTML, while preserving layout, with core(-ish) modules
Since I can't think of anything appropriate that's core, some "core-ish" modules may have to do. If you're on w32, using ActiveState,
ppm search HTML
HTML-Content-Extractor (hyphens OK, not "::"), HTML-TagReader, and YAPE-HTML are just a few that may be relevant, but I suspect you'll have to code your own semantic conversions.
If on a nixish OS, search CPAN, likewise for "HTML."
|
|---|