ellinguista has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to scrape forums from the general press and from blogs, based on finding the proper tags that indicate comments. I get frequent Parse errors about Parser.pm.
mismatched tag at line 24, column 2, byte 2358 at C:/Perl/lib/XML/Parser.pm line 187
Apparently the formatting of all those sites is not strictly valide XML-wise.
What should be done ?
Thanks
Comment on Can Twig process most press and blog sites?