in reply to HTML::Strip Problem
As for why the second snippet only produces about 3/4 of the expected text output, that might be a matter of a "syntax error" in the yahoo HTML source. (But how could Yahoo make a mistake like that?? I'm shocked! Shocked!!) Anyway, it appears that HTML::Strip does not do syntax checking (so it probably won't generate parsing errors that you can trap), and there may be some stray angle brackets or flubbed entities in the source text (perhaps 3/4 of the way into the file) that are causing trouble. You would need to just probe the text to see if that's what the problem is -- e.g. run a validating parser on it, or simply try out some simple one-liners that will isolate angle brackets and/or ampersands, along with the things adjacent to them...
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: HTML::Strip Problem
by mkurtis (Scribe) on Mar 29, 2004 at 14:56 UTC |