Re^2: How to extract untouched content of html tag with HTML::Parser

Replies are listed 'Best First'.
Re^3: How to extract untouched content of html tag with HTML::Parser by Anonymous Monk on Nov 28, 2010 at 17:26 UTC
It is that easy. You have a logic error. Your start handler, which you call start_handler, does no printing. You text handler does printing, but as documented, the text handler handles text not start tags. Also, your end handler does no printing.	[reply]
Re^4: How to extract untouched content of html tag with HTML::Parser by Lana (Beadle) on Nov 28, 2010 at 17:33 UTC
OMG!!! I can't believe I was that blind! Thank you very much! :))	[reply]
Re^5: How to extract untouched content of html tag with HTML::Parser by Anonymous Monk on Nov 28, 2010 at 17:36 UTC
I can believe it, it happens to me every day, usually in between naps and coffee breaks	[reply]
Re^3: How to extract untouched content of html tag with HTML::Parser by roboticus (Chancellor) on Nov 28, 2010 at 16:40 UTC
OK, then, did you look at the `htstrip` example in the distribution? The documentation (at the end of the EXAMPLES section) indicates that you can modify it to do what you want: More examples are found in the eg/ directory of the HTML-Parser distribution: the program hrefsub shows how you can edit all links found in a document; the program htextsub shows how to edit the text only; the program hstrip shows how you can strip out certain tags/elements and/or attributes; and the program htext show how to obtain the plain text, but not any script/style content. ...roboticus	[reply] [d/l]
Re^4: How to extract untouched content of html tag with HTML::Parser by Lana (Beadle) on Nov 28, 2010 at 17:22 UTC
Yes I did examined all examples and played with them alot. But still can't get what I need. I can't understand why using 'text' instead of 'dtext' produces the same result - plain text instead of returning untouched content of that HTML tag...	[reply]