in reply to Re: Re^2: Cleanning HTML - New/better module for that - test please! ;-P
in thread Cleanning HTML - New/better module for that - test please! ;-P
Whatever your rank is or mine doesn't have anything to do with it.
I'm not saying anything about any of the OP's points either - yes, he would probably be better off using HTML::Parser. (There are reasons against this too, sometimes. Depends on too many factors to discuss here, I'll just assume you know what I mean.)
What I was pointing out is that you saw pattern matching and assumed he was 'using regexes' as in common parlance. But pattern matching can (and pretty much has to) be used for a proper parser too, so before you throw out blanket statements like "don't use regexes for parsing HTML" please have a look at what he's actually doing.
(His parser is defective - there are really three modes in *ML: text, tags, and attribute tag values. You have to parse the value assigned to an attribute separately from the tag- and attribute names, mainly because right angle brackets appearing inside an attribute value don't terminate a tag. gmpassos' code doesn't take this into account.)
Makeshifts last the longest.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re5: Cleanning HTML - New/better module for that - test please! ;-P
by thpfft (Chaplain) on Apr 27, 2003 at 23:36 UTC |