in reply to Dealing with Malformed XML

No offense, but you're approaching it the wrong way. If you get bad data into a system, it's almost always preferable to go to the source of the data and correct the error there. If 'A' produces garbage and 'B' has to correct for that garbage, someone is going to come behind you eventually and have to maintain 'B'. If 'A' continuously puts out more garbage through human error, bad data into 'A', or whatever, then your method would be to continuously hack 'B' when 'B' is not the source of the problem.

This is a Bad Thing. Fix the problems where they occur, not later down the road. Who knows? Maybe 'A' will eventually pass data to 'C' as well. Then you have garbage being spread to multiple places and garbage filters will have to be maintained independant of one another (unless some pointy-haired boss decides on a central garbage management system rather than clean up the mess). Code reuse then becomes impeded because the situation wasn't resolved properly the first time. But isn't that part of what XML was designed to avoid?

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: (Ovid) Re: Dealing with Malformed XML
by Coyote (Deacon) on Jan 09, 2001 at 07:50 UTC
    I agree completely. The responsibility for making sure that the data is correct and well-formed is falls upon the people generating the data. I've already addressed this issue with user training and a filter to encode entities and before the data entry people mark up the articles.

    Unfortunately, I inherited this project after about 400 articles had already been scanned and marked up.

    ---- Coyote