in reply to Perl spell checker

What you propose to do sounds straightforward, but it really won't be that simple to get right, though you should be able to come up with something to make the task much easier. I hope that whatever method or combination of methods you end up using, it is certain to miss things at least now an then, so a review by a human editor is still likely to be necessary in order to both ensure that
  1. You remove all identifying information. If you are taking steps to remove identification, chances are that failing to do so in some cases could lead to anything from embarassment to legal action. I don't know what the source of your text is, but it could contain such things as
    • addresses (knowing where someone could be almost as identifying as their name, and this could be tricky as "Mr. XXXXX of 123 Main St., Wawa ON" would be bad but "Mr. XXXXX of YYY YYYY St., Wawa ON" should be ok)
    • phone numbers
    • employee numbers
    • e-mail addresses
  2. You don't remove things that are necessary for the text to be useful, such as:
    • names of corporations, or other organizations (e.g. if the text were consumer complaints, how useful would it be to end up with "Mr. XXX XXXXX was killed by a faulty corkscrew made by YYYYYY Corp.")
    • names of public figures (e.g. before text might be, again depending on the source, something like "Mary Smith says Senator Jones is a baboon because..." in which you may want to hide Mary Smith but not the senator because (s)he is the subject of the text)
    • probably lots of other stuff
Perhaps your text is from just one source you know to have a strict content standard, and that may make things much easier, but it would be worthwhile to consider what could go wrong.

--
I'd like to be able to assign to an luser

Replies are listed 'Best First'.
Re: Re: Perl spell checker
by one4k4 (Hermit) on Oct 19, 2001 at 21:34 UTC