in reply to Re: Character class for French chars with accents in regex?
in thread Character class for French chars with accents in regex?

There are at least two downsides to that method worth mentioning.

First, it allows similar looking characters to be used. For example, there's a cyrillic letter that looks almost identical to the latin 'a'. If the regexp is used to limit valid user names, it wouldn't stop one user from impersonating another by creating a similar looking user name.

Secondly, it may allow characters that users have no easy way of entering into forms and characters that some/many users are unable to render.

The severity of these downsides depends on the purpose of the regexp.

Update: Here are some similar looking strings, but each is different:

  • Comment on Re^2: Character class for French chars with accents in regex?

Replies are listed 'Best First'.
Re^3: Character class for French chars with accents in regex?
by clinton (Priest) on Aug 09, 2007 at 19:03 UTC
    Fair points, both, and well mentioned. Depending on the application for this filter, these downsides may count for less than making your customers irate because they can't enter their names.

    Clint

      <grumble>Apparently it doesn't matter if customers' names are screwed up. My beautiful code here at work which will cope with big-endian, little-endian, even middle-endian names and people with only one name - it just got vetoed by $boss in favour of first name and surname. Thankfully it won't be me who has to explain to Chow Yun Fat why the software calls him Mr. Yun.</grumble>