A quick search showed
Diacritic-Insensitive and Case-Insensitve Sorting, which may help you. I would go ahead and lump accented characters in with their non-accented versions (my
reply in that thread shows one way to convert them) and have a separate category for anything that doesn't convert into
/^[A-Za-z]/.