For Unicode (and a 64K bitmap's insufficient, there's more than 64K characters at this point) you can use a two or three level tree and compress away entire unused branches. There's some discussion of this in the Unicode spec, which is available online from The Unicode Consortium