in reply to Re^7: Mixed Unicode and ANSI string comparisons?
in thread Mixed Unicode and ANSI string comparisons?
Oh. Yes, it's probably better to decline that offer...
Hm. I think I may have given a wrong impression here.
Think of the description lines in FASTA files. They can contain anything useful to the researcher, and often contain stuff that only makes sense to the originator; thus it was often written in a local code page. Each individual string makes sense in the context of its file and origin.
Now take a bunch of legacy FASTA files that originate from all over the world and bring them together into a central DB and index them by their descriptions. And then try to bring the index of legacy descriptions together with more modern ones with their descriptions in Unicode. Now sort them together to provide a single index.
That's pretty close to the problem.
Ideally, the descriptions would all be converted into Unicode; but that requires a huge effort entailing a bunch of translators working in many different languages to translate technical terms; abbreviations, and anything else the originating researchers felt important to put there in his own language. Basically an impossible task.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^9: Mixed Unicode and ANSI string comparisons?
by graff (Chancellor) on Dec 16, 2015 at 04:16 UTC | |
by BrowserUk (Patriarch) on Dec 16, 2015 at 11:55 UTC | |
by Anonymous Monk on Dec 16, 2015 at 12:35 UTC | |
Re^9: Mixed Unicode and ANSI string comparisons?
by Anonymous Monk on Dec 15, 2015 at 03:29 UTC | |
by BrowserUk (Patriarch) on Dec 15, 2015 at 10:53 UTC | |
by soonix (Chancellor) on Dec 15, 2015 at 12:05 UTC | |
by BrowserUk (Patriarch) on Dec 15, 2015 at 12:13 UTC | |
by soonix (Chancellor) on Dec 16, 2015 at 08:36 UTC | |
|