in reply to Re^8: Mixed Unicode and ANSI string comparisons?
in thread Mixed Unicode and ANSI string comparisons?

Now sort them together to provide a single index.
Ok, at this point it's not clear to me what 'sorting' even means here :) Sort in alphabetical order? According to what alphabet? In codepoint order (which only makes sense for a couple of languages)? How about some examples :)

Also, I don't see what decoding has to do with translating from one language to another.

  • Comment on Re^9: Mixed Unicode and ANSI string comparisons?

Replies are listed 'Best First'.
Re^10: Mixed Unicode and ANSI string comparisons?
by BrowserUk (Patriarch) on Dec 15, 2015 at 10:53 UTC
    I don't see what decoding has to do with translating from one language to another.

    The data is. They are free form descriptions produced by researchers from many countries. Parts of most of them will be in Latin (the language not the encoding); parts will be in the researchers own language.

    It's not a case of "translating from one language to another", it is having someone who understands what is in the file so that you could decide how to decode it. The files go back decades; researchers move on. The data continues to exist.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.

        The first module covers 10 European languages; the small sample I saw contained Cyrillic, Arabic, Urdo, and what I think (but can't swear to) were Korean and Japanese.

        The second appears to be completely undocumented, but given its author, I'm guessing is designed to try and determine which of the multitude of Unicrap encodings a file contains, rather than anything to do with ISO-8859-x stuff.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.