tchrist,

Thank you for your explanation/demonstration of how the UCA sort works.

I have already answered your previous post, and have apologized for mis-quoting the article.

It is not whether something prints or not that matters to the database engine, but rather it is the 'lt, eq, gt' that counts. Each key must be ordered so that every key before it must be less than, and every key after it must be greater than. So looking at your example, it seems that only chr(0) to chr(31) would be a problem.

I have written 3 database engines in my life; in the 70's in assembler, in the 80's in C, and recently in Perl. Unfortunately, staring at a lot of hex dumps is required ( even in Perl ). The one thing all of these had in common, it that all data passed from the user must be inserting into the database. So when a database is created the start key is "" value ( length of 0). This is because the user could put in:

$key="\0"; $data = "\0";
which are valid characters. Now, that could be fixed by documenting this behavior. But the chr(0) to chr(31) is used for many internal things for the DB engine and changing the order in sort would be a show stopper.

Thank you

"Well done is better than well said." - Benjamin Franklin


In reply to Re^6: RFC: Is this the correct use of Unicode::Collate? by flexvault
in thread RFC: Is this the correct use of Unicode::Collate? by flexvault

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.