in reply to Questions: how to exclude substring having Evil meanings

Perhaps you should use codes that don't contain letters? With Unicode, that still gives you thousands of possibilites for each position. That way, you satisfy both conditions at once: the codes will be short (shorter than ASCII only codes), and it's unlikely to offend someone.
  • Comment on Re: Questions: how to exclude substring having Evil meanings

Replies are listed 'Best First'.
Re^2: Questions: how to exclude substring having Evil meanings
by ikegami (Patriarch) on Dec 02, 2009 at 23:40 UTC

    There are limitations and drawbacks.

    • It's hard to name most characters. "I have a problem with invoice latin-small-letter-a-with-dot-above-and-macron;devanagari-letter-vocalic-r;left-right-white-arrow." (ǡऋ⬄)
    • There are font problems ("I have a problem with invoice box-box-box.")
    • Encoding problems are still common too.

    What symbols are you suggesting?

    • You'd need a set of 22 chars to maintain a record num of 5 chars. (225 = 5,153,632).
    • You'd need a set of 48 chars to bring the record num down to 4 chars. (484 = 5,308,416).
    • You'd need a set of 171 chars to bring the record num down to 3 chars. (1713 = 5,000,211).

    I suppose you could use the horizontal dominoes. Each domino can be read as two digits from 0 to 6. For example, this is node 🀷🁜🁑🀵 (06:61:44:047).

    Using dominoes would reduce the record num to 4 chars (724 = 5,764,801) assuming you didn't want the sequence to be a legal domino sequence. Both the UTF-8 and the UTF-16 encoding of 4 dominoes would take 16 bytes. (UTF-32 too, for what it's worth.)

    Update: Added everything after the question.

      [...] assuming you didn't want the sequence to be a legal domino sequence.
      What if it had to be legal sequences?


      holli

      You can lead your users to water, but alas, you cannot drown them.

        Then each domino except the first only counts for one base 7 digit. 166144047 would go from 16:61:44:04 to 16:66:61:14:44:40:04.

        That means that 6 or more decimal digits would then be more efficient than the same amount of dominos.

        10^n >= 7^(n+1) ln(10^n) >= ln(7^(n+1)) n*ln(10) >= (n+1)*ln(7) n*ln(10) >= n*ln(7) + ln(7) n*ln(10) - n*ln(7) >= ln(7) n*( ln(10) - ln(7) ) >= ln(7) n >= ln(7)/( ln(10) - ln(7) ) n >= 5.455696235812878344 n >= 6
        $ perl -e' printf "chars: %2d digits: %13.f %s dominoes: %11.f\n", $_, 10**$_, qw( < = > )[( 10**$_ <=> 7**($_+1) )+1], 7**($_+1), for 1..12 ' chars: 1 digits: 10 < dominoes: 49 chars: 2 digits: 100 < dominoes: 343 chars: 3 digits: 1000 < dominoes: 2401 chars: 4 digits: 10000 < dominoes: 16807 chars: 5 digits: 100000 < dominoes: 117649 chars: 6 digits: 1000000 > dominoes: 823543 chars: 7 digits: 10000000 > dominoes: 5764801 chars: 8 digits: 100000000 > dominoes: 40353607 chars: 9 digits: 1000000000 > dominoes: 282475249 chars: 10 digits: 10000000000 > dominoes: 1977326743 chars: 11 digits: 100000000000 > dominoes: 13841287201 chars: 12 digits: 1000000000000 > dominoes: 96889010407
      I suppose you could use the horizontal dominoes. Each domino can be read as two digits from 0 to 6.
      This is brilliant.
      $,=qq.\n.;print q.\/\/____\/.,q./\ \ / / \\.,q.    /_/__.,q..