Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: ROT8000 implementation?

by LanX (Saint)
on Oct 15, 2021 at 21:45 UTC ( [id://11137611]=note: print w/replies, xml ) Need Help??


in reply to ROT8000 implementation?

supposing that ROT8000 means character rotation by 0x8000.

This would only work for old and obsolete UCS-2 with only 0x10000 = 2^16 code-points.°

implementation should be straightforward:

For each character:

  • decode to codepoint
  • toggle high bit
  • encode
Is this a joke or what am I missing?

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

°) Because by definition is rot self inverting: rot(rot(x))=x resp. rot^2=id

Replies are listed 'Best First'.
Re^2: ROT8000 implementation?
by atcroft (Abbot) on Oct 16, 2021 at 01:13 UTC

    Actually, my understanding has always been that ROT-N is just a notation for a Caesar cipher with a specified N. ROT13 is the case of a Caesar cipher rotated by 13d characters, which when using the 26d (2d*13d) character Latin alphabet means that rot(rot(x, 13d), 13d)=x. For any other N, deciphering would be the case of using (26d-N), or rot(rot(x, N), (26d-N))=x. Extending this further, giving a ROT-N of an M-character alphabet, this becomes rot(rot(x, N), (M-N))=x. If N is larger than M, the encoding can be simplified to (N % M) and decoding to (M - (N % M)) (thus if M=26d, ROT-53d simplifies to ROT-1d, decoded by ROT-25d).

    I have never heard of ROT-N notation being in anything but decimal (but that may also be my lack of exposure). As far as the most common encodings (UTF-8, UTF-16, and UTF-32), all support the 1_112_064d Unicode code points currently defined. Thus an N value of 556_032d (hex: 0x8_7C00) should result in the equivalent behavior for the existing defined code points to the ROT-13d with the 26d-character Latin alphabet (i.e., a self-decoding function).

    Below are the encoding and decoding rotations for a 26d, 256d, and 1_112_064d character "alphabets" for various N. It should be noted using 0x8000 (32_768d) rotations on a 256-character alphabet is the equivalent of "double ROT-13d encoding" on a 26-character alphabet, and that using the current number of code points (1_112_064d) has the effect on both a 256-character and 1_112_064-character alphabet.

    (If you find an error in my logic or values, please advise, so I can correct my understanding and/or data, as appropriate.)

    Rotations 26d-char encoding 26d-char decoding 256d-char encoding 256d-char decoding 1_112_064d-char encoding 1_112_064d-char decoding
    13d (0x0D) 13d (0x0D) 13d (0x0D) 13d (0x0D) 243d (0xF3) 13d (0x0D) 1_112_051d (0x10_F7F3)
    26d (0x1A) 0d (0x00) 0d (0x00) 26d (0x1A) 230d (0xE6) 26d (0x1A) 1_112_038d (0x10_F7E6)
    128d (0x80) 24d (0x18) 02d (0x02) 128d (0x80) 128d (0x80) 128d (0x80) 1_111_936d (0x10_F780)
    256d (0x100) 22d (0x016) 4d (0x004) 0d (0x000) 0d (0x000) 256d (0x100) 1_111_808d (0x10_F700)
    8000d (0x1F40) 18d (0x12) 8d (0x08) 64d (0x40) 192d (0xC0) 8000d (0x1F40) 1_104_064d (0x10_D8C0)
    32_768d (0x8000) 8d (0x08) 18d (0x12) 0d (0x00) 0d (0x00) 32_768d (0x8000) 1_079_296d (0x10_7800)
    556_032d (0x8_7C00) 22d (0x0_0016) 4d (0x0_0004) 0d (0x0_0000) 0d (0x0_0000) 556_032d (0x8_7C00) 556_032d (0x8_7C00)
    1_112_064d (0x10_F800) 18d (0x00_0012) 8d (0x00_0008) 0d (0x00_0000) 0d (0x00_0000) 0d (0x00_0000) 0d (0x00_0000)

    Hope that helps.

      As I said, it's a misnomer.

      There is no formula with fixed N here because it operates with a 2^16 lookup table to avoid non-printable characters in both directions.

      Edit

      So the actual N(X) for a mapping

      Y=R(X)

      N(X)= N(Y)= Y-X

      will vary near approximately 2^15-20+-8 (?).*

      And it ignores anything >= 2^16 like emojis, similar to ROT13 ignoring any ASCII outside the alphabet.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      *) Actually the description of the author was outdated and doesn't fit his code.

      He's excluding 2100 characters.

Re^2: ROT8000 implementation?
by LanX (Saint) on Oct 15, 2021 at 23:47 UTC
    Hnngngng ... it's a misnomer

    AFAIS

    • it's only operating on UCS-2 and ignoring the planes above
    • it's ignoring whitespaces by it's own definition of whitespace
    • it's avoiding some control characters

    So not just a simple rotate by 0x8000 = 2^15!

    One needs

    • an ordered list of all allowed characters.
    • divide them by two
    • map them in a lookup hash with 2^16 entries
    • the ignored ones map to themself

    This JS implementation should be a good start for a Perl port

    https://github.com/rottytooth/rot8000/blob/main/rot8000.js

    You can use ord and chr to convert to codepoint and back

    HTH!

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137611]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2024-04-26 04:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found