Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

ROT8000 implementation?

by jwkrahn (Monsignor)
on Oct 15, 2021 at 21:32 UTC ( #11137610=perlquestion: print w/replies, xml ) Need Help??

jwkrahn has asked for the wisdom of the Perl Monks concerning the following question:

I was just reading a CRYPTO-GRAM article about rot8000 and was wondering if anyone was working on a Perl implementation?

FWIW, Google was no help.


Schneier on Security

rot8000 translator

Replies are listed 'Best First'.
Re: ROT8000 implementation?
by LanX (Sage) on Oct 15, 2021 at 21:45 UTC
    supposing that ROT8000 means character rotation by 0x8000.

    This would only work for old and obsolete UCS-2 with only 0x10000 = 2^16 code-points.°

    implementation should be straightforward:

    For each character:

    • decode to codepoint
    • toggle high bit
    • encode
    Is this a joke or what am I missing?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

    °) Because by definition is rot self inverting: rot(rot(x))=x resp. rot^2=id

      Actually, my understanding has always been that ROT-N is just a notation for a Caesar cipher with a specified N. ROT13 is the case of a Caesar cipher rotated by 13d characters, which when using the 26d (2d*13d) character Latin alphabet means that rot(rot(x, 13d), 13d)=x. For any other N, deciphering would be the case of using (26d-N), or rot(rot(x, N), (26d-N))=x. Extending this further, giving a ROT-N of an M-character alphabet, this becomes rot(rot(x, N), (M-N))=x. If N is larger than M, the encoding can be simplified to (N % M) and decoding to (M - (N % M)) (thus if M=26d, ROT-53d simplifies to ROT-1d, decoded by ROT-25d).

      I have never heard of ROT-N notation being in anything but decimal (but that may also be my lack of exposure). As far as the most common encodings (UTF-8, UTF-16, and UTF-32), all support the 1_112_064d Unicode code points currently defined. Thus an N value of 556_032d (hex: 0x8_7C00) should result in the equivalent behavior for the existing defined code points to the ROT-13d with the 26d-character Latin alphabet (i.e., a self-decoding function).

      Below are the encoding and decoding rotations for a 26d, 256d, and 1_112_064d character "alphabets" for various N. It should be noted using 0x8000 (32_768d) rotations on a 256-character alphabet is the equivalent of "double ROT-13d encoding" on a 26-character alphabet, and that using the current number of code points (1_112_064d) has the effect on both a 256-character and 1_112_064-character alphabet.

      (If you find an error in my logic or values, please advise, so I can correct my understanding and/or data, as appropriate.)

      Rotations 26d-char encoding 26d-char decoding 256d-char encoding 256d-char decoding 1_112_064d-char encoding 1_112_064d-char decoding
      13d (0x0D) 13d (0x0D) 13d (0x0D) 13d (0x0D) 243d (0xF3) 13d (0x0D) 1_112_051d (0x10_F7F3)
      26d (0x1A) 0d (0x00) 0d (0x00) 26d (0x1A) 230d (0xE6) 26d (0x1A) 1_112_038d (0x10_F7E6)
      128d (0x80) 24d (0x18) 02d (0x02) 128d (0x80) 128d (0x80) 128d (0x80) 1_111_936d (0x10_F780)
      256d (0x100) 22d (0x016) 4d (0x004) 0d (0x000) 0d (0x000) 256d (0x100) 1_111_808d (0x10_F700)
      8000d (0x1F40) 18d (0x12) 8d (0x08) 64d (0x40) 192d (0xC0) 8000d (0x1F40) 1_104_064d (0x10_D8C0)
      32_768d (0x8000) 8d (0x08) 18d (0x12) 0d (0x00) 0d (0x00) 32_768d (0x8000) 1_079_296d (0x10_7800)
      556_032d (0x8_7C00) 22d (0x0_0016) 4d (0x0_0004) 0d (0x0_0000) 0d (0x0_0000) 556_032d (0x8_7C00) 556_032d (0x8_7C00)
      1_112_064d (0x10_F800) 18d (0x00_0012) 8d (0x00_0008) 0d (0x00_0000) 0d (0x00_0000) 0d (0x00_0000) 0d (0x00_0000)

      Hope that helps.

        As I said, it's a misnomer.

        There is no formula with fixed N here because it operates with a 2^16 lookup table to avoid non-printable characters in both directions.


        So the actual N(X) for a mapping


        N(X)= N(Y)= Y-X

        will vary near approximately 2^15-20+-8 (?).*

        And it ignores anything >= 2^16 like emojis, similar to ROT13 ignoring any ASCII outside the alphabet.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        *) Actually the description of the author was outdated and doesn't fit his code.

        He's excluding 2100 characters.

      Hnngngng ... it's a misnomer


      • it's only operating on UCS-2 and ignoring the planes above
      • it's ignoring whitespaces by it's own definition of whitespace
      • it's avoiding some control characters

      So not just a simple rotate by 0x8000 = 2^15!

      One needs

      • an ordered list of all allowed characters.
      • divide them by two
      • map them in a lookup hash with 2^16 entries
      • the ignored ones map to themself

      This JS implementation should be a good start for a Perl port

      You can use ord and chr to convert to codepoint and back


      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

Re: ROT8000 implementation?
by Takeshi Kovacs (Beadle) on Oct 16, 2021 at 16:30 UTC
    taking the example from as testcase

    FYI: # $chinese = "籝籱籮 籫籾籽籵籮类 籭籲籭 籲籽簪"

    use strict; use warnings; use utf8; use open ":std", ":encoding(UTF-8)"; use Test::More; # rot8000 v1.005; my %rot; init_rot(); my $english = "The butler did it!"; my $chinese = "&#31837;&#31857;&#31854; &#31851;&#31870;&#31869;&#3186 +1;&#31854;&#31867; &#31853;&#31858;&#31853; &#31858;&#31869;&#31786;" +; # fix PerlMonks' Uni<code>Mess $chinese =~ s/&#(\d+);/chr($1)/ge; is( rot8000($chinese), $english, "chin2engl -> $english" ); is( rot8000($english), $chinese, "engl2chin -> $chinese" ); my $random = join "", map { chr int rand 2**16-1 } 1..30; is( rot8000(rot8000($random)), $random, "Identity" ); done_testing; sub rot8000 { my ($in) = @_; my $out; for my $char (split //,$in) { $out .= $rot{$char} // $char; } return $out; } sub init_rot{ my @toggles = reverse (0, 33,127,161,5760,5761,8192,8203,8232,8234 +,8239,8240,8287,8288,12288,12289,55296,57344); my @allowed = map chr, 0.. 2**16-1; while ( my ($stop,$start) = splice @toggles, 0, 2 ) { #say "$start-$stop"; splice @allowed, $start, $stop-$start; } my @allowed_low = splice @allowed, 0, (@allowed/2); @rot{@allowed_low} = @allowed; @rot{@allowed} = @allowed_low; }
Re: ROT8000 implementation?
by karlgoethebier (Abbot) on Oct 16, 2021 at 13:53 UTC

    If you really want to do this why don’t you port some code from GitHub you linked to to Perl?


    «The Crux of the Biscuit is the Apostrophe»

      Obviously he asked before reinventing the wheel.

      Your post adds nothing new to the thread, or do you want to show us your implementation?

        «…show us your implementation?»

        Why should I? The OP himself linked to an implementation. If this implementation is good or valid or what ever you like is another question. And it is a good and common practice to port some algorithm from one language to another. See Rosetta Code for some examples.

        «The Crux of the Biscuit is the Apostrophe»

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11137610]
Approved by LanX
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2021-12-05 13:34 GMT
Find Nodes?
    Voting Booth?
    R or B?

    Results (31 votes). Check out past polls.