in reply to Re^4: Printing the first letter of the Hebrew alphabet (U05D0) kills script?
in thread Printing the first letter of the Hebrew alphabet (U05D0) kills script?

Under normal mouse mode, positions outside (160,94) result in byte pairs which can be interpreted as a single UTF-8 character;

For there to be an issue, a sequence of UTF-8 characters has be interpreted as an escape sequence, not the other way around.

From higher up in that linked document comes this:

The xterm program recognizes both 8-bit and 7-bit control characters. It generates 7-bit controls (by default) or 8-bit if S8C1T is enabled.

It proceeds to say 0x9B and ESC [ are equivalent, for example.

More relevant, it says 0x90 and ESC P are equivalent. U+05D0 is 0xD7 0x90 in UTF-8.

Are these equivalent for you?

perl -e'print "\x1B[31m", "foo", "\x1B[0m", "bar", "\n";' perl -e'print "\x9B31m", "foo", "\x9B0m", "bar", "\n";'

Perhaps you can tell xterm to stop recognising the "8-bit" codes.

  • Comment on Re^5: Printing the first letter of the Hebrew alphabet (U05D0) kills script?
  • Download Code

Replies are listed 'Best First'.
Re^6: Printing the first letter of the Hebrew alphabet (U05D0) kills script?
by ikegami (Patriarch) on Mar 08, 2011 at 22:27 UTC

    From "man xterm":

    Modes for setting keyboard style:

    8-Bit Controls (8-bit-control)

    Enabled for VT220 emulation, this controls whether xterm will send 8-bit control sequences rather than using 7-bit (ASCII) controls, e.g., sending a byte in the range 128-159 rather than the escape character followed by a second byte. Xterm always interprets both 8-bit and 7-bit control sequences (see the document Xterm Control Sequences). This corresponds to the eightBitControl resource.

    (Xterm Control Sequences is the document to which you linked earlier.)

    But for some reason, the 8-bit control sequence I posted above isn't recognised by my xterm.

    Ideally, it would recognise the sequences only between characters, but maybe it's detecting the sequences in the middle of characters too.

      While a raw 0x9B doesn't work on my (non-xterm) console, the UTF-8 encoding of 0x9B works!

      # Doesn't work perl -we'print "\x9B31m", "foo", "\x9B0m", "bar", "\n";' # Works perl -CS -we'print "\x9B31m", "foo", "\x9B0m", "bar", "\n";'

      Neither work on in an xterm for me.