in reply to Re: Displaying NUL in a TK::Text widget
in thread Displaying NUL in a TK::Text widget

It's not that the NUL is screwing up my other output...I want the NUL. I'm trying to create an editor for some binary data...NULs are likely in that data. I noticed that for the other "non-printing" characters, it put in an escape for the char (i.e. \x{1} for ASCII-1). I wonder by what mechanism that's happening and if it can be extended to include NUL as well...

thor

Feel the white light, the light within
Be your own disciple, fan the sparks of will
For all of us waiting, your kingdom will come

  • Comment on Re^2: Displaying NUL in a TK::Text widget

Replies are listed 'Best First'.
Re^3: Displaying NUL in a TK::Text widget
by graff (Chancellor) on Nov 07, 2004 at 09:30 UTC
    I'm curious what Tk would be displaying for characters in the range  [\x7f-\xff] (that is, the ASCII "DEL" code and byte values with the 8th bit set). I'd expect these to be likely in binary data as well.

    If the plan is for a person to use a GUI to edit binary data, I think it would be better for the widget to be displaying some consistent projection of the data into visible characters, rather than pumping the raw binary data directly to the widget.

    For example, you could display the byte stream as a space-separated sequence of two-digit hex numbers (or three-digit octal or even decimal); or a combination of visible ASCII characters plus "escape" or "control" strings like "\n" or "^J"; or maybe use a font that combines "normal" ASCII characters with those nifty two-letter abbreviations for control codes -- Zaxo showed a cool trick using Unicode for this: Printing the Unprintable (but as indicated in that thread, there might not be displayable glyphs for all possible byte values).

    I'm also curious what sort of technique you provide for keyboarding arbitrary binary values (when the user needs to add or change a byte value). Obviously, if you display just space-separated numerics (hex, octal or decimal), the user could just type in digits; or if you're showing things like "^J", accept strings like that for input. You could even offer the users a choice of display/keyboarding methods.

    The main point, though, is that you should have some sort of transform between the binary data in a file and the displayable/editable data in the GUI -- not only to make sure that everything can be seen and typed in, but also to eliminate any possible ambiguities in the display (e.g. space vs. tab vs. LF vs CRLF).

Re^3: Displaying NUL in a TK::Text widget
by tachyon (Chancellor) on Nov 07, 2004 at 12:01 UTC

    As graff notes you can't just pump raw binary in an expect it to display in any logical fashion. But given that you like the \x{1} notation why not do something like:

    $_ = "\000japh\njareh\000"; s/([^\040-\177])/sprintf "\\x{%02x}",ord($1)/eg; print;

    Which just hands the chars to Tk as you seem to want them displayed. Personally I would suggest a hex editor format like this typical output which has 3 cols (offset hex ascii )

    File: jargon-4.4.7.tar.gz size = 9061260 bytes 0% [H] Press 'h' +for help 00000000: 1F 8B 08 00 3E 74 F0 3F 00 03 EC FD 07 3C 5C DF ....>t. +?.....<\. 00000010: DF 2F 8A 8F 16 BD D7 28 51 93 E8 8C 3E 44 1B 7D ./..... +(Q...>D.}

    With your solution you have ASCII printable taking 1 char width but non printables taking either 5 or 6. With hex it can always be 2 or decimal/octal 3 chars. You can display the printable ASCII as a separate column.....

    cheers

    tachyon

      This method gets the display right, however, as long as I'm asking for the world...:)

      When Tk handles the conversion of non-printable characters, when I use the arrow keys to navitate in the text pane, it treats the non-printable as one character. That is to say that if I position the cursor right before the "\" and hit right arrow once, the cursor ends up after the "}". With this solution, that is not the case as I'm replacing one character with several. The translation is happening in such a way that it displays perl's notion of hex characters. I just don't know where that translation is happening.

      Alternatively, if I can tell Tk to treat that series of characters (\x{10}) as only one character, that'd be acceptable too.

      Finally, I'd like to thank everyone for their help thus far.

      thor

      Feel the white light, the light within
      Be your own disciple, fan the sparks of will
      For all of us waiting, your kingdom will come

        I wish I had a better sollution to recommend, but all I can think of is to:
        • put a tag around the text
        • when there is input, see if the "insert" mark moved past the start of a tag of type mytag
        • if so, move the "insert" mark past the end of the tag.
        A lot of work, but worth it IMO if you're going to use this thing a lot.

        You also might want to consider subclassing Tk::Text to support all these changes. Maybe overloading insert to make the transformation before handing it off to SUPER::insert...


        --
        Snazzy tagline here