in reply to Re^3: Tk:Photo -data with binary data?
in thread Tk:Photo -data with binary data?

Here's a patch against Tk804.027 which adds binary data support, but for GIF only. The patch includes additional tests in the t/photo.t file.

I found that the internal Tk image handling code is already binary-ready, at least for the GIF format. Unfortunately Perl/Tk converts every string into a utf8-flagged string, which means that the data has again to be converted back to bytes before handing to the image handling code, which consumes space and time. It seems to be non-trivial to avoid these unnecessary conversions.

--- t/photo.t (revision 1365) +++ t/photo.t (revision 1367) @@ -14,7 +14,7 @@ $numFormats++ unless $@; my $mw = MainWindow->new(); $mw->geometry('+100+100'); -plan tests => (2*(7 * $numFormats) + 2); +plan tests => (2*(7 * $numFormats) + 2 + 2); my @files = (); @@ -81,6 +81,38 @@ foreach my $leaf('Tk.xbm','Xcamel.gif') } $row += 3; } + +# Extra tests +my $col = 0; +$mw->Label(-text => "Extra tests")->grid(-row => $row++, -column => $ +col); +my $file = Tk->findINC('Xcamel.gif'); +my $data = do { open my $fh, $file or die $!; local $/; <$fh> }; + +if ($Tk::VERSION <= 804.027) + { + skip("Binary GIF data not supported",1,1); + } +else + { + my $image = $mw->Photo(-data => $data); + ok(defined $image, 1, "Read binary GIF data"); + $mw->Label(-background => 'white', -image => $image)->grid(-row => +$row, -column => $col); + $mw->update; + } +$col++; + +if (!eval { require MIME::Base64; 1 }) + { + skip("Need MIME::Base64 module",1,1); + } +else + { + my $image = $mw->Photo(-data => MIME::Base64::encode_base64($data)) +; + ok(defined $image, 1, "Read base64 encoded GIF data"); + $mw->Label(-background => 'white', -image => $image)->grid(-row => +$row, -column => $col); + $mw->update; + } +$col++; $mw->after(2500,[destroy => $mw]); MainLoop; --- objGlue.c (revision 1365) +++ objGlue.c (revision 1367) @@ -584,7 +584,17 @@ Tcl_GetByteArrayFromObj(Tcl_Obj * objPtr /* FIXME: presumably should downgrade from UTF-8, what frees it ? */ - return (unsigned char *) Tcl_GetStringFromObj (objPtr, lengthPtr); + /* SRT: Is this correct? */ + sv_utf8_downgrade(objPtr, 0); + if (lengthPtr) + { + return (unsigned char *) SvPV(objPtr, *lengthPtr); + } + else + { + return (unsigned char *) SvPV(objPtr, PL_na); + } +/* return (unsigned char *) Tcl_GetStringFromObj (objPtr, lengthPtr); + */ }

Replies are listed 'Best First'.
Re^5: Tk:Photo -data with binary data?
by BrowserUk (Patriarch) on Dec 14, 2006 at 05:13 UTC

    Ike. That's horrible.

    Taking a few brief moments to try and unwind what sv_utf8_downgrade() does, it appears to involve multiple passes (2 or 3) of the data, potentially with one of those passes calling a function on every 'char' in the data! I'll admit, my attempt to unwind the layers was cursory and I may have missed some subtlety in the macros/nested function call definitions.

    You'd have to benchmark to be sure, but given that base64 encoding and decoding is a single pass process either way, it's possible that incurring a bytes->utf & utf->bytes cycle in order to avoid the binary->base64 & base64->binary could well be a deoptimisation? Of course, I'm assuming that Perl/Tk doesn't also convert the base64 to utf?

    Without even having begun to have considered any other implications, it seems to me that the idea of passing a pointer to the binary image data, directly into the image handling code makes even more sense?

    There's also that passing (say) a 1024x768x24 image as binary data in GD's gd(1) format involves 3 MB of uncompressed data that these conversions would need to process (along with all the memory alloc/deallocing that would involve), whereas passing (say) png, that same data could be only a few Kb in it's compressed form. It's possible that the extra processing required to do/undo the utf conversions on the uncompressed data would counterbalance the savings in not compressing/decompressing through an intermediat format?

    Geez. Unicode certainly has a lot to answer for. Perl (and everything else) really needs a new variant of the SvPV that has a flag (or combination of existing flags) that says 'This is binary data of CUR bytes. It should *never* be implicitely converted, inspected for conversion, nor up or downgraded in any way'. And all the macros/functions that do any of those things just see that flag and do nothing.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Still, using binary data seems to be faster:
      use Tk; use Benchmark qw(cmpthese); use MIME::Base64 qw(encode_base64); use LWP::Simple qw(get); my $mw = tkinit; my $file = Tk->findINC("Xcamel.gif"); my $data = get "file://$file"; # poor man's File::Slurp cmpthese(-1, { base64 => sub { my $p = $mw->Photo(-data => encode_base64($data)); $p->delete; }, binary => sub { my $p = $mw->Photo(-data => $data); $p->delete; } } );
      The result is:
      $ perl5.9.5d -Mblib=/home/e/eserte/work/svk-checkouts/Tk-debug ~/trash +/imgbench.pl Rate base64 binary base64 574/s -- -38% binary 926/s 61% --
      I agree that the performance gain would be larger if no (unnecessary) conversion to utf-8 would occur at all. Maybe it's somehow possible to prevent this, for example by using a special reference or magics for the value of -data, which wouldn't be upgraded by Tcl_GetString calls.

      But I think I have first to learn how the Perl/Tk internals work at all...

        Still, using binary data seems to be faster:

        Definitely worth having. I look forward to it filtering it's way through the chain into a an AS distribution.

        Maybe it's somehow possible to prevent this, for example by using a special reference or magics for the value of -data, which wouldn't be upgraded by Tcl_GetString calls.

        I'm not sure what you mean by a "special reference"?

        To me, it doesn't make sense to pass such large volumes of data (as images tend to be) by value. Especially if the receiver is going to have to make another copy when it gets it. If the call was designed to take a reference, then it could read the data directly from the callers space as it does whatever (the first) conversion is needed.

        That needn't even change the current interface. It just needs to test if it's been given a reference and do the current thing if not.

        But I think I have first to learn how the Perl/Tk internals work at all...

        Out of interest, how big are the sources for Perl/Tk? Anything beyond a few 10s of MB, and it gets silly trying to get an up to date copy via a 40kb/s connection. It changes faster than I can download it.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Of course, I'm assuming that Perl/Tk doesn't also convert the base64 to utf?
      Everything's upgraded to utf-8. In case of base64 data this means that the data is left unchanged, other with an added utf-8 flag. But Perl still has to loop over the base64 data to check if there are some high-bit characters.