in reply to extracting binary file from XML

First off, whomever made the file format should be beaten with a sack of rusty doorknobs. :)

As for how to handle the abomination, it looks like it's probably base64 encoded binary data so probably the approach will be to use MIME::Base64 to convert it back to the original octets, then toss that through something like Compress::Zlib or IO::Uncompress::Gunzip.

The cake is a lie.
The cake is a lie.
The cake is a lie.

Replies are listed 'Best First'.
Re^2: extracting binary file from XML
by ultraman (Novice) on Mar 09, 2009 at 18:48 UTC

    it works!!!!

    the MIME::Base64 managed to "reconstruct" the octet streams/files...

    i'll polish up the code and post it later.

    my surviving brain cells salute you!!! :-)

Re^2: extracting binary file from XML
by Jenda (Abbot) on Mar 10, 2009 at 15:37 UTC

    First on, what would you suggest? XML will not let you include arbitrary bytes in the document, not even if you escaped them as  so you do have to use something like base64 to ensure the stuff you put into the XML document contains only the safe characters. Preferably those that are the same in Latin1 and utf8 (and Latin2 and ...). There is no way to escape certain bytes (non-printable characters in Latin1 if you will) so that the result is valid XML and the parser returns those bytes. Sad but true.

      Right, which is precisely why XML makes a poor substrate for wrapping up arbitrary binary data. My point was primarily that it just seems a . . . strange choice. There's better solutions for this kind of thing (bundling up files with metadata (XML for just said metadata being a better fit)).

      Update: Added parenthetical about XML for metadata.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

Re^2: extracting binary file from XML
by Your Mother (Archbishop) on Mar 09, 2009 at 23:30 UTC
    First off, whomever made the file format should be beaten with a sack of rusty doorknobs. :)

    Damn you! I almost spit coffee all over my keyboard.