Essentially what you are saying is that Puppet is sending you non-standard (therefore non-YAML) data, and Perl's YAML modules don't handle it. Understandable :)

Decoding that long string manually shouldn't be a problem. Though the format you describe means your 4MB jpg will come across as 20MB of text.

If it were encoded as asciified hex bytes (technical terms:), then decoding it would be simple, if horribly slow. Something like:

my $jpg = pack 'C*', map hex( '0' . $1 ), $content =~ m[\\(x[0-9a-fA-F +]+)]g;

But, your example \x123\x321 shows values greater than \xff, which suggests that they are encoding unicode characters rather than bytes. So you'd need something like:

my $jpg = pack 'U*', map hex( '0' . $1 ), $content =~ m[\\(x[0-9a-fA-F +]+)]g;

But whether you could then print that to a binary file without getting a bunch of Wide character in print ... warnings or the content messed with by IO layers I have no idea.

Also, be aware that not only will you have the original 20MB string, and the 4MB result in memory, but also 2 very large lists of scalars. One to the map and one to the pack. How large will depend upon how the unicode decides to split up the binary into 'characters', but each list will be at least 1 million scalars and up to 4 million long.

Seems like a really silly (slow, clumsy & labourious) way to transfer a file given that LWP::Simple will transfer a 4MB binary file locally in less than second.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^3: Load large(ish) files with YAML? by BrowserUk
in thread Load large(ish) files with YAML? by rgcosma

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.