I'm using LWP to download large files. To avoid storing the entire file in memory I'm reading in smaller chunks (:read_size_hint => 4096) and using callbacks (:content_cb => \&mysub). What I want to do is remove a small header that will only be present in the first chunk. So as I download that first chunk, I'd like to uncompress it, perform a substitution on the text (removing the small header), recompress the chunk, and write it out, as well as the subsequent chunks. This way I don't have to uncompress the whole file after download and recompress it again.

I've tried this using IO::Uncompress::Gunzip and IO::Compress::Gzip (since Compress::Zlib recommends using them) to accomplish this. See the test case I've included, which takes a gzip file and tries to emulate this behaviour by splitting it in two chunks, modifies the first chunk and tries gluing them back together again.

This doesn't work, as it the second chunk is considered trailing junk. Is it possible to do what I'm trying?

use IO::Uncompress::Gunzip; use IO::Compress::Gzip; my $gzfile = shift; open my $fh, $gzfile or die "$gzfile: $!\n"; my $buf = do { local $/; <$fh> }; my ($p1, $p2) = unpack("a4096a*", $buf); my $ugz = IO::Uncompress::Gunzip->new(\$p1); $ugz->read(my $gbuf); $ugz->close; $gbuf =~ s/.*?(?=^[^%\n])//ms; my $cgz = IO::Compress::Gzip->new(\ my $z, -Level => 9); $cgz->syswrite($gbuf); $cgz->close; syswrite STDOUT, $z; syswrite STDOUT, $p2;

In reply to Can I modify a single chunk of a gzip stream? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.