I'm using LWP to download large files. To avoid storing the entire file in memory I'm reading in smaller chunks (:read_size_hint => 4096) and using callbacks (:content_cb => \&mysub). What I want to do is remove a small header that will only be present in the first chunk. So as I download that first chunk, I'd like to uncompress it, perform a substitution on the text (removing the small header), recompress the chunk, and write it out, as well as the subsequent chunks. This way I don't have to uncompress the whole file after download and recompress it again.
I've tried this using IO::Uncompress::Gunzip and IO::Compress::Gzip (since Compress::Zlib recommends using them) to accomplish this. See the test case I've included, which takes a gzip file and tries to emulate this behaviour by splitting it in two chunks, modifies the first chunk and tries gluing them back together again.
This doesn't work, as it the second chunk is considered trailing junk. Is it possible to do what I'm trying?
use IO::Uncompress::Gunzip; use IO::Compress::Gzip; my $gzfile = shift; open my $fh, $gzfile or die "$gzfile: $!\n"; my $buf = do { local $/; <$fh> }; my ($p1, $p2) = unpack("a4096a*", $buf); my $ugz = IO::Uncompress::Gunzip->new(\$p1); $ugz->read(my $gbuf); $ugz->close; $gbuf =~ s/.*?(?=^[^%\n])//ms; my $cgz = IO::Compress::Gzip->new(\ my $z, -Level => 9); $cgz->syswrite($gbuf); $cgz->close; syswrite STDOUT, $z; syswrite STDOUT, $p2;
In reply to Can I modify a single chunk of a gzip stream? by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |