Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

cksum contents of a tarball

by diotalevi (Canon)
on Sep 14, 2006 at 22:17 UTC ( [id://573018]=sourcecode: print w/replies, xml ) Need Help??
Category: Utility Scripts
Author/Contact Info
Description: Produces cksum info on the contents of a tarball while leaving the minimum files extracted at any given moment.
#!/usr/bin/perl
## no critic pod
use strict;
use warnings;

=head1 NAME

cksum-tgz - Checksum the contents of a tarball

=head1 SYNOPSIS

  cksum-tgz tarball.tgz > tarball.cksum

  Options:
    --help
    --man

=head1 DESCRIPTION

This is kind of equivalent to unpacking a tarball, running cksum on
everything in it, then deleting the stuff that was unpacked. It
attempts to keep a minimum of files around on the disk while
operating.

=head1 OPTIONS

=over

=item C<--help>

Shows a short help message.

=item C<--man>

Even more help.

=back

=cut

use Getopt::Long 'GetOptions';
use autouse 'Carp'       => 'croak';
use autouse 'Pod::Usage' => 'pod2usage';

GetOptions(
    man  => sub { pod2usage( -verbose => 2 ) },
    help => sub { pod2usage( -verbose => 1 ) },
    )
    or pod2usage( -verbose => 0 );
@ARGV == 1 or pod2usage( -verbose => 0 );
my ($tgz) = shift @ARGV;

## no critic noisy
open my $tgz_fh, '-|', 'tar', 'xvzf', $tgz
    or croak "Can't tar xvzf $tgz: $!";

# tar will print a file before it is done so I have this reader to
# wait for the next file to get mentioned.
my $file_reader = do {
    my @files;
    sub {

        # Add to @files if necessary and possible.
        while ( @files < 2 and $tgz_fh ) {
            chomp( my $file = <$tgz_fh> );
            if ( not defined $file ) {
                close $tgz_fh;
                undef $tgz_fh;
            }
            else {
                push @files, $file;
            }
        }

        if (@files) {
            return shift @files;
        }
        else {
            return;
        }
    };
};

# For each file, cksum it.
# For each directory, plan to remove it in LIFO order.
my @directories;
while ( my $file = $file_reader->() ) {
    if ( -f $file ) {
        0 == system 'cksum', $file
            or croak "Can't exec cksum $file: $?";
        unlink $file
            or croak "Can't unlink $file: $!";
    }
    elsif ( -d _ ) {
        unshift @directories, $file;
    }
}

# Remove my directories in LIFO order.
while (@directories) {
    my $size_before = @directories;
    @directories = grep { not rmdir } @directories;
    last if @directories == $size_before;
}
Replies are listed 'Best First'.
Re: cksum contents of a tarball
by dcd (Scribe) on Sep 15, 2006 at 21:58 UTC
    Not really a comment on the code, but the ideas it triggered. I know perl has a way to create tar files (even find2perl can create tar files), and Archive::Tar can read as well as write files.
    Did you need to create temporary files because cksum couldn't take input from stdin?
    It would be nice if we could treat tar files as file system - but that would require special modules loaded into linux, although maybe Perl could virtualize it as a file system.

      Archive::Tar requires that you load the entire tarball into memory. My tarballs are around two gigs.

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://573018]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2024-04-19 23:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found