jujiro_eb has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

Please help me out here. I am struggling with creating a tar ball containing the traversed tree of files, keeing their relative pathnames in tact.

It was a piece of cake with Archive::Zip using code like the following:

my $obj = Archive::Zip->new(); # new instance $obj->addTree( "$build_home/.", 'myapp' ); $zip_file="$build_dest/ama_".$myapp_version."_win32.zip"; unlink $zip_file; $obj->writeToFileNamed($zip_file);
With Archive::Tar I am not able to save the path at all. Please do pardon my ignorence but Archive::TarGzip is really not very well documented. My only option to create a tar.gz file is to use the tar utility from command line. Being on Windows platform does not make that possible.

Can you please give me some pointers. I would greatly appreciate the help.

Ash

Replies are listed 'Best First'.
Re: Question on Archive::Tar vs Archive::TarGzip
by graff (Chancellor) on May 08, 2009 at 03:19 UTC
    Maybe you should show us the code you tried with Archive::Tar.

    Did your attempt look anything like this?

    perl -MFile::Find -MArchive::Tar -e "$t=Archive::Tar->new(); find(sub{-f && $t->add_files($_)},$ARGV[0]); $t->write(join('.',$ARGV[0].'tar')" some_path
    Note that the total size of the directory being treated is an important factor; if it's really big (i.e. >= your machine's RAM), Archive::Tar will cause trouble, because all the data needs to be memory resident (with substantial overhead) before a tar file can be written. "Trouble" here could mean anything from "just slows everything down a lot" (because of massive virtual memory swapping) to "blue screen of death" (due to some critical state caused by over-consumption of resources).

    If there are ways to break up a large data set into pieces of reasonable size (and if it's not absurd to make a separate tar file for each piece), then that will be the way to go.

    I just tried a command line like the one above (but with different quotes, because I prefer a bash shell -- I'm not sure if the snippet as posted would actually work in cmd.exe). It did the job -- even with a directory tree totaling 2.3 GB of data; I happened to have an 8 GB swap file. (But I wasn't using windows, so YMMV.)

    (updated last paragraph, trying to clarify about the quoting issue -- and regarding memory size, I have 2G of RAM, but the total process size for that command went well over 3G)

    I'll add one more comment: Being on windows is not a good excuse for not having a proper compiled GNU tar tool. If you don't have admin privilege on the box, the question becomes: why can't your admin person provide this tool for you? If you have perl on the box, you should be able to get at least some amount of cygwin or uwin or some other flavor of unix-for-windows tools (including a bash shell, while you're at it).

    Last additional comment: Why is a zip file not good enough, given that you were already able to create that easily? Virtually every OS on every kind of hardware has a tool available for reading zip files, and this has been true for a long time. Worst case: make a zip file, send it to a friend who is better equipped than you are, and have them repackage it in tar format for you.

Re: Question on Archive::Tar vs Archive::TarGzip
by graff (Chancellor) on May 08, 2009 at 04:01 UTC
    I got tired of updating my previous reply, but I wanted to point out that the snippet shown there does not meet the spec -- the directory structure is not preserved, all files end up in the single-level directory of the tar set, and if there are name collisions, data will be lost.

    Getting it right is a matter of understanding File::Find (which has a reputation of being inscrutable). In this case, the first parameter needs to be a hash ref, with at least two keys: { wanted => sub{...}, no_chdir => 1 }.

    That's the only thing that needs to change in the snippet. Here it is, fixed to preserve directory structure in the tar file, and this time with bash-style quotes (which I know for sure are correct and do work in a bash shell):

    perl -MFile::Find -MArchive::Tar -e '$t=Archive::Tar->new(); find( { wanted=>sub{-f && $t->add_files($_)}, no_chdir=>1 }, $ARGV[0] + ); $t->write( "$ARGV[0].tar" )' some_path
    (still updating... :P this time, decided to ditch the "join(...)" for the output file name -- quoting is just so much easier with bash.)
Re: Question on Archive::Tar vs Archive::TarGzip
by Anonymous Monk on May 08, 2009 at 02:07 UTC
Re: Question on Archive::Tar vs Archive::TarGzip
by Marshall (Canon) on May 08, 2009 at 16:30 UTC
    For Windows, I suggest this freeware, 7-zip:
    http://www.7-zip.org/
    It can do 7z, zip, gzip, bzip2, Z and tar formats and more.

    Sometimes I use this to make Windows compatible .zip files. The difference is that this thing compresses a lot more and is faster!

    Update: this thing can make tar files, .GZ of tar files, .zip archives. The U/I is a bit quirky but this thing works well to be best of my experience. There is a command line I/F that can do this too!