atcroft has asked for the wisdom of the Perl Monks concerning the following question:

Can anyone recommend a module that would provide an IO::*-like interface for accessing tar files?

In a recent project at $work, I used IO::File and IO::Uncompress::AnyUncompress to simplify processing of compressed (bzip2 or gzip) or uncompressed files. (For example, reading lines via $foo->getline(), or current line number by $foo->input_line_number.) Now, I have been asked to add processing for tar archives. While I have used Archive::Tar for processing tar archives before, I was hoping that there might be module that could provide an IO::*-like interface so I could minimize the changes needed to add the support.

Thank you for your time and attention, and any direction you may provide.

Replies are listed 'Best First'.
Re: Is there an IO::*-like interface for accessing tar files?
by MidLifeXis (Monsignor) on Feb 25, 2015 at 18:30 UTC

    Were you thinking of something like the IO::Dir interface for walking the contents of the tar file, or something to handle each of the files within? How would you handle tar archives where there are multiple copies of the same file in the archive?

    The difference (which you, but not those asking, may already know) between the other two file types and a tar archive is that the first two are single files, while a tar archive is more like a directory, with the potential of multiple instances of a file within.

    Update: Does nextStream from IO::Uncompress::AnyUncompress do what you are thinking of for getting individual files from the archive? Is that the interface you are considering?

    --MidLifeXis

      Thank you for the feedback.

      I was looking for something to read the contents of the file (a compressed archive of log files, for the curious). If possible, I was trying to retain the use of the close(), getline(), and input_line_number calls from the IO::* modules, so I could localize my dependent logic to one place.

      I looked at the nextStream method you suggested, but it did not appear to do what I needed.

        Since a tar file contains an index, how would the getline() method behave? Would you anticipate being able to iterate through each file within the tar file and then use an IO:: object on each virtual file within that archive (where getline() would apply)?

        Just trying to clarify your expectations for my understanding.

        --MidLifeXis

Re: Is there an IO::*-like interface for accessing tar files?
by BrowserUk (Patriarch) on Feb 25, 2015 at 20:20 UTC

    Does Archive::Tar::Stream come close(r) to your requirements?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

      Thank you for the feedback.

      Archive::Tar::Stream was interesting, but it did not seem geared toward the type of problem I had. Also, I was trying to retain the use of the close(), getline(), and input_line_number methods from the IO::* modules, so I could localize my dependent logic to one place.

      I do have the advantage (for this case) of not caring about the internal files themselves, just the data within them. For now, I will just "cheat" and use IO::Pipe and a call to tar with the appropriate compression switch ('j' or 'z' for Bzip2 or Gzip, respectively) and the 'O' option to send the output into the pipe (although I am willing to look at any suggestions, and appreciate any and all of them).