I think I'll address your post a little backwards. "Stupidity"? Communication is a two-way street, and thus it's rarely one-sided on a failure. In this case, I think I have to shoulder more of the blame of your misunderstanding than you do ;-)
I had it my head the example of Archive::Tar. I'm not entirely sure why (was there a recent thread on this module or something that could be sticking in my head?). So when I say "$obj->get_file", I'm thinking that we're telling the "$obj"ect to get a file from the archive (or, more generally, from whatever source it is encapsulating, e.g., a cache, an FTP server, a web server, a zip file, a jar file, whatever). In this scenario, in all likeliness, well over 80% of the time, you want to get the file from the encapsulated location, and put it somewhere locally. This is where having a simple function that does I/O makes sense. For the other < 20% of the time, I agree that the file handle interface makes the most sense (some encapsulated data types may make more sense to use the callback, e.g., sockets such as FTP, where you still want the object to handle the communication with the remote server, and the overhead of creating a tie'd handle may be too much to bother with, since the end of the file doesn't actually correspond with the end of data through the socket).
Perhaps a more elaborate set of object methods would be useful:
sub extract_file # takes a destination directory or filename
sub extract_file_via_filehandle # takes a filehandle to write to
sub insert_file # takes a source directory and filename
sub insert_file_via_filehandle # takes a filename (for inside the arch
+ive) and a filehandle (for the source)
All of these have no return value (as I was thinking for the get_file/get_filehandle above). Yes, with perl, often you can overload these to be the same function which can dynamically figure out the difference between a file name and a filehandle. Sometimes that's not feasable, other times ... well, I'm separating it out here purely to show the dual interface of which I'm supportive.
You do bring up an interesting question: a way to get a filehandle to which one can write. Unfortunately, that may mean some sort of tied interface - this to allow further action with it (imagine FTP where the connection needs to be maintained, thus closing the filehandle should be somehow prevented, or a dynamic archive where one day you could be writing to a filesystem, another to an FTP server, and another to a RDBMS - some of these would need something tied, such as saving to a blob in a database via chunks). I may have to pursue that myself, actually. It may be a cleaner interface. Thanks ;-) | [reply] [d/l] |
In this scenario, in all likeliness, well over 80% of the time, you want to get the file from the encapsulated location, and put it somewhere locally.
And that goes to the crux of this subthread.
If all a user needs to do is extract files individually or collectively from a container and place them into the filesystem--which may well be the requirement for 80% of the uses of the containers--he doesn't need to use the API!. Use the command line interface for this. You don't need to write any boilerplate code. You don't need to write any code at all!
However, if the container is being accessed via the API, then the very last place you want the results is in a file in the filesystem.
Using APIs to duplicate the work of standalone utilities is redundant, risk-prone and wasted effort. Once you arrive at that conclusion, providing APIs to that only allow that type of operation is also redundant and totally devoid of value add.
The API only becomes useful over the standalone utility if you want to further process the files (or filenames) that are being added or extracted. To that end, providing an API that allows that further (or pre-) processing is not just a "nice to have", but the only API that makes any sense.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?
| [reply] |
Who said anything about the user? What about the programmer? And who said that typing "tar xf $tarball $filename" at the shell prompt isn't shell scripting (and thus code)?
I think we're simply going to have to agree to disagree on your statement that "the very last place you want the results is in a file in the filesystem." That simply has not been my experience in coding with perl - it is very rare that I want my results anywhere but the filesystem. The rare time that I have used the filehandle form of File::Copy, for example, has been to concatenate multiple files (specifically, pksfx.exe and a zip file). Well over 95% of my calls to File::Copy are "copy($srcfile, $dstfile)". Which, in a perl script, is way simpler than "if ($^O =~ /win/i) { system "copy $srcfile $dstfile" } else { system "cp", $srcfile, $dstfile }" (all my scripts have to run on Windows, various flavours of unix and various flavours of Linux) or even way simpler than "open my $srch, '<', $srcfile or die $!; open my $dsth, '>', $srcfile or die $!; copy($srch, $dsth);" which is at least still portable (to more than just windows and unix).
Similarly, 100% of my usage of Archive::Tar has been to extract files to the file system, or to insert files from the file system. That's not to say 100% of all usage of Archive::Tar is this, only 100% of my usage. And the reason I chose perl wasn't to do post-processing of files, but post-processing of the archive: querying the archive to see what the root directory name was. Doing this is shell is possible, but quite ugly. Doing this in perl was easy. The commandline interface to tar simply doesn't have the option to look at files in an archive like this (specifically, I only look at the first file since I know that they will all have the same top-level directory). And then I just told Archive::Tar to extract normally, and could do something with that directory that was just created (i.e., create a symlink of "latest" to it). Filehandles and looping through all the files would have been a waste of my programming time since so many people do that - it's better situated in the module. If I had to do that, then I would have done this in shell script - it would have been a bit slower, but I would have spent less time programming it, which is always a consideration.
In fact, in all that I've done, I have very very rarely wanted a filehandle that wasn't going directly from or to a filesystem... and I used IO::Open3 for that. I am quite thankful that these module writers don't take the mathematician's view that the solution I want can be "trivially derived" from a single, flexible interface, but the engineer's view that "here's the derivation - just use it, and don't reinvent the wheel".
| [reply] [d/l] [select] |
If all a user needs to do is extract files individually or collectively from a container and place them into the filesystem--which may well be the requirement for 80% of the uses of the containers--he doesn't need to use the API!
I'm not so sure I agree with that. What if the user wants to extract files based on complex criteria? What if the user wants to automatically extract files received via web upload forms? What if the user wants to extract the files from many archives at the same time?
Certainly, some of these things can be solved by shell scripts or otherwise, but doesn't Perl sound like an attractive solution too? Having an API that makes these things simple, while still having all the power of Perl around it, seems like a benefit to me.
Added a few minutes later: sorry about jumping in so deeply into the thread. I thought it would be useful to add to the conversation. But now I wonder if anyone will even see it. =^)
| [reply] |