in reply to File::Find considered hard?

Well... here's another way to put it: compare File::Find in perl to find in shell. Now also think of this: "good tools make the easy things easy and the hard things possible."

What is the "easy thing" to do with find (or File::Find)? It is to produce a list of every file/directory in a directory tree and/or iterate over that list.

So, look at how find (as a shell command) interfaces to shell scripting to perform the simple end of things (bear in mind that it often isn't even necessary to use find for doing a great deal of simple things in shell commands, because most shell commands that have anything to gain by it implement there own directory recursion, via a "-r" or "-R" switch... but that's a whole other argument, all together):

Also, please, let's put aside the fact that you really should be doing those more like: As all of that extra garble is just a result of limitations of shell scripting. They're not part of the inherent concept of what's going on. It's just an artifact.

Anyway, now think of how File::Find interfaces to other perl code, and compare this to how other perl code interfaces. I'm not gonna write out how it does work, but lets look at how it should work:

I think the reason why you see so many people complain about how File::Find works is that they have an expectation that it work, basically, like above. It's a rare (and wonderful) thing when building a module, to encounter a pre-existing interface to build to... even if the interface is only existing in the mind of every would-be user of the module.

Where the creators of File::Find went wrong was when they decided to model the interface to File::Find off of the -exec command to find (rather than the -print or -print0 command). The thing is: -exec is only necessary in the find command because of these two things:

Of course, in perl, the first is still sort of true, but that's irrelevant because the second is completely wrong. In fact, the opposite is the case. Perl programmers would rather deal with a list or a loop than with a callback.

In this sense, File::Find's interface problem is in many ways a microcosm of a common issue with technology: new technology comes along to replace old technology, but carries along artifacts of its predecessor that don't apply any more. The thing that is so tragic about how it happened with File::Find is that the artifact which should have been abandoned has actually been taken as the central feature. Instead of taking the case that should have been the focus (-print0), and realizing that the issues which gave rise to the need for the ugly artifact were not an issue in perl, the implementers focused on the artifact and dropped the central case.

Would people have excepted unix find so well, if just plain old find didn't work... if you had to find -exec echo \{\} \;. Who would have used that? Only people who really, really needed to. And they would have cursed it all the way.

------------ :Wq Not an editor command: Wq

Replies are listed 'Best First'.
Re^2: File::Find considered hard?
by Aristotle (Chancellor) on Mar 15, 2004 at 07:18 UTC

    Just an aside:

    You can't do everything you'd want to do with find -print0 | xargs -0. For example, to rename all files to $file.bak, you'd have to write a shell "for" or "while" loop.

    At least with GNU xargs, that's not true. There's an option -i which lets you specify a placeholder in the commandline passed to xargs (which can be specified but defaults to {}), so a xargs solution for the example above would be

    find $FOO -print0 | xargs -0i mv {} {}.bak

    Of course this loses the main advantage of xargs: you are back to spawning one mv process per file, so you might as well just use the portable -exec interface.

    Makeshifts last the longest.

      Perfectly valid, but I don't think it does anything to damage the point I was trying to make.

      I mean... even if xargs didn't have that option, you could do something like:

      find -print0 | xargs -0 -l sh -c 'mv "$0" "$0".bak'
      But that's just getting silly, and even further from the point. =D
      ------------ :Wq Not an editor command: Wq
        That's why I said it was an aside.. :)

        Makeshifts last the longest.