Unix shell ls vs readdir

jffry has asked for the wisdom of the Perl Monks concerning the following question:

Task: to create a list of every symbolic link pointing to a certain file (in this case, "tomcat").

At first, I used the Unix shell ls command:

#!/usr/bin/perl -w

use strict;
use warnings;

my @lines = grep {/->\s+tomcat\s+/} qx{ls -l /etc/rc.d/init.d};
my @inits = map {(split)[-3]} @lines;

print join("\n", @inits);
[download]

But I wanted to keep it all Perl, so I used readdir and readlink:

#!/usr/bin/perl -w

use strict;
use warnings;

my $dir = '/etc/rc.d/init.d';
opendir(my $dirh, $dir);
my @inits = grep {-l "$dir/$_" && (readlink("$dir/$_") =~ /^tomcat$/)}
+ readdir $dirh;
closedir $dirh;

print join("\n", @inits);
[download]

I like how I did not have to use 2 arrays in my "readdir" option, but I think, if I tried, I could eventually crunch down the "ls" option to only using 1 array. I'm not certain I could preserve readability if I did that, tho.

I'm somewhat torn between the ease of using the "ls" option compared to the "readdir" option. Maybe the "ls" option seemed easier because I'm not as comfortable using readdir as I am shell commands, and this will go away with more experience?

Aside from being all Perl, is there any other reason to use the "readdir" option over the "ls" option?

EDIT: Just realized that there is a slim chance in the "ls" option of getting a bad element on the list. Because I'm only parsing ls output, a funny file name could mess up that parsing. Whereas with the "readdir" option, I'm certain of what I'm getting. That's actually a very good reason to stick with the all Perl "readdir" option.

Comment on Unix shell ls vs readdir Select or Download Code

Replies are listed 'Best First'.
Re: Unix shell ls vs readdir by graff (Chancellor) on May 10, 2010 at 23:02 UTC
Aside from being all Perl, is there any other reason to use the "readdir" option over the "ls" option? EDIT: Just realized that there is a slim chance in the "ls" option of getting a bad element on the list. Because I'm only parsing ls output, a funny file name could mess up that parsing. Whereas with the "readdir" option, I'm certain of what I'm getting. That's actually a very good reason to stick with the all Perl "readdir" option. Right. Apart from the fact that "ls" might behave with minor but annoying differences on different systems, it's generally trickier / less reliable to parse its text output than to pull file names via direct-access to directory entries using readdir (and link target names via direct-access to symlinks using readlink). I have seen file names on unix systems with non-ascii characters and ascii control characters (including line-feed, carriage-return, etc), all of which can be very disorienting when viewed via "ls".	[reply]
Re^2: Unix shell ls vs readdir by Anonymous Monk on Feb 06, 2012 at 12:06 UTC
so what do you think ? in terms of reading the 2Lac of files which one is better in performance ? ls or readdir ?	[reply]
Re^3: Unix shell ls vs readdir by graff (Chancellor) on Feb 07, 2012 at 02:10 UTC
I don't know what "the 2Lac of files" is supposed to mean, and "performance" is either too dependent on unknown factors, or else simply irrelevant. Enough practical reasons have been cited to favor the readdir/readlink approach (ease and reliability of file name handling, vs. somewhat more difficult and trouble-prone string parsing), and in some circumstances, running a subshell to run "ls" could be slower than using readdir/readlink. If a timing difference between the two methods really matters (which is rarely true), then doing a benchmark "in context" (i.e. under the same conditions as production use) would be prudent.	[reply]
Re^4: Unix shell ls vs readdir by GrandFather (Saint) on Feb 07, 2012 at 21:22 UTC
Re: Unix shell ls vs readdir by toolic (Bishop) on May 10, 2010 at 19:01 UTC
Your ls solution will produce different results from your readdir solution if you have "hidden" links, such as: `.foo -> tomcat` [download] If that is a concern for you, use `ls -la`	[reply] [d/l] [select]
Re: Unix shell ls vs readdir by happy.barney (Friar) on May 10, 2010 at 18:35 UTC
take a look at module File::Find, quite different approach.	[reply]
Re^2: Unix shell ls vs readdir by jffry (Hermit) on May 11, 2010 at 04:21 UTC
Do you mean use it like this? `#!/usr/bin/perl -w use strict; use warnings; use File::Find; my @inits; sub wanted { if (-l $_ && (readlink("$_") =~ /tomcat/)) { push @inits, $_ ; } } find(\&wanted, '/etc/rc.d/init.d'); print join("\n", @inits);` [download] I'm not really seeing what I'm gaining (aside from exposure to a very useful module). It seems like overkill, and I can't determine how to prevent it from recursively going into any subdirectories. The `$options{'bydepth'}` doesn't seem to do that from what I can understand of the docs.	[reply] [d/l]
Re^3: Unix shell ls vs readdir by Anonymous Monk on May 11, 2010 at 14:42 UTC
If you're looking for files within a single known directory, File::Find (or the recurse method of Path::Class::Dir) will be of little value to you. Their purpose is to call a subroutine for every file under a certain point. Any filtering must be done inside your subroutine. Options controlling depth-first or breadth-first processing of the directory tree will only effect order. No filtering would be implied.	[reply]
Re^2: Unix shell ls vs readdir by Anonymous Monk on May 10, 2010 at 19:23 UTC
... or the `recurse` method of Class::Path::Dir...	[reply] [d/l]
Re^3: Unix shell ls vs readdir by Anonymous Monk on May 10, 2010 at 19:26 UTC
Err, make that Path::Class::Dir. Dyslexic moment. Sorry.	[reply]
Re: Unix shell ls vs readdir by jwkrahn (Abbot) on May 10, 2010 at 18:54 UTC
`my @lines = grep {/->\s+tomcat\s+/} qx{ls -l /etc/rc.d/init.d}; my @inits = map {(split)[-3]} @lines;` [download] If `/etc/rc.d/init.d` contains any subdirectories then ls will also display the files from them. Is that what you want? If any of the file names contains spaces or tabs or newlines (or other whitespace characters) then `(split)[-3]` will not return the correct file name.	[reply] [d/l] [select]
Re^2: Unix shell ls vs readdir by jffry (Hermit) on May 11, 2010 at 03:36 UTC
Actually, `ls -l /a_dir` will not list the contents of a_dir's subdirectories on any Unix flavor that I've used. Maybe you are thinking of `ls -l /a_dir/` which will do exactly what you described because the shell will expand the glob "/a_dir/" and then hand that list of arguments to ls, and, of course, when ls is handed a directory name as an argument it lists the contents of that dir. But yes, a total forehead slap on the situation with spaces in file names messing up my array ordering. Yet another solid reason to keep it all Perl.	[reply]
Re: Unix shell ls vs readdir by stefbv (Priest) on May 10, 2010 at 20:37 UTC
On some systems the output from `"ls -l"` may contain ANSI escape sequences, so it might be safer to use `"\ls -l"` instead.	[reply] [d/l] [select]
Re^2: Unix shell ls vs readdir by almut (Canon) on May 11, 2010 at 00:37 UTC
it might be safer to use `"\ls -l"` instead Under almost all circumstances, the backslash would not be needed. On the interactive command line, the backslash prevents alias expansion (such as `"ls"` —> `"ls --color=auto"`, which then produces the ANSI escape sequences), because alias lookup happens before backslash escapes are processed, and there is no alias for `"\ls"`. However, alias expansion is only done for interactive shells, and not for `sh -c ...` (i.e. `qx{...}` ) — unless explicitly requested otherwise, alias expansion would be done by the shell, but unless there are any shell metacharacters in the command, no shell is involved anyway, as Perl will run `ls` directly.	[reply] [d/l] [select]
Re^3: Unix shell ls vs readdir by stefbv (Priest) on May 11, 2010 at 08:07 UTC
True. I tend to forget that the shell is not involved.	[reply]
Re: Unix shell ls vs readdir by JavaFan (Canon) on May 10, 2010 at 18:13 UTC
Uhm, if you're just interested in the file names, why use "ls -l", then throw everything away the "-l" adds? Why not just a plain "ls"?	[reply]
Re^2: Unix shell ls vs readdir by almut (Canon) on May 10, 2010 at 18:59 UTC
Why not just a plain "ls"? Plain `ls` doesn't show the link target the OP is grepping for...	[reply] [d/l]
Re^2: Unix shell ls vs readdir by Argel (Prior) on May 10, 2010 at 19:03 UTC
It looks like he is using the '->' from the ls -l output to figure out which files are symbolic links pointing to tomcat. Elda Taluta; Sarks Sark; Ark Arks	[reply]