Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Looking for code review/suggestions. As mentioned in threads bellow, this is work is for educational purposes.
First of all I would like to stress that I have understood your aim and point of view. However I contend that "for educational purposes" goes in the same direction of "using no modules at all". More precisely I would say that these are mostly orthogonal concepts.

For example I know next to nothing myself about network programming. Now it happened that I had to perform a simple task of communicating to a device (a webcam) to make it do some actions. I tried with IO::Socket::INET but had a hard time with it and lately resorted to Net::Telnet which basically made my program a one liner, with all of its intelligent defaults...

I'd still be curious to try my hand at IO::Socket::INET on that task, for educational purposes. But the situation is quite different from that suggested by your approach.

Don't forget that: "CPAN is my programming language of choice; the rest is just syntax." (From Perl 6 is Here Today!)

This is especially true if you're having to do with modules in the core distribution, e.g. File::Find. Now, you may want to try your hand at coding a replacement for say File::Find or any other module you like, or a specialized version of it, adapted to the application under consideration...

Now, on to the code...

use strict; use warnings;
Excellent!
sub read_dir { my $dir = shift; if ( opendir( DIR, $dir ) ) {
And what if opendir fails?

My fault, as duly pointed out at Re^2: Find images regardless of filetype extension.: I hadn't noticed it was inside an if clause.

However the use of either short circuiting logical operators or of statement modifiers is so widespread in Perl and so typical of it that IMHO a full

if ( whatever ) { # ... } else { # ... }
structure is rarely necessary and I doubt that it would contribute significantly to readability. In any case I would like to stress that I'm expressing here a stylistic preference of mine, certainly not an absolute truth!. (My point here being that the the  else { return } is far enough from the test not to be noticed at a glance and put in relation with the former.)
my @tmp = readdir(DIR); my @files = map { "$dir/$_" } grep { !/^\.{1,2}$/ && -f "$dir/ +$_" } @tmp; my @dirs = map { "$dir/$_" } grep { !/^\.{1,2}$/ && -d "$dir/ +$_" } @tmp;
Here you're doing twice as many stats as necessary. You're also doing twice as many pattern matchings. Perhaps if you really wanted to follow this approach,
my @tmp = grep !/^\.{1,2}$/, readdir(DIR);
may have been slightly better. Also either a pre-mapping to "$dir/$_" or a chdir into $dir may have been advantageous. Incidentally, as far as the former option is concerned, you know that to do it in a really portable way you should have used File::Spec, don't you? (This is just FYI, I use stuff like "$dir/$_" myself all the time.)

All in all, and maintaining an approach similar to yours, I may have done:

my (@files,@dirs); for (readdir DIR) { next if $_ eq '.' or $_ eq '..'; $_="$dir/$_"; push @files, $_ if -f; push @dirs, $_ if -d _; }
I used string comparisons because they should be faster and they are done on every cycle.
closedir( DIR );
Now most file and directory opening functions support lexical handles. Well, that's not strictly true (from the technical point of view), but that's a good approximation to the truth. And with a lexical dirhanle you wouldn't need an explicit close.
sub glob_files { my ($dir, $recurse) = @_;
Well, I can't comment on your whole logic. But rather than having $recurse flag I would try to arrange things in other to have at some point a recursion for @dirs (this hypothetical @dirs not being yours), so that when @dirs is empty simply no recursion happens...
if (@$files) { push @files, @$files; }
Why not
push @files, @$files;
instead? (I don't know which is faster, but I suspect they execute at much the same speed, and the latter is more clear, IMHO.)
recursion: { last unless $recurse;
Incidentally, why did you put a label there, if you didn't use it? If is't for documenting reasons, then I get your point and I may agree with you.
while (@$dirs) { my $d = shift (@$dirs);
No problem with this, but why not
for (@$dirs) {
instead?
my @f = glob_files("$d", 1);
No need to quote all your variables!
}else{ return; } }
Ditto as above!
foreach $file (@files) { open FH, $file or die "Error opening [$file]: $!\n";
Ditto as above wrt using lexical filehandles. Also, some disagree, but I always recommend using the three args form of open.
next unless ( -s $file > 9 );
You may have done this check in your file finding routine on _ so as to avoid a third stat.
read(FH, $data, 10) or die "Error reading from [$file]: $!\n";
You may want to just warn and go on to the next one... just an idea!
if ( $data =~ /^BM/ ) { $type = 'BMP'; }elsif ( $data =~ /^GIF8[79]a/) { $type = 'GIF'; }elsif ( $data =~ /^\xFF\xD8/ ) { $type = 'JPG'; }else { $type = undef; }
Ouch! It doesn't mach my PNG images, not to say a few other tenths of popular formats, not to mention more exotic ones...

Seriously this may be a good reason for using a dedicated module...

return @images if ( scalar (@images) ); return;
No need for scalar. No need for the outer parentheses. No need for the inner ones either. No need for the if modifier, and no need for the second statement:
return @images;
is just as fine!
my (@files, @images); my $recurse = 1;
Huh?!? Isn't $recurse a lexical variable in your file finding sub?
@files = glob_files($dir, $recurse); @images = find_images(@files);
Here you're returning a possibly huge list of files to scan them subsequently for images. I would check them on the fly, that is what I would do in the File::Find 'wanted' subroutine.

In reply to Re: Find images regardless of filetype extension. by blazar
in thread Find images regardless of filetype extension. by zzspectrez

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (11)
As of 2024-04-19 16:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found