Best programming practice

DarthFredd has asked for the wisdom of the Perl Monks concerning the following question:

I built a script which looks for files which share the same name but different extensions, and deletes dupes. I would like some advice on best programming practices.

If foo.csc, foo.txt, and bar.txt are in the same directory, and $ext is set to csc, foo.txt will be deleted (it has a csc copy), but bar.txt will be left alone. My script:

use File::Find;    
my $ext = "dat";
my $nom = 0;

find ( \&ripper, @ARGV) ;

sub ripper { ## First implementation
unless(s/\.$ext$//i) {return;}
my $file = $_;  #using $_  screws up grep
$nom += unlink(grep(/\Q$wang\E\.(?!$ext)/, glob("*")),"\n"); 
}

sub ripper2 { ## Second implementation
my $file = $_;
return unless(s/\.(?!$ext)$//i); 
my $priority = $_ . $ext;
if(-e $priority && !-d $priority) { $nom += unlink $file; }
}
[download]

I wrote the second implementation because I thought it would be cleaner. Is it? Given the choice, which one would you use? Every opinion shall be gratefully recieved.

Comment on Best programming practice Download Code

Replies are listed 'Best First'.
Re: Best programming practice by Sidhekin (Priest) on Oct 28, 2007 at 22:15 UTC
The second feels the cleaner approach to me as well, but it is not correctly implemented. So I suppose one could argue it is too clever. Alternatively, one could fix it: `sub ripper2b { ## Second implementation, take b my $file = $_; return unless s/\.(?!$ext)\w+$//i; my $priority = "$_.$ext"; if (-e $priority && !-d $priority) { $nom += unlink $file; } }` [download] My two fixes: `(?!$ext)` is zero-width, so `(?!$ext)$` is just `$`. We need to remove the extension as well, whatever it is, so add `\w+` to the regex. The "." was removed by the substitution, so we need to add it again along with `$ext`. If that does not make sense to you, the second implementation is just too clever. Go with the first. `print "Just another Perl ${\(trickster and hacker)},"` The Sidhekin proves Sidhe did it!	[reply] [d/l] [select]
Re^2: Best programming practice by KurtSchwind (Chaplain) on Oct 30, 2007 at 14:19 UTC
I think you went the right way. The 2nd one is much closer to working than the first. I'm not even entirely sure what $wang was supposed to hold. Also, I would have missed the \w+. -- I used to drive a Heisenbergmobile, but everyone I looked at the speedometer, I got lost.	[reply]
Re: Best programming practice by jdporter (Paladin) on Oct 29, 2007 at 13:06 UTC
This is how I'd approach it: `use Getopt::Long; use File::Glob; use strict; use warnings; my $ext = 'csc'; GetOptions( 'ext=s' => \$ext, ); my $unl; for my $dir ( @ARGV ) { my %f; /(.)\.(.)/ and $f{$1}{$2} = $_ for File::Glob::bsd_glob( "$dir/" ); for my $g ( grep { $f{$_}{$ext} and keys %{$f{$_}} > 1 } keys %f ) { delete $f{$g}{$ext}; $unl += unlink $f{$g}{$_} for sort keys %{ $f{$g} }; } } print "\n$unl files unlinked.\n";` [download] A word spoken in Mind will reach its own level, in the objective world, by its own weight*	[reply] [d/l]
Re: Best programming practice by tuxz0r (Pilgrim) on Oct 30, 2007 at 03:59 UTC
Just curious, but I couldn't get either of your subroutines working. But, giving it some thought since I saw the post originally, I put together one which works with the semantics you mentioned, leaving unique filenames alone (ones with no dups) and removing duplicates of files with the same prefix (minus extension), saving the one prefix you pass in. Is this along the lines of what you're trying to do? my $working_dir = $ARGV[0]; # starting directory my $extension = $ARGV[1]; # extension to save dedup($working_dir); exit 0; sub dedup { my $path = shift; my @files = glob("$path/"); print "Checking [$path] ...\n"; foreach (@files) { dedup("$_") if (-d $_ && -x $_); # recurse into subdire +ctories my ($base, $ext) = m/(.)\.([^\.]+)$/; my @matches = glob("$base"); print "\tremoving: $_\n" if scalar @matches > 1 && $ext ne $ex +tension; # unlink $_ if scalar @matches > 1 && $ext ne $exte +nsion; } } [download] I'm not using File::Find here, but it does work recursively through accessible subdirectories. It only looks at dups within the same directory level, so if you have foo. in multiple directories they won't be affected. This is a quick and dirty script, so it could be much better with some argument checking, adding an appropriate way to handle "." dot files and the like; but, it seems to do a fairly good job. If you can tell me what I might have missed running your code, let me know, I'd love to give it a run and compare the output. --- echo S 1 [ Y V U \| perl -ane 'print reverse map { $_ = chr(ord($_)-1) } @F;'	[reply] [d/l]