comment on

First, thank you all for your suggestions. The problem has been one of algorythym. I am iterating @select files from and @allfile loop, and parsing for equality conditions.

As the actual code is over 300 lines, I have included an edited snippet. This code is derived from an earlier script where I needed to parse for reexes in files, not necessarily exact matches , and not necesaarily at the beginning. parsing @selectexpr against @allfiles made sense there.

Hashes are an excellent idea. with them I can parse foo-bar-baz.doc as as hash directly against all foo keys, with proper splitting and filtering, of course. This would allow me to scale up more efficiently.

I would howver prefer, if possible to keep the matching to a regexp rather an an equality operator.

Please ignore syntax errors in the code below, it has been abbreviated.

if ($mymobi =~ m/($myepub)/) {print "DUPLICATE FOUND !\n" ;
&movetodir($myfilt,$dupdir );     }
#Does NOT work 


if ($mymobi eq $myepub) {print "DUPLICATE FOUND !\n" ;
&movetodir($myfilt,$dupdir );     }
#Works
[download]

For an author-title pair,the matching would be done in the title(value) portion rather than the key, which would be expected to identical (though there might be exceptions ).

I need to hit the books on hashes here, as i havent really dealt much with them outside of a 20,000+ listing database with about 2 dozen hash fields.

   opendir(DIR, $dir2 ) or die $!;
     while ( $file = readdir(DIR))       {
               if (-f $file) {  #  read only files
 chomp($file);



$file =~ s/^\s+|\s+$//g;
$filenam = "" ; 
push ( @srcarray, $file) ;
if ($file =~ m/\.mobi$/ig ) {
&typefiles($file, "mobifile"); 
                          }

if ($file =~ m/\.azw3$/ig ) {
&typefiles($file, "azw3file"); 
                        }


sub typefiles( $tfile , $filetype ) { 
($tfile, $filetype ) = @_ ;
if ($filetype eq "mobifile" )  { 
push ( @mobiarray, $file) ;     } # End mobifiles 

# Main body - parsing directory listing and performing actions 
        foreach $authf (@srcarray){

if ($authf =~ m/\.pl$/) { 
next; }



if ($authf =~ m/\.epub/ig ) {
our $authf2 = $authf ;


foreach my $myfilt (@mobiarray){ 
my $mymobi  = $myfilt;
my $myepub  = $authf2;

$mymobi = &extfilter($mymobi);
$myepub = &extfilter($myepub);


sub extfilter($line) {
($line) = @_;
$line =~ s/\.mobi//ig ;
$line =~ s/\.epub//ig ;
$line =~ s/^\s+|\s+$//g;
$line = lc $line;
return $line; 
                     }
[download]

In reply to Re^2: Duplicates in Directories by kel
in thread Duplicates in Directories by kel

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.