jwesley has asked for the wisdom of the Perl Monks concerning the following question:
Hello, here is my problem. I have written a script that looks for duplicate strings in 3 different files and outputs which stings are duplicated and in which files. This worked great until I was tasked with modifying the script to search for an unknown number of files and report on those as well. Below is the current script:
use File::Copy; use File::Find; ### Parse numerical characters and trailing white-space from rmtdb.lrl + for matching. copy("rmtdb.lrl" , "rmtdb.tmp") or die "rmtdb.lrl file cannot be copied: $!\n"; system "cat rmtdb.tmp | cut -d ' ' -f1 > rmtdb.tmp1"; ### Open File Handles. open comdb, "< dblist.comdbg" or die "Cannot open connection to dblist.comdb: $!\n"; open varldb, "< dblist.varldb" or die "Cannot open connection to dblist.varldb: $!\n"; open rmtdb, "< rmtdb.tmp1" or die "Cannot open connection to rmtdb.lrl: $!\n"; ### Create Lists @comdb = <comdb>; @varldb = <varldb>; @rmtdb = <rmtdb>; ### Close File Handles. close comdb; close varldb; close rmtdb; ### Case-shift rmtdb to lowercase. foreach (@rmtdb) {s/$_/\L$_/gi;} ### Begin matching. foreach $db (@comdb) # comdb against varldb. { @result = grep /^\Q$db\E$/i , @varldb; push(@com2var , @result); } foreach $db (@comdb) # comdb against rmtdb. { @result = grep /^\Q$db\E$/i , @rmtdb; push(@com2rmt , @result); } foreach $db (@varldb) # varldb against rmtdb. { @result = grep /^\Q$db\E$/i , @rmtdb; push(@var2rmt , @result); } ### Sort matches for final output. foreach (@com2var) { chomp($_); $hash1{$_}="dblist.comdbg dblist.varldb"; } foreach (@com2rmt) { chomp($_); if (exists $hash1{$_}) { $hash1{$_}="dblist.comdbg dblist.varldb rmtdb.lrl"; } else { $hash1{$_}="dblist.comdbg rmtdb.lrl"; } } foreach (@var2rmt) { chomp($_); if (! exists $hash1{$_}) { $hash1{$_}="dblist.varldb rmtdb.lrl"; } } ### Final Output. print "\n"; foreach (keys %hash1) { print "$_ is duplicated in: $hash1{$_}\n"; } print "\n"; ### Cleanup unlink "rmtdb.tmp", "rmtdb.tmp1"; exit 0;
Now there can be up to 20 rmt(*)db.lrl files in a given directory. I've figured out how to find the files, but I'm having trouble with the matching afterwards.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Sting matching
by wind (Priest) on Apr 21, 2011 at 08:38 UTC | |
by jwesley (Initiate) on Apr 22, 2011 at 00:17 UTC | |
|
Re: Sting matching
by John M. Dlugosz (Monsignor) on Apr 21, 2011 at 08:29 UTC | |
|
Re: String matching
by toolic (Bishop) on Apr 22, 2011 at 01:47 UTC | |
by jwesley (Initiate) on Apr 25, 2011 at 03:25 UTC |