in reply to Compare 2 arrays

The following works on Windows (it will not work on *nix).

It reads all entries in the SDF file into an array, after transforming the backslashes to forward slashes, and lowercase the whole shebang.

Then after we've grabbed up all *.sdf files in the specified directory, we lowercase these as well. On Windows, case is irrelevant on the file system, and this will avoid breaking exact matching if case differs.

If the file is not in the SDF file, we delete (unlink) it.

use warnings; use strict; use File::Find::Rule; my $path = 'c:/sdf'; # path to look in my $sdf_file = 'c:/sdf_file.txt'; open my $fh, '<', $sdf_file or die "can't open the flippin' flackin' file!: $!"; my @sdf_files; while (<$fh>){ chomp; if (my ($file) = /fullpath="(.*)"$/){ # replace backslash to fwd slash, and lowercase $file =~ s|\\|/|g; $file = lc $file; push @sdf_files, $file; } } my @files = File::Find::Rule->file() ->name('*.sdf') ->in($path); for my $file (@files){ $file = lc $file; if (! grep {$file eq $_} @sdf_files){ print "deleting $file\n"; unlink $file or die $!; } }

Replies are listed 'Best First'.
Re^2: Compare 2 arrays
by niceguy (Initiate) on Jun 28, 2016 at 17:19 UTC

    Hi stevieb,

    Thank you for your responds. I want to apologize for not being clear on my environment. The directory contents only "*.nfo" files. Some are inactive, that is why I want to delete them. But in order to identify which one is inactive, I want to compare them to see if it is listed inside the "Test.sdf" file.

    I hope this will clarify the environment. Please let me know if you have any other suggestions.

      Please re-read my post at Re^2: Compare 2 arrays. This does take into account the *.nfo files in the SDF file. The code from stevieb can also be adjusted to do this. The Monks expect that you spend some time analyzing and understanding the code that is being written for you. You have a couple of approaches and both will work.

      Test in small increments. For example to parse the SDF file, you could break out my code into a short test program like this:

      #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %keepList; while (my $line = <DATA>) { my $sdf_file; next unless ($sdf_file) = $line =~ /(\w+\.nfo)/; $keepList{$sdf_file} = 1; print "keeping $sdf_file\n"; #update for debugging ####### } =example printout keeping filename1.nfo keeping filename2.nfo =cut __DATA__ fullpath="C:\directory\filename1.nfo" id="1a" fullpath="C:\directory\filename2.nfo"
      As a note: If you are using Windows file names with a space in them, then the regex would be different. I only use filenames that are compatible with both Unix and Windows and that is probably the case here, but it may not be. One reason to run a simple test on the actual file!

      update: I should clarify, when you have a choice, use only [a-zA-Z0-9_], in the file names, basically anything that meets the rules of a valid identifier in Perl or C is fine, what Perl calls \w characters. Forgo using spaces or dashes in the names if you can and your life will be easier.

        Hi Marshall,

        Thanks! That does work! With the help of my co-worker, we refined it a little more by including white spaces in search.

        Thank you all who have helped me with my problem.

        Below is what we came up with.

        #!\perl\bin\perl use strict; use warnings; my $files = "C:/Directory"; my $list = "C:/Test.sdf"; open my $name, '<', $list or die "Failed to open file: $!\n"; my @files = $files; opendir(OUTPUT, $files); @files = grep {$_ ne '.' and $_ ne '..'} readdir(OUTPUT); closedir(OUTPUT); my %wanted_files; while (<$name>) { if ($_ =~ /^\s*fullpath=.*[\\\/](\w+\.nfo)"/) { $wanted_files{$1} = 1; } } foreach my $target_file (@files) { if (not $wanted_files{$target_file}) { unlink($files . "/" . $target_file) or die $!; } }