jbush82 has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on writing an MD5 hash scanner (searches for files on a system, if it finds a file that matches a filename of a known bad, does and MD5 on it and compares to known bad files MD5), and I've got it working up to the point of identifying the files on a system that match in my list of bad files (I think), but when I try to take the match data and put it into a multi-dimensional array, it seems to not be working.

If you'd like to give it a go up to this point, you need psexec (http://www.microsoft.com/technet/sysinternals/security/psexec.mspx) and also replace the IP address 1.2.3.4 in the script to the IP of the system you are currently on (need admin rights to do the scan). Any advice would be appreciated.
#!C:\perl\bin\perl.ex#!C:\perl\bin\perl.exe use strict; my @fileinfo; my @sysfiles; my $i = 0; my $md5sum; my $filename; my $md5data; my @md5info; my @md5filename; open(FILE, "knownbad.txt") or die("Unable to open file"); my @knownbad = <FILE>; close(FILE); my $k = 0; foreach $md5data (@knownbad) { chop($md5data); ($filename, $md5sum) = split(/\,/, $md5data); $md5info[$k] = [$filename, $md5sum]; $md5filename[$k] = $filename; $k++; } open FILES, "psexec.exe -n 2 \\\\1.2.3.4 cmd.exe \/C dir C\:\\ \/S \/B + |" or die; while ( <FILES> ) { my $dir; my $file; ( $dir, $file ) = m/(.*)[\\\/](.+)/ ? ( $1, $2 ) : ( undef, $_ ); # print "$file is in the directory $dir\n"; $fileinfo[$i] = [$file, $dir]; $sysfiles[$i] = $file; $i++; } close FILES; sleep 5; my $filecount = (scalar (@fileinfo)) - 1; my $hashcount = (scalar (@md5info)) - 1; my $refarray1 = $fileinfo[$filecount]; print "File: $refarray1}[0] --- Directory: ${$refarray1}[1]]\n"; my $refarray2 = $md5info[$hashcount]; my @isect_list; my %isect = (); map { $isect{$_} = 1 } @md5filename; @isect_list = grep { $isect{$_} } @sysfiles; my @uniq = sort keys %{ { map { $_, 1 } @isect_list } }; my @matchedfiles; my $intersect; foreach $intersect (@uniq) { my $c=0; while ($c <= $filecount){ $refarray1 = $fileinfo[$c]; if (${$refarray1}[0] eq $intersect) { print "File: ${$refarray1}[0] in directory ${$refarray1}[1]\n"; push(@{$matchedfiles[$c]}, [${$refarray1}[0], ${$refarray1}[1]]) +; } $c++; } } print "\n\n"; my $refarray3 = $matchedfiles[5]; print "it's not defined!\n" unless defined(${$refarray3}[0]); print "it's not defined!\n" unless defined(${$refarray3}[1]); print "File: ${$refarray3}[0] in directory ${$refarray3}[1]\n";

Replies are listed 'Best First'.
Re: Uninitialized value in concatenation (.) or string?
by pc88mxer (Vicar) on May 09, 2008 at 15:38 UTC
    Is this where (if you uncommented the print statement) you are getting the "Uninitialized value in concatenation..." error:
    ( $dir, $file ) = m/(.*)[\\\/](.+)/ ? ( $1, $2 ) : ( undef, $_ ); # print "$file is in the directory $dir\n"; $fileinfo[$i] = [$file, $dir];
    If so, it's probably because $dir is getting set to undef, and then you are using it in string interpolation in the print statement.

    Also, later that undef value could be causing problems in this code fragment:

    $refarray1 = $fileinfo[$c]; if (${$refarray1}[0] eq $intersect) { print "File: ${$refarray1}[0] in directory ${$refarray1}[1]\n"; $matchedfiles[$i] = [${$refarray1}[0], ${$refarray1}[1]]; }
    The expression ${$refarray1}[1] refers to the dir component of your fileinfo structure, so if it is undef, the print statement in this fragment will also generate the same error message.

    Also, don't you want to increment $i in this loop?

    $matchedfiles[$i++] = [${$refarray1}[0], ${$refarray1}[1]];
    In general it's easier to use push: push(@matchedfiles, [ ... ]);

    Update: Here are some style pointers that might make things a little cleaner.

    1) Read the md5 info into a hash:

    my %md5_of_known_bad; foreach $md5data (@knownbad) { ... $md5_known_bad{$filename} = $md5hash; }
    This serves two purposes: it stores the md5hash of a known bad file and it also can be used to determine if a file is a bad file by using exists:
    if (exists $md5_of_known_bad{$filename}) { # $filename is a known bad }

    2) Your intersection logic can then be simplified to:

    my @uniq; @uniq = grep { exists $md5_of_known_bad{$_} } @sysfiles;
      I just commented out the print statement to speed up the scan of the remote system (outputting slows it down a lot).

      I get the error on line 71

      Use of uninitialized value in concatenation (.) or string at nfse.pl line 71.

      69 print "\n\n"; 70 my $refarray3 = $matchedfiles[5]; 71 print "File: ${$refarray3}[0] in directory ${$refarray3}[1]\n";


      Also, don't you want to increment $i in this loop? $matchedfiles[$i++] = [${$refarray1}[0], ${$refarray1}[1]];

      Actually, I replaced $i with $c and just missed that spot when I did it. Ran the script again and I am of course having the same problem.

      In general it's easier to use push: push(@matchedfiles, ... );

      I attempted this using push($matchedfiles[$c], [${$refarray1}[0], ${$refarray1}[1]]);, but I received the message:
      D:\Documents\network file scanner\enhanced>perl -w nfse.pl Type of arg 1 to push must be array (not array element) at nfse.pl lin +e 63, near "])" Execution of nfse.pl aborted due to compilation errors.



      Thanks for your response. It is going to take me some time to figure out some of your recommendations. I'll post back once I've gone through it and hopefully got it going.
        Check the value of ${$refarray3}[1] for being undef:
        print "it's not defined!\n" unless defined(${$refarray3}[1]);
        If it's undef, it's because of your assignment of undef to $dir back in the while (<FILES>) {...} loop. Perhaps you can use the empty string there instead.

        Also, if you want $matchedfiles[$c] to be list of files, you need to use this syntax:

        push(@{$matchedfiles[$c]}, [...]);
        This makes $matchedfiles[$c] an array ref (i.e. @matchedfiles is an array of array refs.)
Re: Uninitialized value in concatenation (.) or string?
by starbolin (Hermit) on May 10, 2008 at 06:29 UTC

    This code:

    push(@{$matchedfiles[$c]}, [ ... ]); Is incorrect, it's usually a bad idea to use the index from one array to index into another array. $c holds the position in your fileinfo array but $matchedfiles[$c] is not defined yet. The error is compounded when you use $matchedfiles[$c] as an array reference as perl will automatically cause that element to spring to life ( and everything below it ) even if that element was not previously defined. This is called 'autovivification' and can be useful; but in this case is merely a troublesome side-effect of using an undefined reference.


    s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}
      starbolin writes:
      This code: push(@{$matchedfiles[$c]}, [ ... ]); Is incorrect. ...
      But I don't see a problem with auto-vivification in this case. Consider this test code:
      use Data::Dumper; my @m; my $c; $c = 3; push(@{$m[$c]}, [ 'foo' ]); $c = 2; push(@{$m[$c]}, [ 'bar' ]); print Dumper(\@m), "\n"; for (0..3) { print "length of m[$_] = ", scalar(@{$m[$_]}), "\n"; }
      The first push not only creates $m[3] as an array ref, but also sets $m[0], $m[1] and $m[2] to undef. The second push changes $m[2] from undef to a new array ref.

      The last for loop shows that undef is basically indistinguishable from [], i.e. a reference to an empty array.

        I didn't say auto-vivification was the problem. I said that using undefined references was his problem. Auto-vivification is saving his butt. He should be using a hash here. Trying to emulate a hash by using a sparse matrix is difficult, error prone and wrong.

        In general the code is very poor; the OP uses map when foreach would do, he uses foreach when map or grep is called for, he iterates over an array to create a hash when he could have hashed the data to begin with, he has multiple arrays holding the same data, he uses push when a hash is called for and he uses indexing where a push would serve. Arguing over auto-vivification is akin to rearranging the deck chairs while the ship is sinking.


        s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}