Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

cleanzip

by sflitman (Hermit)
on Oct 20, 2008 at 05:33 UTC ( [id://718138]=sourcecode: print w/replies, xml ) Need Help??
Category: Utility Scripts
Author/Contact Info sflitman - Stephen Flitman - sflitman >!< xenoscience.com
Description: This is a quickie script to clean infected zip files using clamscan. It's definitely meant for unix/linux platform and expects clamscan 0.94 which with the indicated switches will print out each infected file on a line by itself with whatever virus it found. Note that this script will also delete zipped emails or mailboxes which clamscan identified as containing Phishing, etc., as it does not distinguish what type of unwanted byte sequences are reported by clamscan. This is to get around a deficiency in ClamAV noted by many that it does not identify the actual bad actor(s) in an archive, just that the archive as a whole is infected (and not multiply-infected, which is of course possible).
#!/usr/bin/perl
# Written by Stephen S. Flitman, MD
# Copyright (C) 2008 Xenoscience, Inc.
# Released under GPL v3
# 101908 clean up infected zip files

use strict;

my $CLAMSCAN=`which clamscan`;
chomp $CLAMSCAN;
die "Where's clamscan?" unless $CLAMSCAN;
my $ZIP=shift @ARGV;
my $TMP=$ZIP;
$TMP=~s!/!_!g;
$TMP="/tmp/$TMP.dir";
system "unzip $ZIP -d $TMP";
my $BADFILES=`$CLAMSCAN --recursive --infected --no-summary $TMP`;
unless ($BADFILES) {
   print "No viruses found in $ZIP\n";
   exit;
}

my (@BADFILES,$BADFILE,$RESULT);
@BADFILES=split(/\n/,$BADFILES);

for $BADFILE (@BADFILES) {
   if ($BADFILE=~s/:.*FOUND$//) {
      $BADFILE=substr($BADFILE,length($TMP)+1);
      print "File to delete is '$BADFILE'\n";
      $RESULT=`zip -d $ZIP "$BADFILE"`;
      print $RESULT;
   } else {
      print "Nothing to do for $BADFILE";
   }
}
system "rm -r $TMP";
exit;
Replies are listed 'Best First'.
Re: cleanzip
by Anonymous Monk on Oct 20, 2008 at 12:56 UTC
Re: cleanzip
by graff (Chancellor) on Oct 21, 2008 at 02:11 UTC
    I think you'll want to be careful about strings that get passed to sub-shells, whether via system() calls or via back-ticks. At best, the job can just fail because of a file name that happens to contain a space or other shell meta-character (and at worst, you might be unpacking zip files whose contents include file names like foo; rm -rf /*).

    You can easily change your one system call to the list style:

    system( 'unzip', $ZIP, '-d', $TMP ); # instead of 'system "unzip $ZIP + -d $TMP"'
    For the back-ticks, where you want to the command's stdout to be assigned to a variable, it might suffice to put backslash in front of any shell-magic characters:
    $ZIP =~ s/([^\w.-])/\\$1/g; for $BADFILE ( @BADFILES ) { if ($BADFILE=~s/:.*FOUND$//) { $BADFILE=substr($BADFILE,length($TMP)+1); $BADFILE =~ s/([^\w.-])/\\$1/g; $RESULT = `zip -d $ZIP $BADFILE`; print $RESULT; } ...
    (not tested)

    Bear in mind that in the OP code, the placement of double-quotes around "$BADFILE" in the back-tick command (running "zip -d ...") will do no good when the file name happens to contain one or more double-quote characters.

      Agree fully. I've also posted execute() which uses an exec-like mechanism, bypassing shell issues for arguments. Here it is again:
      sub execute { # execute a command without shell but with timeout my ($cmd,@args)=@_; my $timeout=15; # seconds my ($result,$pid,$i,$time); if ($args[$#args]=~/^--?timeout=(\d+)$/i) { # or pass as last arg $timeout=$1; pop @args; } die "execute($cmd,".join(',',@args)."): null arguments" # SSF 08 +0708 eliminate null arguments, they are always erroneous if grep { length($_)==0 } @args; $i=index($cmd,' '); # args appended to command? if ($i>-1) { unshift @args,split(/\s+/,substr($cmd,$i+1)); # could be more +than one $cmd=substr($cmd,0,$i); } eval { local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required local($/)=undef; alarm $timeout; $time=time if debug_has(EXEC); $pid=open(CMD,'-|',$cmd,@args); # run without shell overhead if ($pid) { $result=<CMD>; close CMD; debug sprintf("execute(%s%s%s):\n%s%s [%.3f s]",$cmd,@args? +',':'',join(',',@args),substr($result,0,500),length($result)>500?'... +':'',time-$time) if debug_has(EXEC); } else { alarm 0; die "execute($cmd ".join(' ',@args)."): $!" unless $pid; } alarm 0; }; if ($@) { die $@ unless $@ eq "alarm\n"; # propagate unexpected errors } $result; }
      SSF

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://718138]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-24 23:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found