Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: [NTF] to check md5 of loaded modules

by Discipulus (Canon)
on Sep 07, 2023 at 11:28 UTC ( [id://11154289] : note . print w/replies, xml ) Need Help??

in reply to [NTF] Nice Perl ideas I have no time for the first [NTF] born from a chatboard idea discussed among bliako, Corion and me about..


..checking the md5 checksum of every module loaded.

A possible solution and tools for

Probably checking the md5 of every path in @INC would be easier and faster (using tar you can also check permissions) but checking every module is more granular, allows to specify a white/black list and is funnier :)

Basically as explained in require documentation prior to 5.37.7 we can use the hook putting a sub inside @INC like in: unshift @INC, sub { my ($coderef, $filename) = @_; ... } (and is the easier case ;) to have something done just before paths are searched for the requiered module. It is a trick I forgot.

In perldelta for v5.38.0 there is the new %{^HOOK} API and actually require__before and require__after are available: nice and fun!


How to store the file containing the md5 checksum to check against? It should be protected to be useful. My idea (uniplemented) is that brings up a password request as first thing to decipher a protected md5-checksums.txt file, or a SQLite db.

The module needs to use some module to run, so these are checked AFTER they are loaded and this can be a security hole in paranoic world.

The @INC array can be maliciously modified by other modules so should be saved soon and used to scan for file to load (not implemented, olny my @original_INC = @INC; in my code). Probably the module should check the existence of the file using @original_INC and do it returning the value and then populate %INC ..just to be very paranoic :) My traverse_INC sub is almost empty.

The nice starts with HOOKs and this part is totally unimplemented.

Code has a bounch of subs and a big BEGIN block. After the initial check I check brutally $^V ge '5.38.0' to spot which hook to use.

use strict; use warnings; package Paranoic; print " here..\n"; sub calculate_module_md5{ my $path = shift; open my $file, '<', $path or die "Impossible to open [$path]"; my $ctx = Digest::MD5->new; $ctx->addfile ($file); my $md5 = $ctx->hexdigest; close $file; return $md5; } sub load_md5_from_file{ my %paranoic_INC; open my $file, '<', 'md5-check.txt' or die "Unable to load [md5-ch +eck.txt]"; while (<$file>){ chomp $_; next if /^#/; next unless $_; my ($name,$path,$md5) = split /\s+/,$_; $paranoic_INC{ $name } = { path => $path, expected_md5 => $md5, }; } return \%paranoic_INC; } sub traverse_INC{ my $filename = shift; my @original_INC = @_; } BEGIN { my @original_INC = @INC; use Digest::MD5 ; use File::Spec; #print "BEFORE: ",map {"$_ $INC{$_}\n"} keys %INC; print "BEFORE any hook I will check md5 of already loaded module:\ +n"; my $paranoic_inc = load_md5_from_file(); foreach my $module ( keys %INC ){ # SKIP itself while developping it # next if $module eq ''; #test #if ($module eq ''){$$paranoic_inc{$module}{expecte +d_md5}.='XXXXX'} my $md5 = calculate_module_md5( $INC{$module} ); # NOT FOUND if (! exists $$paranoic_inc{$module} ){ # this is a die print "Cannot find a stored md5 for [$module]"; print "\n-->DEBUG: $module\t$INC{$module}\t$md5\n"; + } # WHITELIST elsif ( 'ALLOW' eq $$paranoic_inc{$module}{expected_md5} ){ print " WHITELIST for $module at $$paranoic_inc{$module}{ +path} [$md5]\n"; } # BLACKLIST elsif ( 'DENY' eq $$paranoic_inc{$module}{expected_md5} ){ # this is a die print " DENY for $module at $$paranoic_inc{$module}{path} + [$md5]\n"; } # EXPECTED MD5 elsif ( $md5 eq $$paranoic_inc{$module}{expected_md5} ){ print " OK $module at $$paranoic_inc{$module}{path} has t +he expected md5: ". "$$paranoic_inc{$module}{expected_md5}\n"; } # WRONG MD5 elsif ( $md5 ne $$paranoic_inc{$module}{expected_md5} ){ # this is a die.. print "ERROR: $module at $$paranoic_inc{$module}{path} has + [$md5] ". "insetead of [$$paranoic_inc{$module}{expected_md5 +}]"; } # UNKNOWN RESULT else{ die "UNKNOWN error for $module at $$paranoic_inc{$module}{ +path} with md5 [$md5]" } } print "\nAFTER I will use some hook to check md5 of modules loaded + by the calling program\n"; if ( $^V ge '5.38.0'){ print "====> Perl $^V using \$^HOOK\n"; ${^HOOK}{require__before} = sub { my $filename = shift; if ( exists $INC{$filename} ){ print " SKIP [$filename] already processed\n"; return; } print "Paranoically considering [$filename]\n"; }; } else{ print "====> Perl $^V using \@INC\n"; unshift @INC, sub { my ($self,$filename) = @_; print "Paranoically considering [$filename]\n"; }; } } 1;

The md5-check.txt is a simple file /usr/local/lib/perl5/5.36.1/ 7167a8489aaf +b9faddbbe48c6480f47c /usr/local/lib/perl5/5.36.1/ 31b6105d6dc1cde5 +4154291b86c8b285 Digest/ /usr/local/lib/perl5/5.36.1/x86_64-linux/Digest/MD5.p +m d75a3d708ce93ad8d99fcbdefa2c8429 Digest/ /usr/local/lib/perl5/5.36.1/Digest/ b5de26 +96c583dfec247af39b45288735 /usr/local/lib/perl5/5.36.1/ 74a2550b5b07 +31996c0c825930003013 /usr/local/lib/perl5/5.36.1/ 9ac6b836ee45 +f6e08e5c8a84cee5e619 # ALLOW # /usr/local/lib/perl5/5.36.1/x86_64-linux/ 8f620379a06 +49ad32f14f1ce50b88bc0 File/ /usr/local/lib/perl5/5.36.1/x86_64-linux/File/ + 7be482dda6bd364dd65e286b24cd8691 warnings/ /usr/local/lib/perl5/5.36.1/warnings/register. +pm 2d8f6ce093a2176b982c0e12c0194b3b File/Spec/ /usr/local/lib/perl5/5.36.1/x86_64-linux/File/Spe +c/ bf252d457a243d20eabbd91292fcf3f4 /usr/local/lib/perl5/5.36.1/ 56cde6eba0f6 +67ab56196613df3933c1

..and the script is simple as:

use strict; use warnings; use List::Util;

..finally the command invocation is: perl -I. -MParanoic to be paranoic as soon as possbile.


For a month the demo will be available at the nice PerlBanjo website. The checksum are correct only for the 5.36.1 version, so you'll see errors (have to be die in the code) for 5.38

Here the output for future reference:

BEFORE any hook I will check md5 of already loaded module: OK at /usr/local/lib/perl5/5.36.1/ has the ex +pected md5: 9ac6b836ee45f6e08e5c8a84cee5e619 OK at /usr/local/lib/perl5/5.36.1/ has the ex +pected md5: 7167a8489aafb9faddbbe48c6480f47c WHITELIST for at [f929845aba01aa4bf162a15cc2 +54c123] OK at /usr/local/lib/perl5/5.36.1/ has the expect +ed md5: 31b6105d6dc1cde54154291b86c8b285 OK at /usr/local/lib/perl5/5.36.1/x86_64-linux/ has the + expected md5: 8f620379a0649ad32f14f1ce50b88bc0 OK Digest/ at /usr/local/lib/perl5/5.36.1/Digest/ has +the expected md5: b5de2696c583dfec247af39b45288735 OK at /usr/local/lib/perl5/5.36.1/ has the ex +pected md5: 56cde6eba0f667ab56196613df3933c1 OK File/Spec/ at /usr/local/lib/perl5/5.36.1/x86_64-linux/Fil +e/Spec/ has the expected md5: bf252d457a243d20eabbd91292fcf3f4 OK Digest/ at /usr/local/lib/perl5/5.36.1/x86_64-linux/Digest/ has the expected md5: d75a3d708ce93ad8d99fcbdefa2c8429 OK File/ at /usr/local/lib/perl5/5.36.1/x86_64-linux/File/Spe has the expected md5: 7be482dda6bd364dd65e286b24cd8691 OK warnings/ at /usr/local/lib/perl5/5.36.1/warnings/regi has the expected md5: 2d8f6ce093a2176b982c0e12c0194b3b OK at /usr/local/lib/perl5/5.36.1/ has the ex +pected md5: 74a2550b5b0731996c0c825930003013 AFTER I will use some hook to check md5 of modules loaded by the calli +ng program ====> Perl v5.36.1 using @INC here.. Paranoically considering [List/]

The 5.38.0 output is different in the final part:

AFTER I will use some hook to check md5 of modules loaded by the calli +ng program ====> Perl v5.38.0 using $^HOOK here.. SKIP [] already processed SKIP [] already processed Paranoically considering [List/] SKIP [] already processed SKIP [] already processed SKIP [] already processed SKIP [] already processed SKIP [] already processed


Have fun developping this Perl idea and share your progress!


There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Replies are listed 'Best First'.
Re^2: [NTF] to check md5 of loaded modules
by tobyink (Canon) on Sep 07, 2023 at 13:40 UTC

    MD5 is a pretty old hash format and hasn't been considered especially secure for about a decade.

    Module::Signature switched to SHA256 about five years ago, so switching to that too might be a good idea. Especially as this means that any recent CPAN distributions packaged with Module::Signature in mind will include a SIGNATURE file (an example!) GPG-signed by the author, listing the SHA256 hashes for every file in the distribution including all modules.

Re^2: [NTF] to check md5 of loaded modules
by SankoR (Prior) on Sep 07, 2023 at 12:55 UTC
    Intercepting DynaLoader::dl_load_file(...) to verify the binary bits of XS modules would make this a lot more robust. I'd be more worried about a virus or something being written to inject code in a lib/dll than into a pure Perl module anyway.
    BEGIN { require DynaLoader; # no strict 'refs'; no warnings 'redefine'; my $keep = \&DynaLoader::dl_load_file; *DynaLoader::dl_load_file = sub { my ( $path, $flags ) = @_; warn "We should check '$path' here"; &$keep(@_); }; } # Random XS based core modules use Cwd; use Fcntl; use Digest::MD5;
    I guess you'll need to think about FFI loaded libraries eventually.