cutlass2006 has asked for the wisdom of the Perl Monks concerning the following question:

Hello kind monks,

After some soul searching I find myself unable to find any perl modules/tools assisting one to perform low level forensics on hard drives (or perhaps more appropriately hard drive disk images).

I am seeking wisdom on directly reading disk sectors from actual hard drives as well as hard drive images ... any pointers, tips, or links much appreciated.

Replies are listed 'Best First'.
Re: disk image forensics
by zentara (Cardinal) on Aug 13, 2008 at 18:53 UTC
    First of all, only test this stuff on spare, unimportant partitions. I warned you!!!

    Perl's not really suited to low-level stuff, except maybe for the regexing of the binary data you pump out. I know an old trick to read a bios is to use dd (on linux)

    dd if=/dev/mem bs=32k skip=31 count=1 | strings -n 10 | grep -i bios
    You probably can use this same technique on raw disks, like
    dd if=/dev/hdb0 | strings -n 10 | grep -i secretkey
    To put it in Perl, you probably can run it thru a piped open, and regex the output
    my $pid = open(FH, " dd if=/dev/hdb0 | ") or die "$!\n"; while( my $rrv = sysread( FH, my $buf, 1012 ) ){ #regex your $buf here for whatever #of course you will have to worry about missing full strings #on your chunk boundaries, so you may need to save a few #hunderd bytes of each $buf to add to the next one }

    I'm not really a human, but I play one on earth Remember How Lucky You Are
Re: disk image forensics
by dwm042 (Priest) on Aug 13, 2008 at 19:52 UTC
    For what it is worth, Autopsy is written in Perl and is a nice package for doing forensic analysis of disks and servers. It is commonly included in Linux forensic distributions, such as the Belgian Police's FCCU.

Re: disk image forensics
by mr_mischief (Monsignor) on Aug 13, 2008 at 19:17 UTC
    The problem with a disk image is that it's binary data in a format that depends on many factors.

    What's the filesystem? That determines many other things. How are attributes, ownership, and timestamps stored? Is there journaling involved? If so, what part of the data is the journal?

    What's the block size? Many filesystems allow different block sizes to be configured.

    What's the byte order of the stored data? Is it determined by the filesystem spec, or does the filesystem use whatever is native to the platform?

    The best reference for a filesystem is often the implementation of the FS driver for the platform on which the image was created. IOW, it's usually easiest to mount the file system (possibly read-only) and look at the files and directories that way.

    Perl could certainly be used to read and manipulate data according to a filesystem specification, but you're looking at reinventing many very intricate wheels.

Re: disk image forensics
by Your Mother (Archbishop) on Aug 13, 2008 at 20:20 UTC

    It's been years since I cracked it open but IIRC "Perl for System Administration" ISBN 9781565926097 opens with an example of scanning of a dropped laptop's hard drive block by block to rescue everything that wasn't damaged. I'm sure there would be some good info in there related to whatever you're doing.

Re: disk image forensics
by gone2015 (Deacon) on Aug 13, 2008 at 19:27 UTC

    Many moons ago I ended up writing a mound of code in order to recover data from a busted ext3 file system.

    You can use sysopen and sysread directly on /dev/whatever -- you end up reading blocks of bytes, and picking over those either with substr or unpack. I implemented a modest cached buffering system, for obvious reasons. (Of course you can open the device read-only.)

    Mind you, after a while I began to wonder if Perl really was the best language to do this in !

Re: disk image forensics
by ohcamacj (Beadle) on Aug 14, 2008 at 07:58 UTC
    It is easy to mount a cdrom or hard drive PARTITION image under linux.
    mount -o loop,ro,noatime cdrom_image.iso mountpoint/ mount -o loop,ro,noatime partition_image.img mountpoint/
    However, directly trying to mount a hard drive image fails, because the start of a disk is not the start of the first partition.
    The first sector of a typical hard drive looks something like
    ( Description of MSDOS-style partition table and master boot record as + gleaned from the soure of /parted/ and /grub/. ) ---------------------------------------------------------------- 0 - 6 +3 boot code boot code boot code boot code boot code boot code ---------------------------------------------------------------- 64 - +127 boot code boot code boot code boot code boot code boot code ---------------------------------------------------------------- 128 - + 191 boot code boot code boot code boot code boot code boot code ---------------------------------------------------------------- 192 - + 255 boot code boot code boot code boot code boot code boot code ---------------------------------------------------------------- 256 - + 319 boot code boot code boot code boot code boot code boot code ---------------------------------------------------------------- 320 - + 383 boot code boot code boot code boot code boot code boot code ---------------------------------------------------------------- 384 - + 447 boot code boot code boot code boot code (to 440) AAAABB( ---------------------------------------------------------------- 448 - + 511 partion one )(partion two )(partion three )(partion four )CC AAAA = mbr_sig BB = unknown CC = magic Each 16 byte partion entry is ---------------- 0 - 15 ABBBCDDDEEEEFFFF C = type BBB,DDD = EEEE = start sector from 0 FFFF = length in sectors
    The sector counts are little-endian integers, easily parsable with unpack("V"). Using the offset of the partition, it is possible to mount a partition within a hard drive image file.
    mount -o ro,noatime,loop,offset=<sector offset * 512> hard_drive_image +.img mountpoint/
    On some older systems, mount only accepts offsets up to 2gb. A simple way to check if this is a problem is to run
    losetup -o 5100200300 /dev/loop5 small_file
    followed by
    losetup /dev/loop5
    A system that limits offsets to 2gb will print
    /dev/loop5: [XXX]:XXXXXX (small_file) offset 2147483647, no encryption
    A system that supports large offsets will print
    /dev/loop5: [XXX]:XXXXXX (small file) offset 5100200300, no encryption