oliverm has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I am going to implement a file integrity checker in Perl, as I am sick and tired with Tripwire and Aide. Which type of data structure should I use to store many values for each index?

Replies are listed 'Best First'.
Re: multidimensional array?
by davido (Cardinal) on Aug 20, 2004 at 01:11 UTC

    I'm afraid you'll need to provide more information. An array of arrays, an array of hashes, a hash of arrays, a hash of hashes, or any arbitrarily complex datastructure can be created all using references. But the datastructure should probably be a natural extension of the data you're working with, and I'm afraid you haven't told us what exactly that is going to be.

    But we can provide you some good startingpoints. Start with perlreftut (tutorial on Perl references), and perlref (in depth explanation of Perl references). Then move on to reading perldsc (the Perl Datastructure Cookbook) and perllol (Perl lists of lists discussion). Those documents will prepare you for making an informed decision on what datastructures to use, and how to manipulate them.


    Dave

Re: multidimensional array?
by kvale (Monsignor) on Aug 20, 2004 at 01:06 UTC
    If your indices are numerical and dense (0,1,2,3,4,...) then a multi-dimensional array would be a good match. If your indices are alphanumeric, or are numeric and sparse (21, 53, 678, 112232, ...) use a multi-dimensional hash instead.

    -Mark

Re: multidimensional array?
by Tuppence (Pilgrim) on Aug 20, 2004 at 01:56 UTC

    I would use objects instead of raw hash indexes for this kind of thing. In which case, you may wish to look into Class::MethodMaker and it's object_list support, as well as writing a test suite that expresses what you want your code to do.

    As an aside, are you sure it's worth the effort to write from scratch such a complicated tool that still will not really get you what you want?

    I say this mostly because as a user-space solution any sort of file integrity checker will be vulnerable to kernel modules that hide the attackers files away from user-level view

    Of course, if you're using this for simple file tracking then more power to you..

Re: multidimensional array?
by zentara (Cardinal) on Aug 20, 2004 at 14:10 UTC
    For a file-integrity checker, I think you will want to store the data in a database of somesort, so you can compare it to current values after a reboot. Here is an older, but still nice script called Claymore It uses ascii flat files, but you could probably speed things up by using hashes and Storable.

    I'm not really a human, but I play one on earth. flash japh