natebailey has asked for the wisdom of the Perl Monks concerning the following question:

I'd like to analyse a file system to explore:

Are there any scripts or modules that do this in an integrated/analytical way? Ideally they'd provide some graphs of usage (eg. 10% of files by size are .doc, 20% .xls, 40% .jpg; or 30% of files were last accessed +12 months ago; or the top 10 non-common words appearing in your .docs are ..., etc.)

thanks!
Nathan
  • Comment on Analyse file system - files, sizes, contents, access, etc.

Replies are listed 'Best First'.
Re: Analyse file system - files, sizes, contents, access, etc.
by graff (Chancellor) on Oct 02, 2013 at 05:01 UTC
    Here's something I posted awhile back - it doesn't do everything you have in mind, but it covers a decent subset, and maybe you could use it as a starting point for doing what you really want...

    Get useful info about a directory tree

      Awesome graff! I always prefer to start with some tried and tested code :-)
Re: Analyse file system - files, sizes, contents, access, etc.
by wjw (Priest) on Oct 02, 2013 at 04:24 UTC
    Only thing I have heard of or run into is Wefis which is a web based file manager written in Perl. I don't really know the status and it is not exactly what you are describing, but seems to have some components of what you want.

    Perhaps you might find something useful to you, or at least get some ideas from it.

    Out of curiosity...what is the interest in this? Considering the plethora of already available tools on most systems...? Not my business, but like I said, just curious...

    Good luck...


    • ...the majority is always wrong, and always the last to know about it...
    • ..by my will, and by will alone.. I set my mind in motion

      Thanks - Wefis may have what I'm looking for but I can't find the stats/reporting side? For a user, it appears to be largely a file management system.

      What I'm envisaging is something similar to a web log analyser.

      I'd be more than happy to be pointed at an equivalent solution, Mac-specific or cross-platform capable. I didn't dig very hard because I suspect I want to extend and customise it.

      The problem I'm trying to solve is like NetWallah's - except that tells you only about disk usage; I'm trying to take it up a level, to the information usage, ie. what information is stored in what kind of documents, how frequently are they used/modified and how does their content correlate to their filename.

      I'm envisaging a tool that will help individuals and organisations to improve their information management practice with some simple reports that guide improved behaviour.

      If this has already been done or even half done, I'll happily use an existing solution and save myself time :-)

Re: Analyse file system - files, sizes, contents, access, etc.
by NetWallah (Canon) on Oct 02, 2013 at 04:52 UTC
    A decade or so ago, I had written code to analyze user files on shared storage.

    I used a perl script to collect file extensions, and metadata, and stored that in an access database.

    SQL queries could be run to give summaries like - what percent of disk was eaten by mp3 and video files, Photos, etc. Also - which user was hogging disk , with a summary of top 20 users, and what volumes of "non-corporate" stuff they had.

    An interesting side-effect was the discovery of a treasure-trove of ... well .. lets call it unsavory material.

    I imagine there are probably many such, far more professionally written packages available for things like this, but I'm willing to try to dig up the old code, on request.

                 My goal ... to kill off the slow brain cells that are holding me back from synergizing my knowledge of vertically integrated mobile platforms in local cloud-based content management system datafication.

      Thanks NetWallah - if your code goes further than graff's below, I'd be grateful for a copy!