neptuna has asked for the wisdom of the Perl Monks concerning the following question:

i am writing a script that does a number os usage and space checking routines. having trouble with one of them. the plan here is to recurse thru a directory, record each uniqe user and add the size of the directories per user and then spit out a total like this:
User   Total usage in /some/dir
foo   450MB
bar   550MB
here is some code:
use File::Find; find (\&wanted_user, "$dir"); sub wanted_user { %sum = (); next unless (-d $_); ($user,$size) = (stat($_))[4,7] or die "can't stat: $!\n"; push ( @{$sum{$user}}, $size ); } foreach $user (sort keys %sum) { print "$user: @{$sum{$user}}\n"; }
----------------
questions:
1. it prints nothing. it should at least print something like 11847: 44645, 45466....
2. after question 1 is solved, what is the best way to add the size into a total.
something like: push ( @{$sum{$user}}, ++$size ); ?
3. also I may have to get rid of the stat() way of getting the user and size because there are some dirs that have dirs and files that i do not have access too and will generate erros instead of stating. i know i can parse a `ls -ld` but is there a better way? Thanks for any help Jim

Replies are listed 'Best First'.
Re: help with FS usage script using a data struct
by webfiend (Vicar) on Jun 04, 2002 at 19:07 UTC

    I'm no Saint, but I'll give what help I can. First, your questions:

    1. It prints nothing for a couple of reasons.

    • $dir is not defined in this snippet, so find is just staring into empty space for a few moments and returning.
    • wanted_user() is resetting %sum every time it is called, so you end up displaying an empty hash. It would probably be best to explicitly declare %sum outside of the function.

    It may seem like a bit of a dead horse on Perlmonks, but I'm going to reiterate some common advice. You should always use warnings or call perl with the -w switch unless you know exactly why you aren't. It gives you loads of helpful messages about things that could go wrong with your script. use strict is a good idea, too, but I'll admit I don't follow that one every time.

    2. If all you are interested in is the size, then you can use a simple total for each user in %sum.

    Side note here: On my computer, applying stat to a directory name does not get the size of the directory contents. In fact, "/my/dir/" is always a fixed size of 4096, even if it contains the latest download of Mozilla. You're not really getting much information about usage if you count the size of directories that way.

    3. I don't know much about File::Find, but I'm just trying to get you started with what I do know. The code below does the job you describe, and hopefully it'll make the rest of your task easier. I did a little fiddling with the result output - that's just the kind of guy I am.

    Note: Associating names with user IDs is left as an exercise (e.g., I'm feeling lazy).

    #!/usr/bin/perl # fs.pl # Find filespace consumption per user for a given directory tree # # USAGE # perl fs.pl dir1 [dir2 ... dirN] use warnings; use strict; use File::Find; my %sum; # Add the size of a file to a tally for the file owner. sub wanted_user { my ($user,$size) = (stat($_))[4,7] or die "can't stat '$_': $!\n"; $sum{$user} += $size; } # MAIN EXECUTION while (<@ARGV>) { my $dir = $_; find (\&wanted_user, $dir); print "Usage Report for '$dir'\n"; print "USER\t USAGE\n"; foreach my $user (sort keys %sum) { printf "%s:\t%.2fMB\n", $user, $sum{$user}/1024/1024; } }
Re: help with FS usage script using a data struct
by Abigail-II (Bishop) on Jun 05, 2002 at 09:41 UTC
    Besides the problems mentioned by the others, the line
    next unless -d $_;
    is suspect as well. $_ refers to the filename in the directory. This means that $_ isn't absolute, and hence the -d is done relative to the current directory. Unless the current directory is the same as where File::Find happens to be searching, it's not going to work.

    Also be aware your approach is flawed. A stat() of a directory just gives you the size of the directory node in the file system. This has no relation to the total size of the files found in the directory. It's only related to the number of files (and perhaps the length of the filenames), or rather the maximum number of files the directory has had over its lifetime. (On some filesystems, directories may shrink in size, but on many filesystems, they do not).

    You should be adding file (and directory) sizes. But you have to be careful. A 1 Mb file with 3 links should not be counted three times - the data is there only once.

    Some OSses have commands that will report disk usage by UID on a filesystem. You might want to look into them.

    Abigail

Re: help with FS usage script using a data struct
by Aristotle (Chancellor) on Jun 05, 2002 at 08:19 UTC
    1. With use strict; you wouldn't have run into your problem in the first place. Yes, it's more work, but that's because writing a clean script is always more work than writing a ball of mud.
    2. $sum{$user} += $size; You should look into what these operators do - your use of ++$size indicates you have not at all understood what ++ and friends do.
    3. I don't know why stat is giving you problems because for me it works; I can stat things I have no permissions of any kind to. It makes sense too because it's reading the inode to get that info, so all you need is permission on the directory the object is in to be able to resolve the name to an inode.

    (Disclaimer: Criticism is no personal judgement, all of us are still learning.)

    Makeshifts last the longest.

Re: help with FS usage script using a data struct
by neptuna (Acolyte) on Aug 30, 2002 at 17:03 UTC
    hmm.. totally forgot about this question as i managed to fix it myself. first time i logged into monks since then. sorry for the lapse. thanks all for your suggestions.