comment on

Greetings fellow monks,

Over the course of several years, I've collected a great series of programs, code, graphics, MP3s, and other such things that I save off onto CDs from time to time. My "extremely eloquent" naming convention for these is CD1, CD2, and so forth. Naturally, when I'm searching for a specific file I saved off a year or two ago, the quest becomes quite a process.

To reduce the time it takes to locate what I'm searching for, I wanted to make/use a "directory+depth" catalog or indexing script. Essentially, it would look at every directory and file in a volume or directory, parsing through the complete depth, and spit out a report/index of the files with some size and quantity values associated to each directory. I'd print off these reports or save them as a group on the latest CD.

I've been searching around for some module or previously posted code chunk to do this, but I've been unsuccessful. So here's my first attempt to construct a solution. It's certainly messy, and does need some help with its size and construct, but here it is:

#!/usr/bin/perl

my $root_dir    = '/var/home/gryphon';
my $file_icon   = '- ';
my $dir_icon    = '# ';
my $vol_icon    = '* ';
my $indent_icon = '  ';
my $output_file = 'list-of-stuff.txt';

use strict;
use File::Find;
use File::stat;

my (%files, %dirs);
find(\&learn_files, $root_dir);

open(OUT, "> $output_file");
foreach (sort keys %dirs) {
    my @indent = split(/\//, substr($_, length($root_dir) + 1));
    print OUT $indent_icon x ($#indent + 1);

    if ($#indent > -1) {
        print OUT $dir_icon, $indent[$#indent];
    } else {
        print OUT $vol_icon, $_;
    }

    print OUT ' (', fix_bytes($dirs{$_}{size} + 0);
    print OUT ', ', comma($dirs{$_}{files} + 0), ' files';
    print OUT ', ', comma($dirs{$_}{subdirs} + 0), ' folders)', "\n";

    foreach my $file (sort keys %{$files{$_}}) {
        print OUT $indent_icon x ($#indent + 2);
        print OUT $file_icon, $file, ' (', fix_bytes($files{$_}{$file}
+), ")\n";
    }
}
close(OUT);

sub learn_files {
    if (-d) {
        if ($_ ne '.') {
            $dirs{$File::Find::dir}{subdirs}++;
            add_up($File::Find::dir);
        }
    } else {
        $dirs{$File::Find::dir}{files}++;

        my $file_info = stat($File::Find::name);
        $dirs{$File::Find::dir}{size} += $file_info->size;
        $files{$File::Find::dir}{$_} = $file_info->size;

        add_up($File::Find::dir, $file_info->size)
            if ($File::Find::dir ne $root_dir);
    }
}

sub add_up {
    my $dir = substr($_[0], length($root_dir) + 1);
    my $curr_dir = $root_dir;
    foreach (split(/\//, $dir)) {
        if ($_[1] eq '') {
            $dirs{$curr_dir}{subdirs}++;
        } else {
            $dirs{$curr_dir}{files}++;
            $dirs{$curr_dir}{size} += $_[1];
        }
        $curr_dir .= "/$_";
    }
}

sub fix_bytes {
    return comma(int($_[0] / 10737418.24) / 100) . ' GB' if ($_[0] > 1
+073741824);
    return comma(int($_[0] / 10485.76) / 100) . ' MB' if ($_[0] > 1048
+576);
    return comma(int($_[0] / 10.24) / 100) . ' KB' if ($_[0] > 1024);
    return comma($_[0]) . ' bytes';
}

sub comma {
    my $text = reverse $_[0];
    $text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g;
    return scalar reverse $text;
}
[download]

This seems to work properly in both Linux and Win32 enviornments. I'm going to start jazzing up the output so it'll spit out a nice HTML doc with icons and the like. But the above version is the meat of the matter.

Anyone see any glarring problems with it? Anything in there that's not efficient? While the code seems to work fine in general, it seems to take forever to complete on larger directory trees. Any suggestions?

-gryphon
code('Perl') || die;

In reply to Complete Directory+Depth Listing by gryphon

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.