cspctec has asked for the wisdom of the Perl Monks concerning the following question:

I have created a script that traverses a file system (starting at your currect working directory) and lists all files and directories it finds. I need a way to show when each node (file or directory) was processed.

My script:

#!/usr/bin/perl -w use strict; use Cwd; # Perl script to traverse a given directory and print when each direct +ory or file is discovered. # This starts from the user's current working directory. my $dir = cwd(); my $discovered = 0; my $processed = 0; chomp (my $current = `pwd`); sub process { $dir = shift; foreach (<$dir/*>) { my $file = $_; $file =~ s/$current\///g; next if (-l $_); if (-f $_) { print "Filename: $file\nDiscovered: $discovered\nProcessed: $pro +cessed\n\n"; } if (-d $_) { print "Directory name: $file\nDiscovered: $discovered\nProcessed +: $processed\n\n"; process($_); } } } process($dir);
If I were to run this against the directory layout

example_dir {file_one.txt file_two.txt next_dir {file_three.txt}} (file_one.txt and file_two.txt are inside example_dir, and next_dir is inside example_dir... file_three is inside next_dir, sorry if it's not clear)

I need to output to look like:
Filename: file_one.txt Discovered: 1 Processed: 2 Filename: file_two.txt Discovered: 3 Processed: 4 Directory name: next_dir Discovered: 5 Processed: 8 Filename: file_three.txt Discovered: 6 Processed: 7

I'm trying to print when each node is discovered and processed in the stack. Can anyone help?

Replies are listed 'Best First'.
Re: Help with node discovering and processing
by AppleFritter (Vicar) on Jul 05, 2014 at 09:31 UTC

    A global counter may or may not be the way to go here, depending on the context in which you want to do this. If this is for a small script that you need for system administration or so, I'd say yes, use a global counter. If it's part of a larger project, use a cleaner solution with hooks/callback functions.

    Take a look at Dominus's book Higher-Order Perl, too, walking a directory tree is one of the examples he uses in the first chapter (pp. 16-25). You could probably adapt what he's doing there. (And all the code snippets from his book are online, e.g. dir-walk-cb-def and so on.)

    Other than that, to actually print the right numbers -- for a file it's straightforward; for a directory, "processed" equals "discovered" plus twice the number of files in that directory, plus one, so just print that (without increasing the counter, of course).

Re: Help with node discovering and processing
by Laurent_R (Canon) on Jul 05, 2014 at 07:25 UTC
    Am I right to understand that you only need a "logical clock", i.e. some form of counter telling you in which order the various elements were accessed, but that you don't care about absolute time stamps?

    If so, I would just declare a global counter to be incremented whenever you need. I think that this is a case where a global variable makes sense. If you really don't want a global variable, then you could use a closure maintaining that counter private to it and either doing the print job or returning the value of the (incremented) counter on demand. But it sounds a bit as overkill in the case.

    Edit: Thinking about it, using a static variable (the relatively new state keyword), for example within a callback function, might be another way to go.

Re: Help with node discovering and processing
by 2teez (Vicar) on Jul 05, 2014 at 16:41 UTC

    Hi,

    In addition to all that been said, you might also take a look at the module File::Find and the likes for transversing your directories. Half of the job is done...

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me