boardryder has asked for the wisdom of the Perl Monks concerning the following question:
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Manage Directory Structure
by graff (Chancellor) on Jul 04, 2009 at 02:01 UTC | |
I can't figure out what you actually want... Do you want to create one file per directory, such that this one file contains the concatenation of all data found in all other data files in that directory? Sounds like maybe you want to create tar files? (If not, then what? And why?) Update: what is it about your directory that's "too big"? File count? Total byte count? (Both?) How big is "too big" (what sort of numbers are you looking at)? | [reply] |
by boardryder (Novice) on Jul 04, 2009 at 03:04 UTC | |
My solution below seems to be a good start, but my full directory listing creates 1G flat files that cannot be read into a hash (hash creation causes program to die for running out of memory). "Do you want to create one file per directory, such that this one file contains the concatenation of all data found in all other data files in that directory?" I want a listing of directory contents in each directory file Any other suggestions on my methods to end goal are welcomed :) Thank You!
| [reply] [d/l] |
by GrandFather (Saint) on Jul 04, 2009 at 04:36 UTC | |
should be
or $progname_ will be seen as the variable name rather than $progname. Strictures should also force you to think a little more about lifetime of variables and how you pass information around to different parts of your program.
although passing an array into the find call would be even better:
and avoids the interesting mismatch between the calls to call_dir and the implementation (3 calls, 2 valid return values). However, I'd be inclined to read and parse today and yesterday's files in parallel to detect differences. That avoids having any more than a few lines of data in memory at any given time, but may make the parsing a little more interesting. If instead you read the files in parallel as suggested above, but load a complete directory of files at a time into two arrays of lines (one for today's files and one for yesterday's) you could then use Algorithm::Diff to do the heavy lifting in differencing the two file sets. That limits the data in memory to one directory worth of files (x 2 - one image for each day), but probably simplifies the diff parsing substantially. True laziness is hard work | [reply] [d/l] [select] |
by graff (Chancellor) on Jul 04, 2009 at 14:32 UTC | |
How about breaking the problem down to three separate procedures: File::Find will be good for the first step, though you might want to consider just using the available unix/linux tools:
Using "diff" on two consecutive "dirlist" files will reveal the addition or removal of directories. For step 2, I would do something like: With that done, running the basic "diff" command on two consecutive file listings for a given directory (assuming that the directory existed on both days) will tell you which files changed, which were added, and which were removed. Just figure out what you want to do with the output from "diff". | [reply] [d/l] [select] |
by boardryder (Novice) on Jul 05, 2009 at 04:34 UTC | |
by graff (Chancellor) on Jul 05, 2009 at 14:38 UTC | |
| |