LittleGreyCat has asked for the wisdom of the Perl Monks concerning the following question:
Let me first try to describe the problem I am trying to solve:
I have a large file with many complex entries, each relating to a file in a Unix filestore tree.
I wish to extract a subset of these entries to match part of the current filestore tree; I can produce a list of the current filestore tree using the 'find' command.
So I have two files:
The true filestore list
/fred/myfile
/fred/myfile2
/bert/myfile
The large complex file
user ALLFILES /fred/myfile=/archive/dingbat/fred/myfile 3 6 9 thegoosedrankwine
user ALLFILES /fred/myfile2=/archive/dingbat/fred/myfile 3 6 9 thegoosedrankwine
user ALLFILES /fred/myfile3=/archive/dingbat/fred/myfile 3 6 9 thegoosedrankwine
user ALLFILES /bert/myfile=/archive/dingbat/fred/myfile 3 6 9 thegoosedrankwine
user ALLFILES /bert/myfile2=/archive/dingbat/fred/myfile 3 6 9 thegoosedrankwine
You will note that the third field (up to the '=') in the complex file is the filename in the real filestore tree.
My tentative plan is to set up the first file as a hash, indexed by the whole contents of each line, and then read serially through the second file, splitting out the file name component and matching it with the Hash.
If I get a hit, I then overwrite the matching entry in the Hash with the current line in my complex input file.
At the end I should have copied all the matching entries out of the complex file, and these should now be in the other file.
Any lines without a match will be unchanged.
So, the question:
Can I use 'Tie::File' to generate the Hash (which makes this scalable to work with large files and small memory), should I work in memory, or is there some other Perl feature which will make this so easy that I will be embarrased that I asked the question.
TIA
LGC
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Tie::File to create a Hash?
by Fletch (Bishop) on May 31, 2007 at 13:54 UTC | |
|
Re: Tie::File to create a Hash?
by citromatik (Curate) on May 31, 2007 at 14:20 UTC | |
by LittleGreyCat (Scribe) on May 31, 2007 at 15:26 UTC | |
|
Re: Tie::File to create a Hash?
by blazar (Canon) on May 31, 2007 at 13:57 UTC |