bgi has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

Im currently sitting on a problem and need somebody who has solved something like that:

I try to parse a filetree which is really big, so big that the pathname will be greater than 256 signs...whih is the naturally limit of the filesystem (I belief its cifs)

But I think this is still manageable...from another project I know that the list of the files will be approx 4GB, just the list!!

Is there anybody who had to solve a similar prblem?

regards

Replies are listed 'Best First'.
Re: parse -huge- filetree interative
by moritz (Cardinal) on Jul 24, 2009 at 10:33 UTC
    I don't understand what your problem is.

    You write that you want to "parse a filetree". What do you mean exactly by that? "to parse", in my understanding, means to read a text file and turn it into an internal data structure. If that's what you mean, where does the file size limit comes into play? Perl doesn't care about cifs' limits, its strings can be arbitrarily long (as long as it fits into memory).

    Or do you mean "recursively read a file tree"? If that's what you mean, the file tree already exists, so the path length can't be greater than what the file system supports, so I don't see a problem with that either.

    Or is your problem something else? Please try to be more exact in your description.

Re: parse -hugh- filetree interative
by jethro (Monsignor) on Jul 24, 2009 at 10:44 UTC

    Perl won't have a pathname restriction of 256 characters, but if the filetree is big you should try to avoid keeping it all in memory. If for example you want a hash to store all the filenames, you might use DBM::Deep to use a disk-based hash transparently. Or use a real database (with some DBI module or similar) if you want to do lots of data mining with it

    There are many modules that can help you with your task, for example File::Find. Just search for "File" in and choose what fits your need

Re: parse -hugh- filetree interative
by Anonymous Monk on Jul 24, 2009 at 10:31 UTC
    I don't understand, is this job posting?