Re: Read all the file path having text document
by ikegami (Patriarch) on Nov 29, 2008 at 21:55 UTC
|
my @files = File::Find::Rule->new()
->file()
->name('*.txt')
->in($path);
If you have a problem with links (symbolic or hard) that point to a parent directory, I don't know if they'll help. What they will do is make your code clearer and save you from reinventing the wheel.
| [reply] [d/l] |
Re: Read all the file path having text document
by almut (Canon) on Nov 29, 2008 at 21:56 UTC
|
Unless this is primarily for educational purposes, I'd recommend
using File::Find or File::Find::Rule instead :)
As to dealing with symlink cycles, see the follow/follow_fast/follow_skip options that File::Find provides.
BTW,
sub recurse($);
in Perl you usually wouldn't use prototypes, unless you have a very
good reason. Also, you don't need to predeclare the subroutine here.
| [reply] [d/l] [select] |
|
|
I agree that prototyping is unnecessary and potentially very confusing in the OPed code (and in general) and should be avoided, but in the specific code given in the OP, isn't predeclaration of the subroutine necessary – at least to avoid a warning?
>perl -wMstrict -le
"sub S ($);
S('foo');
sub S ($) { print $_[0], '-totyped subroutine' };
"
foo-totyped subroutine
>perl -wMstrict -le
"
S('foo');
sub S ($) { print $_[0], '-totyped subroutine' };
"
main::S() called too early to check prototype at -e line 1.
foo-totyped subroutine
| [reply] [d/l] |
|
|
| [reply] |
|
|
|
|
|
|
|
"in Perl you usually wouldn't use prototypes, unless you have a very good reason."
Why? What is the downside? Is there some guidance document somewhere? I can see that for 'short' scripts (all on a single page); but what about 4000 lines of Perl spread across 10-12 packages?
To me; prototyping keeps me from making subtle design changes (just 1 extra arg, in this special case) that are not documented, and probably will cause maintenance problems later.
I learned to even like C++'s mangled namespace to use argument types as overloading specifiers.
I think it is inappropriate to pooh-pah someone for using good software engineering technique. Are 'strict', 'warning', and maybe 'taint' really not needed either?
| [reply] |
|
|
To me; prototyping keeps me from making subtle design changes (just 1 extra arg, in this special case) that are not documented, and probably will cause maintenance problems later.
If you're doing that, you don't understand what prototypes do (and this is common, and one of the reasons for the general rule that you shouldn't use them unless you've got a good reason).
Prototypes in perl convert arguments and allow short-cuts in the calling code. The "checking" of the arguments only happens as a side effect, and you'll only get warnings/errors if the arguments cannot get converted. This probably means you'll have subtler bugs, not less.
See also Are prototypes evil?, for example
| [reply] |
|
|
|
|
| [reply] |
Re: Read all the file path having text document
by AnomalousMonk (Archbishop) on Nov 29, 2008 at 23:10 UTC
|
While endorsing the recommendations of others to use one of the fine File::Find group of modules, I would also point out that in your regex m/.txt/ the '.' (dot) is a regex metacharacter that matches any character except a newline (or any character at all if the /s regex switch is used). Try m{ \. txt }xms for greater specificity.
See perlre, particularly the Regular Expressions and /s switch sub-sections. | [reply] [d/l] [select] |
|
|
Hi Monks,
Thanks for the help . I completely changed my code and its working , though its taking 4 minutes to get the path of all the text documents. I have a 120 GB hardisk and only 40 GB contains text document. Can I make it faster?Navzit
use File::Find::Rule;
my $rule = File::Find::Rule->new;
$rule->file;
$rule->name( '*.txt' );
my @files = $rule->in("/" );
open(W,"> index.txt")|| die "can't open index.txt file";
foreach $files_name (@files)
{
print W $files_name ;
print W "\n";
}
| [reply] [d/l] |
|
|
| [reply] [d/l] |
Re: Read all the file path having text document
by ccn (Vicar) on Nov 29, 2008 at 21:50 UTC
|
Your may have troubles with recurring symlinks. Use a hash of %seen directories. | [reply] [d/l] |