Re: read the whole folder files
by kennethk (Abbot) on Apr 09, 2012 at 18:50 UTC
|
So close. opendir/readdir is the equivalent of ls or dir; it just lists the directory content. Assuming you want to read the contents of the file, you need to open them, too, a la:
foreach my $file (@files) {
next unless -f "$directory/$file";
open my $fh, '<', "$directory/$file" or die "Open failed on $file:
+ $!";
while (<$fh>) {
next if /^\s$/; # skip blank lines
Also note the use of -f to check that you're dealing with an ordinary file before opening and actually testing that the file open worked.
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
| [reply] [d/l] [select] |
Re: read the whole folder files
by halfcountplus (Hermit) on Apr 09, 2012 at 20:23 UTC
|
in the grep, you have a regex, with a rather strange looking pattern.
Ditto. Maybe you've confused shell globbing with regular expressions? Let's break this is down:
_*_.txt
"_*_" would mean, zero or more instances of "_", with a "_" following. Then "." matches any character.
If you want to match _one_.txt, _two_.txt, etc, you should use:
_.+?_\.txt
Which means: a "_" followed by one or more of anything, non-greedily (because + is followed by ?, but note ? has another meaning in regexps depending on context; non-greedy matching is important when you are looking for any number of anything, followed by something in particular), then "_." (notice the . is escaped with \ because . alone has a special meaning, see above) followed by "txt".
If you haven't yet: perlretut.
| [reply] [d/l] [select] |
|
|
I like your explanation.
However it should be noted that there are limits to "greediness". The regex will be as "greedy" as it can be while still allowing the rest of the regex to match. In this case, anchoring the regex to the "end of string" makes a difference.
/_.*_\.txt$/- the characters "gobbled up" by the .* won't include the "_.txt" at the end of the string.
Having said that, I am confused by the OPs updated comment because what we were talking about had to do the how to get the file names and not parsing the file contents itself - which is a different question!
| [reply] |
Re: read the whole folder files
by Marshall (Canon) on Apr 09, 2012 at 19:38 UTC
|
in the grep, you have a regex, with a rather strange looking pattern.
my @files = grep {/_*_.txt/} readdir DIR;
perhaps:
my @files = grep {/\.txt$/} readdir DIR;
is all that you need? The $ anchors the regex to the end of the string. An underscore "_" would be unusual before a .txt ending. Without escaping the "." like "\.", the dot means any character, which doesn't look like that you want.
When debugging, print @files to make sure that you are getting the files that you want. | [reply] [d/l] [select] |
Re: read the whole folder files
by perllearner007 (Acolyte) on Apr 09, 2012 at 20:38 UTC
|
Hello Perl monks, Thank you so much...both your suggestions resolved the issue. However, the output file is in the format as shown below:
meter_read energy_consumption n 00_1 34 n 00_2 53 n 00_3 121 n ...
...
meter_read energy_consumption n 00_146 33 n ....
...
Isn't it a bit strange because while reading one file they get sorted into columns like below:
meter_read energy_consumption
00_146 33
00_1 34
And here they don't ? Plus the "n" i.e the \n newline command isn't supposed to show up is it?
| [reply] [d/l] [select] |
|
|
One way (of many) to parse the data would be like this:
#!/usr/bin/perl -w
use strict;
use Data::Dumper; # this is a core module
# no "installation" is required
my $line = "meter_read energy_consumption n 00_1 34 n 00_2 53 n 00_3 1
+21\n";
#
# this extracts number pairs and puts them into a hash
#
my %hash = $line =~ m/([\d_]+)\s+(\d+)/g;
print Dumper \%hash;
__END__
$VAR1 = {
'00_2' => '53',
'00_3' => '121',
'00_1' => '34'
};
| [reply] [d/l] |
Re: read the whole folder files
by perllearner007 (Acolyte) on Apr 09, 2012 at 20:43 UTC
|
Nevermind! I fixed it! Thank you so much! | [reply] |
|
|
use File::Find::Rule;
## GLOB
my @files = find( file => maxdepth => 1, name => '_*_.txt', in => $dir
+ectory );
## REGEX
my @files = find( file => maxdepth => 1, name => qr/_.+?_\.txt$/, in =
+> $directory );
## verbose
my @files = File::Find::Rule->file()
->maxdepth(1) # do not recurse
->name( '_*_.txt', )
->in( $directory );
...
| [reply] [d/l] |
Re: read the whole folder files
by perllearner007 (Acolyte) on Apr 10, 2012 at 16:24 UTC
|
It was running fine yesterday but today I keep getting a warning "use of uninitialised value ...in pattern match.. can anyone spot the mistake below?
if ($energy_consumption =~ /^[^0-1-.]/ or ( $energy_consumption <
+60 or $energy_consumption > 120))
| [reply] [d/l] |
|
|
Well, the warning says there is an uninitialized value in a pattern match, so clearly that means that $energy_consumption is undefined when you get there. That value is assigned in your split, so therefore the split isn't outputting at least 8 values. I suspect that your input file is not formatted as your expect. A quick way to find the offending lines would be to add the block
if (not defined $energy_consumption) {
warn "Split missed, $file: $_";
next;
}
after the split, and see what comes out. My guess is that you are working with tab-delimited files, and there are some empty values.
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
| [reply] [d/l] [select] |
|
|
Hello,
Thank you for your time..I added the block after split but it keeps warning me..
I checked the output of the tab-delim file aswell..Doesn't seem to be a problem there..
it just outputs something like the following
split missed, myresultfile.txt: 61283067 61283865 798 0.57412
| [reply] [d/l] |
|
|