Re: How do I display only matches
by AnomalousMonk (Archbishop) on Sep 24, 2019 at 02:04 UTC
|
Your regex /.{54}\\[a-zA-Z]\s[\r\n]/g (please use <code> ... </code> tags for all code, data and input/output; please see Writeup Formatting Tips) requires a [\r\n] to match, but chomp $row; will remove a newline from the end of each line (if present). Are you sure that a match is possible? Are you depending on a match against \r (carriage-return)? (Also, the /g modifier on the m// has no effect, although it does no harm.)
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
|
use strict;
use warnings;
my $filename = 'dirtest.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
next if $row =~ /.{20}\\[a-zA-Z]\s[\r\n]/;
print "$row\n";
}
The following is the file content:
Directory of D:\ \Q\X
09/20/2019 07:57 PM <DIR> .
09/20/2019 07:57 PM <DIR> ..
The regex works for this content in online regex https://regex101.com/r/LFrvLp/11
When I run the script, every row is displayed, it should only be the first row as it is the only row with a backslash in position 21.
| [reply] [d/l] [select] |
|
|
I think you may be over complicating things with a regex that is both more complicated and harder to understand than necessary?? I mean it looks like the file is a Windows dir listing? I would suggest:
while (my $line = <$fh>)
{
print "$line" if $line =~ /^\s+Directory of/;
}
No need to chomp if you are just going to add the line ending back in. Forcing at least one space at the beginning of the line narrows things down a lot. Putting in "Directory of" makes it very easy to understand what line of this file you are actually looking for. Please correct me if your dataset if more complicated than you've shown.
I would also add that in my work, keying a regex to a particular column number is usually a bad idea because counting the columns can be error prone and there can be some variance if the file could have been generated with "cut-n-paste". YMMV | [reply] [d/l] |
|
|
|
|
Nothing is done with | No change is made to the default value of $/ (the input record separator; see perlvar), so readline (the <$fh> expression) is reading newline-terminated lines. Then chomp removes the $/ sequence (the newline) from each line. The /.{20}\\[a-zA-Z]\s[\r\n]/ regex requires a [\r\n] (carriage-return or newline) character to match, but chomp has removed the newline, and I doubt there is a \r present with which to match in text that seems to come from a Windows directory listing (update: see this for a more thorough discussion of this point).
c:\@Work\Perl\monks>perl -wMstrict -le
"print 'match with \r' if qq{ Directory of D:\\ \\Q\\X \r} =~ /.{20}\\
+[a-zA-Z]\s[\r\n]/;
print 'match sans \r' if qq{ Directory of D:\\ \\Q\\X } =~ /.{20}\\
+[a-zA-Z]\s[\r\n]/;
"
match with \r
Update: Beyond that, the
next if $row =~ /REGEX/;
statement will skip printing if there is a match. You seem to want to print the line if there is a match, so something like
next unless $row =~ /REGEX/;
still seems the way to go (once you get the regex right :) (Update: See also Marshall's reply. It seems like really good advice, although I see no need to stringize "$line" when $line is already a string read from a file.)
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
|
Re: How do I display only matches
by LanX (Saint) on Sep 24, 2019 at 01:47 UTC
|
put this between chomp and print
next if $row =~ /REGEX/;
Update
Of course this must be inverted
next unless $row =~ /REGEX/;
Thanks anomalous monk
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
|
|
use strict;
use warnings;
my $filename = 'dirtest.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
next if $row =~ /.{20}\\[a-zA-Z]\s[\r\n]/;
print "$row\n";
}
The following is the file content:
Directory of D:\ \Q\X
09/20/2019 07:57 PM <DIR> .
09/20/2019 07:57 PM <DIR> ..
The regex works for this content in online regex https://regex101.com/r/LFrvLp/11 But when I run the script, EVERY row is displayed, it should only be the FIRST row as it is the only row with a backslash in position 21.
| [reply] [d/l] [select] |
|
|
Re: How do I display only matches
by Anonymous Monk on Sep 24, 2019 at 03:55 UTC
|
autodie handles your errors with less effort. Also try open with encoding, esp if opening more than one file, for cleaner code:
use strict;
use warnings;
use autodie;
use open ':encoding(UTF-8)';
my $filename = 'dirtest.txt';
open my $fh, '<', $filename;
| [reply] [d/l] |