Re: Matching a string in a parenthesized block (regex help)
by LanX (Saint) on Mar 05, 2021 at 22:40 UTC
|
The best answer depends on the things you haven't told us, like
- are blocks never nested?
- do they always finish in a single } per line?
If both is true use the flip-flop operator .. to match start and end of a block.
Use a normal regex to match the insides.
edit
if( /block-start/ .. /block-end/ ) {
$block .= $line;
$hit = 1 if /match-plz/;
} else {
print $block if $hit;
$block = $hit = undef; # reset
}
| [reply] [d/l] [select] |
|
|
I must admit my knowledge is not advanced enough to understand the significance of the .= operator here. What is the reason behind adding strings to a string here? Not sure how to implement this solution.
| [reply] |
|
|
I already linked to the docs for the Flip-Flop operator
Here an implementation
Please note how ...
- it avoids slurping the whole (potentially huge) file into RAM
- it's self documenting (well better than one big regex)
- you can now easily add more complicated tests when maintaining
use strict;
use warnings;
my $section;
my $hit;
while (<DATA>) {
my $start = /^ASDF \{\s*$/; #(2)
my $end = /^\}\s*$/;
if ($start .. $end) {
$section .= $_;
$hit = 1 if /foo_match/;
}
if ($end and $hit) {
print $section;
$section = $hit = ""; # reset (1)
}
}
__DATA__
ASDF {
tmp
foo_match
tmp
}
string2 {
tmp
}
ASDF {
tmp
bar_match
tmp
}
NB:
- 1) you can also exit instead of resetting
- 2) allowing potential "invisible" whitespace \s* at the end makes it more robust
| [reply] [d/l] [select] |
Re: Matching a string in a parenthesized block (regex help)
by jwkrahn (Abbot) on Mar 06, 2021 at 00:21 UTC
|
$ echo "ASDF {
tmp
plz_match
tmp
}
string2 {
tmp
}
string3 {
tmp
}
" | perl -e'
local $/ = "}\n";
while ( <> ) {
if ( /^ASDF/ && /plz_match/ ) {
print "Matched: $_";
++$match;
}
}
print "No match\n" unless $match;
'
Matched: ASDF {
tmp
plz_match
tmp
}
| [reply] [d/l] |
|
|
use warnings;
use strict;
my $file = "/path/to/file.txt";
sub has_word {
my $arg = $_[0];
local $/;
open FILE, '<', $file;
while ( <FILE> ) {
if ( /^ASDF_$arg/ && /magic/ ) {
close FILE;
return 1;
} else {
close FILE;
return 0;
}
}
}
sub main {
if (has_word("ONE")) {
print "ONE already has the word.\n";
} else {
print "ONE does not have the word.\n";
}
if (has_word("TWO")) {
print "TWO already has the word.\n";
} else {
print "TWO does not have the word.\n";
}
}
main;
Content of file in this particular case:
ASDF_ONE {
magic
tmp
tmp
}
ASDF_TWO {
tmp
magic
tmp
}
string3 {
tmp
tmp
magic
}
The output is not what I expect:
ONE already has the word.
TWO does not have the word.
Indeed, all the sections in this case have the word. | [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
|
|
$ cat file.txt
ASDF_ONE {
magic
tmp
tmp
}
ASDF_TWO {
tmp
magic
tmp
}
string3 {
tmp
tmp
magic
}
#!/usr/bin/perl
use warnings;
use strict;
use feature 'state';
my $file_name = 'file.txt';
sub get_file_data {
state $data;
unless ( length $data ) {
open my $FH, '<', $file_name or die "Cannot open '$file_name'
+because: $!";
my $read = read $FH, $data, -s $FH;
$read == -s _ or die "Error reading '$file_name'";
}
return $data;
}
sub has_word {
my $query = shift;
my $file = get_file_data();
local $/ = "\n}\n";
open my $FH, '<', \$file;
while ( <$FH> ) {
if ( /^ASDF_\Q$query/ && /magic/ ) {
return 1;
}
}
return;
}
if ( has_word( 'ONE' ) ) {
print "ONE already has the word.\n";
}
else {
print "ONE does not have the word.\n";
}
if ( has_word( 'TWO' ) ) {
print "TWO already has the word.\n";
}
else {
print "TWO does not have the word.\n";
}
And it produces this output:
$ perl 11129184.pl
ONE already has the word.
TWO already has the word.
| [reply] [d/l] [select] |
Re: Matching a string in a parenthesized block (regex help)
by hippo (Archbishop) on Mar 06, 2021 at 11:01 UTC
|
use strict;
use warnings;
use Test::More tests => 2;
my $in = <<EOT;
ASDF {
tmp
foo_match
tmp
}
string2 {
tmp
}
string3 {
tmp
bar_match
tmp
}
EOT
my $re = '^ASDF {[^{]*foo_match[^}]*}';
like $in, qr/$re/m, 'foo_match found in ASDF';
$re =~ s/foo/bar/;
unlike $in, qr/$re/m, 'bar_match not found in ASDF although present in
+ string3';
| [reply] [d/l] |
Re: Matching a string in a parenthesized block (regex help)
by LanX (Saint) on Mar 05, 2021 at 23:59 UTC
|
I played around with your regex, what exactly is wrong, except that you didn't slurp it all into $file?
!/usr/bin/perl
use warnings;
use strict;
my $file = "/path/to/file.txt";
local $/; # added after post
my $content = <DATA>;
if ( $content =~ m/(ASDF \{)(.*?)plz_match(.*?)(\})/s ) {
print "Matched: <<< $& >>>\n";
} else {
print "No match: |$content|\n";
}
__DATA__
ASDF {
tmp
plz_match
tmp
}
string2 {
tmp
}
string3 {
tmp
}
Matched: <<< ASDF {
tmp
plz_match
tmp
} >>>
| [reply] [d/l] [select] |
|
|
Now that you mention it, I think the slurp mode fixed my little program.
*edit*
Actually no it seems like it wants to match greedily all the way down to the end of the file... I will check the other responses. For instance it wants to look outside ASDF and will match the other blocks if it has plz_match
| [reply] |
|
|
Look at my regex, I made both .*? non-greedy.
jwkrahn's solution isn't bad either, if your records are that consistent.
Edit
Though
[^}]*? is certainly better for more complex input.
| [reply] [d/l] [select] |
|
|
|
|
|
Re: Matching a string in a parenthesized block (regex help)
by haukex (Archbishop) on Mar 06, 2021 at 03:58 UTC
|
| [reply] |
Re: Matching a string in a parenthesized block (regex help)
by maxamillionk (Acolyte) on Mar 05, 2021 at 22:33 UTC
|
Ah darn it I forgot
local $/;
That's one mistake...
| [reply] [d/l] |