I have a regex question. It is not a real problem, I'm just curious if it is possible at all:
It's about (non)greediness, left to right and regex replacements.
Before I post the code, I need to let you know that
#!/usr/bin/perl use strict; use warnings; my $string =<<'EOF'; <tr> <td>aaa</td> <td>aaa</td> </tr> <tr> <td>NOTWANTED</td> </tr> <tr> <td>bbb</td> <td>bbb</td> </tr> EOF # try to remove all table rows with NOTWANTED $string =~ s/<tr.+?NOTWANTED.+?<\/tr>//gsm; print $string;
prints only the third table row. As far as I understand, the problem here is that regex starts with the leftmost "<tr" (the first one) and will find a (smallest) match that contains NOTWANTED and will remove it.
While this problem can easily be solved withI'm still curious if it can be done with a regex replacement.my @tr = split(/(?=<tr)/, $string); # split at <tr, but do not remove +<tr @tr = grep { ! /NOTWANTED/ } @tr; # remove the elements with NOTWANT +ED print join('', @tr);
In reply to greedy/nongreedy regex replacement by svenXY
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |