The point about the possibility of single word "text chunks" is well made... so long as the words "pretty specific regex" are not intended to deprecate specificity.
IMO, specificity is *GOOD* in a regex unless ambiguity (or at least, specific generalizations) are required because a less-than-specific regex can lead to hard-to-find problems where the source data includes unexpected content.
Consider,
H2O 60%
or
Grand Canyon3 70%
or
Teller-Bose condensate 50%
Does one want the "water" entry or the footnoted "Grand Canyon" in the output?
In reply to Re^3: Capture groups
by ww
in thread Capture groups
by legend
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |