ExReg has asked for the wisdom of the Perl Monks concerning the following question:
I have run into a little problem in extracting information from a bunch of files. As I loop through all the files, I slurp each one in and then look for information matching certain criteria. For simplicity's sake, lets say that the file contents is
$fc = 'abcdfoofrobnicatebardefforspambazghi';I then search for certain patterns in the file and capture them using a regex, say:
$re1 = qr/fo.*?ba./;I capture them into an array for later analysis thus:
@excerpts = $fc =~ /$re1/g;The above yields two entries in @excerpts:
foofrobnicatebar forspambaz
Later, when I get to analyzing the excerpts, I want to search each one for patterns within each match using almost the same regex with capture groups:
$re2 = qr/(fo.)(.*?)(ba.)/;This gives me
foo frobnicate bar for spam baz
My question is this: The expressions I am really using are much more complicated in the two regexes, $re1 and $re2, and there are orders of magnitude more captures than the two shown here. If I make a change to one regex, I have to change the other. This can be a hassle in big ugly expressions, and I have already failed once to notice they were not in sync. It would be nice to have only one regex, but if I were to use the second expression in the first capture, I would get
@excerpts = $fc =~ /$re2/g; foo frobnicate bar for spam baz
Six entries in @excerpts
Having capture groups in a /g capture makes each capture group add an element to the array instead of the whole regex adding one. Is there an easier way than having two almost identical regexes? Can I have one regex with capture groups, but keep it from making each capture group create an array element under /g?
|
|---|