⭐ in reply to How do I extract all text between two keywords like start and end?
Note that the . character matches any character but a newline (see m// if you want to span lines), the * means match zero or more times, and the ? forces * to match as few times as possible -- so it will pick up the first end instead of the last one. The \b is in there to prevent mismatches on words like 'starting' and 'backend'.if ($text =~ /\bstart\b(.*?)\bend\b/) { $result = $1; # do something with results }
It has the limitation of not catching nested starts and ends, in which case you might go the recursion route, and write this as a function:
That can become prohibitively expensive, depending on your data set. I suspect there's a more hideous solution involving split and join, but that's likely to be counterproductive at this point. It also depends on having balanced tags -- if you don't, don't do this!sub between { my $text = shift; if ($text =~ /start(.*?)end/) { $result = $1; between($result); } else { return $text; } }
|
|---|