in reply to Re^2: How to extract a pattern in Perl regex?
in thread How to extract a pattern in Perl regex?

Surely there's a simpler way?

Just capture what you want. Let's change the task to remove the elephant in the room of parsing HTML with regex which you now know you shouldn't do. Instead suppose you want to extract everything between 'foo' and 'bar' and ignore all the rest. Here's the simple approach:

use strict; use warnings; use Test::More tests => 1; my $in = 'abcfooHellobarxyz'; my $want = 'Hello'; my ($have) = ($in =~ /foo(.*)bar/); is $have, $want, "Extracted $want";

The only real caveat to this is to remember to use the /s modifier if the text you are extracting might contain \n.

Replies are listed 'Best First'.
Re^4: How to extract a pattern in Perl regex?
by SergioQ (Scribe) on May 01, 2020 at 22:24 UTC
    Thank you! Yes, this was the main part of what I was looking for. I remember going through a rather large Perl handbook, and it ended the Regex chapter (or started it) by saying that "there is so much to Regex that whole books are written on it." I really see why now.
Re^4: How to extract a pattern in Perl regex?
by AnomalousMonk (Archbishop) on May 01, 2020 at 10:33 UTC
    ... caveat ... is to remember to use the /s modifier if the text you are extracting might contain \n.

    Simpler still is to always use  /s (along with  /x and  /m in a consistent  /xms modifier tail) on every  qr// m// s/// you write. Then the rule is simply "Dot matches all." Period.


    Give a man a fish:  <%-{-{-{-<

      We've had this conversation before. Let's agree to disagree. :-)

        As I was finishing my reply, it began to dawn on me that we had, in fact, had this discussion before. Anyway, it seemed that it might still be useful to a novice monk to be exposed to the disagreement.


        Give a man a fish:  <%-{-{-{-<