guha has asked for the wisdom of the Perl Monks concerning the following question:

I have been playing around with a RE that is supposed to extract the value of a tag.

The program below already does the right thing, but I have an undefined feeling that it could be done with some kind of alternation aka |. But I can't get it to do what I want. I have been reading the docs but either the info is too thin or my head is too thick.

So I was wondering if someone can show me how alternation can/should be used.

#!perl -w use strict; while( <DATA> ) { chomp; my ( $data ) = m/TAG:\"([^\"]+)\"/ ? $1 : m/TAG:([^\"\s]+)/ ? $1 : + 'not_def'; print "$_ => $data\n"; } __DATA__ TAG:"test of data" TAG:test_of_data TAG:test of data TAG: test TAG:test of data" TAG:"test of data TAG:test-2_3.we
TIA

Replies are listed 'Best First'.
Re: Alternation in pattern matches
by ambrus (Abbot) on Feb 18, 2004 at 20:56 UTC

    Good question. The answer is to use the magic $+ vaiable:

    while( <DATA> ) { chomp; my ( $data ) = m/TAG:(\"([^\"]+)\"|([^\"\s]+))/ ? $+ : 'not_def'; print "$_ => $data\n"; }
Re: Alternation in pattern matches
by dws (Chancellor) on Feb 18, 2004 at 21:49 UTC
    Your example isn't 100% clear about which of the test cases have valid data and which don't. One way to make things clearer is to allow for (and ignore) comment lines:
    while ( <DATA> ) { next if /^(?:#.*|\s*)$/; ... } __DATA__ # These should pass TAG:"test of data" ... # These should fail TAG: test ...
    That saves us from having to guess, and removes the risk of our guessing wrong.

      Yes you are correct, it's not 100% clear, but fairly close I would say given the sentence.

      The program below already does the right thing, but I have ....

      During the composal of the node, the idea of including test ala Test::More occurred to me, but laziness got the better of me.

      Anyway your idea with comments in the DATA section is nice and I will try to remember that.

      Thanks for your input dws, as always mature and experienced in a particular way

Re: Alternation in pattern matches
by Abigail-II (Bishop) on Feb 18, 2004 at 21:39 UTC
    If you do include code, and it's output is important to understand the question, then please be so kind to include the output!

    The question arises, are you a backslashophile? You escape things that aren't nessary, making it harder to read the code.

    Anyway, a way of doing it in one regexp (no alternation, that would be hard unless you want to switch afterwards):

    my (undef, $data) = /TAG:("?)((??{ $1 ? '[^"]+' : '[^"\s]+' }))\1/; $data //= 'not_def';

    Abigail

      You don't like (")?(?(1)[^"]+|[^"\s]+) ?

        It just won't catch a missing closing qquote, as in TAG:"test of data. However, you can correct that like: my ( $data ) = m/TAG:(")?((?(1)[^"]+|[^"\s]+))(?(1)")/ ? $2 : 'not_def';