karden has asked for the wisdom of the Perl Monks concerning the following question:

Hello all,

For a correctly matching pattern, this line of code:
$word =~ /(\d)\s[t=(\S+)]*/;
returns the decimal in $1 properly but does not return the string in $2 (returns null always).

How can I capture a string inside such a character class? Anything special with it?

Replies are listed 'Best First'.
Re: string selection from a character class
by Limbic~Region (Chancellor) on Jul 12, 2007 at 15:04 UTC
    karden,
    I believe you misunderstand how character classes work. A character class represents a single character and to capture, you put the parens around the class:
    /foo="([^"]+)"/
    That will capture all the non-quote characters in between two quote characters.

    I think what you are looking for is assertions. See perlre and the tutorial Using Look-ahead and Look-behind.

    Cheers - L~R

Re: string selection from a character class
by FunkyMonk (Bishop) on Jul 12, 2007 at 15:09 UTC
    How can I capture a string inside such a character class? Anything special with it?
    I don't know what you mean by this. Can you explain it further? Or, better still, give some sample data and your expected matches.

    The rules change inside a character class. Outside a character class, \S+ matches one or more non-space characters. Inside, it matches a single non-space character OR a plus.

Re: string selection from a character class
by karden (Novice) on Jul 12, 2007 at 15:37 UTC
    Let me clarify. I expect either of the following as $word in the previous code:
    0
    or
    1 t=something
    And I want to capture the word "something".

    I guess I cannot capture due to Perl matching greedy?

      #!/usr/bin/env perl use warnings; use strict; while (<DATA>) { print $2 . "\n" if /(\d)\st=(\w+)/; } __DATA__ 0 1 t=something 2 t=3foo7bar
      Creates the following output:
      something 3foo7bar
      Is this what you are trying to achieve? In any case, you should heed Limbic~Region's advice on understanding character classes (what goes into square brackets in a regex.)
        Thx toolic

        Your solution is exactly the solution I also am using to overcome and it works enough for me.

        Before my very first posting I was doing (in a loop):

        $word =~ /^(\d)\st=(\S+)/; next if ($1 == 0); print "$2\n";
        But in this way, for $word="0", pattern was not matched so previous loop's $1 and $2 was printed. And program functioned incorrectly.

        Anyway, now I am doing:

        next if(!($word =~ /(\d)\st=(\S+)/)); print "$2\n";
        and it works ok for my task.

        I thank you for your replies.

      You don't need character classes for that:

      m/ (\d+) \s t= (\S+) /xms

      This allows one t=something string, if you want multiple, you could do something like this:

      m/ (\d+) (?: \s t= (\S+) ) /xms