vacant has asked for the wisdom of the Perl Monks concerning the following question:

I have a feeling that this question is too easy, but I am having a hard time with it. I am building some very complex re's by combining sub-expressions using variable interpolation, which works fine. The problem is, certain backslash characters are being interpolated, too, in particular a dot match "\." is interpolated to ".", which isn't the same at all in a re. I would like to find a way to retain variable interpolation but turn off character interpolation, if that is possible.

A couple of extra-difficulty points:

The re's are values in a hash, which makes using concatenation not only messy, but impractical, too.

The depth of nesting of sub-expressions is variable, so that multiple slashes would be very difficult to maintain.

If the answer to all this is simple and obvious, I promise to eat several pages of the "Perl Cookbook".

OK, as requested, here is a contrived example I think illustrates the situation:

my $name = '(\w[\w-]*\w?)'; # any valid hostname segment my $nnam = "(($name\.)*$name)"; # one or more "name" segments my %domainnames = ( "$nname\.com" => 'a commercial domain', "$nname\.edu" => 'an educational institution', )
I just made this up, and it might not work as is, but I think it illustrates the point. The re does, indeed, match one or more legal domain name segments followed by the respective TLD. Unfortunately, instead of matching the dots, it matches any single character, because the "\." need by the re is interpolated by the string operator "" along with the variables $name and $nname. So, if only the interpolation of the characters could be prevented while allowing the interpolation of the variables, this would work, probably.

Thanks

Replies are listed 'Best First'.
Re: variable interpolation sans character interpolation
by ikegami (Patriarch) on Oct 27, 2005 at 23:41 UTC

    I'm not sure what you want. It's one of the following two.

    1) Maybe you're looking for qr/.../.

    $re = "\."; foreach (qw( . a )) { print("$_ ", /$re/ ? "matches" : "doesn't match", " $re\n"); } $re = qr/\./; foreach (qw( . a )) { print("$_ ", /$re/ ? "matches" : "doesn't match", " $re\n"); }
    outputs
    . matches . a matches . . matches (?-xism:\.) a doesn't match (?-xism:\.)

    2) Maybe you're looking for \Q..\E (or quotemeta).

    $var = '...'; $re = qr/abc\Q$var\Eghi/; foreach (qw( abcdefghi abc...ghi )) { print("$_ ", /$re/ ? "matches" : "doesn't match", "\n"); }
    outputs
    abcdefghi doesn't match abc...ghi matches
Re: variable interpolation sans character interpolation
by Nkuvu (Priest) on Oct 27, 2005 at 23:44 UTC
    First thought is the quotemeta function (or \Q \E in the regex). But this won't work if applied multiple times. Consider:
    #!/usr/bin/perl use strict; use warnings; my $string = '\.foo'; print $string, "\n"; for (1..3) { $string = quotemeta $string; print $string, "\n"; } __END__ # prints: \.foo \\\.foo \\\\\\\.foo \\\\\\\\\\\\\\\.foo
    But if you're just concatenating, you can do something like:
    #!/usr/bin/perl use strict; use warnings; my $string = '\.foo'; print $string, "\n"; for (1..3) { # Note the concatenation instead of re-quotemeta-ing $string .= quotemeta "*[$_]"; print $string, "\n"; } __END__ # prints: \.foo \.foo\*\[1\] \.foo\*\[1\]\*\[2\] \.foo\*\[1\]\*\[2\]\*\[3\]
    This help? If not, perhaps some code examples would be beneficial.
Re: variable interpolation sans character interpolation
by ikegami (Patriarch) on Oct 28, 2005 at 01:28 UTC

    In response to your updated question,

    my $name = '(\w[\w-]*\w?)'; # any valid hostname segment my $nnam = "(($name\\.)*$name)"; # one or more "name" segments my %domainnames = ( "$nname\\.com" => 'a commercial domain', "$nname\\.edu" => 'an educational institution', );

    is a solution. When quoting that with double quotes, you need to escape $ (unless you want interpolation), @ (unless you want interpolation), " and \ by preceeding them with \, so the regexp is $nname\.com would be "$nname\\.com" as a string literal.

    But it's simpler to use qr/.../ (like I suggested in my earlier post):

    my $name = qr/(\w[\w-]*\w?)/; # any valid hostname segment my $nnam = qr/(($name\.)*$name)/; # one or more "name" segments my %domainnames = ( qr/$nname\.com/ => 'a commercial domain', qr/$nname\.edu/ => 'an educational institution', );

    However, it's a big waste to stringify a compiled regexp, so you might want to alter your structure for a major speed boost. I doubt you actually need a hash, so here's a solution using an array:

    my $name = qr/(\w[\w-]*\w?)/; # any valid hostname segment my $nnam = qr/(($name\.)*$name)/; # one or more "name" segments my @domainnames = ( { re => qr/$nname\.com/, desc => 'a commercial domain' }, { re => qr/$nname\.edu/, desc => 'an educational institution' }, );

    If you really do need a hash:

      Thanks very much, I think this is just what I needed. Just to make certain I understand what is going on,\ here, let me see if I can explain it to myself.

      The qr// operator creates a compiled re from the string it is given, and this compiled re can be passed in a variable. However, because it is no longer a string (it is a special variable type??) any backslashes it originally contained are no longer subject to interpolation even when a string which contains it as a variable is itself contained in another string, and so on. Is that about right?

      I also think you are right about using an array instead of a hash, particularly because I may want to add more fields to it.

      Thanks again.

        Sounds right. \ is interpretted at compile time in both cases. "..." returns a string with the slashes removed, and qr/.../ returns a compiled regexp (that stringifies to the slashed form, such as when used as a hash key).
Re: variable interpolation sans character interpolation
by Tanktalus (Canon) on Oct 27, 2005 at 23:35 UTC

    Perhaps it will be easy to others, but right now I'm having a hard time wrapping my head around the question - perhaps a more concrete example exhibiting the problem could help me answer this.

    my $str1 = '???'; my $str2 = '???'; my $re = qr/$str1$str2/; # ...?