bliako has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monkees

I want to replace a string with another string which contains (lots of) @, [, ( etc. which have special meaning in Perl and definetely I don't want them to be interpolated.

I can use the q{} function to quote these but how can I call it inside the regular expression, e.g. $xx =~ s/<%xpath%>/\Qq{//div[@id="abc"]}\E/e (does not recognize q{}).

bw, bliako

Replies are listed 'Best First'.
Re: Elegant way to escape a string in a regex
by hippo (Archbishop) on Apr 14, 2025 at 21:24 UTC

    I would not use regex for replacing fixed strings like this but rather substr. That means that this problem of avoiding interpolation etc. just vanishes. As a bonus it should usually be faster too.


    🦛

      good idea, it suits me fine and eliminates all problems, thanks.

        If you use fixed templates, maybe consider sprintf or - depending on the number of templates - just variable interpolation.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery

      The only way I can think of is with using index() to find the position of the string to replace:

      my $template = 'aaa <%xx%> bbb'; my $repl = substr $template, index($template, '<%xx%>'), length('<%xx% +>'), '//div[@id="abc"]'; print $template; # aaa //div[@id="abc"] bbb
Re: Elegant way to escape a string in a regex
by ikegami (Patriarch) on Apr 14, 2025 at 18:25 UTC

    Problems:

    • The first thing that happens is that the end of the operator is found. The only characters to which the parser pays attention at this stage are «\» and the delimiter(s). This means the first «\» in your selector is taken as the end of operator.
    • «\Qq{//div[@id="abc"]}\E» isn't valid Perl code, but it's expected to be when using the «e» modifier.


    It might be best to split out the replacement.

    my $repl = '//div[@id="abc"]'; s/.../$repl/

    The following would also work:

    s{...}{ q{//div[@id="abc"]} }e
    s{...}{ '//div[@id="abc"]' }e

    Changed the delimiter to allow «\» to be used unescaped, and removed «\Q\E».


    If you change the replacement expression's delimiter to «'», it will act as a single-quoted string literal instead of a double-quoted string literal.

    s{...}'//div[@id="abc"]'

    Probably best to avoid this one because it's pretty obscure.

      Thanks. I did not know the last one you gave (s{...}'//div[@id="abc"]'). I should not have used \Q\E then. I did use it in case there were things like $1 in the replacement string. But q{//div=[@id="$1"]} gives no problem and escapes that too.

      So, just to confirm, q{} is absolutely safe for providing a substitution string which nothing in it will be interpreted by the regex (e.g. $1) or by perl (e.g @id): all contents of q{} will be literal, nothing interpreted.

        In single-quoted string literals (e.g. «q{}»), only «\» and the delimiter(s) (i.e. «{» and «}» when using «q{}») are significant. You may need to escape instances of the former, while instances of the latter needs to be escaped or balanced. Finally, you must not escape anything except the aforementioned characters.