in reply to Regex - Is there any way to control when the contents of a variable are interpolated? (Using "$1" and '$1' in regex replacements)

I share some confusion about what you want, but perhaps this a closer approximation to it:

c:\@Work\Perl>perl -wMstrict -le "my %xlate_map = qw/ ( l_paren ) r_paren { l_curly } r_curly /; my $xlate_targets = join '', map quotemeta, keys %xlate_map; ;; sub xlate { my ($s, $targets, $hr_map) = @_; ;; (my $t = $s) =~ s{ ([$targets]) }{$hr_map->{$1}}xmsg; return $t; } ;; my $str = 'a { b } c ( d ) e'; my $xlt = xlate($str, $xlate_targets, \%xlate_map); print qq{'$xlt'}; " 'a l_curly b r_curly c l_paren d r_paren e'

Update 1: This can be generalized (and slowed down) a bit, and also, if you have Perl version 5.14+, simplified (and perhaps un-slowed) a bit (note use of  /r regex modifier in  s///r substitution):

c:\@Work\Perl>perl -wMstrict -le "use 5.014; ;; my %xlate_map = qw/ ( l_paren ) r_paren { l_curly } r_curly /; ;; sub xlate { my ($s, $hr_map) = @_; ;; my $targets = join '', map quotemeta, keys %$hr_map; return $s =~ s{ ([$targets]) }{$hr_map->{$1}}xmsgr; } ;; my $str = 'a { b } c ( d ) e {{F}} ((G))'; my $xlt = xlate($str, \%xlate_map); print qq{'$xlt'}; " 'a l_curly b r_curly c l_paren d r_paren e l_curlyl_curlyFr_curlyr_cur +ly l_parenl_parenGr_parenr_paren'

Update 2: Another approach: complete encapsulation; works with any version; perhaps a bit faster (note  /o regex modifier in  s///o substitution):

c:\@Work\Perl>perl -wMstrict -le "print qq{Perl version $] }; ;; BEGIN { my %xlate_map = qw/ ( l_paren ) r_paren { l_curly } r_curly /; my $xlate_targets = join '', map quotemeta, keys %xlate_map; ;; sub xlate { my ($s) = @_; ;; (my $t = $s) =~ s{ ([$xlate_targets]) }{$xlate_map{$1}}xmsog; return $t; } } ;; my $str = 'a { b } c ( d ) e {{F}} ((G))'; my $xlt = xlate($str); print qq{'$xlt'}; " Perl version 5.008009 'a l_curly b r_curly c l_paren d r_paren e l_curlyl_curlyFr_curlyr_cur +ly l_parenl_parenGr_parenr_paren'

Any of these approaches can be further simplified to operate directly upon  $_ if that's what you really need.

  • Comment on Re: Regex - Is there any way to control when the contents of a variable are interpolated? (Using "$1" and '$1' in regex replacements)
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Regex - Is there any way to control when the contents of a variable are interpolated? (Using "$1" and '$1' in regex replacements)
by JDoolin (Novice) on Mar 15, 2014 at 05:33 UTC

    Desired behavior:

    'a { b } c ( d ) e {{F}} ((G))'; should change to

    'a lbracket b rbracket c ( d ) e { lbracket F rbracket } ((G))';

    Using something like print s/\{([^\}^\}]+)\}/ lbracket $1 rbracket /g . " pairs of brackets replaced with lbracket rbracket.\n"

    works, but when I got the idea in my head that I wanted a subroutine to handle this for me (for legibility and debugging), I couldn't let it go.

      It seems you have your solution. However, I see you're still including the  '^' character in your inverted character class; is this what you want? See example below.

      c:\@Work\Perl>perl -wMstrict -le "$_ = 'a { b } c { d^ } e {{F}} {{^G}}'; ;; s/\{([^{^}]+)\}/ lbracket $1 rbracket /g; print qq{'$_'}; " 'a lbracket b rbracket c { d^ } e { lbracket F rbracket } {{^G}}'

        Based on the reactions, I'm removing my personal anecdotes and conjecture from the post, and sticking with my original question.

        Here is the code that I have working.

        #!/usr/bin/perl $_='{\selectlanguage{english} \textcolor{black}{\ \ 10.\ \ Three resistors connected in series each carry currents labeled }\textit{\textcolor{black}{I}}\textcolor{black} +{\textsubscript{1}}\textcolor{black}{, }\textit{\textcolor{black}{I}}\textcolor{black}{\textsubscript{2}}\tex +tcolor{black}{and}\textit{\textcolor{black}{I}}\textcolor{black}{\tex +tsubscript{3}}\textcolor{black}{. Which of the following expresses the value of the total current }\textit{\textcolor{black}{I}}\textit{\textcolor{black}{\textsubscript +{T}}}\textcolor{black}{in the system made up of the three resistors i +n series?}}.';; $nobrackets = qr/[^\{}]+/; my $pass = 0; while(++$pass <=2){ s/\\textsuperscript\{($nobrackets)\}/ startsuperscript $1 endsuperscri +pt /g; s/\\textsubscript\{($nobrackets)\}/ startsubscript $1 endsubscript/g; s/\\textit\{($nobrackets)\}/ startitalic $1 enditalic/g; s/\\textcolor\{$nobrackets\}//g; s/\{($nobrackets)\}/($1)/g; print "Pass $pass:\n\n". qq{$_}."\n\n\n"; }
        This produces output as follows:
        Pass 1: {\selectlanguage(english) (\ \ 10.\ \ Three resistors connected in series each carry currents labeled )\textit{(I)}( startsubscript 1 endsubscript)(, )\textit{(I)}( startsubscript 2 endsubscript)(and)\textit{(I)}( starts +ubscript 3 endsubscript)(. Which of the following expresses the value of the total current )\textit{(I)}\textit{( startsubscript T endsubscript)}(in the system m +ade up of the three resistors in series?)}. Pass 2: (\selectlanguage(english) (\ \ 10.\ \ Three resistors connected in series each carry currents labeled ) startitalic (I) enditalic( startsubscript 1 e +ndsubscrip t)(, ) startitalic (I) enditalic( startsubscript 2 endsubscript)(and) start +italic (I) enditalic( startsubscript 3 endsubscript)(. Which of the following expresses the value of the total current ) startitalic (I) enditalic startitalic ( startsubscript T endsubscrip +t) endital ic(in the system made up of the three resistors in series?)).
        Notice on pass 1, it removes the inner curly-brackets, and on pass 2, it removes the outer curly-brackets, additional passes could remove more curly-brackets if necessary. What I want(ed) to change was to turn these s///g or s///eeg statements into subroutines, keeping the capture and replacement variables separate. The code works fine as is, but I'm still curious as to whether the variables could be passed to a subroutine.