Here's another approach I thought of: an iterator generator. It's a bit simpler, maybe a bit faster because it does not pass/return strings by value, but by reference. Also runs under 5.8.9. Again, this is not a 'true' module as it exports nothing. You could easily fix this if you want, but the module is simple enough to operate as-is IMHO.
File SML2.pm:
# Strange Markup Language processor 2 - iterator 17mar14waw # Strange Markup Language (anyway, strange to me) processor 2. # iterator generator version. # refer to perlmonks node 1078356. package SML2; use warnings FATAL => 'all'; use strict; sub Iterator (&) { return $_[0]; } # syntactic sugar per mjd my $nobrackets = qr{ [^{}]+ }xms; # maybe use * quantifier?? use constant XLT => ( # basic translation mapping # captured tagged substrings represented by $ below # tag translates as '\textsuperscript' => ' startsuperscript $ endsuperscript ', '\textsubscript' => ' startsubscript $ endsubscript' , '\textit' => ' startitalic $ enditalic' , '\textcolor' => '' , '' => '($)' , ); my %xlate1 = XLT; # tag will be in separate scalar. # tagged string will be returned by anon. subroutine. # convert strings (and placeholders) of %xlate1 to anon. subs. for (values %xlate1) { # convert: # placeholder to function parameter; s{ \$ }'$_[0]'xms; # value to anon. sub returning string w/interpolated param $_ = eval qq{ sub { \\ qq{$_} } }; } sub tag_converter { # works -- iterator generator my (@tags, # tags to process in order of processing ) = @_; # verify existence of tags before making iterator. exists $xlate1{$_} or die qq{unknown tag '$_'} for @tags; return Iterator { my $n = 0; # total substitutions done for all tags for my $tag (@tags) { # processes tags in order $n += $_[0] =~ s{ \Q$tag\E \{ ($nobrackets) \} } {${ $xlate1{$tag}->($1) }}xmsg; } return $n; } } 1;
File SML2.t:
use warnings; use strict; use Test::More # tests => ?? + 1 # Test::NoWarnings adds 1 test 'no_plan' ; use Test::NoWarnings; BEGIN { use_ok 'SML2'; } use constant RAW_SML => <<'EOT'; {\selectlanguage{english} \textcolor{black}{\ \ 10.\ \ Three resistors connected in series each carry currents labeled }\textit{\textcolor{black}{I}}\textcolor{black} +{\textsubscript{1}}\textcolor{black}{, }\textit{\textcolor{black}{I}}\textcolor{black}{\textsubscript{2}}\tex +tcolor{black}{and}\textit{\textcolor{black}{I}}\textcolor{black}{\tex +tsubscript{3}}\textcolor{black}{. Which of the following expresses the value of the total current }\textit{\textcolor{black}{I}}\textit{\textcolor{black}{\textsubscript +{T}}}\textcolor{black}{in the system made up of the three resistors i +n series?}}. EOT use constant PASS_1 => <<'EOT'; {\selectlanguage(english) (\ \ 10.\ \ Three resistors connected in series each carry currents labeled )\textit{(I)}( startsubscript 1 endsubscript)(, )\textit{(I)}( startsubscript 2 endsubscript)(and)\textit{(I)}( starts +ubscript 3 endsubscript)(. Which of the following expresses the value of the total current )\textit{(I)}\textit{( startsubscript T endsubscript)}(in the system m +ade up of the three resistors in series?)}. EOT use constant PASS_2 => <<'EOT'; (\selectlanguage(english) (\ \ 10.\ \ Three resistors connected in series each carry currents labeled ) startitalic (I) enditalic( startsubscript 1 e +ndsubscript)(, ) startitalic (I) enditalic( startsubscript 2 endsubscript)(and) start +italic (I) enditalic( startsubscript 3 endsubscript)(. Which of the following expresses the value of the total current ) startitalic (I) enditalic startitalic ( startsubscript T endsubscrip +t) enditalic(in the system made up of the three resistors in series?) +). EOT note "\n===== JDoolin code =====\n\n"; sub JDoolin_process { my $nobrackets = qr/[^\{}]+/; s/\\textsuperscript\{($nobrackets)\}/ startsuperscript $1 endsuper +script /g; s/\\textsubscript\{($nobrackets)\}/ startsubscript $1 endsubscript +/g; s/\\textit\{($nobrackets)\}/ startitalic $1 enditalic/g; s/\\textcolor\{$nobrackets\}//g; s/\{($nobrackets)\}/($1)/g; } local $_ = RAW_SML; # JDoolin_process() runs against $_ JDoolin_process(); ok $_ eq PASS_1, qq{pass 1}; JDoolin_process(); ok $_ eq PASS_2, qq{pass 2}; note 'try 3rd pass against JDoolin: any change from 2nd pass?'; JDoolin_process(); ok $_ eq PASS_2, qq{3rd pass: no change from pass 2}; note "\n===== SML2 code =====\n\n"; use constant TAG_PROCESS_ORDER => ( qw( \textsuperscript \textsubscript \textit \textcolor ), '' ); FUNT: # functions under test: fully qualified functions from module for my $funt (map qq{SML2::$_}, qw( tag_converter )) { note "\n----- $funt() -----\n\n"; *converter = do { no strict 'refs'; *$funt; }; my $pass = converter(TAG_PROCESS_ORDER); my $text; $text = RAW_SML; ok $pass->($text) > 0, qq{1st pass, processing was done}; ok $text eq PASS_1, qq{1st pass, raw -> pass1}; ok $pass->($text) > 0, qq{2nd pass, processing was done}; ok $text eq PASS_2, qq{2nd pass, pass1 -> pass2}; ok $pass->($text) == 0, qq{3rd pass, no processing done}; ok $text eq PASS_2, qq{3rd pass, no change in text}; ok $pass->($text) == 0, qq{4th pass, no processing done}; ok $text eq PASS_2, qq{4th pass, no change in text}; note 'reset text to raw string for further processing'; $text = RAW_SML; 1 while $pass->($text); ok $text eq PASS_2, qq{run passes to exhaustion, raw -> pass2}; ok $pass->($text) == 0, qq{passes exhausted, nothing to do}; ok $text eq PASS_2, qq{passes exhausted, no change in text}; note 'JDoolin code and iterator code head-to-head'; local $_ = RAW_SML; $text = RAW_SML; JDoolin_process(); $pass->($text); ok $text eq $_, qq{1st pass: JDoolin eq iterator}; JDoolin_process(); $pass->($text); ok $text eq $_, qq{2nd pass: JDoolin eq iterator}; my $t = $text; JDoolin_process(); $pass->($text); ok $text eq $t && $_ eq $t, qq{3rd pass: no further change}; note 'degenerate cases'; $pass = converter(); $text = RAW_SML; ok $pass->($text) == 0, qq{degenerate: no tags to process, no ops} +; ok $text eq RAW_SML, qq{degenerate: no tags, no change in strin +g}; $pass = converter(TAG_PROCESS_ORDER); $text = ''; ok $pass->($text) == 0, qq{degenerate: empty string to process, no + ops}; ok $text eq '', qq{degenerate: empty string, no change in +string}; } # end for FUNT
In reply to Re: Regex - Is there any way to control when the contents of a variable are interpolated? (Using "$1" and '$1' in regex replacements)
by AnomalousMonk
in thread Regex - Is there any way to control when the contents of a variable are interpolated? (Using "$1" and '$1' in regex replacements)
by JDoolin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |