in reply to C strings, unescaping of

Since the two-character literal sequence  '\v' (backslash, "v") seems to be the fly in the ointment, why not handle it as the sole exceptional case? The code is simpler, but still needs substitution replacement  /e code evaluation involving an eval and I've done no Benchmark-ing to see if it's actually faster; it always pays to be suspicious when  /e or eval are involved.

>perl -wMstrict -le "use Test::More 'no_plan'; use Test::NoWarnings; ;; my $s = join '', map qq{$_\\$_$_\\$_\\$_$_}, qw(r n b a f t v) ; my $o = join '', 'x', map qq{\\${_}y\\$_\\${_}x}, qw(0 7 10 77 100 377) ; print qq{'$s'}; print qq{'$o'}; print ''; ;; for ($s, $o) { s{ ( \\ (?: [0-7]{1,3} | [rnbaft])) | \\v } { $1 ? eval qq{qq{$1}} : qq{\013} }xmsge; } print raw_hex($s); print raw_hex($o); print ''; ;; ok $s eq qq{r\rr\r\rrn\nn\n\nnb\bb\b\bba\aa\a\aaf\ff\f\fft\tt\t\ttv\0 +13v\013\013v}; ok $o eq qq{x\000y\000\000x\007y\007\007x\010y\010\010x\077y\077\077x +\100y\100\100x\377y\377\377x}; ;; ;; sub raw_hex { return join ' ', unpack '(H2)*', $_[0]; } " 'r\rr\r\rrn\nn\n\nnb\bb\b\bba\aa\a\aaf\ff\f\fft\tt\t\ttv\vv\v\vv' 'x\0y\0\0x\7y\7\7x\10y\10\10x\77y\77\77x\100y\100\100x\377y\377\377x' 72 0d 72 0d 0d 72 6e 0a 6e 0a 0a 6e 62 08 62 08 08 62 61 07 61 07 07 6 +1 66 0c 66 0c 0c 66 74 09 74 09 09 74 76 0b 76 0b 0b 76 78 00 79 00 00 78 07 79 07 07 78 08 79 08 08 78 3f 79 3f 3f 78 40 79 4 +0 40 78 ff 79 ff ff 78 ok 1 ok 2 ok 3 - no warnings 1..3

Replies are listed 'Best First'.
Re^2: C strings, unescaping of
by Anonymous Monk on Oct 02, 2013 at 12:23 UTC
    Eval is avoidable:
    { my %T = ( (map {chr() => chr} 0..0377), (map {sprintf("%o",$_) => chr} 0..07), (map {sprintf("%02o",$_) => chr} 0..077), (map {sprintf("%03o",$_) => chr} 0..0377), (split //, "r\rn\nb\ba\af\ft\tv\013") ); sub unescape { s/\\([0-7]{1,3}|.)/$T{$1}/g for @_ } }

    I am saddened no-one plays golf in this Monastery.

Re^2: C strings, unescaping of
by Anonymous Monk on Oct 02, 2013 at 15:21 UTC
    Thanks! The eval makes it ~10 times slower. And don't forget \" \\.

    s{\\((v)|[0-7]{1,3}|.)}{$2 ? "\013" : eval qq{qq{$&}}}eg;

      s{\\((v)|[0-7]{1,3}|.)}{$2 ? "\013" : eval qq{qq{$&}}}eg;

      Part of the enormous speed penalty may be attributable not only to eval, but to the use of the  $& matching special variable (see Variables related to regular expressions in perlvar), which can work wonders for putting the brakes on not just an individual regex, but on the execution of every regex in an application!