in reply to Re: Transform ASCII into UniCode
in thread Transform ASCII into UniCode

> Be sure to only use it for validated strings, never a random user input!

Here a generic routine to escape only selected meta-characters.

Escaping any / (or other delimiter) from input should allow to safely apply

eval "\$target =~ tr/$charset/$boldset/";

use v5.12; use warnings; use Data::Dump qw/pp dd/; use Test::More; sub escape_metas { my ( $meta,$e ) = @_ ; $e //= '\\'; # default backslash my $ee ="\Q$e"; # don't mess my regex s[ (?| $ee($ee) # ignore double escapes | $ee($meta) # keep single escapes | ($meta) # escape meta ) ] [$e$1]xgr; } my $e = '\\'; # escape code my $m = '/'; # to be escaped for ("$m", "$e$e$m", "$e$e$e$e$m" ) { my $got = escape_metas($m,$e); is( $got, "$e$_" , "escaping $_ -> $got"); } for ("$e$m", "$e$e$e$m" ) { my $got = escape_metas($m,$e); is( $got, $_ , "ignoring $_ eq $got"); } done_testing;

C:/Strawberry/perl/bin\perl.exe -w d:/tmp/pm/escapism.pl ok 1 - escaping / -> \/ ok 2 - escaping \\/ -> \\\/ ok 3 - escaping \\\\/ -> \\\\\/ ok 4 - ignoring \/ eq \/ ok 5 - ignoring \\\/ eq \\\/ 1..5

Please tell me if I missed a case, tried to write it as generic as possible.

EDIT

More or betters tests are welcome too. =)

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^3: Transform ASCII into UniCode (escape_metas)
by choroba (Cardinal) on Mar 23, 2021 at 17:24 UTC
    I'm probably too busy today to understand. We wanted to escape the strings so they can be used in a transliteration, right? Why not test it directly, then?
    sub use_it { my ($string, $search, $replace) = @_; my ($s, $r); $s = escape_metas('/', '\\') for $search; $r = escape_metas('/', '\\') for $replace; return eval "\$string =~ tr/$s/$r/r" } sub cheat { my ($string, $search, $replace) = @_; return eval "\$string =~ tr|\Q$search\E|\Q$replace\E|r" } sub simulate { my ($string, $search, $replace) = @_; my $result = $string; for my $i (0 .. length($search) - 1) { my $from = substr $search, $i, 1; my $to = substr $replace, $i, 1; $result =~ s/\Q$from/$to/g; } return $result } for my $case ( # String search replace expect ['a/b' => 'a/b', 'xyz', 'xyz'], ['a\\b' => 'a\\b', 'xyz', 'xyz'], ['a/b' => '\\/', 'xy', 'ayb'], ['a\\/b' => '\\/', 'xy', 'axyb'], ['a/\\b' => '\\/', 'xy', 'ayxb'], ['a\\\\b' => '\\/', 'xy', 'axxb'], ['a\\\\/b' => '\\/', 'xy', 'axxyb'], ) { is simulate(@$case), $case->[-1], 'simulate'; is cheat(@$case), simulate(@$case), 'cheat'; is use_it(@$case), simulate(@$case), 'use'; }
    I'm not sure I got the "expect" right, but both "simulate" and "cheat" give the same results. "use", on the other hand, doesn't. I based it on your escape_metas - what did I do wrong?

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      > Why not test it directly, then?

      I wanted to have a generic method for escaping selected metas while keeping others as is.

      > what did I do wrong?

      Took me a moment to understand (well guess) what's happening

      Take case #2:

      a\\b is internally the 3 char string a\b , so this escape b is untouched.

      but you do a string interpolation for

      eval "\$string =~ tr/$s/$r/r"

      so whats happening is

      DB<57> say "a\\b" =~ tr/a\b/xyz/r # expected 'xyz' x\b DB<58>

      Actually I'm not sure what tr's interpretation of \b is here

      Question is if your expectation was right, because the literal code gives my result (?)

      C:\tmp\t_wperl>perl -E"say 'a\\b' =~ tr/a\b/xyz/r # expected 'xyz'" x\b C:\tmp\t_wperl>

      update

      I.o.W.

      Your expectation is to have a 1-to-1 mapping of characters.

      My expectation was to emulate tr like implemented and to catch injections.

      It's a question of definition.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery