in reply to Octal Weirdness

The only reason that:

$in_msg =~ s/$config{'startblock'}//g; $in_msg =~ s/$config{'endblock'}//g;
works is because your strings go through an extra interpolation phase when used in a regex.

When you say that your variables end up getting set like:

$config{'startblock'} = "\013"; $config{'endblock'} = "\034";
you are wrong. In Perl, "\013" would give you a single character while your values are being read from a file so the values end up being 4 characters like '\013' or "\\013" would give you.

If you want \ to mean something special when used in your config file, then you'll have to add code to provide that special meaning. For example:

s#\\(0[0-7]*)#pack("C",oct($1))#ge
could be applied to such values to parse \0 octal escapes.

Updated as if via s/octal/oct/ to correct the bug noted by japhy. Thanks, japhy.

        - tye (but my friends call me "Tye")

Replies are listed 'Best First'.
Re: (tye)Re: Octal Weirdness
by japhy (Canon) on Dec 20, 2000 at 05:14 UTC
Re: (tye)Re: Octal Weirdness
by repson (Chaplain) on Dec 20, 2000 at 10:00 UTC
    Instead of writing you own regex to parse escapes, wouldn't it be more sensible to allow someone to enter any of a set of already known escapes such as \xA \c[ \033. Hmm those all are supported by perl and are already interpretated in any qq() or re.
    $config{startblock} = qq $config{startblock};
    Or something like that (maybe one which actually works).

    Update: I was looking for
    $config{startblock} = eval "qq($config{startblock})";
    Update 2:Nope, that uses eval as tye points out below. Forgot about that when I first posted, thought there was another way using just qq() and not eval, which has all the drawbacks tye mentions.

      Cool! Now if I can get to your config file I can write:

      startblock=@[{system('rm -rf /')}] midblock=);system("rm -rf /") endblock=$x{system('rm -rf /')}
      You can argue that this is a feature or not. Personally, I do occasionally make config files that are written in Perl and so have this risk associated with them. But if the config file isn't written in Perl, then I define the format and don't allow arbitrary Perl to sneak in. I think that fits the priniciple of least surprise: If the config file doesn't look like Perl code, then don't allow Perl code in it.

      It would be nice if there were a very simple and efficient way to get Perl to parse all \ escapes without also doing dangerous variable interpolations. You could try to find or write a module to do this and then try to keep it updated so it stays in sync with what Perl does.

      You can't use the same code that Perl uses to do this because it is all muddled up with the lexer so that it can translate "hi\U\l$x ok" into "hi".lcfirst(uc($x))." OK".

      You can also try to use eval for this but try to protect '$' and '@' from interpolation:

      $str= $config{startblock}; $str =~ s#(\\*)([$@])#$1."\\"x(1&length$1).$2#ge; $str = eval "qq\@$str\@";
      which doesn't look too bad.

              - tye (but my friends call me "Tye")