in reply to Re: String::Escape
in thread efficient char escape sequence substitution

Try printing the value of $var... it will show something you don't expect... \\ in single quotes still escapes the backslash and results in only one backslash.

Converting the hex-strings can be done like this: s#(?<!\\)(\\{2})*\\x([A-F0-9a-f]{2})#$1 . chr (hex ($2))#eg;

It use a look-back to see that there is no backslash before the double backslashes, then it matches two backslashes (a backslash escaping a backslash), and then the backslash and the x, and ofcourse the hex-symbols.

If you want to have the correct result after that you should replace all double backslashes with a single backslash... or better said, remove the backslash before every symbol... but perhaps unprintable does that too...

Update: I decided to look at the source of String::Escape, and the code to match the hex charachters seems to be bugged... as in, it does match \AF (basiclly m/\\[A-Fa-f0-9]/), but not with the \xAF before it... It doesn't seem to have code for octal charachters though... but you should be able to work that out with my previous regex (which is why I leave it in this post). (I will send a mail to the author about this...)

Update2: Reply from the author:

Looks like you're right. I've gotten a few other suggestions, so I'll try to roll them into a new release soon.

Update3: I noticed that [id://djohnston] regex had a flaw and I explained to him how to fix it. After looking back at the original regex (of the OP), I noticed that it has the same flaw. So here is the explanation. (in readmore tags ofcourse)

The flaw is that if you use "\\\\n" as input (two backslashes = 1 data backslash), then it will replace it with a newline, even though the backslash escapes the backslash, not the n.

(Note the use of the /x modifier (to make things a bit easier to look at))

The key to fixing it, is to match the backslashes and to copy them in the returning string.

What you could guess now is that something like:

s/( (?:\\\\)* ) \\ (n)/$1\n/gx; works,

But that is not the truth... the regex engine will do it's very best to match so it might ignore one of the leading slashes...

For example, the data (two backslashes = 1 data backslash): qq(something  \\\\n) will match. Why? The regex engine will realise that storing \\\\ in the non-capturing group will make the egex fail, therfor it will not do that.

The key to solve that problem is to ensure that there are no backslashes before: (?:\\\\)*.

There are two ways you can do that: you could use: [^\\](?:\\\\)*, but then things might/could/should go wrong when it is the first thing in a string, which would make it a special case and requires more regex-code.

The better fix for that is to use a negative look-back. since that will be ignored if it is the first thing in the string. A negative look-back is: (?<!\\)

This brings the final regex to: s/ (?<!\\) ( (?:\\ \\)* ) \\ n/$1\n/g;