in reply to C style strings

G'day harangzsolt33,

You mentioned that you'd used a Wikipedia description, but gave no details. The following is based on "Wikipedia: Escape sequences in C: Table of escape sequences" and "Wikipedia: Digraphs and trigraphs: C".

In many cases, no conversion is required: Perl & C use the same escapes (\n, \t, and so on). For the remainder, the following code should, I believe, cover all cases.

#!/usr/bin/env perl use strict; use warnings; my $c_str = q{BEL \a BS \b ESC \e FF \f NL \n CR \r TAB \t}; $c_str .= q{ VT \v BSLASH \\\\ APOS \' QUOT \" HASH \43}; $c_str .= q{ DOLLAR \x24 AT \u0040 TILDE \U0000007e}; $c_str .= q{ TGs \?= \?/ \?' \?( \?) \?! \?< \?> \?-}; print "\$c_str[$c_str]\n"; my $p_str = c2p($c_str); print "\$p_str[$p_str]\n"; { my %trigraph; BEGIN { no warnings 'qw'; %trigraph = qw{= # / \ ' ^ ( [ ) ] ! | < { > } - ~}; } sub c2p { my ($str) = @_; $str =~ s/\\U([0-9A-Fa-f]{8})/\\x{$1}/g; $str =~ s/\\u([0-9A-Fa-f]{4})/\\x{$1}/g; $str =~ s/\\x([0-9A-Fa-f]+)/\\x{$1}/g; $str =~ s/\\([0-7]{1,3})/0$1/g; $str =~ s/\\\?([=\/'\(\)!<>-])/$trigraph{$1}/g; $str =~ s/\\e/\\c[/g; $str =~ s/\\v/\\x{0b}/g; return $str; } }

As we've discussed previously, you're using "TinyPerl 5.8". I've kept the code as simple as possible to accommodate that; but I've no way of testing it.

Output:

$c_str[BEL \a BS \b ESC \e FF \f NL \n CR \r TAB \t VT \v BSLASH \\ AP +OS \' QUOT \" HASH \43 DOLLAR \x24 AT \u0040 TILDE \U0000007e TGs \?= + \?/ \?' \?( \?) \?! \?< \?> \?-] $p_str[BEL \a BS \b ESC \c[ FF \f NL \n CR \r TAB \t VT \x{0b} BSLASH +\\ APOS \' QUOT \" HASH 043 DOLLAR \x{24} AT \x{0040} TILDE \x{000000 +7e} TGs # \ ^ [ ] | { } ~]

I haven't used trigraphs beyond reading about them long ago. The output I've produced (TGs ...) seems reasonable; however, I wasn't sure how you wanted to handle these. Do feel free to get alternative advice in this area.

What I've provided is a direct conversion; e.g. "VT \v" to "VT \x{0b}". If instead, you wanted "to VT <actual vertical tab character>", just remove one backslash: i.e. change "s/\\v/\\x{0b}/g" to "s/\\v/\x{0b}/g". I'm pretty sure that will work for most of the substitutions. For octal elements: "s/\\([0-7]{1,3})/0$1/g" to "s/\\([0-7]{1,3})/chr oct $1/eg". You'll also need to add substitutions like "s/\\n/\n/g". Given I don't even know if that's what you want, I didn't spend much time looking into that aspect.

— Ken