in reply to Removing \x092 with a regex

Linux, perl v. 5.8.6 self-compiled - for what it can help.

In the first case you would be biten by the fact that the \xNN form works only when you have two hex chars, not three as in 092. So, the fact that you always get the same output is the real puzzle here.

#!/usr/bin/perl use strict; use warnings; my $str = q|cat’s|; printit($str, 'original'); (my $str1 = $str) =~ s/\x092/'/; printit($str1, 'hex only two nybbles'); (my $str2 = $str) =~ s/\x92/'/; printit($str2, 'seems good here'); (my $str3 = $str) =~ s/\x{092}/'/; printit($str3, 'seems good here'); (my $str4 = $str) =~ tr/\x{092}/'/d; printit($str4, 'seems good here'); sub printit { my ($v, $msg) = @_; my $h = unpack "H*", $v; $h =~ s/(..)/0x$1 /g; print "$h\t$v\t$msg\n"; } __END__ 0x63 0x61 0x74 0x92 0x73 cat?s original 0x63 0x61 0x74 0x92 0x73 cat?s hex only two nybbles 0x63 0x61 0x74 0x27 0x73 cat's seems good here 0x63 0x61 0x74 0x27 0x73 cat's seems good here 0x63 0x61 0x74 0x27 0x73 cat's seems good here
Now, it would be good to see how this script runs on your machine, in particular to see the hex dumps. The comment messages will become wrong of course :)

Flavio
perl -ple'$_=reverse' <<<ti.xittelop@oivalf

Don't fool yourself.

Replies are listed 'Best First'.
Re^2: Removing \x092 with a regex
by wfsp (Abbot) on Jul 26, 2005 at 17:09 UTC
    Many thanks for your efforts

    I added a test suggested by kwaping above.

    #!/usr/bin/perl use strict; use warnings; my $str = q|cat’s|; printit($str, 'original'); (my $str1 = $str) =~ s/\x092/'/; printit($str1, 'hex only two nybbles'); (my $str2 = $str) =~ s/\x92/'/; printit($str2, 'seems good here'); (my $str3 = $str) =~ s/\x{092}/'/; printit($str3, 'seems good here'); (my $str4 = $str) =~ tr/\x{092}/'/d; printit($str4, 'seems good here'); (my $str5 = $str) =~ s/’/'/; printit($str5, 'added'); sub printit { my ($v, $msg) = @_; my $h = unpack "H*", $v; $h =~ s/(..)/0x$1 /g; print "$h\t$v\t$msg\n"; } __END__ 0x63 0x61 0x74 0xe2 0x80 0x99 0x73 cat’s original 0x63 0x61 0x74 0xe2 0x80 0x99 0x73 cat’s hex only two nybbles 0x63 0x61 0x74 0xe2 0x80 0x99 0x73 cat’s seems good here 0x63 0x61 0x74 0xe2 0x80 0x99 0x73 cat’s seems good here 0x63 0x61 0x74 0xe2 0x80 0x99 0x73 cat’s seems good here 0x63 0x61 0x74 0x27 0x73 cat's added

      You're saving your Perl script as a UTF-8 file but not telling Perl that it is supposed to be reading the script as such. (I bet)

      - tye        

        tye++

        0x63 0x61 0x74 0x92 0x73 cat’s original 0x63 0x61 0x74 0x92 0x73 cat’s hex only two nybbles 0x63 0x61 0x74 0x27 0x73 cat's seems good here 0x63 0x61 0x74 0x27 0x73 cat's seems good here 0x63 0x61 0x74 0x27 0x73 cat's seems good here 0x63 0x61 0x74 0x27 0x73 cat's added
        Many thanks