http://qs1969.pair.com?node_id=734519

belg4mit has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I've got some code that I'm trying to make usable with as many user setups as possible, and I've run into an odd problem. The first code block below does what it ought (uppercase $_) in 5.005_04, 5.6.0, 5.8.4, 5.8.5 and 5.8.10, but in 5.6.1 and 5.6.2 it turns 'Hello World' into
Hó¿¼ÃÃÃ
WÃÃÃó¿¼
The second, largely identical, code block works fine in 5.6.[12], but that doesn't really explain why things have gone awry.
tr/a-z\x{fff03}-\x{fff05}\x{e0}-\x{f6}\x{f8}-\x{fe}\x{ff}\x{103}\x{105 +}\x{107}\x{109}\x{10b}\x{10d}\x{10f}\x{103}\x{105}\x{117}\x{119}\x{11 +b}\x{11d}\x{11f}\x{103}\x{105}\x{127}\x{129}\x{12b}\x{12d}\x{12f}\x{1 +33}\x{135}\x{137}\x{13a}\x{13c}\x{13e}\x{140}\x{142}\x{144}\x{146}\x{ +148}\x{14b}\x{14d}\x{14f}\x{151}\x{153}\x{155}\x{157}\x{159}\x{15b}\x +{15d}\x{15f}\x{161}\x{163}\x{165}\x{167}\x{169}\x{16b}\x{16d}\x{16f}\ +x{171}\x{173}\x{175}\x{177}\x{17a}\x{17c}\x{17e}/A-Z\x{fff00}-\x{fff0 +2}\x{c0}-\x{d6}\x{d8}-\x{de}\x{178}\x{102}\x{104}\x{106}\x{108}\x{10a +}\x{10c}\x{10e}\x{102}\x{104}\x{116}\x{118}\x{11a}\x{11c}\x{11e}\x{10 +2}\x{104}\x{126}\x{128}\x{12a}\x{12c}\x{12e}\x{132}\x{134}\x{136}\x{1 +39}\x{13b}\x{13d}\x{13f}\x{141}\x{143}\x{145}\x{147}\x{14a}\x{14c}\x{ +14e}\x{150}\x{152}\x{154}\x{156}\x{158}\x{15a}\x{15c}\x{15e}\x{160}\x +{162}\x{164}\x{166}\x{168}\x{16a}\x{16c}\x{16e}\x{170}\x{172}\x{174}\ +x{176}\x{179}\x{17b}\x{17d}/;
tr/a-z/A-Z/;tr/\x{fff03}-\x{fff05}\x{e0}-\x{f6}\x{f8}-\x{fe}\x{ff}\x{1 +03}\x{105}\x{107}\x{109}\x{10b}\x{10d}\x{10f}\x{103}\x{105}\x{117}\x{ +119}\x{11b}\x{11d}\x{11f}\x{103}\x{105}\x{127}\x{129}\x{12b}\x{12d}\x +{12f}\x{133}\x{135}\x{137}\x{13a}\x{13c}\x{13e}\x{140}\x{142}\x{144}\ +x{146}\x{148}\x{14b}\x{14d}\x{14f}\x{151}\x{153}\x{155}\x{157}\x{159} +\x{15b}\x{15d}\x{15f}\x{161}\x{163}\x{165}\x{167}\x{169}\x{16b}\x{16d +}\x{16f}\x{171}\x{173}\x{175}\x{177}\x{17a}\x{17c}\x{17e}/\x{fff00}-\ +x{fff02}\x{c0}-\x{d6}\x{d8}-\x{de}\x{178}\x{102}\x{104}\x{106}\x{108} +\x{10a}\x{10c}\x{10e}\x{102}\x{104}\x{116}\x{118}\x{11a}\x{11c}\x{11e +}\x{102}\x{104}\x{126}\x{128}\x{12a}\x{12c}\x{12e}\x{132}\x{134}\x{13 +6}\x{139}\x{13b}\x{13d}\x{13f}\x{141}\x{143}\x{145}\x{147}\x{14a}\x{1 +4c}\x{14e}\x{150}\x{152}\x{154}\x{156}\x{158}\x{15a}\x{15c}\x{15e}\x{ +160}\x{162}\x{164}\x{166}\x{168}\x{16a}\x{16c}\x{16e}\x{170}\x{172}\x +{174}\x{176}\x{179}\x{17b}\x{17d}/;
The problem actually seems to persist with any subset of the hex translations in parallel with the ASCII e.g;
tr/a-z\x{fff03}-\x{fff05}/A-Z\x{fff00}-\x{fff02}/;

--
In Bob We Trust, All Others Bring Data.

Replies are listed 'Best First'.
Re: tr funkiness in 5.6.[12] (but not 5.6.0!)
by JavaFan (Canon) on Jan 07, 2009 at 00:26 UTC
    Well, I guess you've found a bug in the last release of a really old version of Perl. 5.6.1 dates from the first half of 2001 - George W. Bush was only president for 2.5 months (5.6.2 just fixed build issues). In seems that the bug has been fixed in the (almost) 8 years that have passed since. You've also found a workaround for the bug.

    What else do you want? Noone is going to use the time machine to travel back to 2001 and fix 5.6.1.

      As much as it goes against my grain to feed trolls (why else go on about tangents?)... "What else do you want?" How about an explanation, as indicated in the original message?
      doesn't really explain why things have gone awry.
      The "work-around" is in name only. The code is the result of parsing user input, and there may be any number of ways of triggering this behavior... again, as indicated in the original message. Some people want band-aids or magic pills, others want to know what caused the accident.

      If I'd expected the perl 5.6 tree to be fixed (unlikely without concrete knowledge of what the issue is), I'd have filed a perlbug, and linked it. Instead, I was simply hoping for some inisight and, umm, *wisdom*.

      --
      In Bob We Trust, All Others Bring Data.

        How about an explanation, as indicated in the original message?
        The answer "it's a bug", as indicated in the first reply isn't good enough? If you want details of what caused the bug, get a git clone of the perl source, and run 'git-bisect' between 5.6.0 and 5.6.1 to see which commit first triggers the unwanted behaviour.
Re: tr funkiness in 5.6.[12] (but not 5.6.0!)
by swampyankee (Parson) on Jan 07, 2009 at 11:38 UTC

    It is certainly a good catch, and you deserve kudos for finding it.

    Well, a bug report is probably pointless. Old versions certainly remain in use, however, as in my latest gig one of their AIX boxes was still using 5.0.

    As an aside, were the three Perl versions (5.6.[012]) running in the same environment and built with the same compiler?


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

      .1 and .2 were on the same Centos 4 system, built with gcc 3. I've not yet built a local copy of .0, that was checked against the system perl on a SunOS box. I'll try to look into that soonish, and hope it's not bitten by the same compilation issues as 5.6.1 under gcc 3; one must munge the makefile's to remove <built-in> and <command line>, etc.

      My original guess was something to do with locales, until I noticed that all of the CPANTS failures were 5.6.2 boxen.

      --
      In Bob We Trust, All Others Bring Data.