in reply to Re: Re: String Manupulation
in thread String Manupulation

Better is $string =~ tr/ /-/; Don't use substitution when transliteration is applicable.

Why? I often hear this advice and it usually stems from the fiction that "tr/// is always faster than s///".

A better rule, IMHO, is to use the tool that fits best. In this case, both fit equally well. I personally prefer s/ /-/g because it will be recognized more widely.

If I felt that the requirement was likely to become something like "change ' ' to '-' and tab to '_'", then I might start with tr/ /-/ in expectation of changing it to something like tr/ \t/-_/ (which could be done with s/// but not so cleanly). While if I felt that the requirement was likely to become something like "change whitespace to '-'", then I'd start with s/ /-/g in expectation of changing it to something like s/\s+/-/g (which could be done with tr/// but not so cleanly).

In the very rare case where the performance difference between the two matters, which to use depends on your input. Benchmarking with one 10kB string I get:

Rate 1tr 0tr 0s 1s 1tr 35435/s -- -1% -27% -30% 0tr 35863/s 1% -- -26% -29% 0s 48562/s 37% 35% -- -4% 1s 50833/s 43% 42% 5% --
[ Note that "0s" and "1s" are identical as are "0tr" and "1tr". I usually include such so that runs of each case are interleaved so I get an idea how much variability there is between runs vs. real differences in performance. ]

With a different 10kB string I get:

Rate 0s 1s 0tr 1tr 0s 20623/s -- -2% -38% -38% 1s 20993/s 2% -- -37% -37% 0tr 33175/s 61% 58% -- -1% 1tr 33522/s 63% 60% 1% --
Note that in both cases, the speed difference between s/// vs. tr/// is only a few micro seconds on a 10kB string so this is extremely unlikely to matter either way for the vast majority of uses.

                - tye

Replies are listed 'Best First'.
Re: Re^3: String Manupulation (yarn)
by glwtta (Hermit) on Aug 28, 2003 at 02:23 UTC
    I personally prefer s/ /-/g because it will be recognized more widely.

    Perhaps it's worth using tr// just to correct that? :)

Re4: String Manupulation (yarn)
by dragonchild (Archbishop) on Aug 28, 2003 at 12:40 UTC
    My rationale has to do with the fact that, while the OP is substituting, the more correct statement is that the OP is transliterating. It has nothing to do with speed or the like. s/// is fast enough. tr/// better describes what's going on, given the information above. I agree that the correct tool for the anticipated requirements should be chosen. But, given the requirements as stated, I would argue that tr/// is correct. In fact, because tr/// is used less, I would argue that this is a benefit in the communication to the maintainer.

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.