thens has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I wanted to know which of these two styles looks good.
my $string = "zero one two three"; # from the list $number = (split /\s/, $string)[1]; # This is with anonynmous array $number = [ split /\s/, $string ]->[1];
I personally prefer the second option. Is there any other implication to this. I guess both are same.

-T

Title edit by tye

Replies are listed 'Best First'.
Re: coding style suggestion
by demerphq (Chancellor) on Sep 19, 2003 at 10:31 UTC

    The second one allocates a whole array and then throws it away. The first one is handled by perl and may be optimized into something much more efficient than it looks. I would not use the second when the first is available. Incidentally you'll find that the following split// variant

    my $string = "zero one two three"; $number = (split /\s+/, $string,3)[1];

    is more efficient. It needs to be 2 more than the index you want access. One more for the 0/1 base correction, and one more so that the index you want doesnt include additional values afterwards.

    A benchmark shows that the variant I posted above is around 300% faster than the array variant you posted.


    ---
    demerphq

    <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

      For ultimate speed, use the magic split regex ' '. Say what?

      my $number = ( split ' ', $string, 3 )[1];

      Saves another 1% to 10% or so and it's even easier to type, though a little harder to remember:)

      From perlfunc:split

      As a special case, specifying a PATTERN of space (' ') will split on white space just as split with no arguments does. Thus, split(' ') can be used to emulate awk's default behavior, whereas split(/ /) will give you as many null initial fields as there are leading spaces. A split on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field. A split with no arguments really does a split(' ', $_) internally.

      The very fact that there is a special case made for the splitting of whitespace delimited fields suggests that someone felt that this was worthwhile optimising. Who knows, maybe that someone was even LW himself.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

Re: coding style suggestion
by liz (Monsignor) on Sep 19, 2003 at 10:13 UTC
    Some remarks:

    $number = (split /\s/, $string)[1];
    You probably want /\s+/ to allow for multple spaces.

    $number = [ split /\s/, $string ]->[1];

    Only a performance reason, because in this case you are also creating an array, but then also creating a reference to it. So you're doing extra work in this case.

    But if you're happy with that way, go for it. Just do it that way all the time. Be consistent. Avoid premature optimization!

    Liz

      Be consistent. Avoid premature optimization!

      Agreed totaly. But I personally dont count knowning various idioms relative costs as premature optimization. Premature optimization is when you toil over your code for hours to effect a negligable speed improvement that you havent already justified by some good profiling. Knowning the cost of a particular idiom and avoiding those that are expensive for ones that are inexpensive when the time to write either is equivelent is good practice not premature optimization.

      :-)


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: coding style suggestion
by Anonymous Monk on Sep 19, 2003 at 10:21 UTC

    From looks, I like the anonymous array one too because brackets are...well, sexy. But for those interested in micro optimization, the benchmarks say that the list is faster.

    Benchmark: running anon, list for at least 5 CPU seconds... anon: 6 wallclock secs ( 5.25 usr + 0.01 sys = 5.26 CPU) @ 875 +11.49/s (n=460119) list: 4 wallclock secs ( 5.29 usr + 0.00 sys = 5.29 CPU) @ 121 +428.51/s (n=642243) Rate anon list anon 87511/s -- -28% list 121429/s 39% --

    Those are the benchmarks from your script above. Regardless how you vary the length of the string, the ratio held around the same. Hope this helps.

    Anonymously yours
    Anonymous Nun

      I understand there is a perfomance penalty with options 2. But for small sized list I guess this is affordable. As you say these brackets stand out from the ()s which may be taken for a precedence paranthesis.

      Anyway, all your suggestions were enlightening.

      -T

Re: coding style suggestion
by flounder99 (Friar) on Sep 19, 2003 at 12:11 UTC
    use strict; my $string = "zero one two three"; my $offset = 1; my $regex = '[^\\s]+\\s+' x $offset . '([^\\s]+)'; print $string =~ /$regex/; __END__ one
    I don't have time to test if this is faster but it will not create the intermediate array. It might be more memory efficient if $string is really huge. But it won't let you take a slice.

    update now that I think about it it might not be very efficient if both $string and $offset are huge because $regex will also be huge. Just TIMTOWTDI.

    update 'doh! liz's way is even better.

    --

    flounder

      Ah, but you don't need to bother with $regex:

      my $string = "zero one two three"; my $offset = 1; print $string =~ /(?:[^\s]+\s+){$offset}([^\s]+)/; __END__ one

      And in this case the regex is not large, but may take some time to execute.

      Liz