mr.dunstan has asked for the wisdom of the Perl Monks concerning the following question:

Which is test practice when testing a string value?
1) if ($thing eq "thing") {
2) if ($thing =~ /thing/) {
Is there any difference in speed, accuracy?

-mr.dunstan
  • Comment on Which is better when doing a simple match?

Replies are listed 'Best First'.
Re: Which is better when doing a simple match?
by dragonchild (Archbishop) on Aug 21, 2001 at 23:42 UTC
    eq is always best, if you can use it. It's the simpler method. Doing a regex has to load a lot more schtuff.

    Now, if you had to do a regex, it would be much better if you did ($thing =~ /^thing$/), because that's equivalent to ($thing eq 'thing').

    ------
    /me wants to be the brightest bulb in the chandelier!

    Vote paco for President!

      So you're saying eq is always faster ... ?

      -mr.dunstan
        It is always at least as fast. Why are you asking - do you have a counterexample?

        ------
        /me wants to be the brightest bulb in the chandelier!

        Vote paco for President!

Re: Which is better when doing a simple match?
by Anonymous Monk on Aug 21, 2001 at 23:59 UTC
    Why speculate?
    #!/usr/local/bin/perl -w use Benchmark; $count = 1_000_000; print "Match eq => ", check_eq(), "\n"; print "Match rx => ", check_rx(), "\n"; timethese( $count, { 'eq' => sub{ check_eq() }, 'rx' => sub{ check_rx() } } ); sub check_eq { my $thing = "thing"; my $i = 0; $i++ if( $thing eq "thing" ); $i; } sub check_rx { my $thing = "thing"; my $i = 0; $i++ if( $thing =~ /^thing$/ ); $i; }
    reveals
    Match eq => 1 Match rx => 1 Benchmark: timing 1000000 iterations of eq, rx... eq: 15 wallclock secs (14.27 usr + 0.00 sys = 14.27 CPU) @ 70 +093.46/s (n=1000000) rx: 22 wallclock secs (21.08 usr + 0.00 sys = 21.08 CPU) @ 47 +430.83/s (n=1000000)
      And to get the index with, and the two different cases:
      #!/usr/local/bin/perl -w use Benchmark; $count = 1_000_000; my $thing = "thing"; print "Match eq => ", check_eq(), "\n"; print "Match rx => ", check_rx(), "\n"; print "Match idx => ", check_idx(), "\n"; timethese( $count, { 'eq' => sub{ check_eq() }, 'rx' => sub{ check_rx() }, 'idx' => sub{ check_idx() } } ); $thing = "asdjgfakjgfashdf___thing___asklfhklajsdhlajsdf"; print "Match eq => ", check_eq(), "\n"; print "Match rx => ", check_rx(), "\n"; print "Match idx => ", check_idx(), "\n"; timethese( $count, { 'eq' => sub{ check_eq() }, 'rx' => sub{ check_rx() }, 'idx' => sub{ check_idx() } } ); sub check_eq { ($thing eq "thing")?1:0; } sub check_rx { ($thing =~ /thing/)?1:0; } $i = 0; sub check_idx { ((index $thing,'thing') != -1)?1:0; }
      Result:

      Match eq => 1 Match rx => 1 Match idx => 1 Benchmark: timing 1000000 iterations of eq, idx, rx... eq: 4 wallclock secs ( 3.40 usr + 0.66 sys = 4.06 CPU) @ 24 +6305.42/s (n=1000000) idx: 6 wallclock secs ( 4.01 usr + 0.86 sys = 4.87 CPU) @ 20 +5338.81/s (n=1000000) rx: 5 wallclock secs ( 4.30 usr + 0.90 sys = 5.20 CPU) @ 19 +2307.69/s (n=1000000) Match eq => 0 Match rx => 1 Match idx => 1 Benchmark: timing 1000000 iterations of eq, idx, rx... eq: 3 wallclock secs ( 3.38 usr + 0.62 sys = 4.00 CPU) @ 25 +0000.00/s (n=1000000) idx: 6 wallclock secs ( 4.18 usr + 1.05 sys = 5.23 CPU) @ 19 +1204.59/s (n=1000000) rx: 6 wallclock secs ( 4.49 usr + 1.05 sys = 5.54 CPU) @ 18 +0505.42/s (n=1000000)


      T I M T O W T D I
Re: Which is better when doing a simple match?
by blakem (Monsignor) on Aug 21, 2001 at 23:45 UTC
    Well, they aren't actually the same thing.... Your second one will match strings such as 'clothing' and and 'Worthington' whereas the first one obviously wont. (which answers the 'accurate' part of your question <doh>)

    Still its a moot point. 'eq' wins hands down over the equivalent anchored regex, i.e. $thing =~ /^thing$/. Its faster, cleaner, and easier to debug....

    -Blake

      if (index $thing,'thing' != -1)
      will do that same as
      if ($thing =~ /thing/)


      T I M T O W T D I
Re: Which is better when doing a simple match?
by Maclir (Curate) on Aug 22, 2001 at 00:24 UTC
    Now what do you mean by "best"? The fastest - check the benchmark results. Accurate - the match operator will match any string containing "thing" (at least as you have written it).

    My idea of best includes the concept of "what is the most obvious in terms of the programmer designer's intent". Now if I see a comparison operation like your first example - comparing a variable directly to a string calue withe the "eq" operator, then it is pretty clear to me - you are looking for a string equality. If you throw in some regex and other stuff, then I ask "why has the person done this - is there something deeper I haven't seen yet?". Remember the KISS principle - keep it simple, stupid.

    All things being equal, always use the simplest, clearest construct - ($thing eq "thing") is understandable by almost any programmer - even visual basic people.

Re: Which is better when doing a simple match?
by mr.dunstan (Monk) on Aug 21, 2001 at 23:40 UTC
    I mean "best" pracice!

    -mr.dunstan