Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Hash versus chain of elsifs

by dsheroh (Monsignor)
on Nov 22, 2021 at 08:43 UTC ( [id://11139004]=note: print w/replies, xml ) Need Help??


in reply to Hash versus chain of elsifs

Between the two approaches you mentioned, I would be extremely surprised if the hash did not turn out to be more efficient, as it doesn't have to step through a list of thousands of elsifs. It also has the advantage of taking constant time regardless of whether the string is found or not, rather than being very quick if the string is at the start of the elsif chain and being slower for a non-matching string, which would have to individually check every item in the list before it can be known that the string isn't there.

More importantly, though, the hash approach is much more readable than an endless list of elsifs, and it also opens up the possibility of storing your list of items to match against in a config file or a database (since a hash's contents are just data, not code) which will make for easier long-term maintenance.

BTW, the syntax you probably want for checking whether a hash key exists is exists($hash{$string}), not defined (which will return false if the key is there but has no value).

Replies are listed 'Best First'.
Re^2: Hash versus chain of elsifs
by mldvx4 (Friar) on Nov 22, 2021 at 09:34 UTC

    Thanks. I'll go with the hash method then and use exists to check it. Does the hash get built each time the function is called or does it persist in some way? I'm guessing the former, from some tests I've tried.

      Depends on how and where you're building it; as was mentioned give us a sample and someone can comment on specifics. That being said though here's a way to (for example) initialize your hash from a file with one item per list lazily and only one time (unless you explicitly clear it):

      if( exists _get_cache()->{ $candidate } ) { say qq{IT DOES}; } else { say qq{No such luck . . .}; } { ## Block to scope our cache to just these subs my $lookup_cache = undef; sub _reset_cache { $lookup_cache = undef; } sub _get_cache { $lookup_cache //= _load_cache(); } sub _load_cache { ## presuming you've declared file var somewhere above . . . open( my $fh, q{<}, $CACHE_FILE_NAME ) or die qq{Can't open '$CACHE_FILE_NAME': $!\n}; $lookup_cache = {}; while( <$fh> ) { chomp; $lookup_cache->{ $_ } = 1; } close( $fh ); return $lookup_cache; } } ## End of limited scope block.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

      You should make it persist. That will have a big impact on performance. If you need further help, show us some sample code and we'll be able to show various ways to make it persist.

        Here is some sample code:

        package JunnkSites 0.03; use parent qw(Exporter); our @EXPORT = qw(KnownJunkSite); sub KnownJunkSite { my ($a) = (@_); my %junksites = ( "bollyinside.com", "www.bollyinside.com", ... "worldtrademarkreview.com", "www.worldtrademarkreview.com", ); if(exists($junksites{$a})) { $a = 1; } else { $a = 0; } return($a); }

        Any other performance and style tips or pointers welcome.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11139004]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-03-28 20:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found