in reply to Re^2: Hash versus chain of elsifs
in thread Hash versus chain of elsifs
You should make it persist. That will have a big impact on performance. If you need further help, show us some sample code and we'll be able to show various ways to make it persist.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Hash versus chain of elsifs
by mldvx4 (Hermit) on Nov 22, 2021 at 10:24 UTC | |
Here is some sample code:
Any other performance and style tips or pointers welcome. | [reply] [d/l] |
by eyepopslikeamosquito (Archbishop) on Nov 22, 2021 at 11:22 UTC | |
Some feedback on your posted code: Anyway, here is a very simple example of how I would go about it.
Running this little test program produces:
Update: Should you write a Procedural Module or an OO Module or just use a Hash? In your case, if I wrote a module, I'd use OO. See also:
... though I'd also consider not writing a module at all, instead just using a hash/hashref directly, as analysed below in my reply to this reply. | [reply] [d/l] [select] |
by eyepopslikeamosquito (Archbishop) on Nov 23, 2021 at 11:34 UTC | |
If performance is an issue, you might consider eliminating the lookup function call overhead (and the arguments about whether your module should use state or our or lexically scoped my ;-) simply by not writing a function at all! Instead performing your hash lookups directly. To test this idea, I wrote the following little benchmark:
Running the little benchmark program above on my laptop displayed:
Note that in the sample code above, to make a direct hash lookup more pleasing to the eye, I eliminated the call to exists simply by ensuring all keys have the true value 1. No surprise that using the hash directly is a lot faster than calling a function every time you do a lookup. Also of interest is that a hash lookup is only marginally faster than a hash_ref lookup. Based on this benchmark, rather than agonizing over whether your function should use block lexical scope or the Perl 5.10 state feature or an our variable, you might instead choose not to use a function at all! That is, perform the lookup directly via a hash, rather than a function call. Note that using a hashref, rather than a hash, gives you the flexibility to call your code with many different hashes, at a miniscule performance cost. | [reply] [d/l] [select] |
by kcott (Archbishop) on Nov 22, 2021 at 11:45 UTC | |
G'day mldvx4, I agree with others that a hash is likely to be more efficient than a chain of elsifs. Having said that, as a general rule-of-thumb, you should Benchmark: Perl may have already optimised what you're trying to do (so you'd be both wasting your time and bloating your code); different algorithms may be more or less efficient depending on the data (e.g. number of strings, individual length of strings, total size of data); and so on. Don't guess; benchmark. "Any other performance and style tips or pointers welcome." Example code:
Output:
— Ken | [reply] [d/l] [select] |
by eyepopslikeamosquito (Archbishop) on Nov 22, 2021 at 10:39 UTC | |
You can make your %junksites variable persist either by moving it outside the sub (in a block lexical scope) or by making it a state variable. For a simple example of these two approaches, see the %rtoa variable at: | [reply] [d/l] [select] |
by NERDVANA (Priest) on Nov 22, 2021 at 11:04 UTC | |
| [reply] |
by BillKSmith (Monsignor) on Nov 23, 2021 at 02:32 UTC | |
by NERDVANA (Priest) on Nov 25, 2021 at 07:20 UTC | |
by choroba (Cardinal) on Nov 22, 2021 at 12:33 UTC | |
Note that hash needs key and value pairs, so you're storing only the site names without www as keys; the www. prefixed ones are stored as values. Probably not what you want. The fast way how to initialize the keys is
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] [select] |