Re: Size of Judy::HS array: where is MemUsed()?

G'day kcott,

A few months ago, a mysterious anonymonk popped up out of nowhere to claim the fastest Perl solution in the long-running Long List is Long saga by utilizing Judy arrays ... then just as mysteriously disappeared again. Sadly, I doubt we will hear from him/her again. :-(

While I was watching agog from the sidelines, the remarkable marioroy got his hands dirty with the Judy C code, so I expect he'll reply to this thread on his return.

Some references:

Re^2: Rosetta Code: Long List is Long by anonymonk (Dec 2022) - used Judy::HS in a Perl solution to Long List is Long (see also responses from marioroy)
Re^2: Rosetta Code: Long List is Long - JudySL summary by marioroy (Jan 2023) - AFAICT, llil4judy is C/C++ only
Judy Array References - my list of references on Judy arrays (if you know of other useful references, please let me know)

Comment on Re: Size of Judy::HS array: where is MemUsed()? Select or Download Code

Replies are listed 'Best First'.
Re^2: Size of Judy::HS array: where is MemUsed()? by kcott (Archbishop) on Apr 08, 2023 at 08:39 UTC
G'day eyepopslikeamosquito, Thanks for this. I do remember the "Rosetta Code: Long List is Long" thread — in fact, I front-paged it — and I did follow it for some time. There are certainly things of interest. I've packed up my tinkering for now (heading off for a major family "Easter" reunion) but will no doubt get back to it next week. Your first reference brought Memory::Usage to my attention. I don't recall using that module, or even being aware of it, previously. It could provide common ground for comparing sizes: currently using `Devel::Size::total_size()` for `%hash` and `Judy::HS::MemUsed()` for `$judy`. It would also get around the `MemUsed()` bug(s) already discussed. I came across Judy arrays quite some time ago: I added it to my mental list of things to study further at some future date. Curiously, I came across your third reference a couple of days ago (via some link hopping around PM): this prompted me to move Judy arrays from the mental list to the active list. :-) — Ken	[reply] [d/l] [select]
Re^3: Size of Judy::HS array: where is MemUsed()? by marioroy (Prior) on Apr 10, 2023 at 20:59 UTC
G'day kcott, During the "Rosetta Code: Long List is Long" experiment, I called the JudySL/HS free functions in C to obtain the amount of memory used. The Judy::HS Perl module also returns bytes. Judy::HS wowed me regarding memory utilization versus native hash. I did not call MemUsed at the time. `my bytes = Free( $Judy );` [download]	[reply] [d/l]
Re^4: Size of Judy::HS array: where is MemUsed()? by kcott (Archbishop) on Apr 10, 2023 at 23:42 UTC
G'day Mario, Thanks for the feedback. As mentioned earlier, this was put on hold for a family Easter event; I expect to be working on it again this week. My main concern with `MemUsed()` was the bug(s) reported by hv: if I were to present `Judy::HS` to $work, as a buggy module, which needed patching, and appeared to be abandonware, it probably wouldn't be received too well. Using Memory::Usage instead of `MemUsed()` would circumvent this problem; other parts of `Judy::HS` seem solid (from what I've read). Early results do show that `Judy::HS` used a lot less memory than `%hash`. I initially used `/usr/share/dict/australian-english` to populate the hash keys. I chose this because it was the largest of several files I have in `/usr/share/dict/` (the fact that I'm an Aussie was only a secondary consideration); however, I found that this file has entries with characters outside the 7-bit ASCII range (e.g. `Ångström`). This required some encoding manipulation for `Judy::HS`; creating this data structure was slower than for a `%hash`. `/usr/share/dict/linux.words` is the smallest in that directory and, as far as I can tell, only uses 7-bit ASCII. I'll be giving that a try to see how `Judy::HS` fares against `%hash` when there's no encoding consideration. There's other areas I intend to address, which will likely include: reading the data structures with and without encoding; non-integer values; and, complex structures (e.g. HoH). All very interesting; there should be a Meditation somewhere down the track with results of this investigation. — Ken	[reply] [d/l] [select]
Re^5: Size of Judy::HS array: where is MemUsed()? - perldelta, Perl Releases and Building Perl by eyepopslikeamosquito (Archbishop) on Apr 11, 2023 at 11:46 UTC
Re^6: Size of Judy::HS array: where is MemUsed()? by kcott (Archbishop) on Apr 11, 2023 at 19:34 UTC