kcott has asked for the wisdom of the Perl Monks concerning the following question:

G'day All,

In Judy::HS, the utility function MemUsed() is shown with this very short description:

"Returns the size of a Judy::HS array. This implementation is not supplied by libJudy."

Does anyone know what I'd need to do in order to gain access to MemUsed()?

Alternatively, does anyone know of another method to get the size of a Judy::HS array?

The Perl variable representing the Judy::HS array is just a number: as far as I can tell, it's a pointer used under the hood in XS-land to access the Judy data structure (and way beyond my level of expertise). Modules like Devel::Size just report on the size of the number. Here's a rough summation:

$ perl -E ' use Devel::Size "total_size"; use Devel::Size::Report "report_size"; use Judy::HS qw{Get Set}; my $judy; Set($judy, Key => 42); say q{$judy: }, $judy; say q{total_size($judy): }, total_size($judy); say q{report_size($judy): }, report_size($judy); my ($PValue, $Value) = Get($judy, "Key"); say "\$PValue[$PValue]"; say "\$Value[$Value]"; ' $judy: 42949789440 total_size($judy): 24 report_size($judy): Size report v0.13 for '42949789440': Scalar 24 bytes Total: 24 bytes in 1 elements $PValue[42949813632] $Value[42]

— Ken

Replies are listed 'Best First'.
Re: Size of Judy::HS array: where is MemUsed()?
by hv (Prior) on Apr 08, 2023 at 02:26 UTC

    Does anyone know what I'd need to do in order to gain access to MemUsed()?

    What happens if you try to call it?

    I took a brief look at the code, and it looks buggy - in particular one of the three cases handled by pvtJudyHSMemUsedV accumulates a result in "sum" but never returns it, so you may get a random number (or worse) if you hit that case - but it should be callable, and if you're lucky your particular arrays won't hit that case.

    Sadly the author last updated any of their (many) modules in 2014, so an update seems unlikely any time soon. If you're happy to run a patched version, I can provide a one-line fix for the obviously buggy case.

      G'day hv,

      ++ Thanks for your response.

      "What happens if you try to call it?"

      A common alias of mine:

      $ alias perle alias perle='perl -Mstrict -Mwarnings -Mautodie=:all -MCarp::Always -E +'

      I originally just tried to import it. My code crashed at that point; like this:

      $ perle 'use Judy::HS "MemUsed"'; "MemUsed" is not exported by the Judy::HS module at /home/ken/perl5/pe +rlbrew/perls/perl-5.36.0/lib/site_perl/5.36.0/Sub/Exporter.pm line 77 +8. Sub::Exporter::_do_import(HASH(0xa003f6ab8), ARRAY(0xa005d7d00 +)) called at /home/ken/perl5/perlbrew/perls/perl-5.36.0/lib/site_perl +/5.36.0/Sub/Exporter.pm line 744 Sub::Exporter::__ANON__("Judy::HS", "MemUsed") called at -e li +ne 1 main::BEGIN() called at -e line 1 eval {...} called at -e line 1 BEGIN failed--compilation aborted at -e line 1.

      I basically took the documentation at face value and assumed that I needed something in addition to libJudy. However, I now see that a fully-qualified name will access it:

      $ perle 'use Judy::HS "Set"; my $judy; Set($judy, Key => 42); say Judy +::HS::MemUsed($judy);' 59
      "I took a brief look at the code, and it looks buggy ... If you're happy to run a patched version, I can provide a one-line fix for the obviously buggy case."

      I've just been tinkering with it; comparing $judy with %hash. I've written some basic benchmarks to test speed; I looked into some of the Unicode issues raised in Judy::Mem; and was planning to do a size comparison as well. Depending on how all of that panned out, I had some tentative thoughts on other aspects to investigate.

      Some of my $work involves biological data which, as I'm sure you're aware, can be huge and require long processing times. I'm always on the lookout for more bang-for-your-buck in this area. Having said that, given your comment that "it looks buggy", and the fact that it appears to be abandonware, I'm wondering if this is worth pursuing. In light of that, while I very much appreciate your offer of supplying a patch, I think I need to give some serious consideration as to whether I'll continue with Judy & Co. before taking you up on that.

      — Ken

        I think maybe you were misled by the wording of "This implementation is not supplied by libJudy". I read it as "while other functions in this package are simple wrappers around functions provided by the external library, this function I wrote myself".

        In the event, that looks like a useful warning: it seems quite possible that the external library is solid, and that the simple wrappers do indeed simply wrap. In which case you may well get entirely reliable behaviour with those, even if the additional functionality added by the author is less solid.

        In any case I've added the patch as a ticket on Judy's queue, so it's there if you need it: id=147637.

Re: Size of Judy::HS array: where is MemUsed()?
by eyepopslikeamosquito (Archbishop) on Apr 08, 2023 at 07:35 UTC

    G'day kcott,

    A few months ago, a mysterious anonymonk popped up out of nowhere to claim the fastest Perl solution in the long-running Long List is Long saga by utilizing Judy arrays ... then just as mysteriously disappeared again. Sadly, I doubt we will hear from him/her again. :-(

    While I was watching agog from the sidelines, the remarkable marioroy got his hands dirty with the Judy C code, so I expect he'll reply to this thread on his return.

    Some references:

      G'day eyepopslikeamosquito,

      Thanks for this. I do remember the "Rosetta Code: Long List is Long" thread — in fact, I front-paged it — and I did follow it for some time. There are certainly things of interest. I've packed up my tinkering for now (heading off for a major family "Easter" reunion) but will no doubt get back to it next week.

      Your first reference brought Memory::Usage to my attention. I don't recall using that module, or even being aware of it, previously. It could provide common ground for comparing sizes: currently using Devel::Size::total_size() for %hash and Judy::HS::MemUsed() for $judy. It would also get around the MemUsed() bug(s) already discussed.

      I came across Judy arrays quite some time ago: I added it to my mental list of things to study further at some future date. Curiously, I came across your third reference a couple of days ago (via some link hopping around PM): this prompted me to move Judy arrays from the mental list to the active list. :-)

      — Ken

        G'day kcott,

        During the "Rosetta Code: Long List is Long" experiment, I called the JudySL/HS free functions in C to obtain the amount of memory used. The Judy::HS Perl module also returns bytes. Judy::HS wowed me regarding memory utilization versus native hash. I did not call MemUsed at the time.

        my bytes = Free( $Judy );