rajaguha has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I need an implementation of a sparse array. If the searched key/index does not exist, it needs to return the immediately higher and immediately lower keys/indexes. I also need it to be able to nest to any depth.

I am trying to implement this using a hash coupled with an array containing the keys in sorted order. This can then be packaged as a tied array. I also want to get clever with the search on the hash keys stored in the array. Provision for a user-defined subroutine to return interpolated values is also possible.

But I am wondering if someone has not written something like it already. CPAN searches have failed so far.

Any comments on ways to improve my implementation will also be greatly appreciated.

Thanks

----

(Please YELL if updating the original post is not the done thing.)

Thanks everyone - much food for thought and some very interesting modules.

I will take GrandFather's advice and post the code when ready. For now, I am updating just to state the motivation for the effort.

The original thought was to create a structure for storage and querying of large data sets from experiments having multiple input variables and a single result from each run. The input variables could be reals, integers or even strings(to accommodate categories). For now, I am presuming the output to be a real.

Not only should we be able to return results for the previously loaded combinations(like any db can) but, once 'enough' data is fed in, we should also return approximate results for combinations with intermediate values (for numeric variables) through interpolation.

One can use simple linear interpolation, allow the user to supply some subroutine based on her/his knowledge of the data or get fancy w/ non-linear regression, even try to derive an equation from the data.

But fancy for later. Just the data structure for now.

Replies are listed 'Best First'.
Re: Sparse Array
by Your Mother (Archbishop) on Jul 24, 2011 at 06:03 UTC

    Take a look at Judy.

Re: Sparse Array
by GrandFather (Saint) on Jul 24, 2011 at 05:19 UTC

    It may help us pass comment on your implementation if you actually show some code. It may help too to tell us why you need this solution and perhaps provide a little sample code (a test case) that would exercise the implementation.

    It may be that a database is a good fit for your requirements.

    True laziness is hard work
Re: Sparse Array
by AnomalousMonk (Archbishop) on Jul 24, 2011 at 05:30 UTC

    It may be useful to you that an array element that has never been assigned to will return false when tested by exists (however, calling exists on array values is deprecated).

    >perl -wMstrict -le "my @ra; $#ra = 10; @ra[3, 5, 7] = (33, 55, 77); ;; use Data::Dumper; print Dumper fetch(3, \@ra); print Dumper fetch(6, \@ra); ;; sub fetch { my ($i, $ar) = @_; return [ $i, $ar->[$i] ] if exists $ar->[$i]; return [ $i-1, $ar->[$i-1] ], [ $i+1, $ar->[$i+1] ]; } " $VAR1 = [ 3, 33 ]; $VAR1 = [ 5, 55 ]; $VAR2 = [ 7, 77 ];
Re: Sparse Array
by Anonymous Monk on Jul 24, 2011 at 06:06 UTC

    But I am wondering if someone has not written something like it already. CPAN searches have failed so far.

    ??Sparse - Perl module for Sparse Vectors

    ??PDL::Sparse - Compact storage for mostly-zero PDLs

Re: Sparse Array
by etj (Priest) on May 26, 2022 at 14:03 UTC
    Use PDL::CCS (CCS = Collapsed Column Storage).
Re: Sparse Array
by Khen1950fx (Canon) on Jul 24, 2011 at 07:00 UTC
    I tried this to emulate a sparse array
    #!/usr/bin/perl use strict; use warnings; my %sparse; @sparse{1,1,1,0,0,1,1,0,2,1,1,1,1,1} = (); print "There are ", scalar keys %sparse, " elements in this array\n";
    Since we know that the elements are 0, 1, and 2, the output should be 3 elements.
Re: Sparse Array
by jdporter (Paladin) on Jul 25, 2011 at 17:09 UTC

    By "immediately higher and immediately lower keys/indexes" I infer you mean the nearest existing keys present in the array, not ($key-1, $key+1) regardless of whether present. Correct?

    I do like your approach; using a hash for the actual storage (i.e. values) is sane enough. For the index management, I would look at Set::IntSpan::Fast. Here's the basic idea: use this module to remember the ranges (IntSpan's) of the indices absent from the hash, i.e. the complement of the set of indices present in the hash. (Don't worry about the unbounded ranges at the upper and lower ends; Set::IntSpan::Fast handles them just fine.) Handling a "miss" and getting the nearest keys will be fairly straightforward, though the API doesn't include a method specifically for that so you'll have to code it up yourself. Take a look at the source, particularly the _find_pos and _iterate_ranges internal functions. Note in particular that _find_pos does the binary search you need. :-)

    I reckon we are the only monastery ever to have a dungeon stuffed with 16,000 zombies.
Re: Sparse Array
by JavaFan (Canon) on Jul 25, 2011 at 14:29 UTC
    I need an implementation of a sparse array. If the searched key/index does not exist, it needs to return the immediately higher and immediately lower keys/indexes.
    All binary search trees from data structures 101 will do that. If you keep it balanced, it will do such operations in O(log N) time.
      All binary search trees from data structures 101 will do that.

      That is my 'Non-answer: pick-of-the-week'.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.