in reply to RFC: Set::Select: get intersection or union of sets; or more generally, the set of elements that are in one or more input sets

If we have 3 input sets, then the '110' selector string selects all elements that are in the first and second sets but not in the third.

This looks like an INNER JOIN operation as known from SQL. But there's no facility to select elements that are unique to each set, i.e. select elements from each set which aren't present in the other sets (which would be an OUTER JOIN, right?).

So, the selector string should sport the values (012) or it should not be a string rather than an arrayref holding -1,0,1 elements. That would be more perlish (e.g.sort). I haven't thought of all combinations of intersecting three or more sets, which would require more than just three values for each set, or of a language to combine possible intersections of 3 or more sets; but I think that a good name would be Set::Intersect.

Then, thinking a bit more - if you want the string interface, then a lispish sort of query specification might do the job, as used in LDAP queries - (|(&(1=0)(!(2=-1)))(3=1)(4=1)) - perhaps translated into "and", "or", "in", "not in" and such to make it more perlish. Or "and", "or", "not", "nor", "xor" etc - basically, you need boolean logic and precedence to process things, and a descriptive input specification which supports that.

perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
  • Comment on Re: RFC: Set::Select: get intersection or union of sets; or more generally, the set of elements that are in one or more input sets
  • Download Code

Replies are listed 'Best First'.
Re^2: RFC: Set::Select: get intersection or union of sets; or more generally, the set of elements that are in one or more input sets
by kikuchiyo (Hermit) on Dec 16, 2018 at 15:47 UTC

    But there's no facility to select elements that are unique to each set, i.e. select elements from each set which aren't present in the other sets

    There is. '100' selects elements that are unique to the first set. '001|010|100' selects what you want.

    then a lispish sort of query specification might do the job, as used in LDAP queries - (|(&(1=0)(!(2=-1)))(3=1)(4=1)) - perhaps translated into "and", "or", "in", "not in" and such to make it more perlish. Or "and", "or", "not", "nor", "xor" etc - basically, you need boolean logic and precedence to process things, and a descriptive input specification which supports that.

    This is precisely what I didn't want to do. Lot of work, lot of edge cases, lot of tests and documentation to write.