I present here a technique for performing a Θ join on two or more data sets.
This does in perl what SELECT/WHERE does in SQL (approximately).
The input data sets and the result set are presented via iterators. The implementation below leverages the techniques and classes I posted in Using Nested Iterators to find a Cross Product and A Filtering Iterator.
# sample data. I use arrays for illustrative purposes, # but data could come from anywhere. my @author = ( [ 'Alonzo', 'Church', ], [ 'Stephen', 'Kleene', ], [ 'Wilhelm', 'Ackermann', ], [ 'Willard', 'Quine', ], ); my @author_book = ( [ 'Alonzo', '0691029067', ], [ 'Stephen', '0486425339', ], [ 'Wilhelm', '0821820249', ], [ 'Wilhelm', 'B000O5Q8QG', ], [ 'Willard', '0674554515', ], [ 'Willard', '0674802071', ], ); my @book_title = ( [ '0674802071', 'Set Theory and Its Logic', ], [ '0674554515', 'Mathematical Logic', ], [ 'B000O5Q8QG', 'Solvable Cases of the Decision Problem', ], [ '0821820249', 'Principles of Mathematical Logic', ], [ '0486425339', 'Mathematical Logic', ], [ '0691029067', 'Introduction to Mathematical Logic', ], ); my $join_authors_books = # a filter iterator for implementing our join condition: Iterator::Filter->new( # iterator for walking the cross product: Iterator::Product->new( # iterators for each of the above arrays: Iterator::Array->new( \@author ), Iterator::Array->new( \@author_book ), Iterator::Array->new( \@book_title ), ), # our condition: # where author.name = author_book.name # and author_book.isbn = book_title.isbn sub{ my($author,$author_book,$book_title) = @_; # each is an arrayref - a row from the corresponding "table" $author->[0] eq $author_book->[0] && $author_book->[1] eq $book_title->[0] } ); until ( $join_authors_books->is_exhausted ) { my($author,$author_book,$book_title) = $join_authors_books->value; local($,,$\) = ("\t","\n"); # the $author_book array doesn't contain any info not present in t +he other two print @$author, @$book_title; }
|
|---|