Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I have an application which draws data from different providers, some of them DBIC-based, others file-based, etc. I have all of these data providers nicely wrapped in classes, with roles and collections.

Now I need to query, sort and page the provided data together to be able to display it in a sorted-paged-filtered grid to end user. I've looked at several modules in the CPAN but can't piece it together into a comprehensive solution.

Basically, I've looked at Data::Page, Data::Bulk, among others but apparently there isn't something that brings it all together.

For instance, in my program:

# searching for my $provider ( @providers ) { $data->add_results( $provider->search( query=>$query, sorting=>{ c +ol1=>'asc' } ) ); } # group sorting $data->sort( sub{ return $_[0]->{column} cmp $_[1]->{column} ); # paging my $rs = $data->list_paged( page=>1, rows=>10 ); while( my $row = $rs->next ) { ... }

Of course, the data provided is humongous and I can't just load it into lists first for splicing.

Am I just nuts trying to do this?

Desperate for enlightenment here. Need a strategy or design pattern to focus on. Any comments are welcome!

--Miguel

Code tags added by GrandFather

Replies are listed 'Best First'.
Re: data bulk query, sorting and paging
by moritz (Cardinal) on Nov 30, 2009 at 16:28 UTC

    It all depends on how your data looks like, especially how big the datasets are. If they are rather small (like, a few hundred entries) it makes sense to put that functionality into your provider. If not...

    Databases can sort and page, so for larger data sets it makes sense to move that functionality down to the individual provider classes, and provide a generic replacement for those providers that don't support it natively. You can do the sorting with built-in sort function, and the paging with Data::Pager.

    Perl 6 - links to (nearly) everything that is Perl 6.
      Datasets may be in the thousands, or millon rows...

      The problem I have with delegating sorting and querying to providers is that paging gets really tricky.

      my @rows; for my $provider( qw/Provider1 Provider2 Provider3/ ) { push @rows, $provider->search( query=>..., sort=>..., page=>1, rows=>10 ); } # now I may end up with 30 rows for page 1 # but the client grid expects 10, so let's cut it down... @rows = @rows [ 0..9 ];

      So, ideally, the "collection manager object" should work with this algorithm:

      1. Ask each provider to search for a certain query and sort
      2. Ask for a row from each provider starting at a given page
      3. Is the page complete? Then call any external filters callbacks.
      4. External filters dropped rows? Then keep fetching until page is complete again.
      5. Sort the combined rows
      6. Return resulting array, but keep track of the current paging state

      --Miguel

      Code tags added by GrandFather

        Please do not use  <pre> tags: they usually screw up browser rendering. Please Update your reply and, in particular, your original post to use  <c> ... </c> tags around code (and output) instead. Please see Markup in the Monastery et al.