in reply to MAX and GROUP BY in DBIx::Class

OP here. I deeply apologize to my brothers for wasting your time with a bad example. My code was, in fact, correct, it was the example that was bad, and you're both absolutely right to have criticized it. I've worked my way around it, but I'm going to explain things here so that there's a record of what I meant. (And for that matter, I'd still be curious what the correct solution is for DBIx::Class.)

This is still simplifying things, but I think I am now doing so correctly. Assume my table has only two columns that are relevant for this question: event_id, which is an autoincrement primary-key column and is guaranteed to be ordered chronologically, and name. I am trying to get the row objects for the most recent event for each name, which (because event_id is ordered chronologically) will be the max value of the event_id column for each name. Thus, if my table is

+----------+-----------+--------+ | event_id | name | other | +----------+-----------+--------+ | 1 | John | foo | | 2 | John | bar | | 3 | Mary | baz | | 4 | John | quux | | 5 | Kate | blah | | 6 | Mary | bling | +----------+-----------+--------+
Then, my query SELECT MAX(event_id), name FROM events GROUP BY name; will (I believe) give me this, which is what I want:
+---------------+------+ | max(event_id) | name | +---------------+------+ | 4 | John | | 5 | Kate | | 6 | Mary | +---------------+------+
My original question is how I can write a query in DBIx::Class to return an array of objects for these rows.

I ended up simply getting a $dbh, and plugging my raw SQL query into this to get an array of event_id values, and then I loop over this array and do a DBIC find() on each event_id to give me the row object I need, which is pretty much instantaneous since it's a primary-key query. I also realize that I should never want to get all the result objects in one array; there's a possibility of a large number of results, and getting them all at once might lead to a memory problem.

(Just to be completely clear about what's going on, I'm using SQL Server; and the reason I specifically need the DBIC object itself is that I'm making use of some other code, which asks for the object. So I have to retrieve the object to pass to this other method.)

Replies are listed 'Best First'.
Re^2: MAX and GROUP BY in DBIx::Class
by ysth (Canon) on Nov 14, 2025 at 03:40 UTC
Re^2: MAX and GROUP BY in DBIx::Class
by vincent_veyron (Beadle) on Nov 26, 2025 at 15:58 UTC

    Not sure I got your needs right, as I don't work with DBIx::Class, but is there any reason you can't use a CTE (common table expression) in your query?

    Something like (for Postgresql) :

    with t1 as (SELECT MAX(event_id) as max_event_id, name FROM events GRO +UP BY name) SELECT * FROM events INNER JOIN t1 ON t1.max_event_id = events.event.i +d;

    that would give you the list of objects directly

    compta.libremen.com : Logiciel libre de comptabilité générale et analytique en partie double

Re^2: MAX and GROUP BY in DBIx::Class
by ysth (Canon) on Nov 14, 2025 at 02:51 UTC
    Update: sorry, I was confused by the third column and didn't read your new query properly.

    Suffers the same problem: there is nothing in that query that tells the server that the name you want is the one from the max id row. Most RDBMS will give an error if you try it, including I'm pretty sure SQL Server.

    That, plus your use of the term auto increment for what SQL Server calls an IDENTITY column, makes me think you are in fact using mysql or mariadb (without the ONLY_FULL_GROUP_BY sql_mode that prevents mistakes like your query, which is on by default in non ancient versions of mysql but sadly many old databases turn off).
Re^2: MAX and GROUP BY in DBIx::Class
by NERDVANA (Priest) on Dec 02, 2025 at 06:11 UTC
    Having fought through nearly that exact situation for SQL Server 2008 (which lacks most modern SQL features) a number of years back, I dug out this code and anonymized it a bit. The end result is that you say
    $schema->resultset("Thing")->search({ ... })->prefetch('last_event')-> +all
    and it generates a query like
    SELECT me.*, thing_max_event.* FROM thing me LEFT JOIN thing_event thing_max_event ON thing_max_event.id = ( SELECT TOP 1 id FROM thing_event WHERE thing_event.thing_id = me.id ORDER BY timestamp DESC ) WHERE ...
    and then you get Thing row objects where you can say $row->last_event and have a full row object of ThingEvent to work with. It's also fairly fast. Maybe not for selecting every single max event in the whole database, but we were usually restricting the Thing collection to a few thousand that we cared about for the specific query. Of course, you also need an index on thing_event(thing_id,timestamp) for it to be fast. Maybe also thing_event(event_type,thing_id,timestamp).

    Start by adding a helper to the ThingEvent resultset class:

    package App::Schema::ResultSet::ThingEvent; sub last_rs { my $self = shift; return $self->search_rs(undef,{ order_by => { -desc => $self->me . 'timestamp' }, rows => 1, # per conversation with ribasushi, this should be safe for a sin +gle row query unsafe_subselect_ok => 1 }); }

    Now add a dynamically generated relationship to Thing:

    package App::Schema::Result::Thing; __PACKAGE__->might_have( last_event => 'MyApp::Schema::Result::ThingEv +ent', sub { my ($rsrc, $us, $them)= @{$_[0]}{'self_resultsource','self_alias',' +foreign_alias'}; my $guard= $rsrc->schema->set_limit_dialect_scope_guard('Top'); # u +se "SELECT TOP x FROM ..." syntax for subquery return { "$them.id" => { '=' => $rsrc->schema->resultset('ThingEvent')->alias('thing_max_even +t')->search({ 'thing_max_event.thing_id' => { -ident => "$us.id" }, })->last_rs->columns(['thing_max_event.id'])->as_query } }; });

    Then you also need that helper that switches to the 'TOP N' limit syntax:

    package App::Schema; =head2 set_limit_dialect_scope_guard { my $dialect_scope= $schema->set_limit_dialect_scope_guard('Top'); # make DBIC searches with "SELECT TOP $n" syntax ... } # back to normal By default, DBIC generates ugly nested queries with a row_id range sel +ected in the outer where clause. This is actually the canonical way to do it o +n SQL Server, but can create all sorts of trouble with namespace between + the different layers of the query. Sometimes for sub-queries it is nice t +o have DBIC generate "SELECT TOP 1 foo FROM bar" instead of the usual mess. +DBIC has a C<< sql_maker->limit_dialect >> option for this, but there is no way + to constrain it to a single statement. This "scope guard" object allows you to control the exact scope in whi +ch you want DBIC to use the alternate dialect without worrying about exceptio +ns. =cut # This scope guard object sets the "limit_dialect" of the storage engi +ne back to whatever # it was originally when the object goes out of scope. {package MSSQL_LimitDialectFlipper; sub new { my ($class, $storage, $value)= @_; my $self= { storage => $storage, prev_value => $storage->sql_mak +er->limit_dialect }; Scalar::Util::weaken($self->{storage}); $storage->sql_maker->limit_dialect($value); bless $self, $class } sub DESTROY { $_[0]{storage}->sql_maker->limit_dialect($_[0]{prev_value}) if $ +_[0]{storage}; } } sub set_limit_dialect_scope_guard { my ($self, $new_dialect_value)= @_; MSSQL_LimitDialectFlipper->new($self->storage, $new_dialect_value); }

    There are also a number of sugar methods littered throughout this; we had some fun with our ResultSet base class. Let me know if you need those recipes.

    Once you get one working, you can make lots of other fun relationships like

    • last_event_of_type1
    • first_event_of_type1
    • last_event_of_type2
    • first_event_of_type2
    and then make all sorts of fun queries like
    $schema->resultset("Thing")->join('last_event_of_type1','last_event_of +_type2') ->where('last_event_of_type1.timestamp' => { '>=', $cutoff1 }, 'last_event_of_type2.timestamp' => { '<', $cutoff2 });