Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have what should be a simple question, but I can't figure out how to do it in DBIx::Class (which I'm using elsewhere in this code, so I need it for this too).

I have a single table, that (for this purpose) contains an ID, a name, and a score (and other data not relevant for this question). I want to get the Result object for the highest score for each name. That's it. So, for the data

+----+-----------+-------+ | id | name | score | +----+-----------+-------+ | 1 | John | 24 | | 2 | John | 60 | | 3 | Mary | 4 | | 4 | John | 10 | | 5 | Kate | 30 | | 6 | Mary | 20 | +---+------------+-------+
I want to get back rows 2 (John's highest score), 5 (Kate's highest (and only) score), and 6 (Mary's highest score).

That is, the SQL would be something simple like SELECT MAX(score), name FROM scores GROUP BY name; (except that I want to get the Result object, not the actual column values).

How can I accomplish this? The docs are pretty sketchy about grouping in general, and I can't find a good parallel anywhere.

Replies are listed 'Best First'.
Re: MAX and GROUP BY in DBIx::Class
by ysth (Canon) on Nov 13, 2025 at 21:19 UTC
    What database & version is this? No, that SQL doesn't do it, since there's nothing that tells it you want specifically the name from the row with the max score. And what do you want in the case of a tie, multiple rows, a list of names, an arbitrary name, a name decided by some tie breaking logic?
Re: MAX and GROUP BY in DBIx::Class
by talexb (Chancellor) on Nov 13, 2025 at 23:04 UTC

    As brother ysth has said, this is database dependent. Generally, you'd order your result set by score DESC; then, in SQL Server, you'd ask for TOP 1 after the select keyword; in Oracle, you'd use ROWNUM <= 1 in the WHERE clause (I think); in MySQL you'd use LIMIT 1 at the end of the query; I can't remember what you'd do in PostgreSQL, maybe LIMIT 1 as well?

    The ugly way to do it would be to get the whole array, then just take the first row, but that goes against the ethos of getting the database to do as much work as possible.

    Alex / talexb / Toronto

    As of June 2025, Groklaw is back! This site was a really valuable resource in the now ancient fight between SCO and Linux. As it turned out, SCO was all hat and no cattle.Thanks to PJ for all her work, we owe her so much. RIP -- 2003 to 2013.

Re: MAX and GROUP BY in DBIx::Class
by tonyc (Hermit) on Nov 15, 2025 at 04:32 UTC
    You can get what you originally asked for with SQL, though since more than one row with the same name can have the same score, you might get more than one row with the same name/score and a different id. Using your data:
    sqlite> select * from scores; 1|John|24 2|John|60 3|Mary|4 4|John|10 5|Kate|30 6|Mary|20 sqlite> select a.id, a.name, a.score from scores a join ( select name, max(score) as "score" from scores group by name ) +b on a.name = b.name and a.score = b.score; 2|John|60 5|Kate|30 6|Mary|20 sqlite> insert into scores values(7, 'John', 60); sqlite> select a.id, a.name, a.score from scores a join ( select name, max(score) as "score" from scores group by name ) +b on a.name = b.name and a.score = b.score; 2|John|60 5|Kate|30 6|Mary|20 7|John|60

    I couldn't figure out has to express that in DBIx::Class.

Re: MAX and GROUP BY in DBIx::Class
by Anonymous Monk on Nov 14, 2025 at 01:37 UTC

    OP here. I deeply apologize to my brothers for wasting your time with a bad example. My code was, in fact, correct, it was the example that was bad, and you're both absolutely right to have criticized it. I've worked my way around it, but I'm going to explain things here so that there's a record of what I meant. (And for that matter, I'd still be curious what the correct solution is for DBIx::Class.)

    This is still simplifying things, but I think I am now doing so correctly. Assume my table has only two columns that are relevant for this question: event_id, which is an autoincrement primary-key column and is guaranteed to be ordered chronologically, and name. I am trying to get the row objects for the most recent event for each name, which (because event_id is ordered chronologically) will be the max value of the event_id column for each name. Thus, if my table is

    +----------+-----------+--------+ | event_id | name | other | +----------+-----------+--------+ | 1 | John | foo | | 2 | John | bar | | 3 | Mary | baz | | 4 | John | quux | | 5 | Kate | blah | | 6 | Mary | bling | +----------+-----------+--------+
    Then, my query SELECT MAX(event_id), name FROM events GROUP BY name; will (I believe) give me this, which is what I want:
    +---------------+------+ | max(event_id) | name | +---------------+------+ | 4 | John | | 5 | Kate | | 6 | Mary | +---------------+------+
    My original question is how I can write a query in DBIx::Class to return an array of objects for these rows.

    I ended up simply getting a $dbh, and plugging my raw SQL query into this to get an array of event_id values, and then I loop over this array and do a DBIC find() on each event_id to give me the row object I need, which is pretty much instantaneous since it's a primary-key query. I also realize that I should never want to get all the result objects in one array; there's a possibility of a large number of results, and getting them all at once might lead to a memory problem.

    (Just to be completely clear about what's going on, I'm using SQL Server; and the reason I specifically need the DBIC object itself is that I'm making use of some other code, which asks for the object. So I have to retrieve the object to pass to this other method.)

      Not sure I got your needs right, as I don't work with DBIx::Class, but is there any reason you can't use a CTE (common table expression) in your query?

      Something like (for Postgresql) :

      with t1 as (SELECT MAX(event_id) as max_event_id, name FROM events GRO +UP BY name) SELECT * FROM events INNER JOIN t1 ON t1.max_event_id = events.event.i +d;

      that would give you the list of objects directly

      compta.libremen.com : Logiciel libre de comptabilité générale et analytique en partie double

      Having fought through nearly that exact situation for SQL Server 2008 (which lacks most modern SQL features) a number of years back, I dug out this code and anonymized it a bit. The end result is that you say
      $schema->resultset("Thing")->search({ ... })->prefetch('last_event')-> +all
      and it generates a query like
      SELECT me.*, thing_max_event.* FROM thing me LEFT JOIN thing_event thing_max_event ON thing_max_event.id = ( SELECT TOP 1 id FROM thing_event WHERE thing_event.thing_id = me.id ORDER BY timestamp DESC ) WHERE ...
      and then you get Thing row objects where you can say $row->last_event and have a full row object of ThingEvent to work with. It's also fairly fast. Maybe not for selecting every single max event in the whole database, but we were usually restricting the Thing collection to a few thousand that we cared about for the specific query. Of course, you also need an index on thing_event(thing_id,timestamp) for it to be fast. Maybe also thing_event(event_type,thing_id,timestamp).

      Start by adding a helper to the ThingEvent resultset class:

      package App::Schema::ResultSet::ThingEvent; sub last_rs { my $self = shift; return $self->search_rs(undef,{ order_by => { -desc => $self->me . 'timestamp' }, rows => 1, # per conversation with ribasushi, this should be safe for a sin +gle row query unsafe_subselect_ok => 1 }); }

      Now add a dynamically generated relationship to Thing:

      package App::Schema::Result::Thing; __PACKAGE__->might_have( last_event => 'MyApp::Schema::Result::ThingEv +ent', sub { my ($rsrc, $us, $them)= @{$_[0]}{'self_resultsource','self_alias',' +foreign_alias'}; my $guard= $rsrc->schema->set_limit_dialect_scope_guard('Top'); # u +se "SELECT TOP x FROM ..." syntax for subquery return { "$them.id" => { '=' => $rsrc->schema->resultset('ThingEvent')->alias('thing_max_even +t')->search({ 'thing_max_event.thing_id' => { -ident => "$us.id" }, })->last_rs->columns(['thing_max_event.id'])->as_query } }; });

      Then you also need that helper that switches to the 'TOP N' limit syntax:

      package App::Schema; =head2 set_limit_dialect_scope_guard { my $dialect_scope= $schema->set_limit_dialect_scope_guard('Top'); # make DBIC searches with "SELECT TOP $n" syntax ... } # back to normal By default, DBIC generates ugly nested queries with a row_id range sel +ected in the outer where clause. This is actually the canonical way to do it o +n SQL Server, but can create all sorts of trouble with namespace between + the different layers of the query. Sometimes for sub-queries it is nice t +o have DBIC generate "SELECT TOP 1 foo FROM bar" instead of the usual mess. +DBIC has a C<< sql_maker->limit_dialect >> option for this, but there is no way + to constrain it to a single statement. This "scope guard" object allows you to control the exact scope in whi +ch you want DBIC to use the alternate dialect without worrying about exceptio +ns. =cut # This scope guard object sets the "limit_dialect" of the storage engi +ne back to whatever # it was originally when the object goes out of scope. {package MSSQL_LimitDialectFlipper; sub new { my ($class, $storage, $value)= @_; my $self= { storage => $storage, prev_value => $storage->sql_mak +er->limit_dialect }; Scalar::Util::weaken($self->{storage}); $storage->sql_maker->limit_dialect($value); bless $self, $class } sub DESTROY { $_[0]{storage}->sql_maker->limit_dialect($_[0]{prev_value}) if $ +_[0]{storage}; } } sub set_limit_dialect_scope_guard { my ($self, $new_dialect_value)= @_; MSSQL_LimitDialectFlipper->new($self->storage, $new_dialect_value); }

      There are also a number of sugar methods littered throughout this; we had some fun with our ResultSet base class. Let me know if you need those recipes.

      Once you get one working, you can make lots of other fun relationships like

      • last_event_of_type1
      • first_event_of_type1
      • last_event_of_type2
      • first_event_of_type2
      and then make all sorts of fun queries like
      $schema->resultset("Thing")->join('last_event_of_type1','last_event_of +_type2') ->where('last_event_of_type1.timestamp' => { '>=', $cutoff1 }, 'last_event_of_type2.timestamp' => { '<', $cutoff2 });
      Update: sorry, I was confused by the third column and didn't read your new query properly.

      Suffers the same problem: there is nothing in that query that tells the server that the name you want is the one from the max id row. Most RDBMS will give an error if you try it, including I'm pretty sure SQL Server.

      That, plus your use of the term auto increment for what SQL Server calls an IDENTITY column, makes me think you are in fact using mysql or mariadb (without the ONLY_FULL_GROUP_BY sql_mode that prevents mistakes like your query, which is on by default in non ancient versions of mysql but sadly many old databases turn off).