Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

secondman's avatar

Getting only the latest record in query

I have the following query that I can't seem to get only the latest record from.

$date = Carbon::createFromDate(2021, 2, 8)->format('Y-m-d H:i:s');

$items = Model::whereDate('published_at', '<', $date)
    ->whereNull('removed_at')
    ->whereNotNull('example_id')
    ->where('filled', false)
    ->orderBy('example_id', 'asc')
    ->get();

I need to get only the latest example_id from what can be many entries with the same example_id and different data. I tried latest('published_at') in a couple different places but it still returns the entire collection of around 3000 entries.

Where am I going wrong?

0 likes
20 replies
tykus's avatar

If you want only one record, then you would use first rather than get. To get the latest of a particular column, you order descending (or use latest('column_name'):

$items = Model::whereDate('published_at', '<', $date)
    ->whereNull('removed_at')
    ->whereNotNull('example_id')
    ->where('filled', false)
    ->orderByDesc('example_id')
    ->first();
secondman's avatar

No, I only want the latest record of all the records with the same example_id not just a single record.

tykus's avatar

Ah! Latest by which column?

secondman's avatar

published_at

I've tried this:

$items = Model::whereDate('published_at', '<', $date)
    ->whereNull('removed_at')
    ->whereNotNull('example_id')
    ->where('filled', false)->where(function($query) {
        return $query->latest('published_at')->first();
    })
    ->orderBy('example_id', 'asc')
    ->get();

and this:

$items = Model::whereDate('published_at', '<', $date)
    ->whereNull('removed_at')
    ->whereNotNull('example_id')
    ->where('filled', false)
    ->latest('published_at')
    ->orderBy('example_id', 'asc')
    ->get();

all return the same collection with duplicate example_id entries. I need only the latest example_id by published_at date for all rows that have the same example_id.

tykus's avatar

This is a greatest-n-per-group problem, which is pretty common. I don't have a similar data to hand to test the following, but it should be the beginnings of, if not, your solution

Model::query()
// greatest-n-per-group
	->select('a.*')
	->from('table_name as a')
	->leftJoin('table_name as b', function ($join) {
		$join->on('a.id', '=', 'b.id')
			->whereColumn('a.published_at', '>', 'b.published_at');
	})
	->whereNull('b.id')
// the remaining constraints
	->whereDate('a.published_at', '<', $date)
    ->whereNull('a.removed_at')
    ->whereNotNull('a.example_id')
    ->where('a.filled', false)
    ->orderBy('a.example_id')
    ->get()

I don't like that the set will be potentially large before your existing constraints are added, so some room from improvement.

secondman's avatar

Good lord.

So I'm joining the table on itself ...

what is the

->whereNull('b.id')

doing?

b.id will never be null.

secondman's avatar

It doesn't seem to be constraining the date as designed.

Echoing the first 21 rows we have lots of duplicates:

example_id: 10876 unique_id: 20002 
example_id: 10877 unique_id: 20003 
example_id: 10878 unique_id: 20004 // nope
example_id: 10878 unique_id: 20005 
example_id: 10880 unique_id: 20007 // nope
example_id: 10880 unique_id: 20008 
example_id: 10881 unique_id: 20009 
example_id: 10885 unique_id: 20016 // nope
example_id: 10885 unique_id: 20849 // nope
example_id: 10885 unique_id: 20876 
example_id: 10886 unique_id: 20017 // nope
example_id: 10886 unique_id: 20851 // nope
example_id: 10886 unique_id: 20877 
example_id: 10887 unique_id: 20018 // nope
example_id: 10887 unique_id: 20852 // nope
example_id: 10887 unique_id: 20878 
example_id: 10888 unique_id: 20019 // nope
example_id: 10888 unique_id: 20856 // nope
example_id: 10888 unique_id: 20881 
example_id: 10889 unique_id: 20020 
example_id: 10890 unique_id: 20021 

Not sure how to even work on this

petrit's avatar

If you want to get a collection with one entry in it, than use limit.

$items = Model::whereDate('published_at', '<', $date)
    ->whereNull('removed_at')
    ->whereNotNull('example_id')
    ->where('filled', false)
    ->latest('published_at')
    ->orderBy('example_id', 'asc')
    ->limit(1)
    ->get();
secondman's avatar

Nope.

You're not reading the question correctly.

Dhakalsandeep's avatar

I tried this in my table to implement your logic and it work fine for me please check :

$report_ids = DB::table("reports as r")
        ->where(DB::raw("date(created_at)"),'<','2021-5-1')
        ->where('center','<>','')
        ->selectRaw('distinct center,(select id from reports where center = r.center order by id desc limit 1) as id')
        ->pluck('id');

    $reports = Report::find($report_ids);
    dd($reports);

The logic here is to get relevant array of id first and then get the model.

secondman's avatar

Can you translate that into my column names so I can understand what is happening? I have no idea what your code is doing.

Thanks

Dhakalsandeep's avatar
$date = Carbon::createFromDate(2021, 2, 8)->format('Y-m-d H:i:s');

    $itemIds = DB::table("models as m")
        ->where('published_at', '<', $date)
        ->whereNull('removed_at')
        ->whereNotNull('example_id')
        ->where('filled', false)
        ->selectRaw('distinct example_id,(select id from models where example_id = m.example_id order by published_at desc limit 1) as id')
        ->orderBy('example_id')
        ->pluck('id');

    $items = Model::find($itemIds);
    dd($items);

Note: I assume that if your Model name is Model then your table name is "models".

Also is your "published_at" a timestamp?? if not use

order by id desc limit 1

in subquery.

1 like
rodrigo.pedra's avatar
Level 56

Try this:

    $date = Carbon::createFromDate(2021, 2, 8)->format('Y-m-d H:i:s');

    $items = Model::query()
        ->whereIn(
            DB::raw('(example_id, published_at)'),
            Model::query()
                ->select('example_id')
                ->selectRaw('MAX(published_at)')
                ->groupBy('example_id')
                ->whereDate('published_at', '<', $date)
                ->whereNull('removed_at')
                ->whereNotNull('example_id')
                ->where('filled', false)
        )
        ->orderBy('example_id')
        ->get();

If you don't want to use the DB façade, use this:

    $date = Carbon::createFromDate(2021, 2, 8)->format('Y-m-d H:i:s');

    $items = Model::query()
        ->whereIn(
            new Expression('(example_id, published_at)'),
            Model::query()
                ->select('example_id')
                ->selectRaw('MAX(published_at)')
                ->groupBy('example_id')
                ->whereDate('published_at', '<', $date)
                ->whereNull('removed_at')
                ->whereNotNull('example_id')
                ->where('filled', false)
        )
        ->orderBy('example_id')
        ->get();

And add this import:

use Illuminate\Database\Query\Expression;

There was a recent PR to add a ->whereTupleIn() method that would avoid using the raw expression in WHERE IN, but it got rejected,

1 like
rodrigo.pedra's avatar

You're welcome!

After looking at @dhakalsandeep solution, the proposed solution is very similar, albeit their solution is using two queries whereas mine is using a single one.

But the idea is exactly the same.

Therefore, if you want to mark that response as the best one, be my guest.

One thing to note: if two, or more, records match all the criteria (example_id, filled, removed_at) and share the exact same published_at date and time, you could have duplicate records for the same example_id if this published_at value is the most recent one.

Depending of the database you are using you can mitigate that. If you are using MySQL 8, I can show you an example.

1 like
rodrigo.pedra's avatar

Example for MySQL 8, avoiding duplicates with matching criteria:

$date = Carbon::createFromDate(2021, 2, 8)->format('Y-m-d H:i:s');

$items = Model::query()
    ->select((new Model())->qualifyColumn('*'))
    ->joinSub(
        Model::query()
            ->select('id')
            ->selectRaw('ROW_NUMBER() OVER (PARTITION BY example_id ORDER BY published_at DESC) AS rowid')
            ->whereDate('published_at', '<', $date)
            ->whereNull('removed_at')
            ->whereNotNull('example_id')
            ->where('filled', false),
        'limiter',
        function (JoinClause $join) {
            $join->on((new Model())->getQualifiedKeyName(), '=', 'limiter.id');
            $join->where('limiter.rowid', 1);
        })
    ->orderBy('example_id')
    ->get();

You can replace:

  • (new Model())->qualifyColumn('*') by table_name.*
  • (new Model())->getQualifiedKeyName() by table_name.id

I used these functions as I am assuming your model is actually named something else, and as such I can't guess its table name.

This should also work for other RDBMS that support window functions (tested with SQLite and it also works).

1 like
secondman's avatar

This whole process is to backfill missing information to an API my company uses that's built on Sharepoint (of all things) so it's basically a single use function.

I built a command that will function on the scheduler and take small chunks like so:

foreach (Example::query()
    ->whereIn(
        new Expression('(example_id, published_at)'),
        Example::query()
            ->select('example_id')
            ->selectRaw('MAX(published_at)')
            ->groupBy('example_id')
            ->whereDate('published_at', '<', $this->date)
            ->whereNull('removed_at')
            ->whereNotNull('example_id')
            ->where('filled', false)
    )
    ->orderBy('example_id')
    ->take(50)
    ->cursor() as $record) {

    $data = new ExampleResource($record);

    $this->service->backfill([
        'numeric_id' => $record->id,
        'example_id'  => $record->numeric_id,
        'data' => json_encode($data),
    ]);
}

$this->service being the API service class that sends the data down.

It has to be done in small chunks because the amount of data from the ExampleResource is very large.

In the service I get the response back and then update the filled column so it no longer gets selected.

Using your query I dropped from sending 2990 records to only 2145 which is great!

But alas once this process has been completed this code will be removed as the backfill process won't be needed in the future. The worst part, I set the system up to send this data (that's being backfilled now) from the start, but it was rejected ... then 8 months and thousands of records later .... "Oh we need all that data from before."

Aggravating.

Thanks again for your assistance.

1 like
rodrigo.pedra's avatar

Thanks!

I spent around 10 years working on the financial industry, on business units not IT, so we had a lot of limitations from IT on what we could install in our computers.

So basically I only had VBA and SQL Server around, had to learn some tricks to get the work done =)

1 like

Please or to participate in this conversation.