Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

libertie's avatar

Using Scout on an Index Page with Sorting... Bad Idea?

My Laravel application has index pages for around twenty models, the largest of which has a million records. Each page shows model columns and allows the user to sort by any column. Users can additionally search by keyword. I presently accomplish this search with a sql LIKE operator.

I'm interested in replacing sql searching with Scout Typesense, but I want to maintain sorting on all model fields. The Typesense documentation states that:

"Sorting on a string field requires the construction of a separate index that can consume a lot of memory for long string fields (like description) or in large datasets. So, care must be taken to enable sorting on only relevant string fields."

All of my models have name fields, and many have at one to two additional varchar or text columns that must be sortable. Is my use case inappropriate for Scout Typesense? Is there another technique that I should be using for fuzzy search on an index page?

I'd appreciate any feedback or ideas!

0 likes
9 replies
LaryAI's avatar

Great question! Here’s a breakdown of your situation and some recommendations:

1. Scout + Typesense for Search

Laravel Scout with Typesense is an excellent choice for full-text and fuzzy searching, especially when you have large datasets and want fast, typo-tolerant search. It will outperform SQL LIKE queries in both speed and relevance.

2. Sorting in Typesense

Typesense supports sorting, but as you noted, sorting on string fields (especially long ones) can be memory-intensive. For fields like name or short titles, enabling sorting is generally fine. However, for long text fields (like descriptions), it’s best to avoid enabling sorting.

Typesense best practices:

  • Enable sorting only on fields that users actually need to sort by.
  • Avoid enabling sorting on large text fields.
  • For each sortable field, set "sort": true in your Typesense schema.

Example schema for a model:

{
  "name": "products",
  "fields": [
    {"name": "id", "type": "int32"},
    {"name": "name", "type": "string", "sort": true},
    {"name": "category", "type": "string", "sort": true},
    {"name": "description", "type": "string"} // not sortable
  ]
}

3. Hybrid Approach: Search with Typesense, Sort with SQL

If you need to sort by any column (including those not indexed in Typesense), consider a hybrid approach:

  • Search with Typesense to get relevant IDs.
  • Fetch and sort results with Eloquent using those IDs, applying SQL sorting.

Example:

// 1. Search with Typesense (via Scout)
$ids = Model::search($query)->keys(); // returns array of IDs

// 2. Fetch and sort in SQL
$results = Model::whereIn('id', $ids)
    ->orderBy($sortField, $sortDirection)
    ->paginate(20);

This way, you get fuzzy search and can sort by any column, but you only sort a small subset of IDs, not the entire table.

4. Alternatives

  • Meilisearch: Similar to Typesense, with good support for sorting and fuzzy search.
  • Elasticsearch: More complex, but very powerful for advanced search and sorting needs.

5. Summary

  • Scout + Typesense is appropriate for your use case, as long as you’re careful about which fields are sortable.
  • For sorting on all fields, use a hybrid approach: search with Typesense, then sort with SQL.
  • Avoid enabling sorting on large text fields in Typesense.

References:

Let me know if you’d like code examples for a specific model or more details on the hybrid approach!

EthelMcKain's avatar

That's a fascinating challenge! While Typesense offers incredible speed, relying on it solely for sorting every field might introduce some unexpected complexity, especially with a million records. Could a hybrid approach work? Using Typesense primarily for search, and then selectively employing SQL for sorting certain frequently-used columns might balance performance and maintainability. Or, perhaps investing time in optimizing your SQL queries further is the more prudent move, avoiding the complexities of a dual-system implementation. Ultimately, the best choice depends on your specific performance bottlenecks and tolerance for added complexity.

1 like
libertie's avatar

@EthelMcKain Thanks for the thoughtful response! The hybrid approach is really interesting, but would result in situations where MySQL was being passed hundreds of thousands of IDs in a WHERE clause. Surely that will not perform well?

I think you're right about the need for investigating specific performance bottlenecks. I'll play around with a few approaches and see what works!

Tray2's avatar

Call me old school, but I would probably stay with SQL on this one, with the correct types of indexes, the query should be fast enough.

Some thing to consider.

  1. Can you get away with just using the wildcard on the right side column1 LIKE 'My value%.
  2. If you need the wildcard on both sides, then enable full text search. https://dev.mysql.com/doc/refman/8.4/en/fulltext-search.html
  3. Limit the sorting to as few columns as possible, since you don't want to index all columns.
  4. Don't just create indexes for the columns, just wait until you need them, that way you don't create unnecessary ones.
  5. Consider subscribing, and watch this series about eloquent performance. https://laracasts.com/series/eloquent-performance-patterns
  6. Install something like Debugbar or Clockwork to measure the performance, and use EXPLAIN on your SQL to determine the bottle necks.
1 like
libertie's avatar

@Tray2 I appreciate the reminder that a lot can be accomplished using SQL! Ultimately, I do want fuzzy search, but the considerations and suggestions you outlined are great. Thank you.

libertie's avatar

Grateful for the helpful feedback!

I found some great information in the Typesense docs that helps me to think about performance / system requirement tradeoffs when using the search engine:

If the size of your dataset (only including fields you want to search on) is X MB, you'd typically need 2X-3X MB RAM to index the data in Typesense.
For example: If your dataset size is 5GB, and you want to search on a subset of the fields and the size of that subset of fields is 1GB, then you'd need between 2GB - 3GB RAM to hold the search indices in memory.
You can still store unindexed fields in Typesense (for eg: fields you'd only want for display purposes) - these unindexed fields will be stored on disk and not count towards RAM usage.

I'm not sure how this is impacted by creating sortable string fields, but will ask on the project GitHub page.

jlrdw's avatar

You need to query only whats needed and paginate. Why would every column need a sort. I agree with @tray2 and to add pick the main columns to ORDER BY and have queries for those. But suggestions only.

Please or to participate in this conversation.