spacedog4's avatar

Dealing with huge tables and pagination

I have a table with more then 16000 records (see the update), I need to show it on a list and paginate, but even with simplePagination it takes 15s to query it, is there a better way to paginate it?

Important thing is that I cant select the first 100 and page 2 the next 100 cause I have filters

I'm using a google compute engine and sql with 1vCPU and 3,7GB ram

Update

I forgot to mention that I'm making some relationships, each record can have 6 to 9 relations, so I might be selectiong 100.000 to 200.000 records

My SQL

Here is my sql, the tables are in portuguese so it might be difficult to read

I'm dealing with 4 tables

pedidoproducao has 17567 records

pedidoproducaoetapas has 19360 records

pedidoproducaoocorrencia has just 1 record

pedidoproducaoderivado has 17559 records

produtos has 2301 records

I don't know why the laravel don't built it on sql but I'm making some relationships with with that are not in sql, like etapa_producao and ocorrencia

$query->with('etapas');
$query->with(['etapas.etapa_producao' => function ($q) {
    $q->select('codigoempresa', 'codigoetapasproducao', 'etapa');
}]);
$query->with('etapas.ocorrencias.ocorrencia');
SELECT `pedidoproducao`.*,
  `pedidoproducao`.`hora` AS `hora_previsao`,
  `pedidoproducaoetapas`.`previsao`,
  DATE_ADD(pedidoproducao.data, INTERVAL SUM(pedidoproducaoetapas.previsao) DAY) AS data_previsao,

  (SELECT SUM(previsao) AS previsao
   FROM `pedidoproducaoetapas`
   WHERE `pedidoproducao`.`codigoempresa` = `pedidoproducaoetapas`.`codigoempresa`
         AND `pedidoproducao`.`codigofilial` = `pedidoproducaoetapas`.`codigofilial`
         AND `pedidoproducao`.`codigopedidoproducao` = `pedidoproducaoetapas`.`codigopedidoproducao`) AS `previsao`,

  (SELECT count(*)
   FROM `pedidoproducaoocorrencia`
   WHERE `pedidoproducao`.`codigoempresa` = `pedidoproducaoocorrencia`.`codigoempresa`
         AND `pedidoproducao`.`codigofilial` = `pedidoproducaoocorrencia`.`codigofilial`
         AND `pedidoproducao`.`codigopedidoproducao` = `pedidoproducaoocorrencia`.`codigopedidoproducao`) AS `ocorrencia`
FROM `pedidoproducao`
  INNER JOIN `pedidoproducaoderivado` ON (`pedidoproducaoderivado`.`codigoempresa` = `pedidoproducao`.`codigoempresa`
                                          AND `pedidoproducaoderivado`.`codigofilial` = `pedidoproducao`.`codigofilial`
                                          AND `pedidoproducaoderivado`.`codigopedidoproducao` = `pedidoproducao`.`codigopedidoproducao`)
  INNER JOIN `produtos` ON (`produtos`.`codigoempresa` = `pedidoproducaoderivado`.`codigoempresa`
                            AND `produtos`.`codigopro` = `pedidoproducaoderivado`.`codigopro`)
  LEFT JOIN `pedidoproducaoetapas` ON (`pedidoproducao`.`codigoempresa` = `pedidoproducaoetapas`.`codigoempresa`
                                       AND `pedidoproducao`.`codigopedidoproducao` = `pedidoproducaoetapas`.`codigopedidoproducao`)
WHERE EXISTS
      (SELECT *
       FROM `pedidoproducaoetapas`
       WHERE `pedidoproducao`.`codigoempresa` = `pedidoproducaoetapas`.`codigoempresa`
             AND `pedidoproducao`.`codigofilial` = `pedidoproducaoetapas`.`codigofilial`
             AND `pedidoproducao`.`codigopedidoproducao` = `pedidoproducaoetapas`.`codigopedidoproducao`)
      AND `pedidoproducao`.`codigoempresa` = ?
      AND `pedidoproducao`.`codigofilial` = ?
      AND `pedidoproducao`.`status` = ?
GROUP BY `pedidoproducao`.`codigopedidoproducao`
ORDER BY `pedidoproducao`.`updated_at` DESC,
  `pedidoproducao`.`codigopedidoproducao` ASC

Notice

If is there a way I can do this with cache, cause this page is an history of items that was completed, but I don't know how cache works, I can cache the query and update the cache every time a new item is completed?

0 likes
24 replies
spacedog4's avatar

@PAPA - I don't know what would be Database plugin, but yes I use ajax calls

martinbean's avatar

@spacedog4 ~15 seconds to pull a subset of records sounds far too long. Your table is either very poorly defined, or you have an unoptimised query/queries.

Try using either the Laravel Debugbar or Laravel Telescope to inspect the queries you’re running. You may have an N+1 problem. If you don’t, then you may need to look at your database schema and how to better optimise it.

2 likes
spacedog4's avatar

@MARTINBEAN - i tried to use the laravel debugbar, but it bugs with my project, I have differents guards and providers, also this database wasn't design by me, it's poor, have many primary keys, some tables has 4 or 5 primary keys, so I needed to use comboships for relation

but maybe this 15 seconds its caused by this poorly database than, any advice that I could improve without rebuilding everything?

jlrdw's avatar

16000 is not huge, as a test have you exported all to a test db so you can play with optimizing.

16000 paginated should happen in a blink of the eye.

1 like
spacedog4's avatar

@JLRDW - I'll try that, acctualy, in this particular case, I'm dealing with 16000 records, an relationships, I forgot to mension that, each record can have at least 6 realations that I'm making join so, it can be 100.000 records, even so I think it's not so much

I updated my question with the sql

Snapey's avatar

Looking at your SQL, you need to consider using explain to understand the query and where full table scans are being performed.

Don't convince yourself that 16000 is a lot of records.

jlrdw's avatar

@SPACEDOG4 - Can't you just query a subset of data, I can't see paginating that much.

Example, a customer table, last name starts with wh, so you enter wh in a search field, and query for only last names what start with wh to find and view "Bob White". Just example here.

Does someone actually page through all that data.

You realize a group by has to transverse all each time, right. I'd look at rewriting this and narrow by dates or something.

If it's a report only ran every month, and in house only, export to MS access via odbc and do the reporting in access.

Tray2's avatar

The first thing I'd do is add some indexes to those tables and I think you could shave of at least 50% of the time there. You can run explain plan to see where it uses full table access and add the indexes there. You will also probably see which part of the query takes most of the time and then you can put that last so you have the smallest possible selection to run it on.

I would also make sure that any carthesian products are removed since they are usually bad for performance.

spacedog4's avatar

@SNAPEY - I tried that but the SQL that Laravel generates gives me this error in PHPStorm database manager and Navicat

1055 - Expression #10 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'maxinutri.pedidoproducaoetapas.previsao' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
jlrdw's avatar

In database config file change strict to false.

1 like
spacedog4's avatar

@JLRDW - That's a list of a production history, http://prntscr.com/nb5mnz

And its also slow in the production screen http://prntscr.com/nb5npi

I need to bring somethings in the query, like that date_add where I get the date of the production and sum with the prediction days so I can order by that sum, and others things

spacedog4's avatar

@JLRDW - I manage multiple database, it is already setted to false

class DatabaseConnection {

    public static function setConnection($env_connection)
    {
        $env_connection = strtoupper($env_connection);

        config(['database.connections.mysql' => [
            'driver'   => 'mysql',
            'host'     => env("DB_{$env_connection}_HOST"),
            'database' => env("DB_{$env_connection}_DATABASE"),
            'username' => env("DB_{$env_connection}_USERNAME"),
            'password' => env("DB_{$env_connection}_PASSWORD"),
            'strict' => false,
        ]]);

        DB::reconnect('mysql');
    }
}

my issue is running this code in sql softwares, I can't find a way to set strict in phpstorm or navicat

jlrdw's avatar

Add this to my.cnf or ini

[mysqld]  // under this
sql_mode = "STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

Restart server.

1 like
Snapey's avatar

As suggested, install Debugbar. This will also show you the actual query syntax being used.

spacedog4's avatar

@SNAPEY - I tried, but it crashs with my application, I'm using different databases, guards, providers, multiple primary keys

spacedog4's avatar

@JLRDW - I configured this in google cloud sql and worked, I'll see the results and analyze, thanks

Snapey's avatar

ok. Thats strange since it should not care?

How do you have multiple primary keys - isn't that a contradiction?

spacedog4's avatar

I used explain, and that is the results http://prntscr.com/nb63ti I don't know what I can improve here

I'm doing some research about how to analyze and improve the sql with explain

spacedog4's avatar

@SNAPEY - I searched about the errors, but I could find any similar errors, it wasnt rendering when I'm logged, I tried to render by my self, but it didn't work neither, and was giving a simple error in log file "Array to string conversion" in debug bar, but It didn't say the line, nothing

jlrdw's avatar

On the ONLY_FULL_GROUP_BY

You only remove that, above was example. On your system, you login to mysql and run

SELECT @@sql_mode;

Gives something like:

ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION

Then you take out ONLY_FULL_GROUP_BY, BACKUP this line somewhere.

Now in my.ini under [mysqld] you put your new line, mine was:

sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION

I think some systems need parenthesis, my.ini does not. Linux is my.cnf I think.

1 like
spacedog4's avatar

I made it in a different way with google sql, but it worked, now, I made same research and I dont see what can I improve in my sql, how you can see above a posted a link to the results of the explain, and all the rows are with a key and potencial key, just two of then has too many rows, but how can I decrease it?

spacedog4's avatar
spacedog4
OP
Best Answer
Level 18

I've made some research and how my query is constantly searched by codigoempresa, codigofilial, codigopedidoproducao and status creating an index of these three columns and another for pedidoproducaoetapas improved 10x my query, now, it takes 1.3 seconds, thank you guys

ALTER TABLE `pedidoproducao` ADD INDEX `pedidoproducao_idx_codigoe_codigof_status_codigop` (`codigoempresa`,`codigofilial`,`status`,`codigopedidoproducao`);

ALTER TABLE `pedidoproducaoetapas` ADD INDEX `pedidoproducaoetapas_idx_codigoempresa_codigopedidopro` (`codigoempresa`,`codigopedidoproducao`);
1 like

Please or to participate in this conversation.