cristian9509's avatar

Slow queries/connections with Eloquent, always over 1600ms

I have two Lumen apps in production (both AWS, EC2) and from time to time I see random queries that take exactly over 1600ms. I am using DB::listen to log the query including the time it takes to complete. I've been looking into the Eloquent code but I don't find anything. Maybe the reconnect if someone the connection was lost could cause a delay but why is it always over 1600ms? It just does not make any sense.

Has anyone seen this before or would anyone have ideas what to look for?

Update: This is not about a single query, it's pretty much any query that runs on the app. The numbers of rows retrieved could 0, 1 or 15 (the pagination limit).

0 likes
14 replies
jlrdw's avatar

You don't mention how many records are being retrieved.

cristian9509's avatar

@jlrdw I've updated the question. Basically it's happening for all queries in the app and the rows returned are 0, 1, or 15.

jlrdw's avatar

Can you show a slow query code and how many total count. Not paginated, but the total if you were not paginating.

1 like
cristian9509's avatar

@CLICK - Usually the query that is slow with over 1600ms is the first one in the request. Not necessary the first query in a while. Also, I've noticed recently that the issue only occurs when having some load but not necessarily a very high one.

I tried turning query cache OFF and tested and can still see the 1600ms queries.

Also, at one point I was turning on profiling SET profiling = 1 and then running the query. It came back that the query to enable the profiling was taking more than 1600ms. I am almost inclined that this might be more related to the connection part that is done before the first query in the request.

click's avatar

@cristian9509 it looks like it. If the connection setup would be it you should be able to reproduce it by only making a connection without doing a db query.

Try to reproduce it with:

<?php
$start = microtime(true);
$servername = "127.0.0.1"; // use exact same config as you currently use
$username = "user";
$password = "secret";

// Create connection
$conn = new mysqli($servername, $username, $password);

// Check connection
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
}
echo "Connected successfully in ". round(microtime(true) - $start, 5) .'s';

And is your DB on another server than your app? Could it have something todo with the connection between the servers (DNS?). How do you connect to the DB? Via an IP or a hostname?

1 like
cristian9509's avatar

@CLICK - So I've been doing what you suggested. Instead of mysqli I used PDO. It looks like the issue is 100% happening on DB connection.

My test was to just add a new endpoint, grab the pdo object from the db manager which will create the pdo connection and hit the endpoint with loader.io just to put it under some load. I've immediately started to see the same 1600ms + a little for making a DB connection. Also, recently (not after this test) I've seen the connection taking even 10-15 seconds -- rarely but still it's happening.

My app is on an EC2 instance while my DB is on AWS Aurora. I connect to it via the DNS name provided by AWS.

click's avatar

Ok interesting, so you are at least a little bit closer.

What does pinging the DB server from within your app server do via de DNS name you use to connect? I'm do not know the ins and outs of AWS but you can't connect via an IP to the DB?

1 like
Xsecrets's avatar

yeah I don't know about aurora, but accessing the db from dns could cause the problems you are describing. because there will need to be a network call to get the ip for the db. if you can't use ip you could try editing the /etc/hosts file and adding that domain name in there to eliminate the network call and see if that helps.

cristian9509's avatar

@xsecrets I think you are totally right. I just ran some tests where all I was doing was resolving the Mysql hostname to it's IP address over and over and I immediately saw the delays I described above. Also found this doc https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html (see DNS Limits) and I think this is what I am encountering.

However, you solution might not actually work since the RDS IPs can change based on different factors (I think availability zone would be one) and thus just pointing a specific IP to the hostname might not work. Not sure what else can be done in this case.

kobear's avatar
kobear
Best Answer
Level 4

@CRISTIAN9509 - Some things you could try in AWS:

  1. Move the EC2 instance and the RDS into their own VPC. That should give you more control over the IP addressing.

  2. On your EC2 instance, setup a resolving/caching DNS bind server. It will cache responses from the AWS DNS servers so it is faster, but receive zone updates when the TTL expires on the DNS record.

kobear's avatar

@Curious if you have tried either of my suggestions or if you found something else.

cristian9509's avatar

@kobear went with #2. Did some tests with and without a dns cache and the results were very good. Still work in progress but I think this fixes the issue. Plus if before the dns resolving took between 2 to 25ms on average and with the 1.6s random delays from the tests I did now it should move to microseconds so pretty much average of less than 1ms. We’ll have to see how that works in production rather than in a few tests.

Thanks.

We used this approach https://aws.amazon.com/premiumsupport/knowledge-center/dns-resolution-failures-ec2-linux/

1 like

Please or to participate in this conversation.