Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

daveb2's avatar
Level 3

Detect failed upstream servers behind load balancer

I'm running a Digital Ocean load balancer image on a Digital Ocean droplet, with two more DO droplets running Ubuntu acting as upstream servers, all managed by Forge.

The balancing method I have configured for Forge is IP Hash.

When one of my upstream servers goes down (reboot, nginx stopped, whatever) the load balancer continues happily trying to forward to it and does not appear to correct for any 500/502 errors.

Does anyone have any suggestions for how to configure the load balancer to detect failed upstream servers and exclude them until they become available again?

0 likes
5 replies
LaryAI's avatar
Level 58

To handle failed upstream servers behind a load balancer, you need to configure health checks that can detect when a server is down and automatically exclude it from the pool of available servers until it becomes healthy again. Here's a general approach you can take, specifically for a setup using Nginx as a load balancer, which is common in Digital Ocean environments:

  1. Configure Health Checks:

    You need to set up health checks in your load balancer configuration. This involves periodically checking the health of your upstream servers and marking them as down if they fail the check.

    If you're using Nginx, you can use the http module with the upstream directive to define health checks. Here's an example configuration:

    http {
        upstream backend {
            server 192.168.1.1:80;
            server 192.168.1.2:80;
    
            # Enable health checks
            health_check interval=5s fails=3 passes=2;
        }
    
        server {
            listen 80;
    
            location / {
                proxy_pass http://backend;
            }
        }
    }
    

    In this example:

    • interval=5s specifies that health checks should be performed every 5 seconds.
    • fails=3 means that a server will be marked as down after 3 consecutive failed checks.
    • passes=2 means that a server will be marked as up after 2 consecutive successful checks.
  2. Use a Load Balancer with Built-in Health Checks:

    If you're using a managed load balancer from Digital Ocean, it should have built-in health check capabilities. You can configure these health checks through the Digital Ocean dashboard:

    • Go to your load balancer settings.
    • Find the health check configuration section.
    • Set the protocol, path, and port for the health check. For example, you might check a specific HTTP endpoint that returns a 200 status code when the server is healthy.
    • Configure the interval, timeout, and threshold for marking a server as healthy or unhealthy.
  3. Monitor and Adjust:

    Regularly monitor your load balancer and server logs to ensure that the health checks are functioning as expected. Adjust the health check parameters if necessary to better suit your application's needs.

By implementing these health checks, your load balancer should be able to detect when an upstream server is down and automatically exclude it from the pool, thus preventing 500/502 errors from being served to your users.

1 like
daveb2's avatar
Level 3

Hmm, does anybody know where the upstream block lives on forge? I assume it must be integrated somewhere, because I manage the upstream servers via the Servers section of the web UI. I don't want to override the web UI config.

daveb2's avatar
Level 3

OK getting somewhere... the relevant config is already present on forge-managed servers; it is stored in /etc/nginx/upstreams/mysite.com

However, while modifying this file and restarting nginx will work, as soon as any changes are made via the Forge web ui this file will be overwritten.

So, note to future me: remaining to do is to solve this web ui issue.

daveb2's avatar
Level 3

Seems the defaults are now working? Looks like upstream host failover is working as expected now, so all good - as long as there is an upstream {} block I think the defaults are good enough for my situation.

Please or to participate in this conversation.