Using forge for 1st time, having websockets disconnections. Is this a problem with my nginx config?
Howdy, I'm working on is an online implementation of a turn based board game for 4-5 players using websockets. It seem to work great in general, but I've been having some disconnection problems. some games it might happen once or twice and is fixed by that player doing a quick refresh.
But other times it seems like it becomes a storm of disconnections, where multiple different players become disconnected over and over at its worst players will get disconnected every 20-30 seconds for a while). During these disconnect storms the game becomes very unstable, sometimes causing everyone to get dropped at once, and when that happens people have trouble reconnecting at all.
These storm of disconnections so far have only happened in the 5 player game (our 4 player games have only had intermittent disconnections limited to a player or two), and only once we are a few hours into the game. So I assume those are clues that I'm just pushing some resource a bit too far, but I'm not sure which it could be.
For this project I'm using a LEMP stack on a #$10 Digital Ocean droplet (1 vCPU / 2 GB Memory / 50 GB Disk), set up with Laravel Forge (as I'm using laravel for the registration, and storing save game info in the DB, but I'm not using the built in laravel broadcasting for the game itself, instead the game is build in pure JS on node).
During the last game where we had a storm of disconnections I checked the usage statistics on digital ocean, but the results seemed to my untrained eye to be under control:
-
At its worst my CPU usage graph ever exceeded 7%, so I assume that can't be the problem.
-
The high point on my LOAD 1/5/15 graph was: .27 / .12 / .3, which again I'm no expert, but that seems reasonable.
-
The peak of the MEMORY graph was: 53%, and basically never budged the whole time (before that game began it was hovering at 52%).
-
DISK IO capped out at: 80kb read / 108kb write.
-
And finally PUBLIC BANDWIDTH peaked at: 43kbps inbound / 304kbps outbound
So none of that seems too scary.
The game server is written in JS and run with node, using socket.io. the biggest messages I send over websockets run about 48kb of game data, and come at a rate of once every 2-3 seconds, to once every few minutes. Is that unreasonable amount of data to push at that rate?
I'm also using a cloudflare free plan as my proxy (and as my SSL provider), so I have the websockets traffic coming in and out on port 8443 which is one of the ports they have set aside as websocket safe, is it possible the issue comes from that end?
Finally, here is my nginx config file for the site, the first server is just for the authentication and loading the main page, the second is the websockets stuff coming in on port 8443 and redirecting internally to 6001 which is the port my express game server listens to:
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/[URL-REDACTED].com/before/*;
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name [URL-REDACTED].com [URL-REDACTED].com [URL-REDACTED].com [URL-REDACTED].com;
server_tokens off;
root /home/forge/[URL-REDACTED].com/public;
# FORGE SSL (DO NOT REMOVE!)
ssl_certificate /etc/nginx/ssl/[URL-REDACTED].com/855663/server.crt;
ssl_certificate_key /etc/nginx/ssl/[URL-REDACTED].com/855663/server.key;
ssl_protocols TLSv1.2;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparams.pem;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";
index index.html index.htm index.php;
charset utf-8;
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/[URL-REDACTED].com/server/*;
location / {
try_files $uri $uri/ /index.php?$query_string;
}
location = /favicon.ico { access_log off; log_not_found off; }
location = /robots.txt { access_log off; log_not_found off; }
access_log off;
error_log /var/log/nginx/[URL-REDACTED].com-error.log error;
error_page 404 /index.php;
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;
fastcgi_index index.php;
include fastcgi_params;
}
location ~ /\.(?!well-known).* {
deny all;
}
}
server {
listen 8443 ssl;
listen [::]:8443 ssl;
server_name [URL-REDACTED].com;
root /home/forge/[URL-REDACTED].com/public;
# FORGE SSL (DO NOT REMOVE!)
ssl_certificate /etc/nginx/ssl/[URL-REDACTED].com/855663/server.crt;
ssl_certificate_key /etc/nginx/ssl/[URL-REDACTED].com/855663/server.key;
ssl_protocols TLSv1.2;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparams.pem;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";
index index.html index.htm index.php;
charset utf-8;
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/[URL-REDACTED].com/server/*;
location / {
proxy_pass http://localhost:6001;
proxy_read_timeout 60;
proxy_connect_timeout 60;
proxy_redirect off;
# Allow the use of websockets
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
location = /favicon.ico { access_log off; log_not_found off; }
location = /robots.txt { access_log off; log_not_found off; }
access_log off;
error_log /var/log/nginx/[URL-REDACTED].com-error.log error;
error_page 404 /index.php;
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass unix:/var/run/php/php7.1-fpm.sock;
fastcgi_index index.php;
include fastcgi_params;
}
location ~ /\.(?!well-known).* {
deny all;
}
}
# FORGE CONFIG (DO NOT REMOVE!)
I have no idea what I'm doing on that end, and that's just what I cobbled together from following online tutorials. Perhaps I've got some settings obviously wrong.
So yeah, that basically sums it up. I'm pretty well lost, since 90% of this is totally new to me, but hopefully I blasted you with enough information that someone with a more experienced eye can at least point me in the right direction for the likeliest culprit.
Thanks much!
Please or to participate in this conversation.