Web Server Load Balancing with NGINX Plus

At nginx.conf 2017, I gave a presentation on how to use NGINX Plus health checks with Docker containers. You can access the presentation as a YouTube video or a blog post, which includes the Powerpoint slides and a transcription of my talk. In this post, I’ll describe an improved version of the basic approach, then present working configuration code you can use to implement it yourself.


When running containers in a microservices environment, your service instances are susceptible to becoming overloaded due to resource limitations, such as memory or CPU. There are a number of strategies for addressing this issue; this blog post discusses one that relies on NGINX Plus active health checks.

We’ll focus on methods for three use cases:

  • Request‑count–based. Use this method when requests to a service are so heavyweight that a service instance can only handle one request at a time.
  • CPU‑based. Use this method when CPU utilization is the main limiting factor and you want to set a CPU‑usage threshold above which the service doesn’t accept new requests.
  • Memory‑usage–based. Use this method when memory utilization is the main limiting factor and you want to set a memory‑usage threshold, above which the service doesn’t accept new requests.

All three methods work in the same fundamental way. NGINX Plus calls a program that implements an active health check based on one of the methods above and returns a server status of either unhealthy or healthy. NGINX Plus removes an unhealthy server from the load‑balancing rotation, and keeps a healthy server in the rotation (or adds it back if it was previously unhealthy).

Health Check Approaches

Let’s get into the details of each method. Code for the examples is available in the NGINX repo on GitHub.

For all of the examples, we’re using NGINX Plus as the load balancer and NGINX Unit as the application server, with two examples written in PHP and one written in Python. Everything runs in Docker containers.


For this method, the application creates a semaphore file, /tmp/busy, when it receives a request, then removes the file when it’s finished processing the request. The health check determines whether the file exists on a given service instance. If it does, the instance is considered unhealthy and NGINX Plus stop sending requests to it. If the file doesn’t exist, the instance is considered healthy and NGINX Plus sends requests to it.

The example uses a single Python program,, to implement both the application and the health check; which function to execute is determined by the request URI.

The shortest interval between health checks is one second, so it can that long for NGINX Plus to see that a service instance is unhealthy (busy). During that time, NGINX Plus might send another request to the service instance. To handle this case, the application returns status code 503 if it is already processing a request when another request arrives. If this happens, NGINX Plus tries another instance.


You can use the Docker API to get CPU‑usage metrics for a container, but they are relative to the Docker host. In other works, if the Docker API reports that the CPU usage for a container is 25%, that means 25% of the Docker host’s CPU.

For this example, we set a threshold of 70% for the application, and assign each container in the application an equal share of the threshold percentage. For example, if there is one container it can use 70% of the Docker host’s CPU. If there are two containers, each can use 35% of the Docker host’s CPU. We use the NGINX Plus API to get the number of containers for the application.

There are two PHP programs: testcpu.php generates CPU load and hcheck.php does the health check.

To get statistics for a container, the health‑check program makes the following call to the Docker API on the Docker host:

http://Docker_Host_IP_Address:Docker_API_Port/containers/Container ID/stats?stream=0

Calculating CPU usage requires two calls to the API, one second apart in the example. CPU usage is calculated by comparing the cpu_stats.cpu_usage.total_usage fields in the two calls.


As in the CPU‑based example, this example uses the Docker API to retrieve memory‑usage metrics. Each container is limited to 128 megabytes of memory and the memory‑usage metrics are relative to this limit.

There are two PHP programs: testmem.php uses memory and hcheck.php does the health check. If memory usage is above 70%, the health check returns a status of unhealthy.

The health check makes the same Docker API call as for the CPU‑usage method, but uses different fields: the percentage of memory used is memory_stats.usage divided by memory_stats.stats.hierarchical_memory_limit.

Configuring NGINX

No changes to the main NGINX configuration file (/etc/nginx/nginx.conf) are required. However, if you want to see detailed messages about health checks in the error log, set the severity level to info, as in this example:

error_log /var/log/nginx/error.log info;

The NGINX Plus configuration for the sample applications follows. As you read it, and especially if you use or adapt it, keep these points in mind:

  • Consul is used for DNS service discovery. Both Consul and NGINX Plus support DNS SRV records, which means that NGINX Plus can get the port numbers as well as the IP addresses of the containers. This is necessary because Docker port mapping is used.
  • The first server block, listening on port 80, enables sending requests directly to the program that does health checks. This is required so we can see what an unhealthy health check looks like. We can’t see that type of response if we send a request to a health‑check program via a virtual server that has health checks configured, because NGINX Plus doesn’t forward requests to unhealthy servers.
  • For the sake of easy understanding, the configuration is minimal – it doesn’t include all the directives a best‑practices configuration has.
  • The health‑check intervals are all short, so the system responds quickly during a demo. In a production environment, the one‑second interval for the count‑based health check is probably still suitable, since you want NGINX Plus to stop sending requests as soon as possible after the service becomes busy. For the CPU and memory health checks, a longer interval might be set in production.
  • This configuration and the CPU health‑check program uses the built‑in live activity monitoring dashboard that uses version 2 of the NGINX Plus API, introduced in NGINX Plus R14.
  • The configuration and programs are examples of possible ways to use active health checks on applications in Docker containers. They have not been tested in production or at scale.

The application configuration (/etc/nginx/conf.d/backend.conf):

# Configure DNS. Point to Consul.
resolver consul:53 valid=2s;
resolver_timeout 2s;

# The upstream groups are populated via DNS
upstream unitcnt {
    zone unitcnt 64k;
    server service.consul service=unitcnt resolve;

upstream unitcpu {
    zone unitcpu 64k;
    server service.consul service=unitcpu resolve;

upstream unitmem {
    zone unitmem 64k;
    server service.consul service=unitmem resolve;

# Health checks are successful if a string in the body starts with {"HealthCheck":"OK"
match server_ok {
    status 200;
    body ~ '{"HealthCheck":"OK"';

server {
    # Allows calling upstream health checks directly
    listen 80;

    location /healthcheck {
        proxy_pass http://$arg_server/hcheck.php;

    location /healthcheckpy {
        proxy_pass http://$arg_server/;

server {
    listen 8001;
    status_zone unitcnt;
    root /usr/share/nginx/html;
    proxy_http_version 1.1;
    proxy_set_header Connection "";

    location ~ .py$ {
        proxy_set_header Host $http_host;
        proxy_pass http://unitcnt;
        proxy_intercept_errors on;
        proxy_next_upstream http_503;
        # If all the servers are busy return apibusy.html
        error_page 502 503 =503 /apibusy.html;
        health_check uri=/ match=server_ok interval=1s;

server {
    listen 8002;
    status_zone unitcpu;
    root /usr/share/nginx/html;
    proxy_http_version 1.1;
    proxy_set_header Connection "";

    location ~ .php$ {
        proxy_set_header Host $http_host;
        proxy_pass http://unitcpu;
        error_page 502 =503 /apibusy.html;
        health_check uri=/hcheck.php match=server_ok interval=5s;

server {
    listen 8003;
    status_zone unitmem;
    root /usr/share/nginx/html;
    proxy_http_version 1.1;
    proxy_set_header Connection "";

    location ~ .php$ {
        proxy_set_header Host $http_host;
        proxy_pass http://unitmem;
        error_page 502 =503 /apibusy.html;
        health_check uri=/hcheck.php match=server_ok interval=3s;

# Configure the status API and dashboard
server {
    listen 8082;

    root /usr/share/nginx/html;

    location = /dashboard.html {

    location = / {
        return 302 /dashboard.html;
    location /api {
        access_log off;


NGINX Plus active health checks are an easy way to deal with capacity limitations of services running in Docker, helping to make sure that service instances aren’t overloaded.

Get an NGINX Plus free trial and download the Unit beta and give it a try! All the code for the examples is available in the NGINX repo on GitHub.

Hero image
NGINX 企阅版全解析



Rick Nelson

Rick Nelson


Rick Nelson is the Manager of Pre‑Sales, with over 30 years of experience in technical and leadership roles at a variety of technology companies, including Riverbed Technology. From virtualization to load balancing to accelerating application delivery, Rick brings deep technical expertise and a proven approach to maximizing customer success.


F5, Inc. 是备受欢迎的开源软件 NGINX 背后的商业公司。我们为现代应用的开发和交付提供一整套技术。我们的联合解决方案弥合了 NetOps 和 DevOps 之间的横沟,提供从代码到用户的多云应用服务。访问 了解更多相关信息。