Web Server Load Balancing with NGINX Plus

[Editor – This post has been updated to refer to the NGINX Plus API, which replaces and deprecates the separate dynamic configuration module mentioned in the original version of the post.

For step‑by‑step instructions on deploying a highly available NGINX Plus load‑balancing configuration on GCE, see our deployment guide, All-Active HA for NGINX Plus on the Google Cloud Platform.]

Customers who need to load balance applications running on the Google Cloud Platform have several options: NGINX Plus, the Google Compute Engine (GCE) load balancing services, or NGINX Plus in combination with the GCE services. This post compares each solution and also shows how using NGINX Plus with the GCE load balancing services gives you a highly available HTTP load balancer with rich Layer 7 functionality.


GCE has two load balancing solutions – network load balancing and HTTP/HTTPS load balancing. The former does TCP/UDP (Layer 4/network layer) load balancing within a GCE region. The latter does HTTP/HTTPS (Layer 7/application layer) load balancing with support for cross‑region load balancing (for simplicity, we’ll refer to plain “HTTP load balancing” in the rest of this blog).

Here is a general comparison between NGINX Plus and the Compute Engine load‑balancing services. (GCE NLB is GCE network load balancing and GCE HLB is GCE HTTP/HTTPS load balancing.)

TCP load balancing
UDP load balancing
Load-balancing methods Advanced Simple Advanced Advanced Advanced
SSL/TLS termination
URL request mapping Advanced Simple Advanced Advanced
URL rewriting and redirecting
HTTP health checks Advanced Simple Simple Advanced Advanced
TCP health checks Advanced Advanced
Session persistence Advanced Simple Advanced Advanced
Active‑active NGINX Plus cluster
Cross‑region load balancing

Let’s explore some of the differences between NGINX Plus and the GCE load balancing services, their unique features, and how NGINX Plus can work together with both services.

Comparing NGINX Plus and GCE Load Balancing Services

Request Routing

NGINX Plus can distribute requests among groups of backend servers based on the request URL, fields in the header, or a cookie. For example, you can send all requests with a URL starting with /store to one upstream group of servers and all those with a URL starting with /support to another group. Another example is sending any request with a URL ending with .html, .jpg, .gif, or .png to one group and all those with a URL ending with .php to another.

GCE HTTP load balancer supports the former type of routing but not the latter.

With both GCE HTTP load balancer and NGINX Plus you can do request routing based on the Host field in the request header.

GCE network load balancer does not support HTTP load balancing, and so does not support this type of request routing.

Load‑Balancing Methods

NGINX Plus offers a choice of several load‑balancing methods. In addition to the default Round Robin method there are:

  • Least Connections – A request is sent to the server with the lowest number of active connections.
  • Least Time – A request is sent to the server with the lowest average latency and the lowest number of active connections.
  • IP Hash – A request is sent to the server determined by the source IP address of the request.
  • Generic Hash – A request is sent to the server determined from a user‑defined key, which can contain any combination of text and NGINX variables, for example the variables corresponding to the Source IP Address and Source Port header fields, or the URI.

All of the methods can be extended by adding different weight values to each backend server.

GCE network load balancer supports only the equivalent of the Hash method. By default, it uses a key based on the Source IP Address, Source Port, Destination IP Address, Destination Port, and Protocol header fields to choose a backend server.

Within a region, GCE HTTP load balancer first distributes requests among the groups of backend server instances based on their remaining capacity, measured either in terms of CPU utilization or the number of requests per second, and then evenly distributes requests among the instances in the group.

Session Persistence

Session persistence, also known as sticky sessions or session affinity, is needed when an application requires that all requests from a specific client continue to be sent to the same backend server because client state is not shared across backend servers.

NGINX Plus supports three advanced session‑persistence methods:

  • Sticky Cookie – NGINX Plus adds a session cookie to the first response from the upstream group for a given client. This cookie identifies the backend server that processed the request. The client includes this cookie in subsequent requests and NGINX Plus uses it to direct the requests to the same backend server.
  • Sticky Learn – NGINX Plus monitors requests and responses to locate session identifiers (usually cookies) and uses them to determine the server for subsequent requests in a session.
  • Sticky Route – A mapping between route values and backend servers can be configured so that NGINX Plus monitors requests for a route value and chooses the matching backend server. This mechanism is often used with Tomcat’s JVMRoute feature, where Tomcat appends a value to the JSESSIONID to identify the server that processed the request.

NGINX Plus also offers two basic session‑persistence methods, implemented as two of the load‑balancing methods described above:

  • IP Hash – The backend server is determined by the IP address of the request.
  • Hash – The backend server is determined from a user‑defined key, for example Source IP Address and Source Port, or the URI.

GCE network load balancer supports the equivalent of the NGINX Plus Hash method, although the key is limited to certain combinations of the Source IP Address, Source Port, Destination IP Address, Destination Port, and Protocol header fields.

Note: When you use GCE network load balancer or NGINX IP Hash, or NGINX Hash with Source IP Address included in the key, session persistence works correctly only if the client’s IP address remains the same throughout the session. This is not always the case, for example when a mobile client switches from a WiFi network to a cellular one. To make sure requests continue hitting the same backend server, it is better to use one of the advanced NGINX Plus session persistence mechanisms listed above.

GCE HTTP load balancer does not support session persistence.

Health Checks

Both of the GCE load balancing services support simple application health checks. You can specify the URL that the load balancer requests, and it considers the backend server healthy if it receives the expected HTTP 200 return code. You can also specify the health check frequency and the timeout period before the server is considered unhealthy.

NGINX Plus extends this functionality with advanced health checks. In addition to specifying the URL to use, with NGINX Plus you can insert headers into the request and look for different response codes, and examine both the headers and body of the response.

A useful related feature in NGINX Plus is slow start. NGINX Plus slowly ramps up the load to a new or recently recovered server so that it doesn’t become overwhelmed by connections.This is useful when your backend servers require some warm‑up time and will fail if they are given their full share of traffic as soon as they show as healthy.

NGINX Plus also supports health checks to TCP and UDP servers, which allow you to specify a string to send and a string to look for in the response.

Interestingly, GCE network load balancer does not support TCP health checks. To check the health of TCP servers, you need to run a web server on each one to respond to your HTTP health checks, even if they are not also load balancing HTTP traffic.

SSL Termination

NGINX Plus supports SSL termination, as does the GCE HTTP load balancer. GCE network load balancer does not.

HTTP/2 Support

NGINX Plus supports HTTP/2, as does GCE HTTP load balancer.

Because GCE network load balancer operates at the network layer (Layer 4), it can provide TCP load balancing of HTTP/2 traffic, but without support for any application layer (Layer 7) features.

Additional Features of NGINX Plus

NGINX Plus provides many features that the GCE load balancers do not.

URL Rewriting and Redirecting

With NGINX Plus you can rewrite the URL of a request before passing it to a backend server. This allows the location of files, or request paths, to be altered without modifications to the URL advertised to clients. You can also redirect requests. For example, you can redirect all HTTP requests to an HTTPS server.

Connection and Rate Limits

You can configure multiple limits to control the traffic to and from your NGINX Plus instance. These include limiting inbound connections, the connections to backend nodes, the rate of inbound requests, and the rate of data transmission from NGINX Plus to clients.

WebSocket Support

NGINX Plus supports WebSocket, including the ability to examine the body and the headers of a client request, advanced session persistence options, and other Layer 7 features.

As with HTTP/2 traffic, GCE network load balancer can load balance WebSocket traffic at the TCP level, but without support for any Layer 7 features.

Access Logging

NGINX Plus can log to a local disk or to syslog, and provides an extensive set of values you can configure to be logged.

Additional Features of the GCE Load Balancing Services

Both GCE load balancers are highly available solutions managed by Google Compute Engine, although as of this writing GCE HTTP load balancer is in beta and not covered by any SLA.

Autoscaling Backend Instances with Automatic Reconfiguration

Both GCE load balancers automatically reconfigure themselves when used with GCE Autoscaler. NGINX Plus supports dynamic configuration with the NGINX Plus API for adding or removing backend instances.

UDP Support

GCE network load balancer supports UDP load balancing, which NGINX Plus does not.

Cross-Region Load Balancing

GCE HTTP load balancer supports cross‑region load balancing, which makes your backend available in multiple regions via a single IP for every region. Client requests are sent to the nearest region that has enough capacity.

NGINX Plus with GCE Load Balancing Services

You can use NGINX Plus alone or in conjunction with the GCE load balancers.

Active‑Active High Availability

By putting multiple NGINX Plus instances in multiple availability zones within a region, and behind GCE network load balancer, you get a set of highly available load balancers within the region.

Cross‑Region Load Balancing

To do global load balancing across regions, you can use the DNS‑based load balancing available with Google Cloud DNS, or you can use GCE HTTP load balancer in front of NGINX Plus to take advantage of its ability to provide a single IP for your application across regions.

Autoscaling of Backend Instances

With GCE Autoscaler you can set up autoscaling of backend instances based on their CPU utilization or other standard or custom metrics. You still need to add or remove backend instances from the NGINX configuration, either by editing the configuration file or using the NGINX Plus API.

Autoscaling of NGINX Plus Instances

In a similar way, you can set up autoscaling of NGINX Plus instances. You still need to ensure that NGINX configuration files stay synchronized.


When the load balancing solutions provided by GCE are not enough for your requirements, NGINX Plus is a good choice with its multiple advanced features. Using NGINX Plus together with GCE load balancers gives you a highly available and scalable NGINX Plus setup.

Check out our deployment guide, All‑Active HA for NGINX Plus on the Google Cloud Platform, for step‑by‑step instructions on deploying a highly available NGINX Plus load‑balancing configuration on GCE.

Hero image
免费 O'Reilly 电子书:
《NGINX 完全指南》

更新于 2022 年,一本书了解关于 NGINX 的一切


Michael Pleshakov



F5, Inc. 是备受欢迎的开源软件 NGINX 背后的商业公司。我们为现代应用的开发和交付提供一整套技术。我们的联合解决方案弥合了 NetOps 和 DevOps 之间的横沟,提供从代码到用户的多云应用服务。访问 了解更多相关信息。