BLOG | NGINX

借助 NGINX 基于条件的日志记录对请求进行采样

NGINX-Part-of-F5-horiz-black-type-RGB
Owen Garrett 缩略图
Owen Garrett
Published April 24, 2019

NGINX can record a very detailed log of every transaction it processes. Such logs are known as access logs, and you can fine‑tune the detail that is recorded for different services or locations with a customizable log‑file format.

By default, NGINX logs every transaction it processes. This might be necessary for compliance or security purposes, but for a busy website, the volume of data generated can be overwhelming. In this article, we show how to selectively log transactions based on various criteria, and how to use that knowledge to sample data points about requests in a quick and lightweight way.

Except as noted, this post applies to both NGINX Open Source and NGINX Plus. For ease of reading, we’ll refer to NGINX throughout.

Background – Quick Overview of NGINX Access Log Configuration

NGINX access logs are defined using the log_format directive. You can define several different named log formats; for example, a full log format named main, and an abbreviated log format named notes to record three data points about a request:

log_format main  '$remote_addr - $remote_user [$time_local] "$request" '                 '$status $body_bytes_sent "$http_referer" '
                 '"$http_user_agent" "$http_x_forwarded_for"';

log_format notes '$remote_addr "$request" $status';

The log format can reference NGINX variables and other values calculated at logging time.

You then use the access_log directive to instruct NGINX to log a transaction once it completes. This directive specifies the location of the log file and the log format to use:

access_log  /var/log/nginx/access.log main;

By default, NGINX logs all transactions using the following configuration:

access_log logs/access.log combined;

If you define your own access_log, it overrides (replaces) the default access log.

Conditional Logging

Sometimes you might wish to log certain requests only. This is done using conditional logging, as follows:

server {    listen 80;

    set $logme 0;
    if ( $uri ~ ^/secure ) {
        set $logme 1;
    }

    # Auditors require an additional log for requests to /secure
    access_log  /var/log/nginx/secure.log notes if=$logme;

    # If we have a global access log, we need to re-declare it here
    access_log  /var/log/nginx/access.log main;

    location / {
        # ...
    }
}

Access Logs Are Not Inherited

Access log settings do not stack or inherit; an access_log directive in one context overrides (replaces) access logs declared in parent contexts.

For example, if you want to log additional information about traffic to the URI /secure, you might define an access log in a location /secure {...} block. This access log replaces the general access log defined elsewhere in the configuration.

The example in the previous section addresses this problem. It uses two access logs in the same context, with conditional logging that logs requests for /secure to a dedicated log file.

Challenges with Access Logs

Suppose that you wish to determine some statistical information about traffic to your website:

  • What’s the typical geographic split of users?
  • Which SSL/TLS ciphers and protocols do my users use?
  • What’s the split of web browsers?

The general access log is often not an appropriate place to log this information. You might not wish to pollute the access log with the additional fields needed for your study, and on a busy site, the overhead of logging all transactions would be too high.

In this case, you can log a limited set of fields to a specialized log. To reduce the load on the system, you might also wish to sample a subset of requests.

Sampling Techniques

Sampling from 1% of Requests

The following configuration uses the $request_id variable as a unique identifier for each request. It uses a split_clients block to sample data by logging just 1% of requests:

split_clients $request_id $logme {    1%     1;
    *      0;
}

server {
    listen 80;

    access_log  /var/log/nginx/secure.log notes if=$logme;

    # ...
}

Sampling from 1% of Unique Users

Suppose that we wish to sample one data point from each user (or from 1% of users), such as the User-Agent header. We can’t just sample from all requests because users who generate a large number of requests are then over‑represented in our data.

We use a map block to detect the presence of a session cookie, which tells us whether a request comes from a new user or from a user we have seen before. We then sample requests coming from new users only:

map $cookie_SESSION $logme {    "" $perhaps;  # If the cookie is missing, we log if $perhaps
    default   0;
}

split_clients $request_id $perhaps {
    1%        1;  # $perhaps is true 1% of the time
    *         0;
}

server {
    listen 80;

    access_log  /var/log/nginx/secure.log notes if=$logme;
  
    # Optional: If the application does not generate a session cookie, we 
    # generate our own
    add_header Set-Cookie SESSION=1;

    # ...
}

Sampling Unique Things

Not all clients honor session cookies, however. For example, a web spider might ignore cookies, so every request it issues is identified as coming from a new user, skewing our results.

Wouldn’t it be great if we could sample from requests when it’s the first time we see a new thing? The thing can be a new IP address, a new session cookie value, a new User-Agent header, a not-seen-before host header, or even a combination of these. This way, we sample data for each thing only once.

Clearly, we need to store state (a list of the things we have seen), and for this we turn to NGINX Plus’ key‑value store. The key‑value store maintains an in‑memory key‑value database that can be accessed from NGINX Plus configuration using variables; the database optionally supports automatic expiry of entries (the timeout parameter), persistent storage (state), and cluster synchronization (sync). For each thing that is not already in the store, we log the request and add the thing to the store so it does not get logged again.

In NGINX Plus R18 and later, it’s very easy to set key‑value pairs while processing a transaction:

# Define a keyval zone with appropriate parameterskeyval_zone zone=clients:80m timeout=3600s;

# Create a variable $seen for each unique $remote_addr
keyval $remote_addr $seen zone=clients;

log_format notes '$remote_addr "$request" $status';

server {
    listen 80;

    # if $seen is empty, update the keyval (set $seen 1;) and log this 
    # request (set $logme 1;)
    # Otherwise, $logme is not set and we don't log the request
    # Note that $seen resets to "" after the configured timeout
    if ($seen = "") {
       set $seen 1;
       set $logme 1;
    }
    access_log  /var/log/nginx/secure.log notes if=$logme;

    location / {
        return 200 "All OK: -$seen-$logme-\n";
    }

    location /api {
        api;
    }
}

A Real-World Example – Sampling TLS Parameters

This article was inspired by a real‑world problem – how can I configure TLS according to good practices without excluding users with legacy devices?

TLS best practice is a moving target. TLS 1.3 was ratified a year ago, but many clients only talk previous TLS versions; ciphers are declared ‘insecure’ and retired, but older implementations rely on them; ECC certificates offer greater performance than RSA, but not all clients can accept ECC. Many TLS attacks rely on a “man in the middle” who intercepts the cipher negotiation handshake and forces the client and server to select a less secure cipher. Therefore, it’s important to configure NGINX Plus to not support weak or legacy ciphers, but doing so might exclude legacy clients.

In the following configuration example, we sample each TLS client, logging the SSL protocol, cipher, and User-Agent header. Assuming that each client selects the most recent protocols and most secure ciphers it supports, we can then evaluate the sampled data and determine what proportion of clients get excluded if we remove support for older protocols and ciphers.

We identify each client by its unique combination of IP address and User-Agent, but identifying clients by session cookie or another method works just as well.

log_format sslparams '$ssl_protocol $ssl_cipher '                     '$remote_addr "$http_user_agent"';

# Define a keyval zone with appropriate parameters
keyval_zone zone=clients:80m timeout=3600s;

# Create a variable $seen for each unique combination of $remote_addr and 
# 'User-Agent' header
keyval $remote_addr:$http_user_agent $seen zone=clients;

server {
    listen 443 ssl;

    # default NGINX SSL configuration
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;

    if ($seen = "") {
       set $seen 1;
       set $logme 1;
    }
    access_log  /tmp/sslparams.log sslparams if=$logme;

    # ...
}

This generates a log file with entries like the following:

TLSv1.2 AES128-SHA 1.1.1.1 "Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0"TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 2.2.2.2 "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1"
TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 3.3.3.3 "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:58.0) Gecko/20100101 Firefox/58.0"
TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 4.4.4.4 "Mozilla/5.0 (Android 4.4.2; Tablet; rv:65.0) Gecko/65.0 Firefox/65.0"
TLSv1 AES128-SHA 5.5.5.5 "Mozilla/5.0 (Android 4.4.2; Tablet; rv:65.0) Gecko/65.0 Firefox/65.0"
TLSv1.2 ECDHE-RSA-CHACHA20-POLY1305 6.6.6.6 "Mozilla/5.0 (Linux; U; Android 5.0.2; en-US; XT1068 Build/LXB22.46-28) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 UCBrowser/12.10.2.1164 Mobile Safari/537.36"

We can then process the file using a variety of methods to determine the spread of data:

$ cat /tmp/sslparams.log | cut -d ' ' -f 2,2 | sort | uniq -c | sort -rn | perl -ane 'printf "%30s %s\n", $F[1], "="x$F[0];'    ECDHE-RSA-AES128-GCM-SHA256 =========================
    ECDHE-RSA-AES256-GCM-SHA384 ========
                     AES128-SHA ====
    ECDHE-RSA-CHACHA20-POLY1305 ==
        ECDHE-RSA-AES256-SHA384 ==

We identify the low‑volume, less secure ciphers, check the logs to determine which clients are using them, and then make an informed decision about removing ciphers from the NGINX Plus configuration.

Conclusion

NGINX’s conditional logging can be used to sample a subset of the requests that NGINX manages, and write a standard or special‑purpose log. This technique is useful if you ever need to take a quick sample of traffic for statistical analysis, such as determining the spread of SSL parameters.

You need to put some thought into how you sample data so that busy users or spiders are not over‑represented. You can use variables in NGINX configuration, along with the map and split_clients directives, to select and filter requests.

For situations where the decision is more complex, or where high confidence of accuracy is desired, you can build sophisticated selectors in NGINX configuration. The NGINX Plus key‑value store enables you to accumulate state and share it across NGINX Plus instances in a cluster if needed.

Try out request sampling with NGINX Plus for yourself – start your free 30-day trial today or contact us to discuss your use cases.


"This blog post may reference products that are no longer available and/or no longer supported. For the most current information about available F5 NGINX products and solutions, explore our NGINX product family. NGINX is now part of F5. All previous NGINX.com links will redirect to similar NGINX content on F5.com."