Skip to main content

Use IPTables NOTRACK to implement stateless rules and reduce packet loss.

I recently struck a performance problem with a high-volume Linux DNS server and found a very satisfying way to overcome it. This post is not about DNS specifically, but useful also to services with a high rate of connections/sessions (UDP or TCP), but it is especially useful for UDP-based traffic, as the stateful firewall doesn't really buy you much with UDP. It is also applicable to services such as HTTP/HTTPS or anything where you have a lot of connections...

We observed times when DNS would not respond, but retrying very soon after would generally work. For TCP, you may find that you get a a connection timeout (or possibly a connection reset? I haven't checked that recently).

Observing logs, you might the following in kernel logs:
kernel: nf_conntrack: table full, dropping packet.
You might be inclined to increase net.netfilter.nf_conntrack_max and net.nf_conntrack_max, but a better response might be found by looking at what is actually taking up those entries in your connection tracking table.

We found that the connection tracking was even happening for UDP rules. You could see this with some simple filtering of /proc/net/ip_conntrack looking to see how many entries are there relating to port 53, for example. Here is the basic rules that most Linux people would likely write for iptables.

-A INPUT  -p udp --dport 53 -j ACCEPT
-A INPUT -p tcp --dport 53 -m state --state=NEW -j ACCEPT

NOTRACK for Stateless Firewall Rules in a Stateful Firewall

Thankfully, I had heard of the NOTRACK rule some time back, but never had a cause to use it, so at least I knew where to begin my research. Red Hat have an article about it at, though the rules below do not necessarily come from that.

So we needed to use the 'raw' table to disable stateful inspection for DNS packets; that does mean we need to explictly match all incoming and outgoing packets (which is four UDP flows for a recursive server, plus TCP if you want to do stateless TCP) -- its rather like IPChains way back in the day... and like IPChains, you do lose all the benefits you get from a stateful firewall, and gain all the responsibilities of making sure you explicitly match all traffic flows.

# Don't do connection tracking for DNS
-A PREROUTING -p tcp --dport 53 -j NOTRACK
-A PREROUTING -p udp --dport 53 -j NOTRACK
-A PREROUTING -p tcp --sport 53 -j NOTRACK
-A PREROUTING -p udp --sport 53 -j NOTRACK
-A OUTPUT -p tcp --sport 53 -j NOTRACK
-A OUTPUT -p udp --sport 53 -j NOTRACK
-A OUTPUT -p tcp --dport 53 -j NOTRACK
-A OUTPUT -p udp --dport 53 -j NOTRACK
# Allow stateless UDP serving
-A INPUT  -p udp --dport 53 -j ACCEPT
-A OUTPUT -p udp --sport 53 -j ACCEPT
# Allow stateless UDP backending
-A OUTPUT -p udp --dport 53 -j ACCEPT
-A INPUT  -p udp --sport 53 -j ACCEPT
# Allow stateless TCP serving
-A INPUT  -p tcp --dport 53 -j ACCEPT
-A OUTPUT -p tcp --sport 53 -j ACCEPT
# Allow stateless TCP backending
-A OUTPUT -p tcp --dport 53 -j ACCEPT
-A INPUT  -p tcp --sport 53 -j ACCEPT

Beware the moving bottleneck

That worked well... perhaps a little too well. Now the service gets more than it did before, and you need to be prepared for that, as you may find that a new limit (and potential negative behaviour) is reached.

DNS is particularly prone to having very large spikes of activity due to misconfigured clients. A common problem, particularly from Linux clients, are things like Wireshark, scripts that look up (often using dig -- see my post on how to do this better), and not having a local name-service cache (eg. nscd or better).

Assuming you can identify such clients (see my other DNS posts which have some ideas and tools), you could (perhaps in conjunction with fail2ban or similar) have some firewall rules that limit allowable request rates from segments of your network.

These rules would go prior to your filter table rules allowing access (listed earlier).

# This chain is where the actual rate limiting is put in place.
# Note that it is using just the srcip method in its hashing
-A DNS_TOO_FREQUENT_BLACKLIST -p udp -m udp --dport 53 -m hashlimit --hashlimit-mode srcip --hashlimit-srcmask 32 --hashlimit-above 10/sec --hashlimit-burst 20 --hashlimit-name dns_too_frequen -m comment --comment "drop_overly_frequent_DNS_requests" -j DROP

# This matches a pair of machines I judged to be innocently bombarding DNS
# It so happens that they could be nicely summarised with a /31
# The second line is so we can counters of what made it through
#... more rules here as needed

Concluding Remarks

I've been running this configuration now for some time, and am very happy with it. I do intend to implement this technique on other services where I feel it may be needed (Samba perhaps, perhaps logging servers)

I hope you find this useful for you; if you've got any comments, I'd be happy to see them. 



  1. I guess we can skip NOTRACK rules at all if all other rules are written like this, without state module? It's working that way.

  2. No, unfortunately, if you have a rule such as;

    -A INPUT -p udp --dport 53 -j ACCEPT

    then state will still be tracked, even though we haven't made any "stateful" condition on the rule. This has to happen in a stateful firewall because at the top of our stateful firewall we test for state. I'm not sure if the state starts getting tracked as soon as we start require matching on state, or as soon as the ipt_state module is loaded, but you can verify this behaviour by looking for (in this example) a UDP port of 53 in /proc/net/nf_conntrack, which is where you can see the state currently tracked (warning: it can be very large).


Post a Comment

Popular posts from this blog

ORA-12170: TNS:Connect timeout — resolved

If you're dealing with Oracle clients, you may be familiar with the error message
ERROR ORA-12170: TNS:Connect timed out occurred I was recently asked to investigate such a problem where an application server was having trouble talking to a database server. This issue was blocking progress on a number of projects in our development environment, and our developers' agile post-it note progress note board had a red post-it saying 'Waiting for Cameron', so I thought I should promote it to the front of my rather long list of things I needed to do... it probably also helped that the problem domain was rather interesting to me, and so it ended being a late-night productivity session where I wasn't interrupted and my experimentation wouldn't disrupt others. I think my colleagues are still getting used to seeing email from me at the wee hours of the morning.

This can masquerade as a number of other error strings as well. Here's what you might see in the sqlnet.log f…

Getting MySQL server to run with SSL

I needed to get an old version of MySQL server running with SSL. Thankfully, that support has been there for a long time, although on my previous try I found it rather frustrating and gave it over for some other job that needed doing.

If securing client connections to a database server is a non-negotiable requirement, I would suggest that MySQL is perhaps a poor-fit and other options, such as PostgreSQL -- according to common web-consensus and my interactions with developers would suggest -- should be first considered. While MySQL can do SSL connections, it does so in a rather poor way that leaves much to be desired.

UPDATED 2014-04-28 for MySQL 5.0 (on ancient Debian Etch).

Here is the fast guide to getting SSL on MySQL server. I'm doing this on a Debian 7 ("Wheezy") server. To complete things, I'll test connectivity from a 5.1 client as well as a reasonably up-to-date MySQL Workbench 5.2 CE, plus a Python 2.6 client; just to see what sort of pain awaits.

UPDATE: 2014-0…

From DNS Packet Capture to analysis in Kibana

UPDATE June 2015: Forget this post, just head for the Beats component for ElasticSearch. Beats is based on PacketBeat (the same people). That said, I haven't used it yet.

If you're trying to get analytics on DNS traffic on a busy or potentially overloaded DNS server, then you really don't want to enable query logging. You'd be better off getting data from a traffic capture. If you're capturing this on the DNS server, ensure the capture file doesn't flood the disk or degrade performance overmuch (here I'm capturing it on a separate partition, and running it at a reduced priority).

# nice tcpdump -p -nn -i eth0 -s0 -w /spare/dns.pcap port domain

Great, so now you've got a lot of packets (set's say at least a million, which is a reasonably short capture). Despite being short, that is still a massive pain to work with in Wireshark, and Wireshark is not the best tool for faceting the message stream so you can can look for patterns (eg. to find relationshi…