Wednesday, April 19, 2017

Forcing the password gropers through a smaller hole with OpenBSD's PF queues

While preparing material for the upcoming BSDCan PF and networking tutorial, I realized that the pop3 gropers were actually not much fun to watch anymore. So I used the traffic shaping features of my OpenBSD firewall to let the miscreants inflict some pain on themselves. Watching logs became fun again.

Yes, in between a number of other things I am currently in the process of creating material for new and hopefully better PF and networking session.

I've been fishing for suggestions for topics to include in the tutorials on relevant mailing lists, and one suggestion that keeps coming up (even though it's actually covered in the existling slides as well as The Book of PF) is using traffic shaping features to punish undesirable activity, such as


What Dan had in mind here may very well end up in the new slides, but in the meantime I will show you how to punish abusers of essentially any service with the tools at hand in your OpenBSD firewall.

Regular readers will know that I'm responsible for maintaining a set of mail services including a pop3 service, and that our site sees pretty much round-the-clock attempts at logging on to that service with user names that come mainly from the local part of the spamtrap addresses that are part of the system to produce our hourly list of greytrapped IP addresses.

But do not let yourself be distracted by this bizarre collection of items that I've maintained and described in earlier columns. The actual useful parts of this article follow - take this as a walkthrough of how to mitigate a wide range of threats and annoyances.

First, analyze the behavior that you want to defend against. In our case that's fairly obvious: We have a service that's getting a volume of unwanted traffic, and looking at our logs the attempts come fairly quickly with a number of repeated attempts from each source address. This similar enough to both the traditional ssh bruteforce attacks and for that matter to Dan's website scenario that we can reuse some of the same techniques in all of the configurations.

I've written about the rapid-fire ssh bruteforce attacks and their mitigation before (and of course it's in The Book of PF) as well as the slower kind where those techniques actually come up short. The traditional approach to ssh bruteforcers has been to simply block their traffic, and the state-tracking features of PF let you set up overload criteria that add the source addresses to the table that holds the addresses you want to block.

I have rules much like the ones in the example in place where there I have a SSH service running, and those bruteforce tables are never totally empty.

For the system that runs our pop3 service, we also have a PF ruleset in place with queues for traffic shaping. For some odd reason that ruleset is fairly close to the HFSC traffic shaper example in The Book of PF, and it contains a queue that I set up mainly as an experiment to annoy spammers (as in, the ones that are already for one reason or the other blacklisted by our spamd).

The queue is defined like this:

   queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300

yes, that's right. A queue with a maximum throughput of 1 kilobit per second. I have been warned that this is small enough that the code may be unable to strictly enforce that limit due to the timer resolution in the HFSC code. But that didn't keep me from trying.

And now that I had another group of hosts that I wanted to just be a little evil to, why not let the password gropers and the spammers share the same small patch of bandwidth?

Now a few small additions to the ruleset are needed for the good to put the evil to the task. We start with a table to hold the addresses we want to mess with. Actually, I'll add two, for reasons that will become clear later:

table <longterm> persist counters
table <popflooders> persist counters 

 
The rules that use those tables are:

block drop log (all) quick from <longterm> 


pass in quick log (all) on egress proto tcp from <popflooders> to port pop3 flags S/SA keep state \ 
(max-src-conn 1, max-src-conn-rate 1/1, overload <longterm> flush global, pflow) set queue spamd 

pass in log (all) on egress proto tcp to port pop3 flags S/SA keep state \ 
(max-src-conn 5, max-src-conn-rate 6/3, overload <popflooders> flush global, pflow) 
 
The last one lets anybody connect to the pop3 service, but any one source address can have only open five simultaneous connections and at a rate of six over three seconds.

Any source that trips up one of these restrictions is overloaded into the popflooders table, the flush global part means any existing connections that source has are terminated, and when they get to try again, they will instead match the quick rule that assigns the new traffic to the 1 kilobyte queue.

The quick rule here has even stricter limits on the number of allowed simultaneous connections, and this time any breach will lead to membership of the longterm table and the block drop treatment.

The for the longterm table I already had in place a four week expiry (see man pfctl for detail on how to do that), and I haven't gotten around to deciding what, if any, expiry I will set up for the popflooders.

The results were immediately visible. Monitoring the queues using pfctl -vvsq shows the tiny queue works as expected:

 queue spamd parent rootq bandwidth 1K, max 1K qlimit 300
  [ pkts:     196136  bytes:   12157940  dropped pkts: 398350 bytes: 24692564 ]
  [ qlength: 300/300 ]
  [ measured:     2.0 packets/s, 999.13 b/s ]


and looking at the pop3 daemon's log entries, a typical encounter looks like this:

Apr 19 22:39:33 skapet spop3d[44875]: connect from 111.181.52.216
Apr 19 22:39:33 skapet spop3d[75112]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[57116]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[65982]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[58964]: connect from 111.181.52.216
Apr 19 22:40:34 skapet spop3d[12410]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[63573]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[76113]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[23524]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[16916]: autologout time elapsed - 111.181.52.216


here the miscreant comes in way too fast and only manages to get five connections going before they're shunted to the tiny queue to fight it out with known spammers for a share of bandwidth.

I've been running with this particular setup since Monday evening around 20:00 CEST, and by late Wednesday evening the number of entries in the popflooders table had reached approximately 300.

I will decide on an expiry policy at some point, I promise. In fact, I welcome your input on what the expiry period should be.

One important takeaway from this, and possibly the most important point of this article, is that it does not take a lot of imagination to retool this setup to watch for and protect against undesirable activity directed at essentially any network service.

You pick the service and the ports it uses, then figure out what are the parameters that determine what is acceptable behavior. Once you have those parameters defined, you can choose to assign to a minimal queue like in this example, block outright, redirect to something unpleasant or even pass with a low probability.

All of those possibilities are part of the normal pf.conf toolset on your OpenBSD system. If you want, you can supplement these mechanisms with a bit of log file parsing that produces output suitable for feeding to pfctl to add to the table of miscreants. The only limits are, as always, the limits of your imagination (and possibly your programming abilities). If you're wondering why I like OpenBSD so much, you can find at least a partial answer in my OpenBSD and you presentation.

FreeBSD users will be pleased to know that something similar is possible on their systems too, only substituting the legacy ALTQ traffic shaping with its somewhat arcane syntax for the modern queues rules in this article.

Will you be attending our PF and networking session in Ottawa, or will you want to attend one elsewhere later? Please let us know at the email address in the tutorial description.



Update 2017-04-23: A truly unexpiring table, and downloadable datasets made available

Soon after publishing this article I realized that what I had written could easily be taken as a promise to keep a collection of POP3 gropers' IP addresses around indefinitely, in a table where the entries never expire.

Table entries do not expire unless you use a pfctl(8) command like the ones mentioned in the book and other resources I referenced earlier in the article, but on the other hand table entries will not survive a reboot either unless you arrange to have table contents stored to somewhere more permanent and restored from there. Fortunately our favorite toolset has a feature that implements at least the restoring part.

Changing the table definition quoted earler to read

 table <popflooders> persist counters file "/var/tmp/popflooders"

takes part of the restoring, and the backing up is a matter of setting up a cron(8) job to dump current contents of the table to the file that will be loaded into the table at ruleset load.

Then today I made another tiny change and made the data available for download. The popflooders table is dumped at five past every full hour to pop3gropers.txt, a file desiged to be read by anything that takes a list of IP addresses and ignores lines starting with the # comment character. I am sure you can think of suitable applications.

In addition, the same script does a verbose dump, including table statistiscs for each entry, to pop3gropers_full.txt for readers who are interested in such things as when an entry was created and how much traffic those hosts produced, keeping in mind that those hosts are not actually blocked here, only subjected to a tiny bandwidth.

As it says in the comment at the top of both files, you may use the data as you please for your own purposes, for any re-publishing or integration into other data sets please contact me via the means listed in the bsdly.net whois record.

As usual I will answer any reasonable requests for further data such as log files, but do not expect prompt service and keep in mind that I am usually in the Central European time zone (CEST at the moment).

I suppose we should see this as a tiny, incremental evolution of the "Cybercrime Robot Torture As A Service" (CRTAAS) concept.

Update 2017-04-29: While the world was not looking, I supplemented the IP address dumps with versions including one with geoiplocation data added and a per country summary based on the geoiplocation data.

Spending a few minutes with an IP address dump like the one described here and whois data is a useful excersise for anyone investigating incidents of this type. This .csv file is based on the 2017-04-29T1105 dump (preserved for reference), and reveals that not only is the majority of attempts from one country but also a very limited number of organizations within that country are responsible for the most active networks.

The spammer blacklist (see this post for background) was of course ripe for the same treatment, so now in addition to the familiar blacklist, that too comes with a geoiplocation annotated version and a per country summary.

Note that all of those files except the .csv file with whois data are products of automatic processes. Please contact me (the email address in the files works) if you have any questions or concerns.

Update 2017-05-17: After running with the autofilling tables for approximately a month, and I must confess, extracting bad login attempts that didn't actually trigger the overload at semi-random but roughly daily intervals, I thought I'd check a few things about the catch. I already knew roughly how many hosts total, but how many were contactin us via IPv6? Let's see:

[Wed May 17 19:38:02] peter@skapet:~$ doas pfctl -t popflooders -T show | wc -l
    5239
[Wed May 17 19:38:42] peter@skapet:~$ doas pfctl -t popflooders -T show | grep -c \:
77

Meaning, that of a total 5239 miscreants trapped, only 77, or just short of 1.5 per cent tried contacting us via IPv6. The cybercriminals, or at least the literal bottom feeders like the pop3 password gropers, are still behind the times in a number of ways.

Update 2017-06-13: BSDCan 2017 past, and the PF and networking tutorial with OpenBSD session had 19 people signed up for it. We made the slides available on the net here during the presentation and announced them on Twitter and elsewhere just after the session concluded. The revised tutorial was fairly well received, and it is likely that we will be offering roughly equivalent but not identical sessions at future BSD events or other occasions as demand dictates.

Update 2017-07-05: Updated the overload criteria for the longterm table to what I've had running for a while: max-src-conn 1, max-src-conn-rate 1/1.

Update 2017-10-07: I've decided to publish a bit more of the SSH bruteforcer data. The origin is from two gateways in my care, both with these entries in their pf.conf:

table persist counters file "/var/tmp/bruteforce"
block drop log (all) quick from

supplemented with cron jobs that dump the current data to the file so the data survives a reboot. After years of advocating 24-hour expiry on blacklists, I recently changed my mind so, both hosts now run with 28-day expiry. For further severity on my part, the hosts also exchange updates to their bruteforce tables via cron jobs that dump table contents to file, fetch the partner's data and load into their own local table. In addition, a manual process extracts (at quasi-random but approximately daily intervals) addresses of failures that do not reach the limits and add those to the tables as well.

The data comes in three varieties: a raw address list, (with a #-prepended comment at the start) suitable for importing into such things as a PF table you block traffic from, the address list with the country code for each entry appended, and finally a summary of list entries per country code. All varieties are generated twice per hour. 

1 comment:

  1. Hey, Peter. It's me, Pete. :) My ancestors are Norwegian, but I'm American and living in the USA.

    I read your article with interest, and it occurs to me that you could use a technique I started using recently. It's loosely related to port-knocking, but in many ways the exact opposite of the usual brute-force blocking methods. Since I've never been able to find any hints of anyone else doing this, I claimed naming rights, so I decree it to be called "knocking harder."

    I use Linux, so my solution uses iptables and its xt_recent module, but I assume the BSDs have something equivalent (and likely better). xt_recent is most often used to thwart brute-force attempts by limiting the connection rate. For instance, if you receive five SYN packets within 30 seconds, you can block all further SYNs for a day or so. My problem with this is it's reactive, not proactive, and does not hide the service's existence from scanners and attackers like port-knocking would. However, vanilla port-knocking is a pain to setup and use, and requires something on the client host to emit the knock sequence.

    My solution is to force attackers to be more noisy, or "knock harder" to gain entrance. Since scanners are likely to send one, possibly two SYNs to discover a service, I throw the first few SYNs away. True TCP stacks, as used by real clients, will retransmit TCP packets that aren't ACKed. So I don't allow an incoming SYN until I get the 3rd one within 4 seconds. For services like SSH, that means about a 3 second delay in session startup, but empirical testing shows that ALL brute force attacks stopped as soon as I implemented this. I was surprised at how well it worked.

    This technique can be used in conjunction with blacklisting or throttling of connections that do make it through, but it cuts out the vast majority (if not all) of opportunistic scanners and brute-forcers early on. Feel free to adapt to your services, as true pop3 clients should still function if they use a true TCP stack. Please let me know how it works for you if you do try it.

    Thanks for the articles. I look forward to more interesting posts.

    ReplyDelete

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.