Why is this resource asymmetry unavoidable? It seems reasonable that a new protocol could solve it.
Consider augmenting TCP with a "pre-SYN" and "pre-SYNACK". Suppose you have to send a pre-SYN which declares your identity ip X. Then the server sends to X a "pre-SYNACK" encrypted message containing (X, t = timestamp()) based on a server-only key.
Afterwards, X may send the server a SYN for real. But then the server only allows a timeout of length, say, timestamp() - t, as of receiving the SYN back to maintain bookkeeping.
As benevolent cooperating clients, we may want to ourselves wait an exponentially-backed-off amount between receiving the pre-SYNACK from the server and sending our SYN back (such that in log-many rounds we establish our conn).
Then any malevolent actor must then keep a resource on X alive for at least half long as the server does (since the server requires zero bookkeeping after firing off the pre-SYNACK).
Edit: this is simplified, you could just own a machine at X which is far away from the ddos'd server and then have your botnet spoof requests from X; by augmenting this procedure with even more preamble rounds you could verify how long communication takes between the server and client X and subtract that out.
The server has to do an encryption and a decryption. A legit client has to do the pre-SYN dance. In your protocol, I don’t spam pre-SYNs; I spam SYNs with fake tokens. Thus I send only the same number of packets as before, but the server has to do these decryptions! Or I can mix and match spoofed SYNs with pre-SYNs and legitimate SYNs in whatever way finds the best leverage.
Your pre-SYN design is very close to the design of TLS 1.2, by the way. That’s how I knew how to attack it. TLS is attackable in just this way: open a real connection, then lots of fake handshake requests. No crypto required to generate them, but the server has to do lots.
Now that said, BCP 38 would go a LONG way towards addressing this. There will come a day when the well-behaved networks decide they’re not accepting traffic from non-compliant peers.
You do bring a good point. Maybe outlandish, but see my sibling comment (https://news.ycombinator.com/item?id=28578832) where the server overhead for rejecting invalid packets which the attacker made for free goes down to performing a hash.
mind you BCP38 will do nothing against legitimate IP's used in DDoS attacks. (which are more and more common because of large botnets thanks to IoT devices and their abysmal security).
This method is basically already in use for DOS mitigation using a technique called a SYN cookie. But as the parent said, there is a huge spectrum of different ways to produce a DOS and so there is no single method of addressing them. The problem tends to be much harder as you get to higher levels, for example, requesting things from an HTTP server - there are mitigations available there as well such as Cloudflare's approach of capturing DoS like requests and subjecting them to CAPTCHA, but once again no one size fits all.
I guess the problem is "simply" the use of these resource-asymmetric protocols. As a server-owner, if there was a robust alternative protocol which ensured resource symmetry, wouldn't you be naturally incentivized to adopt it and reject all asymmetric ones?
It still feels like there's a way of specifying a protocol (which of course would need to be adopted) where (1) authentication is a first-class entity (2) all non-authenticated requests enforce resource symmetry at the protocol level, while still allowing asymmetric work to be done (after a resource-symmetric auth).
Edit: augmenting the rough outline of what I originally proposed you can essentially require an arbitrary amount of proof-of-work, however useless, on the requester side (e.g., literally request a nonce such that the hash of the packet is below some value). Since only auth needs to be symmetric the overhead doesn't seem too bad.
The problem with the proof-of-work concept, which is raised frequently, is that the vast majority of DoS traffic today comes from attackers that don't pay for their work. They're using botnets of compromised devices. So all a PoW requirement tends to do, in practice, is make their botnet run a little bit warmer.
Sure, you could crank up the difficulty until it makes the cost of the attack too high, but then all your users will get mad - that's basically what CloudFlare does and it attracts non-stop criticism.
Another way we could put this is that asymmetric work requirements are basically the definition of a network service. If we require the client to do enough computation that it expends as many resources as the backend, in a lot of ways there's no reason for the backend to even exist! People will be prone to migrate to a peer-to-peer or desktop solution instead of using your very very slow website.
Or a little more of a hot-take: the idea of proof-of-work requirements to prevent abuse of systems with asymmetric workload is almost as old as network systems. A prominent example is "hashcash" for email, introduced 1997. These have so routinely failed to gain any traction that we need to consider that the proof-of-work idea tends to come from a fundamental misunderstanding of the problem. Increasing the cost to the attacker sounds great until you consider that in most real-world situations, the attacker's budget is effectively infinite... because for a couple decades now these types of abuses have originated mostly from compromised systems, not from systems the attacker controls. Proof-of-work requirements still reduce the volume the attacker is capable of, but trends like IoT and the general proliferation of computing mean that the attacker's potential resources continually expand. More difficult PoW, in most cases, will only cause the attacker to make a relatively one-time investment of obtaining a new set of compromised resources (new botnet, new exploit to distribute existing botnet, etc). The increase in "per-item" cost to the attacker tends to never materialize.
> Sure, you could crank up the difficulty until it makes the cost of the attack too high, but then all your users will get mad - that's basically what CloudFlare does and it attracts non-stop criticism.
This is interesting, where can I read more about both sides?
I do think you hit the nail on the head here, with resource symmetry being the center of the give-get here and making the botnet run warmer is the win (because it requires larger botnets, because it's more detectable, etc.). I think that this might make me upset as a client, but if it's economically advantageous for the server, then I'd still tolerate it to get the server's content.
I think that's why you need auth to be first-class in such a protocol. Yes, auth might be painful and annoying and require that your browser essentially mines hashes for a while on connection setup, but then after that presumably your website would be as fast as usual, and perhaps there might be a way to safely cache such authentication with a lease for valid users. Perhaps this kind of approach is too anti-internet, though, since it basically says that to do any useful backend work you need to be a registered user for some backend service.
Yes, I think there is a lot of room for better protocols that reduce the asymmetry. A lot of our internet is built on protocols that assumed good behavior, since back in the 70s and 80s and even 90s that was plausible.
But I don’t think we will ever get to a point where we bring that asymmetry to zero. We often really do want complex tasks to be done by a server in response to a simple client request.
Software people rarely put this asymmetry in the front of their minds during design, so I think its something we will continue to see in new systems, even if we (magically) universally adopted better versions of the old, too-trusting protocols.
If we do some form of authentication it seems likely that the receiving side needs to do some kind of computation. What if I just sent garbage? Garbage is easy to create but the receiver needs to do some work to figure this out.
Hrm, I don't think you're engaging with the spirit of my proposal, which does require resource-symmetry to establish auth in the protocol.
I admit I haven't specified a full counterfactual protocol here, but see my edit for a rough outline of how you wouldn't just be able to generate garbage.
You said "I feel that..." so I suppose you're an mbti 'F' - excellent choice! But I feel you'll work on this idea for a bit and then throw your hands up.
The symmetry is irrelevant when the attacker has stolen someone else's resources. Granny may notice her computer is even slower than usual, what then?
If we stipulate a future where attackers cannot steal other people's resources, then work backwards, I can't see any internet with any degree of freedom.
It would be necessary for every system to be 100% watertight. Mathematically-provably so. Spectre/Meltdown demonstrates that this isn't just formal proof about software. Eradicating the pathways to ddos is an intractable problem. You'd need to seriously consider preventing all network access except through certified and regulated kiosks to which users do not have direct physical access. Like, hand a librarian a piece of paper with a URL on it and get a print out.
I'm not expert, you shouldn't have confidence that I'm correct.
On your somewhat off-topic meta-comment about my lack of due diligence: lucky for me I don't need to break out Coq to validate an RFC draft to post a comment on HN, which despite my mushy-gushy feelings has resulted in a productive, curious, and educational discussion, at least for me!
On the point of stolen resources, true, the attacker doesn't care, but I think if we get to the level of resource symmetry in a protocol we've effectively throttled a class of attacks. There are only so many grannies whom a given attacker can pwn. Symmetry is relevant because it makes it that much harder and that much more demanding of your botnet. Besides, like you mention at the end of your comment demanding some kind of additional Byzantine DOS-tolerance is likely too hard of an ask.
> There are only so many grannies whom a given attacker can pwn
And at how many millions does that number start to taper down? Plus, zombie devices of all sorts are being used of course, so while there certainly does feel like some sort of resource-symmetry scheme would be, if nothing else, a satisfying solution, its hard to see logically how that would really help all that much for these sort of attacks.
Not to mention, implementation of new protocols is one thing, adoption of such protocols is another - I would dare say the harder part of the two.
If you are required to serve a population that won't keep up, its an uphill battle.
You're too lazer focused on syn attacks, there are countless others. What it boils down to is the sheer size of the botnets. You stop syn? Great, I'll make it UDP with a max payload and send it from 500k hosts. You will eventually fall over if my botnet is big enough.
> Why is this resource asymmetry unavoidable? It seems reasonable that a new protocol could solve it.
Sending a HTTP GET is easy, but the server has to process the request and send the whole page... think a few tens of bytes of a request, and a whole webpage in a reply.
The fundamental asymmetry is bandwidth. If you can overwhelm my incoming connection, I can't do much useful work.
I've got to pay for capacity, but a bot net controller doesn't need a whole lot of bandwidth to trigger a lot. If you control 10,000 hosts with 10Mbps upload, that's 100Gbps; and botnets are growing as are typical upload bandwidths.
And that's without spoofing and amplified reflection attacks. Some of the reflection attacks have high amplification, so if you've got access to 100Gbps of spoofable traffic, you can direct Tbps of traffic.
If my server has a 10G connection and you're sending me 1Tbps, there's just no way I can work.
Syncookies work pretty well for TCP, I had run into some effective SYN floods against my hosts I managed in 2017, but upgrading to FreeBSD 11 got me to handling line rate SYNs on a 2x 1Gbps box and I didn't take the time to test 2x10G as I wasn't really seeing that many SYN floods. I don't recall any more effective SYN floods after that point. We didn't tend to have dedicated attackers though; people seemed to be testing DDoS as a service against us, and we'd tend to see exactly 90 or 300 seconds of abuse at random intervals.
Our hosting provider would null route IPs that were attacked if the volume was high enough for long enough. Null routing is kind of the best you can hope for your upstreams to do, but it's also a denial of service, so there you go.
> Why is this resource asymmetry unavoidable? It seems reasonable that a new protocol could solve it.
Protocols are trying to avoid this problem. For TCP there exist SYN cookies to reduce the amount of work a host has to do. For QUIC retry packets exist, as well as anti-amplification rules. However applying those mechanisms also has a cost. One is you still get computational cost not to zero, since you still have to receive and reject packets. And the other cost is now your legitimate users are experiencing a latency penality due to potentially requiring another round trip.
The latter means you don't want to make those mechanisms the default, but rather use them selectively in certain situations.
The former means even if you apply countermeasures you are not fully protected. If you get flooded purely by the bare amount of packets and the system stalls just from handling them, all the higher level mitigations won't work very well anymore. That's the difference between a DDoS and a DoS - the DDoS might not even need an asymetry in cost, it just brute-forces the system down.
> Why is this resource asymmetry unavoidable? It seems reasonable that a new protocol could solve it.
There's some sense in which a little asymmetry is unavoidable -- a malicious attacker can always just send whatever bytes they want over the wire, and a server on the other end with any practical purpose must at a minimum receive each of those bytes AND perform some operation equivalent to determining if any more work needs to be done.
To the extent that any real work or state management needs to be done by the server, the only way to avoid much asymmetry is to use a protocol forcing the client calls to be artificially more expensive before the server will agree to respond.
Consider augmenting TCP with a "pre-SYN" and "pre-SYNACK". Suppose you have to send a pre-SYN which declares your identity ip X. Then the server sends to X a "pre-SYNACK" encrypted message containing (X, t = timestamp()) based on a server-only key.
Afterwards, X may send the server a SYN for real. But then the server only allows a timeout of length, say, timestamp() - t, as of receiving the SYN back to maintain bookkeeping.
As benevolent cooperating clients, we may want to ourselves wait an exponentially-backed-off amount between receiving the pre-SYNACK from the server and sending our SYN back (such that in log-many rounds we establish our conn).
Then any malevolent actor must then keep a resource on X alive for at least half long as the server does (since the server requires zero bookkeeping after firing off the pre-SYNACK).
Edit: this is simplified, you could just own a machine at X which is far away from the ddos'd server and then have your botnet spoof requests from X; by augmenting this procedure with even more preamble rounds you could verify how long communication takes between the server and client X and subtract that out.