Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> 1) That would do nothing to fix resolvers that had already cached NSEC responses lacking type maps.

The TTL for NSEC records are presumably way lower than the TTL for the DS records.

> 2) That presumes the wildcard record was superfluous and could have been replaced with a simple A record for a single or small number of records. Would love to see a citation supporting that.

It’s theoretically possible that it would not have worked for all cases, but that is, in my experience, very unlikely.

> Any way you slice it, there was no quick way to fully recover from this bug once they hit it

The bug seems to me to have been reasonably easy to mitigate, but their problem was that Slack did not know what they were doing. Thu bug itself was minor, but Slack tried to fix it by stopping to serve DNSSEC-signed DNS data, while long-TTL DS records were still being unexpired in the world. This is the worst possible thing you could do.

> Of the top 100 domains -- the operators you would assume would be the most concerned about DNS response poisoning -- *six* have turned DNSSEC on.

1. That number used to be zero, as tptacek liked to point out.

2. The huge operators often have fundamentally different security priorities than regular companies and users.

3. People said the same about IPv6 and SSL, which were also very slow to adopt. But they are all climbing.



> The TTL for NSEC records are presumably way lower than the TTL for the DS records.

Possibly. It was still an outage they had to wait out the TTL for, due to the design of DNSSEC.

> It’s theoretically possible that it would not have worked for all cases, but that is, in my experience, very unlikely.

This is completely unsubstantiated speculation on your part.

> The bug seems to me to have been reasonably easy to mitigate, but their problem was that Slack did not know what they were doing.

It is indeed reasonably easy to Monday-morning quarterback someone else's outage and blame operators for the sharp edges around poorly designed protocols.

> 1. That number used to be zero, as tptacek liked to point out.

Cool. so, at this rate, in another 100 years or so we should be at 50% adoption.

> 2. The huge operators often have fundamentally different security priorities than regular companies and users.

Priorities like uptime?

> 3. People said the same about IPv6 and SSL, which were also very slow to adopt. But they are all climbing

1) people started rolling IPv6 out once v4 addresses got scarce. There is no such compelling event to drive DNSSEC adoption. 2) SSL is easy to roll out and provides compelling security benefits. It is also exceedingly unlikely in practice to blow up in your face and result in run-out-the-clock outages -- unlike DNSSEC.


> This is completely unsubstantiated speculation on your part.

Do you have any support for your assumption that the wildcard record was vital and practically impossible to replace with regular records?

> It is indeed reasonably easy to Monday-morning quarterback someone else's outage and blame operators for the sharp edges around poorly designed protocols.

When Slack, being a large company presenting themselves as proficient in tech, make a tech mistake so bad that they lock themselves out if the internet for an entire day, a mistake even I know not to make, then I get to criticize them.


Cloudflare has been promoting DNSSEC for almost as long as I've been writing about DNSSEC, so no, nothing has really changed with the Top 100.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: