Name Collisions II — A call for research

This post is a heads up to all uber-geeks about a terrific research initiative to try to figure out causes and mitigation of name-collision risk. There’s a $50,000 prize for the first-place paper, a $25,000 prize for the second place paper and up to five $10,000 prizes for third-place papers. That kind of money could buy a lot of toys, my peepul. And the presentation of those papers will be in London — my favorite town for curry this side of India. Interested? Read on. Here’s a link to the research program — you can skip the rest of this post and get right to the Real Deal by clicking here:

www.NameCollisions.net

Background and refresher course — what is the DNS name-collision problem?

Key points

Even now, after months of research and public discussion, I still don’t know what’s going to happen
I still don’t know what the impact is going to be, but in some cases it could be severe
Others claim to know both of those things but I’m still not convinced by their arguments right now
Thus, I still think the best thing to do is learn more
That’s why I’m so keen on this research project.

Do note that there is a strong argument raging in the DNS community about all this. There are some (myself included) who never met or even heard of the DNS purists who currently maintain that this whole problem is our fault and that none of this would have happened if we’d all configured our private networks with fully-qualified domain names right from the start.

Where were those folks in 1995 when I opened my first shrink-wrapped box of Windows NT and created the name that would ultimately become the root of a huge Active Directory network with thousands of nodes? Do you know how hard it was to get a domain name back then? The term “registrar” hadn’t been invented yet. All we were trying to do is set up a shared file, print and mail server for crying out loud. The point is that there are lots of legacy networks that look like the one depicted below, some of them are going to be very hard and expensive to rename, and some of them are likely to break (perhaps catastrophically) when second level names in new gTLDs hit the root. m’Kay?

Private networks, the way we’ve thought about them for a decade

Here’s my depiction the difference between a private network (with all kinds of domain names that don’t route on the wider Internet) and the public Internet (with the top level names you’re familiar with) back in the good old days before the arrival of 1400 new gTLDs.

Private networks, the way they may look AFTER 1400 new gTLDs get dropped into the root

The next picture shows the namespace collision problem that the research efforts should be aimed at addressing. This depiction is still endorsed by nobody, your mileage may vary, etc. etc. But you see what’s happening. At some random point in the future, when a second-level name matching the name of your highly-trusted resource get delegated, there’s the possibility that traffic which has consistently been going to the right place in your internal network will suddenly be routed to an unknown, untrusted destination on the worldwide Internet.

The new TLDs may unexpectedly cause traffic that you’re expecting to go to your trusted internal networks (or your customer’s networks) to suddenly start being routed to an untrusted external network, one that you didn’t anticipate. Donald Rumsfeld might call those external networks “unknown unknowns” — something untrusted that you don’t know about in advance.

Think of all the interesting and creative ways your old network could fail. Awesome to contemplate, no? But wait…

What if the person who bought that matching second-level name in a new gTLD is a bad-actor? What if they surveyed the error traffic arriving at that new gTLD and bought that second-level name ON PURPOSE, so that they could harvest that error traffic with the intention of doing harm? But wait…

What if you have old old old applications that are hard-coded to count on a consistent NXDOMAIN response from a root server. Suppose that the application gets a new response when the new gTLD gets delegated (and thus the response from the root changes from the expected NXDOMAIN to an unexpected pointer to the registry). What if the person that wrote that old old old application is long gone and the documentation is… um… sketchy? But wait…

To top it all off, with this rascal, problems may look like a gentle random rain of breakage over the next decade or so as 2nd-level names get sold. It’s not going to happen on gTLD-delegation day, it’s going to happen one domain at a time. Nice isolated random events sprinkled evenly across the world. Hot damn. But wait…

On the other end of the pipe, imagine the surprise when some poor unsuspecting domain-registrant lights up their shiny new domain and is greeted by a flood of email from network operators who are cranky because their networks just broke. What are THEY going to be able to do about those problems? Don’t think it can happen? Check out my www.corp.com home page — those cats are BUSY. That domain gets 2,000,000 error hits A DAY. Almost all of it from Microsoft Active Directory sites.

So argue all you want. From my perch here on the sidelines it looks like life’s going to get interesting when those new gTLDs start rolling into the root. And that, dear reader, is an introduction to the Name Collision problem.

Mitigation approaches.

Once upon a time, 3 or 4 months ago when I was young and stupid, I thought this might be a good way to approach this problem. I’m going to put it in this post as well, but then I’m going to tell you why it won’t work. Another explanation of why we need this research and we need it now.

Start here:

If you have private networks that use new gTLDs (look on this list) best start planning for a future when those names (and any internal certificates using those names) may stop working right.

A bad solution:

In essence, I thought the key to this puzzler was to take control of when the new gTLDs become visible to your internal network. It’s still not a terrible idea, but I’ve added a few reasons why it won’t work down at the end. Here’s the scheme that I cooked up way back then.

By becoming authoritative for new gTLDs in your DNS servers now, before ICANN has delegated them, you get to watch the NXD error traffic right now rather than having to wait for messages from new registries. Here’s a list of the new gTLDs to use in constructing your router configuration.

This is the part where you look at the NXD traffic and find the trouble spots. Then, with a mere wave of my hand and one single bullet point, I encourage you to fix all your networks. Maybe you’ve got a few hundred nodes of a distributed system all over the world that you need to touch? Shouldn’t be a problem, right?

This is the Good Guy part of this approach. Of course, because we all subscribe to the One World, One Internet, Everybody Can Reach Everything credo, we will of course remember to remove the preventative blocking from our routers just as soon as possible. Right? Right?

The reasons why this won’t work:

The first thing that blows my idea out of the water is that you probably don’t have complete control over the DNS provider your customers use. I still think this is a pretty good idea in tightly-run corporate shops that don’t permit end users to modify the configuration of their machines. But in this Bring Your Own Device world we live in, there’s going to be a large population of people who configure their machines to point at DNS providers who aren’t blocking the names that conflict with your private network space.

Let’s assume for a minute that everything is fine in the internal network, and the corporate DNS resolver is blocking the offending names while repairs are being made (hopefully cheaply). Suppose a road warrior goes out to Starbucks and start using a laptop that’s configured to point at Google’s 8.8.8.8 DNS resolver. In the old days before new gTLDs, the person would fire up their computer, go to the private name, the query would fail and they would be reminded to fire up the VPN to get to those resources. Tomorrow, with a conflicting new gTLD in the root, that query might succeed, but they wouldn’t be going to the right place.

Here’s the second problem. My tra-la-la scheme above assumes that most mitigation will be easy, and successful. But what it it’s not? What if you have a giant Active Directory tree which, by all accounts, is virtually impossible to rename without downtime? What if you have to “touch” a LOT of firmware in machines that are hard-wired to use new gTLDs. What if vendors haven’t prepared fixes for the devices that are on your network looking at a new gTLD with the presumption that it won’t route to the Internet (yet now it does)? Or the nightmare scenario — something breaks that has to be diagnosed and repaired in minutes?

The research project

See why we need you to look hard at this problem? Like, right now?? ICANN is already delegating these domains into the root. Here’s a page that lists the ones that have already been delegated.

http://newgtlds.icann.org/en/program-status/delegated-strings

If you see one of your private network names on THIS list, you’re already in the game. Hooyah! So this is moving FAST. This research should have been done years ago, long before we got to this stage. But here we are. We, the vast galaxy of network operators and administrators who don’t even know this is coming, need your help. Please take a look at the NameCollisions.net site and see if you can come up with some cool ideas. I hope you win — because you’ll help the rest of us a lot. I’ll buy you a curry.