This is another scratch-pad post that's aimed at a narrow audience -- network geeks, especially in ISPs and corporations. The first bit is a 3-minute read, followed by a 20-minute "more detail" section. If you're baffled by this, but maybe a little concerned after you read it, please push this page along to your network-geek friends and colleagues and get their reaction. Feel free to repost any/all of this.
Key points before we get started
- I don't know what's going to happen
- I don't know what the impact is going to be, but in some cases it could be severe
- Others claim to know both of those things but I'm not convinced by their arguments right now
- Thus, I think the best thing to do is learn more, hope for the best and prepare for the worst
- My goal with this post is just to give you a heads-up
If I were you, I'd:
- Scan my private network and see if any of my names collide with the new gTLDs that are coming
- Check my recursive DNS server logs and see if any name collisions are appearing there
- Start thinking about remediation now
- Participate in the discussion of this topic at ICANN, especially if you foresee major impacts
- Spread the word that this is coming to friends and colleagues
Do note that there is a strong argument raging in the DNS community about all this. There are some (myself included) who never met or even heard of the DNS purists who currently maintain that this whole problem is our fault and that none of this would have happened if we'd all configured our private networks with fully-qualified domain names right from the start.
Where were those folks in 1995 when I opened my first shrink-wrapped box of Windows NT and created the name that would become the root of a huge Active Directory network with thousands of nodes? Do you know how hard it was to get a domain name back then? The term "registrar" hadn't been invented yet. All we were trying to do is set up a shared file, print and mail server for crying out loud. The point is that there are lots of legacy networks that look like the one depicted below, they're going to be very hard and expensive to rename, and some of them are likely to break when new gTLDs hit the root. m'Kay?
Private networks, the way we've thought about them for a decade
Here's my depiction the difference between a private network (with all kinds of domain names that don't route on the wider Internet) and the public Internet (with the top level names you're familiar with) back in the good old days before the arrival of 1400 new gTLDs.
This next picture shows the namespace collision problem. This depiction is still endorsed by nobody, your mileage may vary, etc. etc. But you see what's happening. At some random point in the future, when a second-level name matching the name of your highly-trusted resources get delegated, there's the possibility that traffic which has consistently been going to the right place in your internal network will suddenly be routed to an unknown, untrusted destination on the worldwide Internet.
But wait, there are more bad things that might happen. What if the person who bought that matching second-level name in a new gTLD is a bad-actor? What if they surveyed the error traffic arriving at that new gTLD and bought that second-level name ON PURPOSE, so that they could harvest that error traffic with the intention of doing you harm?
But wait, there's more. What if you have old old applications that are counting on a consistent NXDOMAIN response from a root server. Suppose that the application was written in such a way that that it falls over when the new gTLD gets delegated (and thus the response from the root changes from the expected NXDOMAIN to an unexpected pointer to the registry). Does this start to feel a little bit like Y2K?
Well one of the good things about Y2k was that most of the "breakage" events would have all happened on the same day -- with this rascal, things might look more like a gentle random rain of breakage over the next decade or so as 2nd-level names get sold.
Imagine the surprise when some poor unsuspecting domain-registrant wakes up to a flood of email from network operators who are cranky because their networks just broke. Don't think it can happen? Check out my www.corp.com home page -- those cats are BUSY. That domain gets 2,000,000 error hits A DAY. Almost all of it from Microsoft Active Directory sites.
The new TLDs may unexpectedly cause traffic that you're expecting to go to your trusted internal networks (or your customer's networks) to suddenly start being routed to an untrusted external network, one that you didn't anticipate. Donald Rumsfeld might call those external networks "unknown unknowns" -- something untrusted that you don't know about in advance. The singular goal of this post is to let you know about this possibility in advance. Here's the key message:
If you have private networks that use TLDs on this list, best start planning for a future when those names (and any internal certificates using those names) are going to stop working right.
That's it. If you want, you can quit reading here. I'm going to stick updates in this section, followed by the "More detail" part at the bottom.
Update 1 -- Mikey's first-try at a near-term mitigation plan
After conversations with a gaggle of smart people, I've decided that the following three pictures are a relatively low-impact way to address this problem in a network that you control.
In essence, I think the key to this approach is to take control of when the new gTLDs become visible to your internal network. By becoming authoritative for new gTLDs in your DNS servers now, before ICANN has delegated them, you get to watch the NXD error traffic right now rather than having to wait for messages from new registries. Here's a list of the new gTLDs to use in constructing your router configuration.
Note: all the color, bold, highlighting in this section is mine -- just to draw your eye to things that I find interesting.
There are over 1000 names on that list I linked to above. Here is a shorter list drawn from Interisle Consulting Group's 2-August, 2013 report entitled "Name Collisions in the DNS" [PDF, 3.34 MB]. This list is the top 100 names in order of frequency of queries that they saw in their study. I've taken the liberty of highlighting a few that might be interesting for you to keep an eye out for on your network or your customer's networks.
Here's the executive summary of the InterIsle report.
Executive Summary -- InterIsle Consulting Report
Names that belong to privately-defined or “local” name spaces often look like DNS names and are used in their local environments in ways that are either identical to or very similar to the way in which globally delegated DNS names are used. Although the semantics of these names are properly defined only within their local domains, they sometimes appear in query names (QNAMEs) at name resolvers outside their scope, in the global Internet DNS.
The context for this study is the potential collision of labels that are used in private or local name spaces with labels that are candidates to be delegated as new gTLDs. The primary purpose of the study is to help ICANN understand the security, stability, and resiliency consequences of these collisions for end users and their applications in both private and public settings.
The potential for name collision with proposed new gTLDs is substantial. Based on the data analyzed for this study, strings that have been proposed as new gTLDs appeared in 3% of the requests received at the root servers in 2013. Among all syntactically valid TLD labels (existing and proposed) in requests to the root in 2013, the proposed TLD string home ranked 4th, and the proposed corp ranked 21st. DNS traffic to the root for these and other proposed TLDs already exceeds that for well-established and heavily-used existing TLDs.
Several options for mitigating the risks associated with name collision have been identified. For most of the proposed TLDs, collaboration among ICANN, the new gTLD applicant, and potentially affected third parties in the application of one or more of these risk mitigation techniques is likely to substantially reduce the risk of delegation.
The potential for name collision with proposed new gTLDs often arises from well- established policies and practices in private network environments. Many of these were widely adopted industry practices long before ICANN decided to expand the public DNS root; the problem cannot be reduced to “people should have known better.”
The delegation of almost any of the applied-for strings as a new TLD label would carry some risk of collision. Of the 1,409 distinct applied-for strings, only 64 never appear in the TLD position in the request stream captured during the 2012 “Day in the Life of the Internet” (DITL) measurement exercise, and only 18 never appear in any position. In the 2013 DITL stream, 42 never appear in the TLD position, and 14 never appear in any position.
The risk associated with delegating a new TLD label arises from the potentially harmful consequences of name collision, not the name collision itself. This study was concerned primarily with the measurement and analysis of the potential for name collision at the DNS root. An additional qualitative analysis of the harms that might ensue from those collisions would be necessary to definitively establish the risk of delegating any particular string as a new TLD label, and in some cases the consequential harm might be apparent only after a new TLD label had been delegated.
The rank and occurrence of applied-for strings in the root query stream follow a power- law distribution. A relatively small number of proposed TLD strings account for a relatively large fraction of all syntactically valid non-delegated labels observed in the TLD position in queries to the root.
The sources of queries for proposed TLD strings also follow a power-law distribution. For most of the most-queried proposed TLD strings, a relatively small number of distinct sources (as identified by IP address prefixes) account for a relatively large fraction of all queries.
A wide variety of labels appear at the second level in queries when a proposed TLD string is in the TLD position. For most of the most-queried proposed TLD strings, the number of different second-level labels is very large, and does not appear to follow any commonly recognized empirical distribution.
Name collision in general threatens the assumption that an identifier containing a DNS domain name will always point to the same thing. Trust in the DNS (and therefore the Internet as a whole) may erode if Internet users too often get name-resolution results that don’t relate to the semantic domain they think they are using. This risk is associated not with the collision of specific names, but with the prevalence of name collision as a phenomenon of the Internet experience.
The opportunity for X.509 public key certificates to be erroneously accepted as valid is an especially troubling consequence of name collision. An application intended to operate securely in a private context with an entity authenticated by a certificate issued by a widely trusted public Certification Authority (CA) could also operate in an apparently secure manner with another equivalently named entity in the public context if the corresponding TLD were delegated at the public DNS root and some party registered an equivalent name and obtained a certificate from a widely trusted CA. The ability to specify wildcard DNS names in certificates potentially amplifies this risk.
The designation of any applied-for string as “high risk” or “low risk” with respect to delegation as a new gTLD depends on both policy and analysis. This study provides quantitative data and analysis that demonstrate the likelihood of name collision for each of the applied-for strings in the current new gTLD application round and qualitative assessments of some of the potential consequences. Whether or not a particular string represents a delegation risk that is “high” or “low” depends on policy decisions that relate those data and assessments to the values and priorities of ICANN and its community; and as Internet behavior and practice change over time, a string that is “high risk” today may be “low risk” next year (or vice versa).
For a broad range of potential policy decisions, a cluster of proposed TLDs at either end of the delegation risk spectrum are likely to be recognizable as “high risk” and “low risk.” At the high end, the cluster includes the proposed TLDs that occur with at least order-of-magnitude greater frequency than any others (corp and home) and those that occur most frequently in internal X.509 public key certificates (mail and exchange in addition to corp). At the low end, the cluster includes all of the proposed TLDs that appear in queries to the root with lower frequency than the least-frequently queried existing TLD; using 2013 data, that would include 1114 of the 1395 proposed TLDs.
And here is their list of risk-mitigation options.
9 Name collision risk mitigation
ICANN and its partners in the Internet community have a number of options available to mitigate the risks associated with name collision in the DNS. This section describes each option; its advantages and disadvantages; and the residual risk that would remain after it had been successfully implemented.
The viability, applicability, and cost of different risk mitigation options are important considerations in the policy decision to delegate or not delegate a particular string. For example, a string that is considered to be “high risk” because risk assessment finds that it scores high on occurrence frequency or severity of consequences (or both), but for which a very simple low-cost mitigation option is available, may be less “risky” with respect to the delegation policy decision than a string that scores lower during risk assessment but for which mitigation would be difficult or impossible.
It is important to note that in addition to these strategies for risk mitigation, there is a null option to “do nothing”—to make no attempt to mitigate the risks associated with name collision, and let the consequences accrue when and where they will. As a policy decision, this approach could reasonably be applied, for example, to strings in the “low risk” category and to some or all of the strings in the “uncalculated risk” category.
It is also important to note that this study and report are concerned primarily with risks to the Internet and its users associated with the occurrence and consequences of name collision—not risks to ICANN itself associated with new TLD delegation or risk mitigation policy decisions.
9.1 Just say no
An obvious solution to the potential collision of a new gTLD label with an existing string is to simply not delegate that label, and formally proscribe its future delegation—e.g., by updating  to permanently reserve the string, or via the procedure described in  or . This approach has been suggested for the “top 10” strings by [ ], and many efforts have been made over the past few years to add to the list of formally reserved strings  other non-delegated strings that have been observed in widespread use    .
A literal “top 10” approach to this mitigation strategy would be indefensibly arbitrary (the study data provide no answer to the obvious question “why 10?”), but a policy decision could set the threshold at a level that could be defended by the rank and occurrence data provided by this study combined with a subjective assessment of ICANN’s and the community’s tolerance for uncertainty.
A permanently reserved string cannot be delegated as a TLD label, and therefore cannot collide with any other use of the same string in other contexts. A permanently reserved string could also be recommended for use in private semantic domains.
There is no disadvantage for the Internet or its users. The disadvantages to current or future applicants for permanently proscribed strings are obvious. Because the “top N” set membership inclusion criteria will inevitably change over time, this mitigation strategy would be effective beyond the current new gTLD application round only if those criteria (and the resulting set membership) were periodically re-evaluated.
9.1.3 Residual risk
This mitigation strategy leaves no residual risk to the Internet or its users.
9.2 Further study
For a string in the “non-customary risk” or “calculated risk” category, further study might lead to a determination that the “severity of consequences” factor in the risk assessment formula is small enough to ensure that the product of occurrence and severity is also small.
Further study might shift a string from the “uncalculated risk” to the “calculated risk” category by providing information about the magnitude of the “severity of consequences” factor. It might also reduce the uncertainty constant in the risk assessment formula, facilitating a policy decision with respect to delegation of the string as a new TLD.
Further study obviously involves a delay that may or may not be agreeable to applicants, and it may also require access to data that are not (or not readily) available. Depending on the way in which a resolution request arrives at the root, it may be difficult or impossible to determine the original source; and even if the source can be discovered, it might be difficult or impossible (because of lack of cooperation or understanding at the source) to determine precisely why a particular request was sent to the root.
The “further study” option also demands a termination condition: “at what point, after how much study, will it be possible for ICANN to make a final decision about this string?”
9.2.3 Residual risk
Unless further study concludes that the “severity of consequences” factor is zero, some risk will remain.
9.3 Wait until everyone has left the room
At least in principle, some uses of names that collide with proposed TLD strings could be eliminated: either phased out in favor of alternatives or abandoned entirely. For example, hardware and software systems that ship pre-configured to advertise local default domains such as home could be upgraded to behave otherwise. In these cases, a temporary moratorium on delegation, to allow time for vendors and users to abandon the conflicting use or to migrate to an alternative, might be a reasonable alternative to the permanent “just say no.” Similarly, a delay of 120 days54 before activating a new gTLD delegation could mitigate the risk associated with internal name certificates described in Sections 6.2 and 7.2.
A temporary injunction that delays the delegation of a string pending evacuation of users from the “danger zone” would be less restrictive than a permanent ban.
Anyone familiar with commercial software and hardware knows that migrating even a relatively small user base from one version of the same system to another—much less from one system to a different system—is almost never as straightforward in practice as it seems to be in principle. Legacy systems may not be upgradable even in principle, and consumer-grade devices in particular are highly unlikely to upgrade unless forced by a commercial vendor to do so. The time scales are likely to be years—potentially decades—rather than months.
Embracing “wait until...” as a mitigation strategy would therefore require policy decisions with respect to the degree of evacuation that would be accepted as functionally equivalent to “everyone” and a mechanism for coordinating the evacuation among the many different agents (vendors, users, industry consortia, etc.) who would have to cooperate in order for it to succeed.
9.3.3 Residual risk
Because no evacuation could ever be complete, the risks associated with name collision would remain for whatever fraction of the affected population would not or could not participate in it.
9.4 Look before you leap
Verisign  and others (including ) have recommended that before a new TLD is permanently delegated to an applicant, it undergo a period of “live test” during which it is added to the root zone file with a short TTL (so that it can be flushed out quickly if something goes wrong) while a monitoring system watches for impacts on Internet security or stability.
A “trial run” in which a newly-delegated TLD is closely monitored for negative effects and quickly withdrawn if any appear could provide a level of confidence in the safety of a new delegation comparable to that which is achieved by other product-safety testing regimes, such as pharmaceutical and medical-device trials or probationary-period licensing of newly trained skilled craftsmen.
The practical barriers to instrumenting the global Internet in such a way as to effectively perform the necessary monitoring may be insurmountable. Not least among these is the issue of trust and liability—for example, would the operator of a “live test” be expected to protect Internet users from harm during the test, or be responsible for damages that might result from running the test?
9.4.3 Residual risk
No “trial run” (particularly one of limited duration) could perfectly simulate the dynamics of a fully-delegated TLD and its registry, so some risk would remain even after some period of running a live test.
9.5 Notify affected parties
For some proposed TLDs in the current round, it may be possible to identify the parties most likely to be affected by name collision, and to notify them before the proposed TLD is delegated as a new gTLD.
Prior notice of the impending delegation of a new gTLD that might collide with the existing use of an identical name string could enable affected parties to either change their existing uses or take other steps to prepare for potential consequences.
Notification increases awareness, but does not directly mitigate any potential consequence of name collision other than surprise. For many proposed TLDs it might be difficult or impossible to determine which parties could be affected by name collision. Because affected parties might or might not understand the potential risks and consequences of name collision and how to manage them, either in general or with respect to their own existing uses, notification might be ineffective without substantial concomitant technical and educational assistance.
9.5.3 Residual risk
In most cases at least some potentially affected parties will not be recognized and notified; and those that are recognized and notified may or may not be able to effectively prepare for the effects of name collision on their existing uses, with or without assistance.
Here are some of the tasty bits from a risk-mitigation proposal issued by ICANN staff several days later (5-August, 2013).
[ICANN Staff] PROPOSAL TO MITIGATE RISK
The Study establishes a low-risk profile for 80% of the strings. ICANN proposes to move forward with its established processes and procedures with delegating strings in this category (e.g., resolving objections, addressing GAC advice, etc.) after implementing two measures in an effort to mitigate the residual namespace collision risks.
First, registry operators will implement a period of no less than 120 days from the date that a registry agreement is signed before it may activate any names under the TLD in the DNS1. This measure will help mitigate the risks related to the internal name certificates issue as described in the Study report and SSAC Advisory on Internal Name Certificates. Registry operators, if they wish, may allocate names during this period, i.e., accept registrations, but they will not activate them in DNS. If a registry operator were to allocate names during this 120-day period, it would have to clearly inform the registrants about the impossibility to activate names until the period ends.
Second, once a TLD is first delegated within the public DNS root to name servers designated by the registry operator, the registry operator will not activate any names under the TLD in the DNS for a period of no less than 30 days. During this 30-day period, the registry operator will notify the point of contacts of the IP addresses that issue DNS requests for an un-delegated TLD or names under it. The minimum set of requirements for the notification is described in Appendix A of this paper. This measure will help mitigate the namespace collision issues in general. Note that both no-activate- name periods can overlap.
The TLD name servers may see DNS queries for an un-delegated name from recursive resolvers – for example, a resolver operated by a subscriber’s ISP or hosting provider, a resolver operated by or for a private (e.g., corporate) network, or a global public name resolution service. These queries will not include the IP address of the original requesting host, i.e., the source IP address that will be visible to the TLD is the source address of the recursive resolver. In the event that the TLD operator sees a request for a non-delegated name, it must request the assistance of these recursive resolver operators in the notification process as described in Appendix A.
ICANN considers that the Study presents sufficient evidence to classify home and corp as high-risk strings. Given the risk level presented by these strings, ICANN proposes not to delegate either one until such time that an applicant can demonstrate that its proposed string should be classified as low risk based on the criteria described above. An applicant for one of these strings would have the option to withdraw its application, or work towards resolving the issues that led to its categorization as high risk (i.e., those described in section 7 of the Study report). An applicant for a high-risk string can provide evidence of the results from the steps taken to mitigate the name collision risks to an acceptable level. ICANN may seek independent confirmation of the results before allowing delegation of such string.
For the remaining 20% of the strings that do not fall into the low or high-risk categories, further study is needed to better assess the risk and understand what mitigation measures may be needed to allow these strings to move forward. The goal of the study will be to classify the strings as either low or high-risk using more data and tests than those currently available. While this study is being conducted, ICANN would not allow delegation of the strings in this category. ICANN expects the further study to take between three and six months. At the same time, an applicant for these strings can work towards resolving the issues that prevented their proposed string from being categorized as low risk (e.g., those described in section 7 of the Study report). An applicant can provide evidence of the results from the steps taken to mitigate the name collision risks to an acceptable level. ICANN may seek independent confirmation of the results before allowing delegation of such string. If and when a string from this category has been reclassified as low-risk, it can proceed as described above for the low-risk category strings.
ICANN is fully committed to the delegation of new gTLDs in a secure and stable manner. As with most things on the Internet, it is not possible to eliminate risk entirely. Nevertheless, ICANN would only proceed to delegate a new gTLD when the risk profile of such string had been mitigated to an acceptable level. We appreciate the community's involvement in the process and look forward to further collaboration on the remaining work.
APPENDIX A – NOTIFICATION REQUIREMENTS
Registry operator will notify the point of contact of each IP address block that issue any type of DNS requests (the Requestors) for names under the TLD or its apex. The point of contact(s) will be derived from the respective Regional Internet Registry (RIR) database. Registry operator will offer customer support for the Requestors or their clients (origin of the queries) in, at least, the same languages and mechanisms the registry plans to offer customer support for registry services. Registry operator will avoid sending unnecessary duplicate notifications (e.g. one notification per point of contact).
The notification should be sent, at least, over email and must include, at least the following elements: 1) the TLD string; 2) why the IP address holder is receiving this email; 3) the potential problems the Requestor or its clients could encounter (e.g., those described in section 6 of the Study report); 4) the date when the gTLD signed the registry agreement with ICANN, and the date of gTLD delegation; 5) when the domain names under the gTLD will first become active in DNS; 6) multiple points of contact (e.g. email address, phone number) should people have questions; 7) will be in English and may be in other languages the point of contact is presumed to know; 8) ask the Requestors to pass the notification to their clients in case the Requestors are not the origin of the queries, e.g., if they are providers of DNS resolution services; 9) a sample of timestamps of DNS request in UTC to help identify the origin of queries; 10) email digitally signed with valid S/MIME certificate from well- known public CA.
It's that last appendix, where people are going to get notified, that really caught my eye. I can imagine a day when an ISP is going to get notifications from all kinds of different registry operators listing the IP addresses of their customer-facing recursive DNS servers. The notification will be that their customers are generating this kind of error traffic -- but leaves the puzzle of figuring out which customer up to the ISP. Presumably this leaves the ISPs to comb through DNS logs to ferret out which customer it actually was, carry the bad news to the customer, and presumably deal with the outraged fallout. In other cases these notifications will go directly to corporate network operators with the same result. In either case, ponder the implications of a 30 lead-time to fix these things. Maybe easy. Maybe not.
What's next? Where do we go from here?
For me, "learning more" and "spreading the word" are the next steps. People on all sides of the argument are weighing in, but as InterIsle points out, there is a lot of analysis that should be done. They were able to identify the number of queries, the new-TLDs that were queried and the scope of IP addresses of where queries came from. What they point out we don't (and need to) know is the impact of those. How bad would the breakdowns be? Opinions are loudly stated, but facts are scarce.
If you want to learn more, the best place to get started is probably ICANN's "Public Comment" page on this issue. You'll have some reading to do, but right now (until 17-September, 2013) you have the opportunity to submit comments. The more of you that do that the better. The spin-doctors on all sides are hard at work -- it's very difficult to find unbiased information. There aren't very many comments as I write this in mid-August, but they should make interesting reading as they come in -- and you can read them too.
Click HERE for the ICANN public-comment page
That's more than enough for one blog post. Sorry this "little bit more detail" section got so long. There's plenty more if you want to dig further.
DISCLAIMER: Be aware that almost everybody in this debate is conflicted in one way or another (including me - here's a link to my "Statement of Interest" on the ICANN site). I participate in ICANN as the representative of a regional internet exchange point (MICE) and also as the owner of a gaggle of really generic .COM domains (click HERE for that story). I haven't got a clue what the impact of new gTLDs will be on my domains. I also don't know what the impact will be on ISPs and corporate network operators but I am very uneasy right now. I may write some more opinionated posts about that unease, once I understand better what's going on.