IDN - Internationalized Domain Names

Domain names with accents and umlauts

Domain names under .ch and .li can also contain non-ASCII characters such as umlauts and accents. Such domain names that contain umlauts, other diacritical characters or letters from alphabets other than the Latin alphabet are referred to as Internationalized Domain Names (IDN). The characters permitted for .ch and .li domain names are listed in Annexes 1 and 2 of the GTC.

ACE string

There were essentially two different options open for introducing internationalized domain names (IDN). The first was to make adjustments to the domain name system (DNS) which would allow unicode characters to be used directly. It was felt that this was too drastic a measure, and hence the second option was chosen. This involved compiling an algorithm to specify how a unicode string should be converted into a permitted ASCII domain name. This ACE string (ACE stands for ASCII Compatible Encoding) is then entered into the DNS. The introduction of IDN means that, for the very first time, the entry in the DNS is no longer identical with the domain name.

Name Preparation, Punycode

A number of requirements have to be fulfilled before a unicode string can be converted into an ACE string. This is done by the so-called "Nameprep" procedure, which makes sure that no inadmissible characters are included. Umlauts which are made up of two characters have to be replaced by a single character, e.g. a + ¨ = ä. This process is referred to as "normalization". In addition, all big Latin letters are converted into small letters. This is known as "case mapping" or "case folding".

If non-ASCII characters are contained in the string after the "name preparation" has been run through, the system places the prefix xn-- in front of this string. Punycode takes the non-ASCII characters out of the actual domain name, notes their position, and adds them on at the end of the name again, in coded form, separated by means of a further hyphen.

An example

Consequences

The domain name and the entry in the DNS are two different things with IDN.

bücher.ch is the domain name,
xn--bcher-kva.ch
is the ACE string, and it is this string that is entered in the DNS.

For technical reasons, the character string that has been processed by the algorithm is several characters longer than the domain name itself. The domain name "www.buecher.ch" is seven characters long. The corresponding ACE string, however, is 13 characters long.

bücher.ch = domain name = must be at least three characters long,
xn--bcher-kva.ch = DNS entry = may be a maximum of 63 characters long.

IETF Standards

  • RFC 3492 Encoding Scheme (Punycode)
  • RFC 5890 IDNA (Internationalized Domain Names for Applications): Framework
  • RFC 5891 IDNA: Protocol
  • RFC 5892 IDNA: Unicode Code Points
  • RFC 5893 IDNA: Right-to-Left Scripts
  • RFC 5894 IDNA: Background, Explanations, and Rationale

Current browsers and e-mail programmes support IDNs. However, you should not rely solely on an IDN for important applications.

Switch does not guarantee that domain names with umlauts and accents as per Annex 2 of the GTC are suitable for use in conjunction with programs such as browsers and e-mail programs and does not accept any liability in this respect.