Tool to convert domain names to Punycode and vice versa, simplifying the management of special characters for universal compatibility.
Punycode - dCode
Tag(s) : Character Encoding
dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!
Punycode was developed to solve a major problem: traditional domain names only support 37 ASCII characters (26 letters from a to z, 10 numbers from 0 to 9 and the hyphen -).
With the global expansion of the Internet, it became necessary to allow the use of characters from foreign alphabets in domain names.
Punycode converts any Unicode character into an ASCII sequence, making international domain names compatible with existing infrastructure.
Punycode works in two main steps:
— Character separation: The domain name is divided into two parts: ASCII characters (which remain unchanged) and non-ASCII characters.
— Non-ASCII character encoding: Non-ASCII characters are converted into a sequence of ASCII characters using a specific algorithm (called Bootstring). This sequence is then added to the end of the domain name, preceded by the prefix xn--.
Example: météo.fr is converted into xn--mto-bmab.fr via Punycode
Punycode is based on the Bootstring algorithm, a general encoding algorithm for Unicode strings. The encoding process (Unicode to Punycode) is as follows:
— The Unicode string is scanned, the ASCII characters (from codes 0 to 127) are copied directly.
— Non-ASCII characters are extracted, deduplicated, and sorted in ascending Unicode code point order.
— Each non-ASCII character is encoded by a number defining both the character to be inserted and its location in the string. The number is then encoded in a system close to (but not identical to) base 36 with letters and numbers.
— The resulting sequence of letters and numbers is added to the end of the Punycode string under construction, separated from the previous characters by a hyphen -.
Punycode addresses always start with the prefix xn-- followed by a series of letters and numbers.
Punycode is currently the standard method for encoding international domain names.
RFC 3492 describes in detail how Punycode encoding works.
The length of the encoding can be significantly longer than the original text.
Punycode is generally safe, but it can be used to create domain names that look like other domain names (homoglyph or phishing attacks).
Example: The Cyrillic letter а looks like the Latin letter a.
Modern browsers try to detect and warn users of these potential risks.
dCode retains ownership of the "Punycode" source code. Except explicit open source licence (indicated Creative Commons / free), the "Punycode" algorithm, the applet or snippet (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or the "Punycode" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) and all data download, script, or API access for "Punycode" are not public, same for offline use on PC, mobile, tablet, iPhone or Android app!
Reminder : dCode is free to use.
The copy-paste of the page "Punycode" or any of its results, is allowed (even for commercial purposes) as long as you credit dCode!
Exporting results as a .csv or .txt file is free by clicking on the export icon
Cite as source (bibliography):
Punycode on dCode.fr [online website], retrieved on 2025-01-18,