Tool to decrypt a Unicode shift-encoded message by adding a value to its code point to obtain a substitution cipher.
Unicode Shift - dCode
Tag(s) : Substitution Cipher
dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!
Each character has a unique identifier (a number called a code point) in the Unicode repository. By adding a value N to this number, then a different character is identified which can make it possible to create a substitution cipher by character shift, like the Caesar code.
For each character in the plain message, note its numeric value (its code point) and add a shift/an offset value N.
Example: The Unicode symbol 🔑 (U+1F511) has the code point 128273, adding +23 to it the code point 128296 which is the symbol 🔨 (U+1F528)
For each character of the encrypted message, note its numerical value (its code point) and subtract the offset value N.
Example: Decrypt the coded message ԶԕՁԶԷ with offset 1234. The corresponding Unicode code points are 1334,1301,1345,1334,1335 subtracting 1234 from it, the plain values are 100,67,111,100,101 i.e. the dCode characters
A clear message composed of the usual alphanumeric characters (from the ASCII code) tends to have codes between 32 and 127, i.e. a spread over a few dozen values.
If such a message is shift-encrypted, then the post-shift dot codes will not be spread any further, the spread should remain within the same range.
If the offset is significant (number greater than 100 or 1000 then the message will only be composed of exotic characters, from non-Latin alphabets or symbols/emoji)
Analyze the values of the smallest code point and of the largest code point, to deduce an average value of the shift.
The Unicode shift cipher (but also in general) remains a substitution and is therefore attackable by frequency analysis: the most frequently encoded characters are the most frequently used characters in the plain message (usually the letter E).
Upper and lower case are distinct with a Unicode shift cipher.
ROT8000 is a variant of ROT-13 or ROT-47 adapted to Unicode with a rotation of 0x8000 (hexadecimal value) but with some adjustments.
dCode retains ownership of the "Unicode Shift" source code. Except explicit open source licence (indicated Creative Commons / free), the "Unicode Shift" algorithm, the applet or snippet (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or the "Unicode Shift" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) and all data download, script, or API access for "Unicode Shift" are not public, same for offline use on PC, mobile, tablet, iPhone or Android app!
Reminder : dCode is free to use.
The copy-paste of the page "Unicode Shift" or any of its results, is allowed (even for commercial purposes) as long as you credit dCode!
Exporting results as a .csv or .txt file is free by clicking on the export icon
Cite as source (bibliography):
Unicode Shift on dCode.fr [online website], retrieved on 2024-11-21,