Deciphering Ceasar basic concept

Deciphering

 

Ciphertext: “VaqrprzoreoeratraWhyvhfraJnygreUbyynaqreqrgjrroebrefinaRqvguZnetbganne

NzfgreqnzNaarjvytenntzrrxbzraznnezbrgabtrrarragvwqwrovwbzn

oyvwiraBznmnyurgzbrvyvwxuroorabzNaarabtrracnnejrxraqnnegrubhqrafpuev

wsgRqvguSenaxvarraoevrsnnaTregehqAnhznaauhaiebrtrerohhezrvfwrvaSenaxs

hegnzZnva

The given ciphertext has only letters without space, punctuation or separated key, there are two classic cipher systems such as substitution cipher and transposition cipher which are known to be easy to attack by using frequency analysis or bruteforce techniques.

1. Bruteforce

Firstly, it is suggested that some sorts of simple substitution cipher has been used and Ceasar cipher is one of the most popular and simplest known encryption techniques. This encryption can be represented by aligning two alphabets; the ciphertext is the plain alphabet rotated left or right by some numbers of positions. It can be represented using modular arithmetic as well by first converting the letters into numbers such as A = 0, B = 1,..., Z = 25. Encryption of a letter x by a shift n can be described as E(x)=(x+n)mod 26 and the decryption is similarly D(x)=(x-n) mod 26. One way to break this is printing out all results with possible shifts of ciphertext by using bruteforce where the shift ranges from 1 to 26 (assuming they are only lower/uppercase letters in ASCII table). By using the Python program [1], there is a readable output as follow:

This result is in Dutch language and it means: “In December, bringing Juliusen Walter Hollander's two brothers Edith Margot to Amsterdam Anne would like to come but it must no be zero in some time with Grandma Omazal have tired of hard-to for Anne no no few weeks to keep writes Edith Frank Smash letter to Gertrud Naumann, their former neighbor in Frankfurtam Main”.

2. Frequency analysis

By analysing the frequency of ciphertext’s letters, and by knowing the expected distribution of letters in the original language of the plaintext, we can easily decipher substitution ciphertext by trying several combinations of mappings between ciphertext and plaintext letters based on the statistic. The letter frequency of the given ciphertext was derived by using the script [2] which shows the most frequently used letter: “R”. Given that the plaintext is in English, Dutch, German or Turkish, and there are many published frequency result extracted from common books such as: “etaoinsrhldcumfpgwybvkxjqz”, “enatirodslghvkmubpwjczfxy(ëéó)q”, “enisratdhulcgmobw fkzvüpäßjöyqx”, and “aeinrlıdkmuytsboüşzgçhğvcöpfjwxq” respectively[1].

In substituting the letters appeared the most in ciphertext by letters from those result above, I recognize the language of plaintext must be in Dutch. Because after replacing 2 most frequently used Dutch letter R, A in ciphertext with E, N respectively, I found out that the 3-letter-word “een(means “a” in English) appeared with high frequency and this word is one of the most popular Dutch words[2]. Keep replacing others based on the popular Dutch frequency letter with some guessing as described in [3], we get the final text which is similar to the first result.

3.Encryption algorithm improvements

Method 1:

This encryption can be improved by rotating left or right each letters by different numbers of positions. Firstly we convert the letters into numbers such as A = 0, B = 1,.., Z = 25. Encryption of a letter xi can be described as E(x_i )=(x_i+n*i)mod 26 and the decryption is similarly D(x_i )=(x_i-n*i) mod 26 where i is the letter’s position in plaintext or ciphertext. It will be difficult for attackers to use frequency analysis without knowing the encipher method and number n. An example of n=2 is described in [4].

Method 2:

This encryption can be improved by substituting each letters with a pseudo-random generated array of letters. Given an alphabet array consisting of uppercase and lowercase letters:

[a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z]

We shuffle this array by initializing a basic random generator with a seed number which can be the key or the time when this text was enciphered, then we have a Pseudo-random Array (PRA). We convert the letters into numbers such as A = 0, B = 1,.., Z = 25. Encryption of a letter x can be described as and the decryption is similarly where index is a function shows the letter’s position in array. The result consists of random uppercase or lowercase letters independent of plaintext. It will be difficult for attacker to use frequency analysis despite knowing the encipher method, however attackers can still bruteforce the seed if they know initial array’s order and a small value of seed was implemented. An example of seed=0xFFFFFFF is described in [5].

However, I've got a question for viewers: Do you think this kind of encryption seems tobe complex but it can still be beaten by simple substitution cipher analysis?

4. Conclusion

ROT13 is used in this encipher method with replaces a letter with other 13 letters after it in the alphabet. It can be easily broken even in a ciphertext-only scenario by using bruteforce and frequency analysis.

 

-cmxd

APPENDIX

[1]

import string
ciphertext = 'VaqrprzoreoeratraWhyvhfraJnygreUbyynaqreqrgjrroebrefinaRqvguZnetbganneNzfgreqnzNaarjvytenntzrrxbzraznnezbrgabtrrarragvwqwrovwbznoyvwiraBznmnyurgzbrvyvwxuroorabzNaarabtrracnnejrxraqnnegrubhqrafpuevwsgRqvguSenaxvarraoevrsnnaTregehqAnhznaauhaiebrtrerohhezrvfwrvaSenaxshegnzZnva'
for i in range(0,26):
    plaintext = ''
    for c in ciphertext:
        if c in string.uppercase:
            newpos = string.uppercase.find(c)+i
            if newpos>25 : newpos -= 26
            plaintext += string.uppercase[newpos]
        if c in string.lowercase:
            newpos = string.lowercase.find(c)+i
            if newpos>25 : newpos -= 26
            plaintext += string.lowercase[newpos]
    print "Shift:"+str(i)+" : "+plaintext

 

[2]

  1. # -*- coding: utf-8 -*-
    import collections, string
    cipher = 'VAQRPRZOREOERATRAWHYVHFRAJNYGREUBYYNAQREQRGJRROEBREFINARQVGUZNETBGANNENZFGREQNZNAARJVYTENNTZRRXBZRAZNNEZBRGABTRRARRAGVWQWROVWBZNOYVWIRABZNMNYURGZBRVYVWXUROORABZNAARABTRRACNNEJRXRAQNNEGRUBHQRAFPUEVWSGRQVGUSENAXVARRAOEVRSNNATREGEHQANHZNAAUHAIEBRTREROHHEZRVFWRVASENAXSHEGNZZNVA'
    dem = [[0, chr(i + ord('A'))] for i in range(26)]
    for ch in cipher:
        if ch not in string.uppercase:
            continue
        dem[ord(ch) - ord('A')][0] += 1
    count = {}
    dem.sort()
    dem = dem[::-1]
    print "Letter Frequency: ", dem
    for ch in [cipher[i:i+2] for i in xrange(0, len(cipher)-1)]:
        if (not count.get(ch)):
            count[ch]=1
        else:
            count[ch]+=1
    print "2 letters Frequency: ", collections.OrderedDict(sorted(count.items(), key=lambda t: t[1]))
    count = {}
    for ch in [cipher[i:i+3] for i in xrange(0, len(cipher)-2)]:
        if (not count.get(ch)):
            count[ch]=1
        else:
            count[ch]+=1
    print "3 letters Frequency: ",  collections.OrderedDict(sorted(count.items(), key=lambda t: t[1]))

     

[3]

  1. a = {}
    # #e n a t i r o d s l g h v k m u b p w j c z f x y (e e o) q DUTCH
    #e n a r m i t o d u b l j h g s k f w v c z p y x q
    a['R'] = 'E'
    a['A'] = 'N'
    a['N'] = 'A'
    a['E'] = 'R'
    a['Z'] = 'M'
    a['V'] = 'I'
    a['G'] = 'T'
    a['B'] = 'O'
    a['Q'] = 'D'
    a['O'] = 'B'
    a['H'] = 'U'
    a['Y'] = 'L'
    a['W'] = 'J'
    a['U'] = 'K'
    a['T'] = 'G'
    a['X'] = 'U'
    a['S'] = 'F'
    a['F'] = 'S'
    a['J'] = 'W'
    a['I'] = 'V'
    a['P'] = 'C'
    a['M'] = 'Z'
    a['C'] = 'P'
    plaintext = ''
    for ch in cipher:
        if ch in a.keys():
            plaintext = plaintext + a[ch]
        else:
            plaintext = plaintext + '*'
    print plaintext

     

[4]

  1. import string
    plaintext="IndecemberbrengenJuliusenWalterHollanderdetweebroersvanEdithMargotnaarAmsterdamAnnewilgraagmeekomenmaarmoetnogeeneentijdjebijomablijvenOmazalhetmoeilijkhebbenomAnnenogeenpaarwekendaartehoudenschrijftEdithFrankineenbriefaanGertrudNaumannhunvroegerebuurmeisjeinFrankfurtamMain"
    ciphertext=""
    for i in range(1,len(plaintext)+1):
        shift = (2*i%26)
        if plaintext[i-1] in string.uppercase:
            newpos = string.uppercase.find(plaintext[i-1])+shift
            if newpos>25 : newpos -= 26
            ciphertext += string.uppercase[newpos]
        elif plaintext[i-1] in string.lowercase:
            newpos = string.lowercase.find(plaintext[i-1])+shift
            if newpos>25 : newpos -= 26
            ciphertext += string.lowercase[newpos]
    print "Ciphertext: ", ciphertext
    #Decipher
    plaintext=""
    for i in range(1,len(ciphertext)+1):
        shift = (2*i%26)
        if ciphertext[i-1] in string.uppercase:
            newpos = string.uppercase.find(ciphertext[i-1])-shift
            if newpos<0 : newpos += 26
            plaintext += string.uppercase[newpos]
        elif ciphertext[i-1] in string.lowercase:
            newpos = string.lowercase.find(ciphertext[i-1])-shift
            if newpos<0 : newpos += 26
            plaintext += string.lowercase[newpos]
    print "Plaintext: ", plaintext

     

[5]

  1. import random
    def random_encipher(string,seed):
       random.seed(seed)
       alphabet = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
       random.shuffle(alphabet)
       ciphers = ""
       for letter in string:
        if ord(letter) >=97:
           index = ord(letter)-97
           cipher = alphabet[index]
           ciphers+=cipher
        else:
           index = ord(letter)-65
           cipher = alphabet[26+index]
           ciphers+=cipher
       return ciphers
    def random_decipher(string,seed):
       random.seed(seed)
       alphabet = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
       random.shuffle(alphabet)
       plaintext = ""
       for letter in string:
        index=alphabet.index(letter)
        if index >=26:
           plaintext += chr(65+index-26)
        else:
           plaintext += chr(97+index)
       return plaintext
    ciphertext= random_encipher("IndecemberbrengenJuliusenWalterHollanderdetweebroersvanEdithMargotnaarAmsterdamAnnewilgraagmeekomenmaarmoetnogeeneentijdjebijomablijvenOmazalhetmoeilijkhebbenomAnnenogeenpaarwekendaartehoudenschrijftEdithFrankineenbriefaanGertrudNaumannhunvroegerebuurmeisjeinFrankfurtamMain",0xFFFFFFF )
    print "Ciphertext: ", ciphertext
    print "Plaintext: ",random_decipher(ciphertext, 0xFFFFFFF )

     

  1. http://www.letterfrequency.org
  2. https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists#Dutch