$c_{1} = E( IV \oplus p_{1})$

$c_{2} = E( c_{1} \oplus p_{2})$

$...$

$c_{n} = E( c_{n-1} \oplus p_{n})$





image stolen from https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation

$p_{i} = D(c_{i}) \oplus c_{i-1}$





also $D : C $ X $ K \rightarrow P$

and $D = E^{-1}$





Padding Scheme

$ p = \big( \sum^{n}_{i} p_{i} \big) + \xi^{e} $

preserve plain-text content

make the encryption algorithm aware of how much padding to remove

the length of characters that need to be added is calculated as a number $k$ and then the padding is made up of $k$ characters of the value $k$

0x1 0x2 0x2 0x3 0x3 0x3

...

n. 0xn 0xn 0xn 0xn ... 0xn





Formulating the attack

So now we know how padding, encryption and decryption works for Cipher Block Chaining mode. Lets play around with some of the algebra and talk about some of the weird properties it has. We're going to formulate an attack, one that allows us to learn about the plain-text without knowing the secret key. Algebraically this means we need an expression of the plain-text that depends only on values that we can determine (in a time less than a full brute-force attack) and that we already know.





As an attacker we control the cipher text that needs to be decrypted, we now need an expression involving this that will give us information about the plain-text. As we know in the decryption operation the cipher text is first decrypted and then XOR'd with the previous block of cipher text, as follows:





$p_{i} = D(c_{i}) \oplus c_{i-1}$ (1)





What we need to remember here is that we control all the c's so to denote a block that has been changed by an attacker as follows $c^{'}_{i}$. Also the decryption operation starts from the last block, meaning the block that most likely contains some padding. Here's what it looks like when an attacker has influenced the last block of cipher text for decryption:





$p^{'}_{i} = D(c^{'}_{i}) \oplus c_{i-1}$ (2)





But this would (as noted above) change the decryption operation for the current block; it changes to $c^{'}_{i}$ from $c_{i}$. We want to know what the result of the $D(c_{i})$ operation is so that means we can't touch what goes into it; rather we should be looking at what happens when we change the second last block, namely $c^{'}_{i-1}$. Here's what that will look like:





$p^{'}_{i} = D(c_{i}) \oplus c^{'}_{i-1}$ (3)

We can then expand the decryption operation (since we know algebraically what the result will be):





$c_{i} = E(p_{i} \oplus c_{i -1})$ (4)

pluggin this into (3) we get:

$p^{'}_{i} = D(E(p_{i} \oplus c_{i -1})) \oplus c^{'}_{i-1}$ (5)





and since $D = E^{-1}$ and thus because they are inverse functions

$D(E(x)) = E^{-1}(E(x)) = x$ therefore





$p^{'}_{i} = p_{i} \oplus c_{i -1} \oplus c^{'}_{i-1}$ (6)





So now we have an equation with two unknowns; we don't know $p_{i}$ because its the plain-text, and we don't know $p^{'}_{i}$ because it depends on $p_{i}$. So how do we solve this equation? Well the truth is we do know some of the values $p^{'}_{i}$ will assume. Since we are decrypting the last block, this means $p^{'}_{i}$ will need to (at some point) be padding bytes (0xk 0xk ...), algebraically this means we can solve the equation! so lets rearrange the terms in (6) so we can target the correct term:





$ p_{i} = p^{'}_{i} \oplus c_{i -1} \oplus c^{'}_{i-1}$

Awesome! Now to put the final nail in the coffin we need to write this in terms of bytes (we've been talking about blocks so far). Here's the byte wise definition:





$p_{i}[j] = p^{'}_{i}[j] \oplus c_{i -1}[j] \oplus c^{'}_{i-1}[j]$ (7)





Where $[j]$ means the $j_{th}$ byte of the annotated block.

So given that some of the bytes in $p^{'}_{i}$ will be known padding bytes we can write them as $k_{j}$. Which gives us:





$p_{i}[j] = k[j] \oplus c_{i -1}[j] \oplus c^{'}_{i-1}[j]$ (8)





now officially we have proved that this equation is solvable :)





Decryption by Oracle So the attack works by choosing leaking formation about the bytes that make up $p_{i}$, we propose a value for $k[j]$ (while modifying the values for $c^{'}_{i-1}[j]$) and then asking the decryption oracle if this value is true. Lets say we are decrypting last byte, we can choose 0x1 as $k[j]$ and 0x90 as the value for $c^{'}_{i-1}[j]$ then as the oracle the following question:





0x1 $= p_{i}[j] \oplus c_{i -1}[j] \oplus$ 0x90 ?





The "oracle" here could be a web service or a mail server or anything that is reporting on the validity of of the CBC decryption operation.



Because the cipher text we are decrypting depends on the original value of our modified cipher text block the equation includes $c_{i-1}[j]$. So when you're performing this attack, you will need to keep track of what you changed the byte value from.



So depending on whether the oracle manages to decrypt the byte or not (decryption will succeed if the byte we are influencing with these guessed values actually decrypts to a valid padding value). So the attack might mean guessing a bunch of times like this:



0x1 $= p_{i}[j] \oplus c_{i -1}[j] \oplus$ 0x90 0x1 $= p_{i}[j] \oplus c_{i -1}[j] \oplus$ 0x255 ... 0x1 $= p_{i}[j] \oplus c_{i -1}[j] \oplus$ 0x97 So depending on whether the oracle manages to decrypt the byte or not (decryption will succeed if the byte we are influencing with these guessed values actually decrypts to a valid padding value). So the attack might mean guessing a bunch of times like this:

0x1

$p_{i}[j] =$ 0x1 $ \oplus c_{i -1}[j] \oplus$ 0x97

0x2

0x2

0x2 $= p_{i}[j-1] \oplus c_{i -1}[j-1] \oplus$ 0x112

0x2 $= p_{i}[j-1] \oplus c_{i -1}[j-1] \oplus$ 0x87

...

0x2 $= p_{i}[j-1] \oplus c_{i -1}[j-1] \oplus$ 0x48

0x03

Parallelization of the attack

Interesting Questions

Are there any other block cipher modes that are susceptible to the Vaudenay attack? Given that the fastest known worst running time for the depends on re-ordering the decrypted plain-text is there a way to make it not depend on this? In a sense find an algorithm that will beat the speed of one dependent on how fast the plain-text can be re-ordered. In other words find an algorithm that makes this fast in a way that doesn't depending on a sorting algorithm, use the inherent properties of the decryption attack to make it faster.

What about decryption? Well you could probably imagine what that looks like:And in algebraic notation:There is some information missing from this operation though, what about padding? Block ciphers are only designed to handle block length multiples of input data, so either we just hope that all plain-text we need to encrypt is a multiple of the block length or we add something to the plain-text to make sure it is :)Well in order to account for that we need to expand our definition of plain-text blocks as follows:and where $+$ and $\sum$ are short hands for the string concatenation operation; and $ \xi \in S$ where $S$ is the set of all padding characters; $e$ is the length of padding character we need to add to make $p$ a string that's a length that's a multiple of the block length. Which means that $ e + n $ $mod$ blocklength $ = 0$.There are standards for padding plain-text before encrypting it, each scheme works a little differently but each of them must accomplish two things:Here we will look at the padding scheme proposed in PKCS#7 defined in RFC 5652 . Here's how it works:For example in increasing length the padding works as follows:So in essence we are proposing in the first round of the attack that the plain-text has one byte of padding (hence the guess of 0x1), this means that the $k[j]$ should have been. we are essentially guessing values for $c_{i-1}[j]$ until this is true and the oracle can move onto decrypting the next byte in the block. So if the Oracle replies with a YES (yes I can decrypt the next byte because this one decrypted to the correct padding value) then we can work out what the $j_{th}$ is with a couple rudimentary algebraic tricks. By rearranging we can then express this as:We then need to move onto guessing the value for the next byte, this means we need to fixate a different value for $k[j-1]$ ( $j-1$ since we're working on the next byte), and according to how the padding scheme works we need to make sure that we setfor $k[j-1]$,for $k[j]$ and see to it that we only modify the $2_{nd}$ last byte of $c^{'}_{i-1}$ namely $c^{'}_{i-1}[j-1]$.We can then guess until we get a yes from the Oracle:Once we get a yes on this round we can move onto the $3_{rd}$ last byte setting $k[j-3] \rightarrow k[j]$ as. Overall the attack is merely a way of abusing the fact that we have a way to ask about the value of the padding in a block, if we know about the padding we can stuff this information into the algebraic forms above and jimmy them around until they tell us about the plain-text!You should also notice that when we wrote the notation in (6) we where strict about the order of the blocks involved, though this is merely semantic,it means to express that the attack essentially abuses control of the $i_{th}$ block to decrypt the $(i-1)_{th}$ block. In fact, you can use any two pairs of cipher blocks in the attack, regardless of when they occur in the actual cipher-text. All that is required as that the attack takes in two whole cipher text blocks.This independence with regard to ordering is a classic attribute in problems that can be solved using parallelization; namely we can processes decryption of a number of blocks simultaneously and then reorder them once the plain-text is available.