We now estimate the cost of generating a near-collision block. Since the bitconditions are concen- trated on the first 16 working states and the tunnel T8 is used, we assume that the algorithm can
be broken down into the following steps:
1. Generate a full differential path/generate a set of initial working states that connects to the lower differential path.
2. SelectQ1, . . . , Q16according to the path and according to tunnel requirements.
3. Try to obtain a solution up to step 2425 with the help of tunnelsT4 and T5. Return to step 2 of this algorithm and choose differentQi if it is not possible to obtain a solution. Use early
abort to reduce the cost of this step. The message words are computed as needed, like in the attack by Stevens et al.
4. Attempt to generate a solution for the whole path from our solution up to step 24 using tunnel T8. We use early abort to some extent. We do not try to solve the differential path exactly since we only care about the values of δQ61, . . . , δQ64, but we check whether δQ35=· · ·=δQ60= 231 and we abort if that is not the case.
5. Check if the values forδQ61, . . . , δQ64 are correct. If yes, we have found a solution, if not, we have to continue searching.
We will estimate the expected number of calls to the Step-function of MD5Compress that the algorithm makes. Dividing this number by 64 gives us the equivalent cost in terms of MD5Compress- calls. LetCpath be the cost of generating the full differential path. We do not know the exact value of Cpath, but we can estimate the costs that the other steps of the message block construction
algorithm contribute.
In step 2, we can choose working states Qi without computing the step-function. But we also
need to verify whether the rotations in these steps are correct. Since we do not know how the states Q1, . . . , Q16 are selected – maybe the attackers introduced additional bitconditions to improve the rotation probabilities, as in the attack by Stevens et al. (see Section2.2.9 and [24, Section 6.3.3]) – we assume that this cost is negligible compared to the cost of the other parts. For each t such that 16 ≤t ≤ 24, we estimate the expected cost Ct of generating a solution up to step t. Let pt
be the probability that a solution up to stept is a solution up to step t+ 1. Forbt the number of
bitconditions on step t+ 1 and rt the rotation probability of step t+ 1, we havept= 2−bt ·rt. We
let st be the combined strength of the tunnels that are applicable to step t, i.e., the tunnels that
only change working states Qt+1, . . . , Q64. The expected amortized cost for generating a solution up to stept is
Ct0 = Ct+ 2
st−1
2st
which can be seen as follows: Using the tunnels, if we haveonesolution up to stept, we can generate 2st solutions up to step t. But for each solution beyond the initial one, we have to recompute one
message word.26 The cost of this recomputation per message word is equivalent to the cost of Step. Thus, we can generate 2st solutions up to step t at an expected cost of C
t+ 2st −1 and therefore
25
Recall that we say that a pair of inputs solves a path up to steptif it agrees with the bitconditionsq−3, . . . ,qt
and with theδQt+1from the differential path. 26
We have to recompute only one word because we compute the message words as they are needed. The other words that might be changed by the tunnel are not computed at this point.
the above formula for the expected amortized cost holds. If there are no applicable tunnels, we have st= 0 and Ct0 =Ct. We estimate the expected cost of finding a solution up to stept as
Ct= 0 fort= 16 1 pt−1 · C 0 t−1+ 2 for 16< t≤32 1 pt−1 · C 0 t−1+ 1 fort >32
This estimate is justified as follows: We have a probability ofpt−1that a solution up to stept−1 is a solution up to step t. Thus, we expect that we need to examine 1/pt−1 solutions up to step t−1 until we find a solution up to step t. The expected amortized cost of generating a solution up to step t−1 isC0t−1. We need to compute one additional step of MD5Compressto determine whether a solution up to stept−1 is a solution up to stept. Fort≤32, we need to compute one additional message word in order to do this. After step 32, we already have the whole message block. The cost of computing a message word is equivalent to one execution of Step.
The probabilitiesp16, . . . , p23 are roughly equal for each of the four differential paths. They are given in the following table.
t 16 17 18 19 20 21 22 23
pt 2−6.0 2−3.0 2−2.2 2−2.0 2−3.0 2−2.0 20.0 2−1.0
Furthermore, averaging over the tunnel strengths in all four observed near-collision blocks, we have s21≈10.8,s24≈9.8 and st= 0 for all t6= 21,24. Thus, we have
C17=C170 ≈27.0
C18=C180 ≈210.0 C19=C190 ≈212.2
C20=C200 ≈214.2
Then,C21≈217.2. The amortized cost isC210 ≈26.4. Continuing on towards step 24, we have
C22=C220 ≈28.5 C23=C230 ≈28.5 and C24≈29.5. The amortized cost isC0
24≈20.9.
Since we have only trivial bitconditions with δQi = 0 from step 24 to step 33, we have C330 =
C33 = C240 + 17 ≈ 24.2. In the next step, we have one non-constant bitcondition and a rotation probability of 1/2, so we get C34 = C0
34 ≈ 26.4. Then, there is a sequence of trivial steps with δQi= 231. From now on, we relax our definition of solving a differential path. We require only that
theδQiare correct. By Theorem1.14, part 3, we haveδQi= 231with certainty up toi= 47. Thus, C47=C0
47=C34+ 13≈26.6. For the steps after that, Theorem 1.14, part 4, tells us thatδQi = 231
implies δQi+1 = 231 as long as ∆Qi[31] = ∆Qi−2[31] and δWt = 0 or ∆Qi[31] 6= ∆Qi−2[31] and δWt = 231. We assume that the condition of Theorem 1.14, part 4, holds with probability 1/2.
This is justified as follows: Assuming thatQi is selected uniformly at random and Q0i =Qi+ 231,
we have ∆Qi[31] = 1 or −1 with probability 1/2 each and exactly one of these two alternatives
That gives us C59 = 219.6. If the path is solved up to step 59 (modulo the sign in ∆Qi[31]), we
have δQ57 = · · · =Q60 = 231. Let the random variable attempts be the number of input pairs with δQ57 = · · · =δQ60 = 231 that we need to evaluate until we find an input pair that has the desired values for δQ61, . . . , δQ64. The expected cost of finding a solution with the correct values for δQ61, . . . , δQ64 is then E[attempts]·(C60+ 4) ≈ E[attempts]·219.6. In terms of calls to MD5Compress, the complexity of generating a near-collision block is
Cblock=Cpath+E[attempts]·213.6
since MD5Compress consists of 64 = 26 steps. We will estimate E[attempts] in the next section using Monte Carlo-simulations.
3.6.2 Estimating the Expected Number of Attempts at Solving the End-Segment