schemes for securing network coding systems against pollution attacks. Within this framework, we roughly divide these schemes into three phases:
• Parameter setup phase: The source determines security parameters, chooses its keys including secret keys or public and private keys, and selects its hash or signature function. • MAC (hash or signature) calculation phase: The source calculates the authentication information such as the hashes, MACs or signatures of its messages. This information is either securely transmitted to the forwarders and sinks, or directly attached to the original messages.
• Message verification phase: The forwarders and sinks verify received messages. Verifica- tion is based on encoding vectors, authentication information, shared secret keys or the source’s public keys. If verification succeeds, the received messages are accepted and will be used for further encoding or decoding. Otherwise, they are discarded.
Table 6.1 Notation
Symbol Explanation
Mi, mi,j i-th source message and its j-th codeword
E, ej encoded message and its j-th codeword
n the number of source messages transmitted m the number of codewords of each message
t the number of random keys each node has u the number of codewords hashed in each MAC wi,j message Mi’s hash embedded in its j-th MAC
K, |K| global key pool and its size
ks,i i-th key of the source
id(ks,i) ks,i’s index in key pool
{x}ks,i encrypting x with random key ks,i
ri random seed used to generate hash chain for i-th MAC
ri,j j-th element of hash chain computed from ri
The first phase can be done offline, but the other two must be executed online. Hence, the second and third phases mainly determine the efficiency of schemes.
Note: Once a forwarder detects a polluted message, it may either encode other unpolluted messages by selecting a new encoding vector, or ask its upstream node to send the message again, because the pollution may be due to transmission error. Of course, the number of retransmissions should be pre-defined. We do not discuss this issue here, because it is out of the scope of our work.
6.4.2 The Detailed Procedure of Our Scheme
We assume that each node can randomly pick up a number of secret keys from a global key pool, utilizing some probabilistic key pre-distribution approaches such as [25, 90]. Thus, any two nodes have certain probability to share a common secret key. The source generates the same number of MACs for each message using its random keys. Each MAC is calculated based on some codewords randomly selected from the message, hence, it can authenticate those codewords of the message. In this way, each forwarder sharing some secret key(s) with the source can verify the corresponding codewords of an input message by checking the MACs using the shared key(s).
However, this shared-key based verification has a vulnerability. That is, a compromised forwarder who has a shared key is aware that which codewords have been used to generate an MAC. Then, it can pollute the corresponding codewords of messages without being detected, although it is unable to pollute the codewords authenticated by other MACs for which it has no shared keys. To address this vulnerability, we choose to overlap the codewords authenticated by any two MACs for the same message. By carefully controlling the overlapping ratio, we assure that a polluted message can be detected within certain hops with a high probability. We describe the detailed procedure of each phase of our scheme in the rest of this section.
Parameter setup phase: In this phase, the source first chooses the following security parameters, functions and secret keys:
• Two parameters t and u, where t is the number of MACs attached to each source message, and u is the number of codewords used to generate a MAC. These two parameters are public.
• t random integers r1, · · · , rt, where each rj ∈ [1, m] for j = 1, · · · , t. Each integer will
be embedded into an MAC for identifying the indexes of codewords based on which the MAC is generated.
• A pseudo-random permutation function f : [1, m] → [1, m], where f is public and any node can compute a hash chain from a given seed rj using this function.
• A hash function h : Zu
q → Zq, where Zq constrains the range of codewords and h is
public. Using h any node can generate a hash from u codewords, where the length of the hash is the same as that of codewords.
• t random keys ks,1, · · · , ks,t from a global key pool K, where s is the index of the source.
The index of each key ks,i in the key pool for i = 1, · · · , t is denoted as id(ks,i).
Note: We suppose that each node picks t random keys from K. The keys of node j are denoted as kj,1, · · · , kj,t.
MAC calculation phase: In this phase, the source attaches t MACs to each message Mi
encrypting the hash of u randomly selected codewords using a random key. For XOR network coding, a hash is simply an XOR of the selected codewords, whereas for normal network coding, the hash is a random linear combination of the selected codewords.
More precisely, message Mi is attached with t MACs MACi,1, · · · , MACi,t as well as the
corresponding indexes of the random keys that are used to generate MACs. Thus, in our scheme, the source actually generates and transmits
Mi, id(ks,1), MACi,1, · · · , id(ks,t), MACi,t . (6.5)
For j = 1, · · · , t, we define
MACi,j = {id(ks,j), rj, hi,j}ks,j , (6.6)
where {·}ks,j denotes encryption using key ks,j, and hi,j is the hash of u randomly selected
codewords of message Mi. The indexes of these codewords are determined by a hash chain that
is computed from a seed rj using function f . For v = 1, · · · , u, let rj,v denote each element of
the hash chain. Then, we have
rj,v= f (rj,v−1) , (6.7)
where rj,0= rj. Here, rj,1, · · · , rj,u are the indexes of selected codewords.
Once u codewords are selected, the source can generate the hash from these codewords using function h. For XOR network coding, the hash is
hi,j = mi,rj,1⊕ · · · ⊕ mi,rj,u . (6.8)
Note: we also take the consideration of normal network coding. In this case, the hash becomes
hi,j = β1mi,rj,1 + · · · + βumi,rj,u mod q
=
u
X
v=1
βvmi,rj,v mod q , (6.9)
where the coefficients βv ∈ Zq for v = 1, · · · , u are randomly generated to combine these
codewords. A simple way to generate these coefficients is to let them form a hash chain, which is generated from the seed rj using another pseudo-random permutation function f0. (We do
Finally, each source message is attached with t MACs, and each MAC is computed from u codewords. Or equivalently, each MAC authenticates u codewords of a message. We emphasize that the codewords authenticated by different MACs may overlap, that is, the same codeword may be used to generate different MACs. Averagely, each codeword is authenticated by t×um MACs.
In our scheme, when each forwarder generates its output message, it always attaches the MACs of all source messages from which this output message is produced. For example, when a forwarder generates E = Mi⊕ Mj, it will attach MACi,1, · · · , MACi,t and MACj,1, · · · , MACj,t
to its output message E. We observe that the source generates the MACs for different messages using the same set of random keys, so the indexes of keys such as id(ks,j) in equation (6.6) do
not need to be transmitted multiple times.
Message verification phase: In this phase, each forwarder or sink verifies its input messages based on the MACs for which it has the shared key(s) with the source. When receiving a message along with the MACs of all source messages from which this message is encoded, the node processes as follows:
1. It first checks the indexes prefixed to each MAC to see if it has any shared key with the source.
2. Once finding a shared key, it decrypts the corresponding MACs of source messages and generates the indexes of u codewords from the seed embedded into the MACs.
3. For normal network coding, it also needs to generate the coefficients used to combine the codewords.
4. After identifying the indexes of codewords, it takes the corresponding codewords out of the received message and calculates the hash of these codewords following equation (6.8) for XOR coding (or equation (6.9) for normal linear coding).
5. It further takes out the hashes embedded into the decrypted MACs of source messages and encodes them using the encoding vector transmitted along with the received message.
6. Finally, it checks if the hash of the received message (calculated in step 4) equals the combination of the hashes embedded in corresponding MACs (obtained in step 5). If equals, the verification succeeds. Otherwise, the received message is assumed to be polluted and will be discarded.
For example, if a node receives E = Mi⊕ Mj, and finds that it can decrypt MACi,l and
MACj,l of messages Mi and Mi. From the decrypted MACs, it further knows that the MACs
are calculated from the codewords of indexes x, y and z, then it checks if
ex⊕ ey⊕ ez = hi,l⊕ hj,l , (6.10)
where ex, ey and ez are the corresponding codewords of message E, and hi,l and hj,l are the
hashes encrypted in MACi,l and MACj,l. When equation (6.10) is satisfied, the node accepts
message E. Otherwise, it discards E.
6.4.3 Batch Verification
Our scheme supports batch verification, which can further reduce computation overhead and speed up message verification. Suppose a node receives three messages Ea, Eb and Ec.
For XOR network coding, it generates a new message E = Ea⊕ Eb⊕ Ec. For normal network
coding, it first chooses three random coefficients γa, γb, and γc, then, generates a new message
E = γaEa+ γbEb+ γcEc. The node further calculates E’s encoding vector as an XOR or linear
combination (with coefficients γa, γb, and γc) of those of messages Ea, Eb and Ec. And it also
attaches to E all unique MACs appended to Ea, Eb and Ec. Finally, it verifies E as normal.
If the new message passes verification, all the input messages are accepted. Otherwise, one or more messages must have been polluted. In this case, further verification should be carried out to find the malicious one(s). The node needs to re-check each input message individually or use batch verification repeatedly on the subsets of input messages. For example, we can speed up re-checking by using binary-checking, that is similar to binary-search algorithm. Binary- checking rules out a half of input messages at each step. That is, we encode and check each half of input messages separately. If pass, that half of messages will be accepted. Otherwise,
two sub-halves of the suspected half will be re-checked. This binary-checking process can be iteratively carried out until all polluted messages are found.