In this section, we prove Theorem 3.9 by giving an algorithm for (k, q)-BinRankApx(F). We assume that all the arithmetic operations, binary rank etc. in this section are over
the given field F, although we will not explicitly mention it. First we show how to extend the kernelization rule that we described in Section 3.3 to the approximate version. The idea is similar, that we remove identical rows and columns, but we need to keep some extra book-keeping here. The reason is that one row of the reduced matrix may represent 2 or more entries of the original matrix, say p entries, and making an error in such an entry actually means making p errors. So we will keep track of the number of entries represented by an entry Ai,j, which we call the error cost Ei,j. Initially we can assume that all the error costs are 1. For two matrices A and A0, we define E(A, A0) as
P
i,j:Ai,j6=A0i,jEi,j. We will solve the following more general problem: (k, q)-BinRankApxCost(F)
Input: A ∈Zm×n and an error cost matrix E ∈Zm×n≥1
Output: whether there exist a matrix A0 ∈ Zm×nsuch that E(A, A0) ≤ q and binary rank of A0 overF is at most k
The reduction rule for kernelization is as follows:
Reduction rule 3.28. If the ith and jth rows of A are identical, delete jth row from A and E and set Ei,`:= Ei,`+ Ej,` for all ` ∈ [n]. Similarly, if the ith and jth columns of A are identical, delete jth column from A and E and set E`,i:= E`,i+ E`,j for all ` ∈ [m].
For proving the correctness of the rule, first we need to show that it is fine to make only identical errors in the identical rows/columns.
Lemma 3.29. Consider a matrix A whose ith and jth rows (or columns) are identical. If there is a matrix A0 such that E(A0, A) ≤ q, then there is a matrix A00 such that E(A00, A) ≤ q, the binary rank of A00 is at most the binary rank of A0, and A00i,: = A00j,:
(or A00:,i= A00:,j resp.).
Proof. We will prove the lemma for rows and the proof for columns follows by symmetry.
Assume without loss of generality that E(A0i,:, Ai,:) ≤ E(A0j,:, Aj,:). Let A00 be the matrix
obtained by replacing A0j,: with A0i,: in A0. It is easy to see that A00 has binary rank at most that of A0 and that E(A00, A) ≤ E(A0, A).
Consider an instance (A, E) of (k, q)-BinRankApxCost(F). Suppose row i and row j are identical in A and we delete row j from A according to reduction rule 3.28. Let ˆA be the matrix after deletion of row j and let ˆE be the new error matrix modified
as given in reduction rule 3.28. Suppose ˆA0 is a solution for instance ( ˆA, ˆE). Let A0
be the matrix obtained from ˆA0 by copying the ith row and inserting it as the jth row (increasing the indices of the following rows by 1). We prove A0 has binary rank at most k as follows. Since ˆA0 has binary rank at most k, there exist ˆB ∈ {0, 1}m×k and
ˆ
C ∈ {0, 1}k×nsuch that ˆB ˆC = ˆA0. Let B be the matrix obtained from ˆB by copying the ith row and inserting it as the jth row (increasing the indices of the following rows by 1). It is clear that B ˆC = A0 and hence A0 has binary rank at most k. By the construction of
ˆ
E and ˆA, it is easy to see that E(A0, A) = ˆE( ˆA0, ˆA) ≤ q. Hence a solution of ( ˆA, ˆE) can
be converted to a solution of (A, E) in O(n) time. We also need to show that if ( ˆA, ˆE) is
to (A, E). We can assume due to Lemma 3.29 that A0i,: = A0j,:. Let ˆA0 := A0[m]\j,:. Then,
ˆ
E( ˆA0, ˆA) = E(A0, A) − E(A0i,:, Ai,:) − E(A0j,:, Aj,:) + ˆE( ˆA0i,:, ˆAi,:)
= E(A0, A) − X
`:Ai,`6=A0i,`
Ei,`− X `:Aj,`6=A0j,` Ej,`+ X `: ˆAi,`6= ˆA0i,` ˆ Ei,` = E(A0, A) − X
`:Ai,`6=A0i,`
Ei,`−
X
`:Ai,`6=A0i,`
Ej,`+
X
`:Ai,`6=A0i,`
Ei,`+ Ej,`
= E(A0, A) ≤ q
Hence, ˆA0 is a solution to ( ˆA, ˆE). Hence the reduction rule is correct. Note that we gave the argument for only deletion of rows, but it can be shown for columns also similarly. Also note that we need to do some more book-keeping, i.e. for each deleted row (or column), we need to remember the row (or column) that was identical to it, so that we can reconstruct the corresponding row (or column) in B. But this is very straightforward and the reconstruction procedure takes only O(mn) time. The whole kernelization can be implemented in O(mn(log m + log n)) time, the same way as the one in Section 3.3. After the reduction rule is exhaustively applied, we show that the number of rows and columns is at most 2k+ q. The reason is that there are at most q rows or columns that can have errors, and hence if we remove those rows from A and A0 the remaining part of the two matrices should be same. Hence there exist q rows (and columns) of A whose removal gives a matrix with binary rank at most k. Hence the remaining part can have at most 2k rows by Lemma 3.19. Thus we have the following lemma.
Lemma 3.30. If (A, E) is a YES instance of (k, q)-BinRankApxCost(F) and A has all rows and columns distinct then A has at most 2k+ q rows and at most 2k+ q columns.
Now, we guess the error positions in A. Note that there can be only at most q error positions as each position has error cost at least 1. Since the total number of positions is at most (2k+ q)(2k+ q), we have that the number of possible guesses is at most
(2k+ q)(2k+ q) + 1
q
!
= O(22kq+ 22q log q+ 2kq+q log q)
Once we have guessed the positions of errors, we can compute the required A0 as follows: The positions where there are no errors, we know A0i,j = Ai,j. It only remains to
compute the entries of the positions where error occurs. Let (B, C) be a k-rank binary decomposition of A over F. The entry A0i,j is completely determined by Bi,: and C:,j . Hence the entries at error positions of A0 are completely determined by at most q rows of B and q columns of C. We guess these rows and columns which adds a multiplicative factor of at most 22kq to the running time. Now we have completely fixed A0.
Now it only remains to check whether A0 has binary rank at most k. For this we can just call the algorithm for k-BinRank(F) on (A0, k). This will run in time
O(22k2+2k
(2k+ q)2log(2k+ q)) according to Theorem 3.6. Thus the total running time is
O((22kq+ 22q log q+ 2kq+q log q)22kq+2k2+4k(k + log q) + mn(log m + log n)). This completes the proof of Theorem 3.9.