I never expected to dedicate a casual blog entry to a hardness proof, especially one fairly academic as this. However, the nature of it evokes certain elegance beyond what I typically encounter in the domain. Authentic theoreticians may not sympathize with this excitement. Yet I find enough aesthetic substance in the matter to indulge in the exercise that follows.
Subset Sum: given a set A of integers, and an integer k, does some subset S of A sum to k?
Subset Sum is a NP Complete problem, possessing a pseudo-polynomial dynamic-programming solution. However, I’m less intrigued by the NP proof and more by the NP-hardness proof which I present below.
Subset Sum is NP hard
Proof: We’ll construct a polynomial reduction from 3-CNFSAT (3-literal per clause Conjunctive Normal Form Satisfiability) problem. Let Ψ be a 3-CNF formula with n variables and m clauses. Let’s build set A as follows:
For each variable xi, 1 ≤ i ≤ n, append the following two integers to A:
- yi = 10n+m-i + c1ic2i…cmi, where digit cji is 1 if literal xi appears in clause j, and 0 otherwise.
- zi = 10n+m-i + d1id2i…dmi, where digit dji is 1 if the negative literal ¬xi appears in clause j, and 0 otherwise.
Then for each clause cj, add two rows
- gj = 10m-j
- hj = 10m-j
Finally, let k = int(1n3m) (exponents in string arithmetic). This concludes our reduction.
For example, the formula
(x1 ∨ x_2 ∨ x_3) ∧ (¬x2 ∨ x3 ∨ x4) ∧ (x1 ∨ ¬x2 ∨ ¬x3)
yields the following set A and k:
Note, k = 1111333 for the current example, given 4 variables and 3 clauses.
- The reduction trivially constructs A in polynomial time.
Assignment t satisfies Ψ ==> exists subset S of A that sums to k:
Construct S as follows:
- Add yi to S if t(xi) is true, and zi if t(xi) is false.
- For each clause cj, add gj to S if less than 3 literals are satisfied in cj, and additionally add hj if less than 2 literals are satisfied.
The elements of S sum to k: since we add only yi or zi to S, every significant (variable) digit will contain a 1, and every insignificant (clause) digit will contain a 3, the sum of the number of literals satisfied in cj plus the number of literals not satisfied (maximum two), compensated by gj and hj.
Assignment t doesn’t satisfy Ψ ==> no subset S of A exits that sums to k:
Since we cannot satisfy Ψ, exists some clause cj that cannot be satisfied irrespective of the truth assignment. Given an arbitrary assignment t, clause cj doesn’t contain a single literal that would satisfy it, and the yi or zi added to S would contain a 0 in the respective decimal point. Consequently, S can only sum to 2 in that decimal point from the compensating gj and hj.
End of proof
This proof draws inspiration from CLRS, but as I don’t presently have access to the full text, I can’t ascertain the original source. Does my somewhat terse coverage make sense? Do you find this sort of proof admirable, or entirely unremarkable?