I never expected to dedicate a casual blog entry to a hardness proof, especially one fairly academic as this. However, the nature of it evokes certain elegance beyond what I typically encounter in the domain. Authentic theoreticians may not sympathize with this excitement. Yet I find enough aesthetic substance in the matter to indulge in the exercise that follows.
Introduction
Subset Sum: given a set A of integers, and an integer k, does some subset S of A sum to k?
Subset Sum is a NP Complete problem, possessing a pseudo-polynomial dynamic-programming solution. However, I’m less intrigued by the NP proof and more by the NP-hardness proof which I present below.
Subset Sum is NP hard
Proof: We’ll construct a polynomial reduction from 3-CNFSAT (3-literal per clause Conjunctive Normal Form Satisfiability) problem. Let Ψ be a 3-CNF formula with n variables and m clauses. Let’s build set A as follows:
For each variable xi, 1 ≤ i ≤ n, append the following two integers to A:
- yi = 10n+m-i + c1ic2i…cmi, where digit cji is 1 if literal xi appears in clause j, and 0 otherwise.
- zi = 10n+m-i + d1id2i…dmi, where digit dji is 1 if the negative literal ¬xi appears in clause j, and 0 otherwise.
Then for each clause cj, add two rows
- gj = 10m-j
- hj = 10m-j
Finally, let k = int(1n3m) (exponents in string arithmetic). This concludes our reduction.
For example, the formula
(x1 ∨ x2 ∨ x3) ∧ (¬x2 ∨ x3 ∨ x4) ∧ (x1 ∨ ¬x2 ∨ ¬x3)
yields the following set A and k:
A | x1x2x3x4,c1c2c3 |
---|---|
y1 | 1 0 0 0, 1 0 1 |
z1 | 1 0 0 0, 0 0 0 |
y2 | 1 0 0, 1 0 0 |
z2 | 1 0 0, 0 1 1 |
y3 | 1 0, 1 1 0 |
z3 | 1 0, 0 0 1 |
y4 | 1, 0 1 0 |
z4 | 1, 0 0 0 |
g1 | 1 0 0 |
h1 | 1 0 0 |
g2 | 1 0 |
h2 | 1 0 |
g3 | 1 |
h3 | 1 |
k | 1 1 1 1 3 3 3 |
Note, k = 1111333 for the current example, given 4 variables and 3 clauses.
Reduction correctness
- The reduction trivially constructs A in polynomial time.
Assignment t satisfies Ψ ==> exists subset S of A that sums to k:
Construct S as follows:
- Add yi to S if t(xi) is true, and zi if t(xi) is false.
- For each clause cj, add gj to S if less than 3 literals are satisfied in cj, and additionally add hj if less than 2 literals are satisfied.
The elements of S sum to k: since we add only yi or zi to S, every significant (variable) digit will contain a 1, and every insignificant (clause) digit will contain a 3, the sum of the number of literals satisfied in cj plus the number of literals not satisfied (maximum two), compensated by gj and hj.
Assignment t doesn’t satisfy Ψ ==> no subset S of A exits that sums to k:
Since we cannot satisfy Ψ, exists some clause cj that cannot be satisfied irrespective of the truth assignment. Given an arbitrary assignment t, clause cj doesn’t contain a single literal that would satisfy it, and the yi or zi added to S would contain a 0 in the respective decimal point. Consequently, S can only sum to 2 in that decimal point from the compensating gj and hj.
End of proof
Conclusion
This proof draws inspiration from CLRS, but as I don’t presently have access to the full text, I can’t ascertain the original source. Does my somewhat terse coverage make sense? Do you find this sort of proof admirable, or entirely unremarkable?
Questions, comments? Connect.