Subset Sum algorithm

Subset Sum is the first NP-complete problem I learned at Macalester. This question is viewed 36000+ times yet I don’t see a sufficient answer that explains the algorithm in detail with logic. So I thought I make an attempt to do so.

Assumption:

For the sake of simplicity first I made the assumption that the input set X contains only positive integers and k is positive. However, we can tweak the algorithm to handle negative integers and the case if k is negative.

Logic:

The key to this algorithm or really any DP problem is to break down the problem and start simply from a base case. then we can build on the base case using some knowledge that we know:

  1. we know that if the set X is empty then there is no way we can sum to any value of k.
  2. If a set X contains k then it has a subset sum to k.
  3. we know that if a subset of set x1 who is a subset of X sum to k1 then X will have a subset that sum to k1 namely x1.
  4. we have a set X = {x1, x1, x3, ......., xn, xn+1}. We know it has a subset sum to k1 if x1 = {x1, x1, x3, ......., xn} has a subset sum to k - k1.

Example to illustrate 1,2,3,4:

  1. it is easy. if you have an empty set {}. you can’t have a subset thus
    you can’t have any subset sum.
  2. A set X = {4} has a subset sum to 4 because 4 it self is part of the set

  3. say you have a set x1 = {1,3,5} who is a subset of set X = {1,3,5,2,8}. if x1 has a subset sum to k1 = 8 then that means X also has a subset sum to 8 because x1 is a subset of X

  4. say you have a set X = {1,3,5,2,19} and we want to know does it have a subset sum to 20. It does and one way to can know if that is x1 = {1,3,5,2} can sum to (20 – 19) = 1. Since x1 has a subset sum to 1 then when we add 19 to the set x1
    we can take that new number 1 + 19 = 20 to create our desired sum 20.

Dynamically build a matrix
Cool! now let us utilize the above four logics and start building from the base case. We are going to build a matrix m. We define:

  • matrix m has i+1 rows and k + 1 columns.

  • Each cell of the matrix has value true or false.

  • m[i][s] returns true or false to indicate the answer to this question: “using the first i items in the array can we find a subset sum to s? ” m[i][s]returns true for yes and false for no

(note the Wikipedia answer or most of the people build a function m(i,s) but I thought the matrix is an easy way to understand dynamic programming. It works well when we only have positive numbers in the set or array. However the function route is better because you don’t have to deal with index out of range, match index of the array and sum to the matrix…..)

Let us build the matrix using an example:

X = {1,3,5,2,8}
k = 9

We are going to build the matrix row by row. We ultimately want to know the cell m[n][k] contain true or false.

First Row:
Logic 1. told us that the first row of the matrix should all be false.

   0 1 2 3 4 5 6 7 8 9
   _ _ _ _ _ _ _ _ _ _
0| F F F F F F F F F F
1|
2|
3|
4|
5|

Second Row and above:
Then for the second row or above, we can use logic 2,3,4 to help us populate the matrix.

  • logic 2 tells us that m[i][s] = (X[i-1] == s) rememebr m[i] is referring to the ith item in X which is X[i-1]
  • logic 3 tells us that m[i][s] = (m[i-1][s]) this is looking at the cell direclty above.
  • logic 4 tells us that m[i][s] = (m[i-1][s - X[i-1]]) this is looking at the row above and left of X[i-1] cells.

If any of these is true then m[i][s] is true otherwise false. so we can rewrite 2,3,4 into m[i][s] = (m[i-1][s] || a[i-1] == s || m[i-1][s - a[i-1]])

Use these above logics to populate the matrix m. In our example, it looks like this.

   0 1 2 3 4 5 6 7 8 9
   _ _ _ _ _ _ _ _ _ _
0| F F F F F F F F F F
1| F T F F F F F F F F
2| F T F T T F F F F F 
3| F T F T T T T F T T
4| F T T T T T T T T T 
5| F T T T T T T T T T

Now use the matrix to answer your question:

look at m[5][9] which is the original question. using the first 5 items (which is all the items) can we find a subset sum to 9 (k)? and the answer is indicated by that cell which is true

Here is the Code:

import java.util.*;

public class SubSetSum {

    public static boolean subSetSum(int[] a, int k){

        if(a == null){
            return false;
        }

        //n items in the list
        int n = a.length; 
        //create matrix m
        boolean[][] m = new boolean[n + 1][k + 1]; //n + 1 to include 0, k + 1 to include 0 

        //set first row of matrix to false. This also prevent array index out of bounds: -1
        for(int s = 0; s <= k; s++){
            m[0][s] = false;
        }

        //populate matrix m
        for(int i = 1; i <= n; i++){
            for(int s = 0; s <= k; s++){    
                if(s - a[i-1] >= 0){ //when it goes left we don't want it to go out of bounds. (logic 4)
                    m[i][s] = (m[i-1][s] || a[i-1] == s || m[i-1][s - a[i-1]]); 
                } else {
                    m[i][s] = (m[i-1][s] || a[i-1] == s);
                }       

            }
        }

        //print matrix
        print(m);

        return m[n][k];

    }

    private static void print(boolean[][] m){
        for(int i = 0; i < m.length; i++){
            for(int j = 0; j < m[i].length; j++){
                if(m[i][j]){
                    System.out.print("T");
                } else {
                    System.out.print("F");
                }           
            }
            System.out.print("\n");
        }
    }

    public static void main(String[] args){
        int[] array = {1,3,5,2,8};
        int k = 9;

        System.out.println(subSetSum(array,k));

    }
}

To build the matrix m takes O((n+1)(k+1)) which is O(nk). it seems like it should be polynomial but it is not! It’s actually pseudo polynomial. Read about it here

Again this only works if the input only contains positive numbers. You can easily tweak it to work with negative numbers tho. The matrix would still have n+1 rows but B - A + 1 columns. Where B is the upperbound and A is the lowerbound ( +1 to include zero).The matrix would still be You would have to offset s with the lowerbound.

It is pretty hard to explain DP problem over text from start to end. But I hope this helps those out there that try to understand this problem.

Note that in the examples above the rows of the DP table are sorted. That does not have to be the case.

Here is a DP table for the case of the question, i.e. given a set of {5, 3, 11, 8, 2}. For brevity, I have omitted the false values.

┌─────────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┐
│ (index) │  0   │  2   │  3   │  5   │  7   │  8   │  10  │  11  │  13  │  14  │  15  │  16  │
├─────────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│    0    │ true │      │      │      │      │      │      │      │      │      │      │      │
│    5    │ true │      │      │ true │      │      │      │      │      │      │      │      │
│    3    │ true │      │ true │ true │      │ true │      │      │      │      │      │      │
│    11   │ true │      │ true │ true │      │ true │      │ true │      │ true │      │ true │
│    8    │ true │      │ true │ true │      │ true │      │ true │ true │ true │      │ true │
│    2    │ true │ true │ true │ true │ true │ true │ true │ true │ true │ true │ true │ true │
└─────────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘

Below is an implementation in JavaScript which will output the target set {5, 11}:

var subSetSum = function(input, sum) {

    let y = input.length;
    let x = sum;

    if(input.length === 0) return 0;

    let d = [];

    //fill the rows
    for (let i = 0; i <= y; i++) {
      d[i] = [];
      d[i][0] = true;
    }
    
    for (let j = 1; j <= y; j++) { //j row
      for (let i = 1; i <= x; i++) { //i column
      let num = input[j-1];
        if(num === i) {
          d[j][i] = true;
        } else if(d[j-1][i]) {
          d[j][i] = true;
        } else if (d[j-1][i-num]) {
          d[j][i] = true;
        }
      }
    }
    
    //console.table(d); //uncomment to see the table
    if(!d[y][x]) return null;

    let searchedSet = [];
    for(let j=input.length, i=sum; j>0 && i != 0; j--) {
      if(input[j-1] !== i) {
        while(d[j-1][i]) { // go up
          j--;
        }
      }
      searchedSet.push(input[j-1]);
      i = i-input[j-1];
    }

    return searchedSet;
};

console.log('searched set:'+ JSON.stringify(subSetSum([5, 3, 11, 8, 2], 16)));

Leave a Comment