Subset Sum Problem-Dynamic programming

Subset sum problem

Given a set or an array of integers, find if there is subset with a given sum K. It is know as subset sum problem. For example, if array  A = [2,3,5,6,7,8,9] and K = 15, subsets which have K as sum are [ 3,5,7], [7,8], [6,9],[2,5,8]. Answer to the problem will be True.

Brute force solution is to find all subsets of array and check each one of them to see if it’s members add up to given sum. There can be 2n subsets for a set with n elements and hence the complexity of this solution is exponential for obvious reasons.

Understanding subset sum problem

First thing we must understand is that in order to apply dynamic programming to any problem, it should satisfy two basic conditions. First, it can be subdivided into smaller subproblems and solutions to those subproblems lead to solution of original problem. Second, subproblems should be overlapping, so that optimization can be done using memorization.

Here, ask is to find a subset of set whose sum does not exceed S. This condition is similar to what we had in knapsack problem. There we had a limited capacity and we could take anything which does not exceed the limit. However there is a difference, in knapsack problem allows to take items less than capacity if value was greater in all others combinations. In subset sum problem discard subsets which have sum less or more than S.

What was strategy to solve knapsack problem? Yes, take each item available and check if it fits in constraint in current context. If it does, add item to solution and reduce problem to subproblem with N-1 items and reduced capacity of knapsack ( C-v ) where v is value of included item. If item does not satisfy constraint, ignore it and reduce problem to N-1 items and knapsack capacity C. Same approach can be used to solve subset sum problem too.

To apply DP to problem, let’s first come up with recursive solution to problem. Consider, first element of set to add as part of subset?  What all possible scenarios can happen here? Either element is added to subset or it is not added to subset. Right?

What if element is included to subset? If element is included, problem reduces to n-1 elements with sum to be found as S-value of element.

If element is not added to subset, it can happen if including element will make subset sum greater than S. Even then, problem reduces to n-1 elements, with required sum still being S.

Recurrence relation for subset sum problem

isSubsetSum(arr, i, n, sum)  = isSubsetSum(A, n-1, S-A[i]) 
                             if item A[i] can be included in subset 
                             isSubsetSum(A, n-1, sum)
                             if item A[i] cannot be included in subset

What would be the base case for recursive function? At any given point of time, if required sum is zero, subset is found; return true. Else, if all elements of array are considered and yet required is non zero, there is no subset with given sum; hence, return false.

If S == 0  return true
If n == 0 && S != 0 return false

With some book keeping we can also print all subsets with given sum.

Subset sum problem implementation

#include<stdlib.h>
#include<stdio.h>
 
#define true 1
#define false 0
int isSubsetSum(int arr[], int n, int sum ){
	printf("\nCalled with sum : %d, N = %d", sum, n );
	if(!sum){
		return true;
	}
	if(sum < 0) return false;
 
	if(n <= 0  && sum > 0)  return false;
	return isSubsetSum(arr, n-1, sum-arr[n])  
         + isSubsetSum(arr, n-1, sum );
}
 
/* Driver program */
int main(){
 
  int set[] = {1,3,5,4,6};
  int n  =  sizeof(set)/sizeof(set[0]);
 
  printf("\n Is there subset with given sum  : %s",
              isSubsetSum(set, n-1, 10 ) ? "Yes" : "No");
  return 0;
}

Complexity of solving subset sum problem using recursion is (2n). It is because, in worst case, all subsets (of length 1 to N ) of set are considered.

Subset sum problem using dynamic programming

Look at computation tree of recursive function calls, it will be evident that subproblems are recalculated. Can this information be stored somehow, so that it not recalculated? Answer to question is yes.

Let’s create a two dimensional table with size S * N called Subset. Subset[i][j] is true, if there is a subset in A[0..j-1] with sum i. Otherwise Subset[i][j] is false.Final goal is to find value of Subset[S][N], which will be 1 if there is a subset of array with sum equal to S. To calculate Subset[S][N], we have to fill table.

Subset[i][j] is true iff one of the following two conditions is true:

1. Subset[i][j-1] is true. If Subset[i][j-1] is true, it means that there is a subset with sum i in A[0..j-1], so obviously, there must be subset with sum i in A[0..j]

2. Subset[i-A[j]][j-i] is true. If Subset[i-A[j]][j-i]  is true, it means that there is subset with sum of (i–A[j]) in A[0..j-1]. Now if we select A[j], then a subset with sum of i can be obtained. Therefore Subset[i][j] is true.

Set Subset[0][j] = true (for 0<=j<=N). Its possible to obtain a sum = 0 from any value of j with empty subset.

Implementation of dynamic programming algorithm for subset sum problem

#include<stdlib.h>
#include<stdio.h>
 
#define true 1
#define false 0
 
int isSubsetSum(int arr[], int n, int sum)
{
  /* The value of subset[i][j] will be true if there is a 
  subset of set[0..j-1] */
  int subset[sum+1][n+1];
  int i,j;
 
  /* If sum == 0, there will be empty set satisfying that condition
  hence row with sum = 0 will have all true values. */
  for (i = 0; i <= n; i++)
    subset[0][i] = true;
 
  /* If sum is not 0 and set is empty, no subset is there */ 
  for (i = 1; i <= sum; i++)
    subset[i][0] = false;
 
  for (i=1; i<=sum; i++){
    for (j=1; j<=n; j++){
        /* If there is subset with j-1 elements, copy value */
        subset[i][j] = subset[i][j-1];
 
        /* Now if the latest element can be added */
        if (i >= arr[j-1])
            subset[i][j] = subset[i][j] || subset[i-arr[j-1]][j-1];
    }
  }
  return subset[sum][n];
}

/* Driver program */
int main(){
 
  int set[] = {1,3,5,4,6};
  int n  =  sizeof(set)/sizeof(set[0]);
  int subset[n];
  printf("Is there a subset with given sum : %s", 
  			isSubsetSum(set, n, 10) ? "Yes" : "No");
  return 0;
}

Dynamic programming algorithm has time complexity O(N*S) and space complexity O(N*S) where N is size of set and S is sum required.

Please share if you find something wrong or missing. Also, if you want to contribute to website, please refer Publishing and contact us. We would love to publish your article and at the same time, will pay you too.

  • M8

    > 18 return isSubsetSum(arr, n-1, sum-arr[n], subset, count + 1)
    > 19 + isSubsetSum(arr, n-1, sum, subset, count );
    In the first approach, why do you use ‘+’ and not the logical OR operator?

    • http://algorithmsandme.in/ Jitendra Sangar

      We need to print all subsets with given sum. If we do logical OR, then there will be short circuiting and second function call will never be executed once the first function returns true. If the question was to just check if there is a subset with given sum, then logical OR would have work. Hope this makes sense.

  • javabinder.blogspot.in

    Thank you jitu….nice work !!!!!!

    • http://algorithmsandme.in/ Jitendra Sangar

      You’re welcome 🙂

  • locs

    > This code has complexity of O(N^2) and uses O(N^2) extra space.

    No it isn’t. The algorithm you have there is O(n*Sum). Much like knapsack, it is depending on a value that isn’t the size of your set (ie. you are depending on the value of sum as _well_ as the size of the set). This leads to depending on small values of n or Sum if you want to keep it reasonably fast and so depends on the number of bits of these values and thus pseudo-polynomial not O(n^2) which would make it polynomial.

    • http://algorithmsandme.in/ Jitendra Sangar

      Thanks for pointing it out, corrected the text.

  • Ankit

    here is a link to my solution. finds all subsets and prints them
    http://www.edufyme.com/code?id=45c48cce2e2d7fbdea1afc51c7c6ad26

    • http://algorithmsandme.in/ Jitendra Sangar

      No worries about comments on some posts if you have good code. But just to get back links, please don’t spam. I see you have already commented three times with some link to your site. I am approving only one.

      • Ankit

        sorry about that, will keep that in mind

  • Pingback: Balanced partition problem: Dynamic programming - Algorithms and Me()