Free AI web copilot to create summaries, insights and extended knowledge, download it at here

6363

Abstract

j</code>. Note that since we have <code>target — a + 1</code> columns, the final value of <code>j</code> will be <code>target — a</code>, meaning that in the end, we try to find subsets that sum to <code>a + j = a + target — a = target</code>.</li><li>So, <code>DP[i, j] = 1</code> means that there exists a subset in <code>nums[0...i]</code> which sums to <code>a + j</code>. If <code>DP[i, j] = 0</code> there does not exist such a subset.</li><li>Recovering the solution: if <code>DP[n — 1, target — a] == 1</code>, there exists a subset of <code>nums</code> which sums to <code>target</code>.</li></ul><h2 id="b427">The update rule ⭐️</h2>We start by initializing the <code>DP</code> table with zeros and filling the first row of <code>DP</code>, noticing that <code>DP[0, j]</code> can only be 1 if <code>nums[0]</code> is equal to <code>a + j</code>.<div id="b44d"><pre>for j in range(0, target - a + 1): DP[0, j] = (nums[0] == a + j)</pre></div>For the remaining rows, <code>DP[i, j]</code> can be marked 1 under the following scenarios:<ul><li><code>DP[i — 1, j] == 1</code>: If the “intermediate” target <code>a + j</code> can be achieved using only a subset from <code>nums[0...(i — 1)]</code> then clearly, it can also be achieved if the <code>i</code>-th number is allowed to be in the subset.</li><li><code>nums[i] == a + j</code>: In this case, the intermediate target <code>a + j</code> can be achieved from a subset with a single integer: <code>{nums[i]}</code>.</li><li><code>DP[i — 1, j — nums[i]] == 1</code>: The trickiest rule, reminiscent of dynamic programming solutions to <a href="https://readmedium.com/how-to-solve-the-knapsack-problem-with-dynamic-programming-eb88c706d3cf">the knapsack problem</a>. If there exists a subset of <code>nums[0...(i — 1)]</code> which sums to <code>a + j — nums[i]</code>, then we know there is also a subset that sums to <code>a + j</code> by including <code>nums[i]</code> in the subset.</li></ul><div id="9bf8"><pre>for i in range(1, n): for j in range(0, target - a + 1): DP[i, j] = DP[i - 1, j] or nums[i] == (a + j) if DP[i, j] == False: next_j = j - nums[i] if 0 <= next_j < target - a + 1: DP[i, j] = DP[i - 1, next_j]</pre></div><h2 id="4622">Time and space complexity ⏱</h2>The DP solution described here is what’s known as a <a href="https://en.wikipedia.org/wiki/Pseudo-polynomial_time">pseudo-polynomial algorithm</a>. The size of the DP table depends not only (linearly) on the number of elements in <code>nums</code>, but also (linearly) on the values of <code>nums</code> and <code>target</code>, since the number of columns in the table is the distance between the <code>target</code> and the sum of negative integers in <code>nums</code>.Each cell of the table must be set to 0 or 1 once, and this is determined using a constant number of queries to other cells, so the algorithm’s runtime scales equally with the table size.<h2 id="c1ec">DP has shortcomings, brute-force can be better 💪</h2>Let’s says <code>nums = [-1000000, 1000000]</code> and <code>target = 999999</code>. Using the DP method, we would have 2 rows and <code>999999 + 1000000 + 1 = 2000000</code> columns. That’s a lot of memory usage for an obviously unsolvable problem! If there are few numbers in <code>nums</code> but the range of values is large, you’re better off brute-force checking all possible subsets.<h1 id="47f8">3. Finding all solutions 🤯</h1><figure id="4f66"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*KKP33DOsQiujVp_Z"><figcaption>We will use a stack, but unfortunately not this kind of stack! Photo by <a href="https://unsplash.com/@brigittetohm?utm_source=medium&utm_medium=referral">Brigitte Tohm</a> on <a href="https://unsplash.com?utm_source=medium&utm_medium=referral">Unsplash</a></figcaption></figure>Let’s move on to probably the most challenging topic, and also the least-discussed in other tutorials: how to actually find which subsets achieve the <code>target</code> sum!To do this, we need to use our <code>DP</code> table and backtrack through it. We’re going to use a non-recursive technique: a <a href="https://en.wikipedia.org/wiki/Stack_%28abstract_data_type%29">stack</a>. Each item in the stack will be a simple data structure that I call <code>StackItem</code>.<div id="b422"><pre>class StackItem: row: int # Row index in the DP table col: int # Column index in the <span class=

Options

"hljs-symbol">DP table take: list[int] # Indices of integers to include in the subset togo: int # Value "to go" until reaching the target sum</pre></div>We know by now, there exists a solution if the bottom-right cell of the <code>DP</code> table is 1. Otherwise, don’t even bother trying to find solutions, because there aren’t any! We’re going to initialize the stack / backtracking by starting in the bottom-right cell. We assume that the final integer in <code>nums</code> is included in the subset, so <code>togo</code> will be the <code>target</code> value minus <code>nums[n — 1]</code>.<div id="64b2"><pre>stack.push( StackItem( row=n - 1, col=target - a, take=[n - 1], togo=target - nums[n - 1] ) )</pre></div>Now we continuously pop items from the stack and add new items until it is empty. Once finished, we will have enumerated all possible solutions.Consider that we just popped an item: <code>item = stack.pop()</code> whose <code>row = i</code> and <code>col = j</code>. We will check three scenarios for <code>item</code>:<ol><li>If <code>DP[i — 1, j] == 1</code> then there could be a solution that doesn’t use the <code>i</code>-th integer in <code>nums</code>. In this case, we will add a new <code>StackItem</code> as though the <code>i</code>-th integer is not included in the subset, but the <code>(i-1)</code>-th integer is.</li><li>If <code>DP[i — 1, j — nums[i]] == 1</code> then there is a solution that uses the remaining <code>nums[0...(i — 1)]</code> integers to form an “intermediate” subset that sums to <code>a + j — nums[i]</code>. In this case, we will add a new <code>StackItem</code> assuming the <code>i</code>-th integer is included in the “final” subset.</li><li>If <code>item.togo == 0</code> then we’ve got a solution! Then <code>item.take</code> stores the indices of integers in (sorted) <code>nums</code> that form a subset whose sum is <code>target</code>.</li></ol><div id="05a2"><pre>while len(stack) > 0: item = stack.pop() i, j, take, togo = item.unpack()</pre></div><div id="ed47"><pre> # Scenario 1 if i > 0 and DP[i - 1, j]: new_take = take.copy() new_take[-1] = i - 1 # Replace the last element new_togo = togo + nums[i] - nums[i - 1] stack.push(StackItem(i - 1, j, new_take, new_togo))</pre></div><div id="7a57"><pre> # Scenario 2 next_j = j - nums[i] if i > 0 and 0 <= next_j < (target - a + 1): if DP[i - 1, next_j]: new_take = take.copy() new_take.append(row - 1) # Add a new element new_togo = togo - nums[i - 1] stack.push(StackItem(i - 1, next_j, new_take, new_togo))</pre></div><div id="e51d"><pre> # Scenario 3 if togo == 0: yield [nums[t] for t in take]</pre></div><h1 id="4bc7">Conclusion 🎉</h1>I hope you enjoyed learning how to solve the SUBSET-SUM problem using dynamic programming! There are quite a few online resources which explain how to determine if a solution exists for a particular set of (<code>nums</code> , <code>target</code>) but often these tutorials assume that all numbers are positive. The algorithm described here also works for negative numbers!💡Moreover, almost none of the existing tutorials explain how to leverage the DP table to backtrack through all solutions, which was the main motivation of this article.For a more concrete implementation (C++ and Python, not pseudo-code) please visit <a href="https://github.com/trevphil/subsetsum">GitHub</a> or download the pip module via <code>pip install subsetsum</code>.</article></body>

How to find all solutions to the SUBSET-SUM problem

A dynamic programming solution that works for negative, positive, and repeating numbers

The SUBSET-SUM problem involves determining whether or not a subset from a list of integers can sum to a target value. For example, consider the list of nums = [1, 2, 3, 4]. If the target = 7, there are two subsets that achieve this sum: {3, 4} and {1, 2, 4}. If target = 11, there are no solutions.

In general, determining if there are even any solutions to SUBSET-SUM is NP-hard: if there are n integers in the nums list, there exist 2^n — 1 subsets that need to be checked (excluding the empty set). In this article, we are going to look at a more efficient solving method using dynamic programming (DP). However, unlike most tutorials, not only will we determine if there exists a solution, we will see how to discover all solutions. The algorithm works for negative and positive input values, as well as repeating non-unique integers in nums.

TLDR; Python package 🐍

Looking for a quick solution and don’t necessarily want to know the underlying details? I’ve created a Python package called “subsetsum” with a super-fast solver: pip install subsetsum

The logic of the solver is implemented in C++ and uses Pybind11 to expose a Python interface. Source code is available on GitHub.

1. Pre-processing

Before going into the DP solution, we will apply some pre-processing to the problem inputs (nums and target).

Flip the sign 🔄

The first thing we’ll do is flip the sign of all integers in nums as well as the target, if the target is negative. This ensures that our target will always be 0 or greater, which just makes life easier for the whole algorithm! We can do this without worrying, since the solutions before and after flipping the sign are equivalent:

x0 + x1 + ... + x10 = target
# Multiply both sides by -1
-x0 - x1 - ... - x10 = -target
# The equations are equivalent!

Sort the numbers ➡️

The next pre-processing step is to sort nums in ascending order, which is necessary for the DP algorithm to work (described later). If you need to remember the original order of nums, you could perform an argsort which maps from an integer’s current position in the list to its position in the sorted list. Then you can always re-map back to the original position if necessary.

nums = [-2, 1, -3, 0, 4, 5]
index = argsort(nums) = [2, 0, 3, 1, 4, 5]
nums[index] = [-3, -2, 0, 1, 4, 5]

Check if the target is too low or too high 🛑

Consider the case where nums = [-3, -2, -1, 1, 2, 3, 4]. What’s the smallest sum achievable? -6. What’s the largest sum possible? 10. If the target sum is less than the sum of all negative integers in nums or greater than the sum of all positive integers in nums, no solution exists.

We will store the sum of all negative integers in variable a and the sum of all positive integers in variable b. If target < a or target > b, we can stop early with “No solution!”

2. Dynamic programming

With the pre-processing finished, we’re ready to fill up a dynamic programming table called DP. The DP table will have n rows (given n numbers) and target — a + 1 columns. Values stored in the table will simply be True or False. Row and column indices start at 0.

If we’re on row i, we are considering all subsets using integers up to and including the i-th integer in the sorted nums. The i-th integer does not need to be included, but it is “available” if we want to use it.
If we’re on column j, we are trying to find subsets that sum up to an “intermediate” target of a + j. Note that since we have target — a + 1 columns, the final value of j will be target — a, meaning that in the end, we try to find subsets that sum to a + j = a + target — a = target.
So, DP[i, j] = 1 means that there exists a subset in nums[0...i] which sums to a + j. If DP[i, j] = 0 there does not exist such a subset.
Recovering the solution: if DP[n — 1, target — a] == 1, there exists a subset of nums which sums to target.

The update rule ⭐️

We start by initializing the DP table with zeros and filling the first row of DP, noticing that DP[0, j] can only be 1 if nums[0] is equal to a + j.

for j in range(0, target - a + 1):
    DP[0, j] = (nums[0] == a + j)

For the remaining rows, DP[i, j] can be marked 1 under the following scenarios:

DP[i — 1, j] == 1: If the “intermediate” target a + j can be achieved using only a subset from nums[0...(i — 1)] then clearly, it can also be achieved if the i-th number is allowed to be in the subset.
nums[i] == a + j: In this case, the intermediate target a + j can be achieved from a subset with a single integer: {nums[i]}.
DP[i — 1, j — nums[i]] == 1: The trickiest rule, reminiscent of dynamic programming solutions to the knapsack problem. If there exists a subset of nums[0...(i — 1)] which sums to a + j — nums[i], then we know there is also a subset that sums to a + j by including nums[i] in the subset.

for i in range(1, n):
    for j in range(0, target - a + 1):
        DP[i, j] = DP[i - 1, j] or nums[i] == (a + j)
        if DP[i, j] == False:
            next_j = j - nums[i]
            if 0 <= next_j < target - a + 1:
                DP[i, j] = DP[i - 1, next_j]

Time and space complexity ⏱

The DP solution described here is what’s known as a pseudo-polynomial algorithm. The size of the DP table depends not only (linearly) on the number of elements in nums, but also (linearly) on the values of nums and target, since the number of columns in the table is the distance between the target and the sum of negative integers in nums.

Each cell of the table must be set to 0 or 1 once, and this is determined using a constant number of queries to other cells, so the algorithm’s runtime scales equally with the table size.

DP has shortcomings, brute-force can be better 💪

Let’s says nums = [-1000000, 1000000] and target = 999999. Using the DP method, we would have 2 rows and 999999 + 1000000 + 1 = 2000000 columns. That’s a lot of memory usage for an obviously unsolvable problem! If there are few numbers in nums but the range of values is large, you’re better off brute-force checking all possible subsets.

3. Finding all solutions 🤯

We will use a stack, but unfortunately not this kind of stack! Photo by Brigitte Tohm on Unsplash

Let’s move on to probably the most challenging topic, and also the least-discussed in other tutorials: how to actually find which subsets achieve the target sum!

To do this, we need to use our DP table and backtrack through it. We’re going to use a non-recursive technique: a stack. Each item in the stack will be a simple data structure that I call StackItem.

class StackItem:
    row: int  # Row index in the DP table
    col: int  # Column index in the DP table
    take: list[int]  # Indices of integers to include in the subset
    togo: int  # Value "to go" until reaching the `target` sum

We know by now, there exists a solution if the bottom-right cell of the DP table is 1. Otherwise, don’t even bother trying to find solutions, because there aren’t any! We’re going to initialize the stack / backtracking by starting in the bottom-right cell. We assume that the final integer in nums is included in the subset, so togo will be the target value minus nums[n — 1].

stack.push(
    StackItem(
        row=n - 1,
        col=target - a,
        take=[n - 1],
        togo=target - nums[n - 1]
    )
)

Now we continuously pop items from the stack and add new items until it is empty. Once finished, we will have enumerated all possible solutions.

Consider that we just popped an item: item = stack.pop() whose row = i and col = j. We will check three scenarios for item:

If DP[i — 1, j] == 1 then there could be a solution that doesn’t use the i-th integer in nums. In this case, we will add a new StackItem as though the i-th integer is not included in the subset, but the (i-1)-th integer is.
If DP[i — 1, j — nums[i]] == 1 then there is a solution that uses the remaining nums[0...(i — 1)] integers to form an “intermediate” subset that sums to a + j — nums[i]. In this case, we will add a new StackItem assuming the i-th integer is included in the “final” subset.
If item.togo == 0 then we’ve got a solution! Then item.take stores the indices of integers in (sorted) nums that form a subset whose sum is target.

while len(stack) > 0:
    item = stack.pop()
    i, j, take, togo = item.unpack()

    # Scenario 1
    if i > 0 and DP[i - 1, j]:
        new_take = take.copy()
        new_take[-1] = i - 1  # Replace the last element
        new_togo = togo + nums[i] - nums[i - 1]
        stack.push(StackItem(i - 1, j, new_take, new_togo))

    # Scenario 2
    next_j = j - nums[i]
    if i > 0 and 0 <= next_j < (target - a + 1):
        if DP[i - 1, next_j]:
            new_take = take.copy()
            new_take.append(row - 1)  # Add a new element
            new_togo = togo - nums[i - 1]
            stack.push(StackItem(i - 1, next_j, new_take, new_togo))

    # Scenario 3
    if togo == 0:
        yield [nums[t] for t in take]

Conclusion 🎉

I hope you enjoyed learning how to solve the SUBSET-SUM problem using dynamic programming! There are quite a few online resources which explain how to determine if a solution exists for a particular set of (nums , target) but often these tutorials assume that all numbers are positive. The algorithm described here also works for negative numbers!💡Moreover, almost none of the existing tutorials explain how to leverage the DP table to backtrack through all solutions, which was the main motivation of this article.

For a more concrete implementation (C++ and Python, not pseudo-code) please visit GitHub or download the pip module via pip install subsetsum.