Prefix sum with Hashmaps
I'm going through some DSA stuff, and I hit a roadblock on using Prefix Sum with Hashmaps. This is a case of trying to understand by reading other's solutions, but I just can't understand why this works. I don't need to be explained what the code is doing but how the algorithm works. This right here is one of the solutions for 525. Contiguous Array. I'll spoiler mark it just in case.
5 Replies
560. Subarray Sum Equals K is the introductory problem to this concept.
So, image walking through the array, element by element. If we see a 0 we subtract 1 from
count
, and if we see a 1 we add 1 to count
. Let's take the Example 3 from the leetcode page (0,1,1,1,1,1,0,0,0), and we'll record count
against each element:
If we calculate a count
value which is equal to a count
value that we calculated for a previous index (for example, at index 6 we calculate a count
value of 3, which is the same as the count
for index 4), then we know there must be an equal number of 1's and 0's between those two indexes. Because we had to sum an equal number of 1's and 0's to get back to where we started.
So, we keep track of the first position at which we first found each count
. And when we calculate a count
which we've seen before, we look at the number of elements between that first position and the current position: that's the length of the sub-array between those two positions. Then we keep track of the max length of such sub-arrays, and that's our answer.
Actually "Approach #2" here explains it well: https://leetcode.com/problems/contiguous-array/editorial/So in the case when you're counting subarrays. It's not the fact that the first time you come across your count that matters, it's when you get a duplicate count that matters. When you get a duplicate value, the difference between the first index and the current index, is the number of subarrays. And more importantly, you're never updating the value for any keys in the map. It's just a reference so you can measure the distance.
The image in the link above shows it quite well I think
Every time you arrive at a particular count, there must be an equal number of 1's and 0's since the first time you saw that count
There's an equal number of 1's and 0's since every subsequent time you saw that count too, but since you're after the longest subarray, you're just interested in the first time
Sorry I looked at the second problem I referenced which focuses on counting subarrays. But whether it's counting subarrays or finding the longest subarray, you want to change your return value based on when you come across a duplicate value. Either increment by 1 if counting or measure your distance for longest subarray.