The problem is:
Given an integer array
nums, return all the triplets[nums[i], nums[j], nums[k]]wherei,jandkare distinct andnums[i] + nums[j] + nums[k] == 0.The solution set must not contain duplicate triplets and the order of the output and the order of the triplets does not matter.
This is my code:
def threeSum(nums): triples = set() # this is where the triples will be stored num_map = {} n = len(nums) for i in range(n): # creating a hash map of the array for quick lookups num_map[nums[i]] = i for i in range(n-1): for j in range(i+1,n): comp = - nums[i] - nums[j] # the complement if (comp in num_map and num_map[comp] != i and num_map[comp] != j): triples.add(tuple(sorted([nums[i],nums[j],comp]))) return triplesMy questions are:
- I believe the time complexity of my solution is O(n^2), and it is not known if there is an algorithm which can perform better than O(n^2). Am I correct?
- In any event, it would seem that my code is not efficient. Which part of the code is slowing it down?
I suspected the use ofsorted() might be adding to it, but given that it is only sorting an array of 3 elements (and is only triggered if a solution is found), I thought it might not be having much impact. The reason I sort the array is to avoid duplicates.
- \$\begingroup\$"it would seem that my code is not efficient" - What exactly makes you think that?\$\endgroup\$no comment– no comment2024-03-03 13:49:05 +00:00CommentedMar 3, 2024 at 13:49
- 1\$\begingroup\$@nocomment When I submit it on leetcodeleetcode.com/problems/3sum/submissions/1192682909 the runtime is 6553 ms. Most of the submissions run in under 1000ms\$\endgroup\$Oliver– Oliver2024-03-03 14:03:16 +00:00CommentedMar 3, 2024 at 14:03
3 Answers3
It would be nice to see some tests of the function. That would help reviewers see what's known to work, and identify untested scenarios.
This transformation is strongly tied to the specific requirements of the question:
for i in range(n): # creating a hash map of the array for quick lookups num_map[nums[i]] = i
We're storing only a single index in this reverse lookup table, and overwriting any previous value. The wording of the question permits this because we are required to return a set of the values. But this approach makes the code less reusable for similar problems (such as those where we must return theindices rather than values).
for i in range(n) is unidiomatic. I'd write:
for i,n in enumerate(nums): num_map[n] = i print(num_map)Or more likely:
num_map = {n:i for i,n in enumerate(nums)}Since we don't need to know the actual indices used (only that they be distinct), we can transform the input by sorting it input first, thus allowing us to know thatnums[i] ≤nums[j] ≤nums[k] wheni ≤j ≤k. Then we don't needsorted when adding, and our condition can be simpler:
nums = sorted(nums) num_map = dict((n,i) for i,n in enumerate(nums)) for i,nᵢ in enumerate(nums[:-1]): for j,nⱼ in enumerate(nums[i+1:], i+1): comp = -nᵢ -nⱼ # the complement if comp in num_map and j < num_map[comp]: triples.add((nᵢ, nⱼ, comp))We can then reduce work by observing that when we reachnⱼ that would makek too small, then no higherj will succeed, so we can break out of thej loop early:
for i,nᵢ in enumerate(nums[:-1]): for j,nⱼ in enumerate(nums[i+1:], i+1): comp = -nᵢ -nⱼ # the complement if nⱼ > comp: # we won't find any more matches in this j or higher break try: if j < num_map[comp]: triples.add((nᵢ, nⱼ, comp)) except KeyError: passI also eliminated the duplication ofcomp in num_map andnum_map[comp] here - remember that Python exceptions are cheap.
We could probably go further, and remove values fromcomp as they become unreachable. Or maintain an index into it, searching backwards asj increases, rather than doing thein test each iteration. I'll leave that as an exercise for the motivated reader.
- \$\begingroup\$Thank you for the response - this is very helpful. According towiki.python.org/moin/TimeComplexity the "average case" for a lookup in a set is O(1). Why is it in fact O(log n) in this scenario?\$\endgroup\$Oliver– Oliver2024-03-03 11:40:38 +00:00CommentedMar 3, 2024 at 11:40
- \$\begingroup\$My error - confused with C++
std::set.\$\endgroup\$Toby Speight– Toby Speight2024-03-03 12:38:39 +00:00CommentedMar 3, 2024 at 12:38 - 1\$\begingroup\$I'd say
num_map[comp]rather frequently doesn't exist, so raising and catching so many errors seems like a very bad idea, given that they're interested in speed andcatching exceptions is expensive.\$\endgroup\$no comment– no comment2024-03-03 14:05:37 +00:00CommentedMar 3, 2024 at 14:05
Theexcellent answer by user @Toby Speightaddresses the main goal of performance.
I'm not that smart, but I'll offer some feedback on more mundane styleissues with your code.
Overview
The code layout is nice and compact.
Documentation
It would be nice to add a docstring to the function to describe ts purpose,what the input is and what it returns. The problem statement in your question would be a good starting point.
Comments
The following comment is not very useful because it does not tell the useranything new; it is obvious you are storing something into a variable.
# this is where the triples will be storedA "hash map" in Python parlance is a "dictionary".
Naming
The variabletriples could be namedtriplets to be consistent with the documentation.
Lint check
pylint identified trailingwhitespace on a couple lines.Since it is not needed and can be quite annoying, just delete it.
While we're on the topic, I think adding a single space after each commamakes the code easier to read.
Here is new code with the suggestions above:
def threeSum(nums): ''' Given an integer array nums, return all the triplets [ nums[i], nums[j], nums[k] ] where i, j and k are distinct and nums[i] + nums[j] + nums[k] == 0. <MORE DETAILS TO BE ADDED BY CODE AUTHOR> ''' triplets = set() num_map = {} n = len(nums) for i in range(n): # creating a dictionary of the array for quick lookups num_map[nums[i]] = i for i in range(n-1): for j in range(i+1, n): comp = - nums[i] - nums[j] # the complement if (comp in num_map and num_map[comp] != i and num_map[comp] != j): triplets.add(tuple(sorted([nums[i], nums[j], comp]))) return triplets- \$\begingroup\$Your docstring suggests that
threeSum([0,0,0,0])returns 24 triplets.\$\endgroup\$no comment– no comment2024-03-03 15:27:04 +00:00CommentedMar 3, 2024 at 15:27 - \$\begingroup\$@nocomment: Please recommend a better docstring. Go ahead and suggest an improvement by clicking the "Edit" link. My suggestion was really meant as a placeholder for the code author to fill in. I updated the answer to make that more evident.\$\endgroup\$toolic– toolic2024-03-04 11:40:19 +00:00CommentedMar 4, 2024 at 11:40
- \$\begingroup\$
for i in range(n): num_map[nums[i]] = i=>for i, num in enumerate(nums): num_map[num] = i\$\endgroup\$Chris– Chris2025-09-11 14:47:12 +00:00CommentedSep 11 at 14:47 - \$\begingroup\$Your second loop can also be structured as a set comprehension.\$\endgroup\$Chris– Chris2025-09-11 14:47:55 +00:00CommentedSep 11 at 14:47
I'm surprised no one on Leetcode mentioned: Assuming you are working with integers within a fixed range [-N,N], CLRS gives a O(n + N log N) solution for 3SUM.
- Represent
numsas a bitset of length N. - Here is the trick: compute
nums + nums(all pair sums) in O(N log N) time using FFT. If it helps, imagine the bitset as a generating function, and the convolution will add the exponents. - Check
nums + numsagainstnums.
3SUM only checks if a solution exists. There are up to n^2 possible triplets to be returned. Otherwise O(n^2) is the best known algorithm. It's simple: for everynums[i], nums[j], simply check ifnums[k]=-(nums[i]+nums[j] is in a hash table. If hashing is not allowed, then you can use the hi and lo pointer method like in 2sum to avoid binary search.
- \$\begingroup\$I haven't either :P but I know it exists from reading other people's project euler solutions and cp-algorithms.com\$\endgroup\$qwr– qwr2024-03-04 17:26:40 +00:00CommentedMar 4, 2024 at 17:26
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.
