numpy.isin#
- numpy.isin(element,test_elements,assume_unique=False,invert=False,*,kind=None)[source]#
Calculates
elementintest_elements, broadcasting overelement only.Returns a boolean array of the same shape aselement that is Truewhere an element ofelement is intest_elements and False otherwise.- Parameters:
- elementarray_like
Input array.
- test_elementsarray_like
The values against which to test each value ofelement.This argument is flattened if it is an array or array_like.See notes for behavior with non-array-like parameters.
- assume_uniquebool, optional
If True, the input arrays are both assumed to be unique, whichcan speed up the calculation. Default is False.
- invertbool, optional
If True, the values in the returned array are inverted, as ifcalculatingelement not in test_elements. Default is False.
np.isin(a,b,invert=True)is equivalent to (but fasterthan)np.invert(np.isin(a,b)).- kind{None, ‘sort’, ‘table’}, optional
The algorithm to use. This will not affect the final result,but will affect the speed and memory use. The default, None,will select automatically based on memory considerations.
If ‘sort’, will use a mergesort-based approach. This will havea memory usage of roughly 6 times the sum of the sizes ofelement andtest_elements, not accounting for size of dtypes.
If ‘table’, will use a lookup table approach similarto a counting sort. This is only available for boolean andinteger arrays. This will have a memory usage of thesize ofelement plus the max-min value oftest_elements.assume_unique has no effect when the ‘table’ option is used.
If None, will automatically choose ‘table’ ifthe required memory allocation is less than or equal to6 times the sum of the sizes ofelement andtest_elements,otherwise will use ‘sort’. This is done to not usea large amount of memory by default, even though‘table’ may be faster in most cases. If ‘table’ is chosen,assume_unique will have no effect.
- Returns:
- isinndarray, bool
Has the same shape aselement. The valueselement[isin]are intest_elements.
Notes
isinis an element-wise function version of the python keywordin.isin(a,b)is roughly equivalent tonp.array([iteminbforitemina])ifa andb are 1-D sequences.element andtest_elements are converted to arrays if they are notalready. Iftest_elements is a set (or other non-sequence collection)it will be converted to an object array with one element, rather than anarray of the values contained intest_elements. This is a consequenceof the
arrayconstructor’s way of handling non-sequence collections.Converting the set to a list usually gives the desired behavior.Using
kind='table'tends to be faster thankind=’sort’ if thefollowing relationship is true:log10(len(test_elements))>(log10(max(test_elements)-min(test_elements))-2.27)/0.927,but may use greater memory. The default value forkind willbe automatically selected based only on memory usage, so one maymanually setkind='table'if memory constraints can be relaxed.Examples
>>>importnumpyasnp>>>element=2*np.arange(4).reshape((2,2))>>>elementarray([[0, 2], [4, 6]])>>>test_elements=[1,2,4,8]>>>mask=np.isin(element,test_elements)>>>maskarray([[False, True], [ True, False]])>>>element[mask]array([2, 4])
The indices of the matched values can be obtained with
nonzero:>>>np.nonzero(mask)(array([0, 1]), array([1, 0]))
The test can also be inverted:
>>>mask=np.isin(element,test_elements,invert=True)>>>maskarray([[ True, False], [False, True]])>>>element[mask]array([0, 6])
Because of how
arrayhandles sets, the following does notwork as expected:>>>test_set={1,2,4,8}>>>np.isin(element,test_set)array([[False, False], [False, False]])
Casting the set to a list gives the expected result:
>>>np.isin(element,list(test_set))array([[False, True], [ True, False]])