python - Find the index of the k smallest values of a numpy array

Question

Welcome To Ask or Share your Answers For Others

python - Find the index of the k smallest values of a numpy array

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Find the index of the k smallest values of a numpy array

In order to find the index of the smallest value, I can use argmin:

import numpy as np
A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
print A.argmin()     # 4 because A[4] = 0.1

But how can I find the indices of the k-smallest values?

I'm looking for something like:

print A.argmin(numberofvalues=3)   
# [4, 0, 7]  because A[4] <= A[0] <= A[7] <= all other A[i]

Note: in my use case A has between ~ 10 000 and 100 000 values, and I'm interested for only the indices of the k=10 smallest values. k will never be > 10.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T00:13:22+0000

Use np.argpartition. It does not sort the entire array. It only guarantees that the kth element is in sorted position and all smaller elements will be moved before it. Thus the first k elements will be the k-smallest elements.

import numpy as np

A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
k = 3

idx = np.argpartition(A, k)
print(idx)
# [4 0 7 3 1 2 6 5]

This returns the k-smallest values. Note that these may not be in sorted order.

print(A[idx[:k]])
# [ 0.1  1.   1.5]

To obtain the k-largest values use

idx = np.argpartition(A, -k)
# [4 0 7 3 1 2 6 5]

A[idx[-k:]]
# [  9.  17.  17.]

WARNING: Do not (re)use idx = np.argpartition(A, k); A[idx[-k:]] to obtain the k-largest. That won't always work. For example, these are NOT the 3 largest values in x:

x = np.array([100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 0])
idx = np.argpartition(x, 3)
x[idx[-3:]]
array([ 70,  80, 100])

Here is a comparison against np.argsort, which also works but just sorts the entire array to get the result.

In [2]: x = np.random.randn(100000)

In [3]: %timeit idx0 = np.argsort(x)[:100]
100 loops, best of 3: 8.26 ms per loop

In [4]: %timeit idx1 = np.argpartition(x, 100)[:100]
1000 loops, best of 3: 721 μs per loop

In [5]: np.alltrue(np.sort(np.argsort(x)[:100]) == np.sort(np.argpartition(x, 100)[:100]))
Out[5]: True

Categories

python - Find the index of the k smallest values of a numpy array

python - Find the index of the k smallest values of a numpy array

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags