Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
859 views
in Technique[技术] by (71.8m points)

python - Find consecutive unmasked values

I have a large 3 dimensional (time, longitude, latitude) input array. Most of the entries are masked. I need to find those entries where the mask is False for longer than a specific number of consecutive time steps (which I call threshold here). The result should be a mask with the same shape as the input mask.

Here is some pseudo-code to hopefully make clearer what I mean:

new_mask = find_consecutive(mask, threshold=3)
mask[:, i_lon, i_lat]
# [1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0]
new_mask[:, i_lon, i_lat]
# [1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]

EDIT:

I'm not sure if my approach so far make sense. It does good performance-wise and gives me a labeled array and knowledge about which labels I want. I just couldn't figure out an efficient way to transform labels into a mask again.

from scipy.ndimage import measurements
structure = np.zeros((3, 3, 3))
structure[:, 1, 1] = 1
labels, nr_labels = measurements.label(1 - mask, structure=structure)
_, counts = np.unique(labels, return_counts=True)
labels_selected = [i_count for i_count, count in enumerate(counts)
                   if count >= threshold]
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

That's a classical case of binary closing operation in image-processing. To solve it you can take help from scipy module, specifically - scipy.ndimage.morphology.binary_closing after we feed an appropriate 1D kernel of all ONES and of length threshold. Also, Scipy's binary closing function gives us the closed mask only. So, to get the desired output, we need to OR it with the input mask. Thus, the implementation would look something like this -

from scipy.ndimage import binary_closing
out = mask | binary_closing(mask, structure=np.ones(threshold))

How about a NumPy version of binary closing!

Now, a closing operation is basically image-dilation and image-erosion, so we can simulate that behiviour using the trusty convolution operation and we do have that here in NumPy as np.convolve. Similar to scipy's binary closing operation, we need the same kernel here as well and we would use it both for dilation and erosion. The implementation would be -

def numpy_binary_closing(mask,threshold):

    # Define kernel
    K = np.ones(threshold)

    # Perform dilation and threshold at 1
    dil = np.convolve(mask,K,mode='same')>=1

    # Perform erosion on the dilated mask array and threshold at given threshold
    dil_erd = np.convolve(dil,K,mode='same')>= threshold
    return dil_erd

Sample run -

In [133]: mask
Out[133]: 
array([ True, False, False, False, False,  True,  True, False, False,
        True, False], dtype=bool)

In [134]: threshold = 3

In [135]: binary_closing(mask, structure=np.ones(threshold))
Out[135]: 
array([False, False, False, False, False,  True,  True,  True,  True,
        True, False], dtype=bool)

In [136]: numpy_binary_closing(mask,threshold)
Out[136]: 
array([False, False, False, False, False,  True,  True,  True,  True,
        True, False], dtype=bool)

In [137]: mask | binary_closing(mask, structure=np.ones(threshold))
Out[137]: 
array([ True, False, False, False, False,  True,  True,  True,  True,
        True, False], dtype=bool)

In [138]: mask| numpy_binary_closing(mask,threshold)
Out[138]: 
array([ True, False, False, False, False,  True,  True,  True,  True,
        True, False], dtype=bool)

Runtime tests (Scipy vs Numpy!)

Case #1 : Uniformly sparse

In [163]: mask = np.random.rand(10000) > 0.5

In [164]: threshold = 3

In [165]: %timeit binary_closing(mask, structure=np.ones(threshold))
1000 loops, best of 3: 582 μs per loop

In [166]: %timeit numpy_binary_closing(mask,threshold)
10000 loops, best of 3: 178 μs per loop

In [167]: out1 = binary_closing(mask, structure=np.ones(threshold))

In [168]: out2 = numpy_binary_closing(mask,threshold)

In [169]: np.allclose(out1,out2) # Verify outputs
Out[169]: True

Case #2 : More sparse and bigger threshold

In [176]: mask = np.random.rand(10000) > 0.8

In [177]: threshold = 11

In [178]: %timeit binary_closing(mask, structure=np.ones(threshold))
1000 loops, best of 3: 823 μs per loop

In [179]: %timeit numpy_binary_closing(mask,threshold)
1000 loops, best of 3: 331 μs per loop

In [180]: out1 = binary_closing(mask, structure=np.ones(threshold))

In [181]: out2 = numpy_binary_closing(mask,threshold)

In [182]: np.allclose(out1,out2) # Verify outputs
Out[182]: True

Winner is Numpy and by a big margin!


Boundary conditions

It seems the boundaries need the closing too, if the 1s are close enough to the bounadries. To solve those cases, you can pad one 1 each at the start and end of the input boolean array, use the posted code and then at the end de-select the first and last element. Thus, the complete implementation using scipy's binary_closing approach would be -

mask_ext = np.pad(mask,1,'constant',constant_values=(1))
out = mask_ext | binary_closing(mask_ext, structure=np.ones(threshold))
out = out[1:-1]

Sample run -

In [369]: mask
Out[369]: 
array([False, False,  True, False, False, False, False,  True,  True,
       False, False,  True, False], dtype=bool)

In [370]: threshold = 3

In [371]: mask_ext = np.pad(mask,1,'constant',constant_values=(1))
     ...: out = mask_ext | binary_closing(mask_ext, structure=np.ones(threshold))
     ...: out = out[1:-1]
     ...: 

In [372]: out
Out[372]: 
array([ True,  True,  True, False, False, False, False,  True,  True,
        True,  True,  True,  True], dtype=bool)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...