Note

Go to the endto download the full example code or to run this example in your browser via Binder.

Evaluating segmentation metrics#

When trying out different segmentation methods, how do you know which one isbest? If you have aground truth orgold standard segmentation, you can usevarious metrics to check how close each automated method comes to the truth.In this example we use an easy-to-segment image as an example of how tointerpret various segmentation metrics. We will use the adapted Rand errorand the variation of information as example metrics, and see howoversegmentation (splitting of true segments into too many sub-segments) andundersegmentation (merging of different true segments into a single segment)affect the different scores.

importnumpyasnpimportmatplotlib.pyplotaspltfromscipyimportndimageasndiimportskimageasskiimage=ski.data.coins()

First, we generate the true segmentation. For this simple image, we knowexact functions and parameters that will produce a perfect segmentation. Ina real scenario, typically you would generate ground truth by manualannotation or “painting” of a segmentation.

elevation_map=ski.filters.sobel(image)markers=np.zeros_like(image)markers[image<30]=1markers[image>150]=2im_true=ski.segmentation.watershed(elevation_map,markers)im_true=ndi.label(ndi.binary_fill_holes(im_true-1))[0]

Next, we create three different segmentations with different characteristics.The first one usesskimage.segmentation.watershed() withcompactness, which is a useful initial segmentation but too fine as afinal result. We will see how this causes the oversegmentation metrics toshoot up.

edges=ski.filters.sobel(image)im_test1=ski.segmentation.watershed(edges,markers=468,compactness=0.001)

The next approach uses the Canny edge filter,skimage.feature.canny().This is a very good edge finder, and gives balanced results.

edges=ski.feature.canny(image)fill_coins=ndi.binary_fill_holes(edges)im_test2=ndi.label(ski.morphology.remove_small_objects(fill_coins,max_size=20))[0]

Finally, we use morphological geodesic active contours,skimage.segmentation.morphological_geodesic_active_contour(), a methodthat generally produces good results, but requires a long time to converge ona good answer. We purposefully cut short the procedure at 100 iterations, sothat the final result isundersegmented, meaning that many regions aremerged into one segment. We will see the corresponding effect on thesegmentation metrics.

image=ski.util.img_as_float(image)gradient=ski.segmentation.inverse_gaussian_gradient(image)init_ls=np.zeros(image.shape,dtype=np.int8)init_ls[10:-10,10:-10]=1im_test3=ski.segmentation.morphological_geodesic_active_contour(gradient,num_iter=100,init_level_set=init_ls,smoothing=1,balloon=-1,threshold=0.69,)im_test3=ski.measure.label(im_test3)method_names=['Compact watershed','Canny filter','Morphological Geodesic Active Contours',]short_method_names=['Compact WS','Canny','GAC']precision_list=[]recall_list=[]split_list=[]merge_list=[]forname,im_testinzip(method_names,[im_test1,im_test2,im_test3]):error,precision,recall=ski.metrics.adapted_rand_error(im_true,im_test)splits,merges=ski.metrics.variation_of_information(im_true,im_test)split_list.append(splits)merge_list.append(merges)precision_list.append(precision)recall_list.append(recall)print(f'\n## Method:{name}')print(f'Adapted Rand error:{error}')print(f'Adapted Rand precision:{precision}')print(f'Adapted Rand recall:{recall}')print(f'False Splits:{splits}')print(f'False Merges:{merges}')fig,axes=plt.subplots(2,3,figsize=(9,6),constrained_layout=True)ax=axes.ravel()ax[0].scatter(merge_list,split_list)fori,txtinenumerate(short_method_names):ax[0].annotate(txt,(merge_list[i],split_list[i]),verticalalignment='center')ax[0].set_xlabel('False Merges (bits)')ax[0].set_ylabel('False Splits (bits)')ax[0].set_title('Split Variation of Information')ax[1].scatter(precision_list,recall_list)fori,txtinenumerate(short_method_names):ax[1].annotate(txt,(precision_list[i],recall_list[i]),verticalalignment='center')ax[1].set_xlabel('Precision')ax[1].set_ylabel('Recall')ax[1].set_title('Adapted Rand precision vs. recall')ax[1].set_xlim(0,1)ax[1].set_ylim(0,1)ax[2].imshow(ski.segmentation.mark_boundaries(image,im_true))ax[2].set_title('True Segmentation')ax[2].set_axis_off()ax[3].imshow(ski.segmentation.mark_boundaries(image,im_test1))ax[3].set_title('Compact Watershed')ax[3].set_axis_off()ax[4].imshow(ski.segmentation.mark_boundaries(image,im_test2))ax[4].set_title('Edge Detection')ax[4].set_axis_off()ax[5].imshow(ski.segmentation.mark_boundaries(image,im_test3))ax[5].set_title('Morphological GAC')ax[5].set_axis_off()plt.show()

Split Variation of Information, Adapted Rand precision vs. recall, True Segmentation, Compact Watershed, Edge Detection, Morphological GAC

## Method: Compact watershedAdapted Rand error: 0.5421684624091794Adapted Rand precision: 0.2968781380256405Adapted Rand recall: 0.9999664222191392False Splits: 6.036024332525564False Merges: 0.0825883711820654## Method: Canny filterAdapted Rand error: 0.0027247598212836177Adapted Rand precision: 0.9946425605360896Adapted Rand recall: 0.9999218934767155False Splits: 0.20042002116129515False Merges: 0.18076872508600775## Method: Morphological Geodesic Active ContoursAdapted Rand error: 0.8346015951433162Adapted Rand precision: 0.9191321393095933Adapted Rand recall: 0.09087577915161697False Splits: 0.6466330168716372False Merges: 1.4656270133195097

Total running time of the script: (0 minutes 1.789 seconds)

DownloadJupyternotebook:plot_metrics.ipynb

DownloadPythonsourcecode:plot_metrics.py

Downloadzipped:plot_metrics.zip

Gallery generated by Sphinx-Gallery

This Page

Show Source

Movatterモバイル変換

Evaluating segmentation metrics#

This Page