Load COCO Layout Annotations

Preparation

In this notebook, I will illustrate how to use LayoutParser to load andvisualize the layout annotation in the COCO format.

Before starting, please remember to download PubLayNet annotations andimages from theirwebsite(let’s just use the validation set for now as the training set is verylarge). And let’s put all extracted files in thedata/publaynet/annotations anddata/publaynet/val folder.

And we need to install an additional library for conveniently handlingthe COCO data format:

pip install pycocotools

OK - Let’s get on the code:

Loading and visualizing layouts using Layout-Parser

frompycocotools.cocoimportCOCOimportlayoutparseraslpimportrandomimportcv2
defload_coco_annotations(annotations,coco=None):"""    Args:        annotations (List):            a list of coco annotaions for the current image        coco (`optional`, defaults to `False`):            COCO annotation object instance. If set, this function will            convert the loaded annotation category ids to category names            set in COCO.categories    """layout=lp.Layout()foreleinannotations:x,y,w,h=ele['bbox']layout.append(lp.TextBlock(block=lp.Rectangle(x,y,w+x,h+y),type=ele['category_id']ifcocoisNoneelsecoco.cats[ele['category_id']]['name'],id=ele['id']))returnlayout

Theload_coco_annotations function will help convert COCOannotations into the layoutparser objects.

COCO_ANNO_PATH='data/publaynet/annotations/val.json'COCO_IMG_PATH='data/publaynet/val'coco=COCO(COCO_ANNO_PATH)
loading annotations into memory...Done (t=1.17s)creating index...index created!
color_map={'text':'red','title':'blue','list':'green','table':'purple','figure':'pink',}forimage_idinrandom.sample(coco.imgs.keys(),1):image_info=coco.imgs[image_id]annotations=coco.loadAnns(coco.getAnnIds([image_id]))image=cv2.imread(f'{COCO_IMG_PATH}/{image_info["file_name"]}')layout=load_coco_annotations(annotations,coco)viz=lp.draw_box(image,layout,color_map=color_map)display(viz)# show the results
../../_images/output_8_0.png

You could add more information in the visualization.

lp.draw_box(image,[b.set(id=f'{b.id}/{b.type}')forbinlayout],color_map=color_map,show_element_id=True,id_font_size=10,id_text_background_color='grey',id_text_color='white')
../../_images/output_10_0.png

Model Predictions on loaded data

We could also check how the trained layout model performs on the inputimage. Following thisinstruction,we could conveniently load a layout prediction model and run predictionson the existing image.

model=lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config',extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST",0.8],label_map={0:"text",1:"title",2:"list",3:"table",4:"figure"})
layout_predicted=model.detect(image)
lp.draw_box(image,[b.set(id=f'{b.type}/{b.score:.2f}')forbinlayout_predicted],color_map=color_map,show_element_id=True,id_font_size=10,id_text_background_color='grey',id_text_color='white')
../../_images/output_15_0.png