Shape Operations

[BETA: the API and behaviorwill be changed in the future.]

Starting from v0.2, Layout Parser provides supports for two types of shape operations,union andintersection, across allBaseCoordElements andTextBlock. We’ve made some design choices to construct a set of generalized APIs across different shape classes, detailed as follows:

Theunion Operation

Illustration of Union Operations▲ The Illustration of Union Operations. The resulting matrix are symmetric so only the lower triangular region is left empty. Each cell shows the visualization of the shape objects, their coordinates, and their object class. For the output visualization, the gray and dashed line delineates the original obj1 and obj2, respectively, for reference.

Notes:

  1. The x-interval and y-interval are both from theInterval Class but with different axes. It’s ill-defined to union two intervals from different axes so in this case Layout Parser will raise anInvalidShapeError.

  2. The union of two rectangles is still a rectangle, which is the minimum covering rectangle of the two input rectangles.

  3. For the outputs associated withQuadrilateral inputs, please see details in theProblems related to the Quadrilateral Class section.

Theintersect Operation

Illustration of Intersection Operations▲ The Illustration of Union Operations. Similar to the previous visualization, the resulting matrix are symmetric so only the lower triangular region is left empty. Each cell shows the visualization of the shape objects, their coordinates, and their object class. For the output visualization, the gray and dashed line delineates the original obj1 and obj2, respectively, for reference.

Problems related to theQuadrilateral Class

It is possible to generate arbitrary shapes when performing shape operations onQuadrilateral objects. Currently Layout Parser does not provide the support forPolygon objects (but we plan to support that object in the near future), thus it becomes tricky to add support for these operations forQuadrilateral. The temporary solution is that:

  1. When performing shape operations onQuadrilateral objects, Layout Parser will raiseNotSupportedShapeError.

  2. A workaround is to setstrict=False in the input (i.e.,obj1.union(obj2,strict=False)). In this case, any quadrilateral objects will be converted toRectangles first and the operation is executed. The results may not bestrictly equivalent to those performed on the original objects.