Layout Elements

Coordinate System

classlayoutparser.elements.Interval(start,end,axis,canvas_height=None,canvas_width=None)[source]

Bases:layoutparser.elements.base.BaseCoordElement

This class describes the coordinate system of an interval, a block defined by a pair of start and end pointon the designated axis and same length as the base canvas on the other axis.

Parameters
  • start (numeric) – The coordinate of the start point on the designated axis.

  • end (numeric) – The end coordinate on the same axis as start.

  • axis (str) – The designated axis that the end points belong to.

  • canvas_height (numeric,optional, defaults to 0) – The height of the canvas that the interval is on.

  • canvas_width (numeric,optional, defaults to 0) – The width of the canvas that the interval is on.

propertyheight

Calculate the height of the interval. If the interval is along the x-axis, the height will be theheight of the canvas, otherwise, it will be the difference between the start and end point.

Returns

Output the numeric value of the height.

Return type

numeric

propertywidth

Calculate the width of the interval. If the interval is along the y-axis, the width will be thewidth of the canvas, otherwise, it will be the difference between the start and end point.

Returns

Output the numeric value of the width.

Return type

numeric

propertycoordinates

This method considers an interval as a rectangle and calculates the coordinates of the upper leftand lower right corners to define the interval.

Returns

Output the numeric values of the coordinates in a Tuple of size four.

Return type

Tuple(numeric)

propertypoints

Return the coordinates of all four corners of the interval in a clockwise fashionstarting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpyarray

propertycenter

Calculate the mid-point between the start and end point.

Returns

Returns of coordinate of the center.

Return type

Tuple(numeric)

propertyarea

Return the area of the covered region of the interval.The area is bounded to the canvas. If the interval is puton a canvas, the area equals to interval width * canvas height(axis=’x’) or interval height * canvas width (axis=’y’).Otherwise, the area is zero.

put_on_canvas(canvas)[source]

Set the height and the width of the canvas that the interval is on.

Parameters

canvas (Numpyarray orBaseCoordElement orPIL.Image.Image) – The base element that the interval is on. The numpy array should be theformat of[height, width].

Returns

A copy of the current Interval with its canvas height and width set tothose of the input canvas.

Return type

Interval

condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates,generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates,generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other,soft_margin={},center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict,optional, defaults to{}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool,optional, defaults toFalse) – The toggle to determine whether the center (instead of the four corners)of the current element is in the other element.

Returns

ReturnsTrue if the current element is in the other element andFalse if not.

Return type

bool

intersect(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Intersect the current shape with the other object, with operations defined inShape Operations.

union(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Union the current shape with the other object, with operations defined inShape Operations.

pad(left=0,right=0,top=0,bottom=0,safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. Ifsafe_mode is set to True, the function will cut off the excess padding that falls on the negativeside of the coordinates.

Parameters
  • left (int,optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int,optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int,optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int,optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool,optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

shift(shift_distance)[source]

Shift the interval by a user specified amount along the same axis that the interval is defined on.

Parameters

shift_distance (numeric) – The number of pixels used to shift the interval.

Returns

The shifted Interval object.

Return type

BaseCoordElement

scale(scale_factor)[source]

Scale the layout element by a user specified amount the same axis that the interval is defined on.

Parameters

scale_factor (numeric) – The amount for downscaling or upscaling the element.

Returns

The scaled Interval object.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image according to the coordinates of the element.

Parameters

image (Numpyarray) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpyarray

to_rectangle()[source]

Convert the Interval to a Rectangle element.

Returns

The converted Rectangle object.

Return type

Rectangle

to_quadrilateral()[source]

Convert the Interval to a Quadrilateral element.

Returns

The converted Quadrilateral object.

Return type

Quadrilateral

classlayoutparser.elements.Rectangle(x_1,y_1,x_2,y_2)[source]

Bases:layoutparser.elements.base.BaseCoordElement

This class describes the coordinate system of an axial rectangle box using two points as indicated below:

(x_1,y_1)----||||||----(x_2,y_2)
Parameters
  • x_1 (numeric) – x coordinate on the horizontal axis of the upper left corner of the rectangle.

  • y_1 (numeric) – y coordinate on the vertical axis of the upper left corner of the rectangle.

  • x_2 (numeric) – x coordinate on the horizontal axis of the lower right corner of the rectangle.

  • y_2 (numeric) – y coordinate on the vertical axis of the lower right corner of the rectangle.

propertyheight

Calculate the height of the rectangle.

Returns

Output the numeric value of the height.

Return type

numeric

propertywidth

Calculate the width of the rectangle.

Returns

Output the numeric value of the width.

Return type

numeric

propertycoordinates

Return the coordinates of the two points that define the rectangle.

Returns

Output the numeric values of the coordinates in a Tuple of size four.

Return type

Tuple(numeric)

propertypoints

Return the coordinates of all four corners of the rectangle in a clockwise fashionstarting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpyarray

propertycenter

Calculate the center of the rectangle.

Returns

Returns of coordinate of the center.

Return type

Tuple(numeric)

propertyarea

Return the area of the rectangle.

condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates,generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates,generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other,soft_margin={},center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict,optional, defaults to{}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool,optional, defaults toFalse) – The toggle to determine whether the center (instead of the four corners)of the current element is in the other element.

Returns

ReturnsTrue if the current element is in the other element andFalse if not.

Return type

bool

intersect(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Intersect the current shape with the other object, with operations defined inShape Operations.

union(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Union the current shape with the other object, with operations defined inShape Operations.

pad(left=0,right=0,top=0,bottom=0,safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. Ifsafe_mode is set to True, the function will cut off the excess padding that falls on the negativeside of the coordinates.

Parameters
  • left (int,optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int,optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int,optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int,optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool,optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

shift(shift_distance=0)[source]

Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is onenumeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric orTuple(numeric) orList[numeric]) – The number of pixels used to shift the element.

Returns

The shifted BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

scale(scale_factor=1)[source]

Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is onenumeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric orTuple(numeric) orList[numeric]) – The amount for downscaling or upscaling the element.

Returns

The scaled BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image according to the coordinates of the element.

Parameters

image (Numpyarray) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpyarray

to_interval(axis,**kwargs)[source]
to_quadrilateral()[source]
classlayoutparser.elements.Quadrilateral(points:Union[numpy.ndarray,List,List[List]],height=None,width=None)[source]

Bases:layoutparser.elements.base.BaseCoordElement

This class describes the coodinate system of a four-sided polygon. A quadrilateral is defined bythe coordinates of its 4 corners in a clockwise order starting with the upper left corner (as shown below):

points[0]-...-points[1]||......||points[3]-...-points[2]
Parameters
  • points (Numpyarray orlist) – Anp.ndarray of shape 4x2 for four corner coordinatesor a list of length 8 for in the format of[p0_x, p0_y, p1_x, p1_y, p2_x, p2_y, p3_x, p3_y]or a list of length 4 in the format of[[p0_x, p0_y], [p1_x, p1_y], [p2_x, p2_y], [p3_x, p3_y]].

  • height (numeric,optional, defaults toNone) – The height of the quadrilateral. This is to better support the perspectivetransformation from the OpenCV library.

  • width (numeric,optional, defaults toNone) – The width of the quadrilateral. Similarly as height, this is to better support the perspectivetransformation from the OpenCV library.

propertyheight

Return the user defined height, otherwise the height of its circumscribed rectangle.

Returns

Output the numeric value of the height.

Return type

numeric

propertywidth

Return the user defined width, otherwise the width of its circumscribed rectangle.

Returns

Output the numeric value of the width.

Return type

numeric

propertycoordinates

Return the coordinates of the upper left and lower right corners points thatdefine the circumscribed rectangle.

Returns

Tuple(numeric): Output the numeric values of the coordinates in a Tuple of size four.

propertypoints

Return the coordinates of all four corners of the quadrilateral in a clockwise fashionstarting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpyarray

propertycenter

Calculate the center of the quadrilateral.

Returns

Returns of coordinate of the center.

Return type

Tuple(numeric)

propertyarea

Return the area of the quadrilateral.

propertymapped_rectangle_points
propertyperspective_matrix
map_to_points_ordering(x_map,y_map)[source]
condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates,generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates,generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other,soft_margin={},center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict,optional, defaults to{}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool,optional, defaults toFalse) – The toggle to determine whether the center (instead of the four corners)of the current element is in the other element.

Returns

ReturnsTrue if the current element is in the other element andFalse if not.

Return type

bool

intersect(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Intersect the current shape with the other object, with operations defined inShape Operations.

union(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Union the current shape with the other object, with operations defined inShape Operations.

pad(left=0,right=0,top=0,bottom=0,safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. Ifsafe_mode is set to True, the function will cut off the excess padding that falls on the negativeside of the coordinates.

Parameters
  • left (int,optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int,optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int,optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int,optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool,optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

shift(shift_distance=0)[source]

Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is onenumeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric orTuple(numeric) orList[numeric]) – The number of pixels used to shift the element.

Returns

The shifted BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

scale(scale_factor=1)[source]

Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is onenumeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric orTuple(numeric) orList[numeric]) – The amount for downscaling or upscaling the element.

Returns

The scaled BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image using the points of the quadrilateral instance.

Parameters

image (Numpyarray) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpyarray

to_interval(axis,**kwargs)[source]
to_rectangle()[source]
to_dict() → Dict[str,Any][source]

Generate a dictionary representation of the current object:

{"block_type":"quadrilateral","points":[p[0,0],p[0,1],p[1,0],p[1,1],p[2,0],p[2,1],p[3,0],p[3,1]],"height":value,"width":value}

TextBlock

classlayoutparser.elements.TextBlock(block,text=None,id=None,type=None,parent=None,next=None,score=None)[source]

Bases:layoutparser.elements.base.BaseLayoutElement

This class constructs content-related information of a layout element in addition to its coordinate definitions(i.e. Interval, Rectangle or Quadrilateral).

Parameters
  • block (BaseCoordElement) – The shape-specific coordinate systems that the text block belongs to.

  • text (str,optional, defaults to None) – The ocr’ed text results within the boundaries of the text block.

  • id (int,optional, defaults toNone) – The id of the text block.

  • type (int,optional, defaults toNone) – The type of the text block.

  • parent (int,optional, defaults toNone) – The id of the parent object.

  • next (int,optional, defaults toNone) – The id of the next block.

  • score (numeric, defaults toNone) – The prediction confidence of the block

propertyheight

Return the height of the shape-specific block.

Returns

Output the numeric value of the height.

Return type

numeric

propertywidth

Return the width of the shape-specific block.

Returns

Output the numeric value of the width.

Return type

numeric

propertycoordinates

Return the coordinates of the two corner points that define the shape-specific block.

Returns

Output the numeric values of the coordinates in a Tuple of size four.

Return type

Tuple(numeric)

propertypoints

Return the coordinates of all four corners of the shape-specific block in a clockwise fashionstarting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpyarray

propertyarea

Return the area of associated block.

condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates,generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates,generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other,soft_margin={},center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict,optional, defaults to{}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool,optional, defaults toFalse) – The toggle to determine whether the center (instead of the four corners)of the current element is in the other element.

Returns

ReturnsTrue if the current element is in the other element andFalse if not.

Return type

bool

union(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Union the current shape with the other object, with operations defined inShape Operations.

intersect(other:layoutparser.elements.base.BaseCoordElement,strict:bool=True)[source]

Intersect the current shape with the other object, with operations defined inShape Operations.

shift(shift_distance)[source]

Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is onenumeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric orTuple(numeric) orList[numeric]) – The number of pixels used to shift the element.

Returns

The shifted BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

pad(left=0,right=0,top=0,bottom=0,safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. Ifsafe_mode is set to True, the function will cut off the excess padding that falls on the negativeside of the coordinates.

Parameters
  • left (int,optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int,optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int,optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int,optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool,optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

scale(scale_factor)[source]

Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is onenumeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric orTuple(numeric) orList[numeric]) – The amount for downscaling or upscaling the element.

Returns

The scaled BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image according to the coordinates of the element.

Parameters

image (Numpyarray) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpyarray

to_interval(axis:Optional[str]=None,**kwargs)[source]
to_rectangle()[source]
to_quadrilateral()[source]
to_dict() → Dict[str,Any][source]

Generate a dictionary representation of the current textblock of the format:

{"block_type":<nameofself.block>,<attributesofself.blockcombinedwithnon-emptyself._features>}
classmethodfrom_dict(data:Dict[str,Any]) → layoutparser.elements.layout_elements.TextBlock[source]

Initialize the textblock based on the dictionary representation.It generate the block based on theblock_type andblock_attr,and loads the textblock specific features from the dict.

Parameters

data (dict) – The dictionary representation of the object

Layout

classlayoutparser.elements.Layout(blocks:Optional[List]=None,*,page_data:Dict=None)[source]

Bases:collections.abc.MutableSequence

TheLayout class id designed for processing a list of layout elementson a page. It stores the layout elements in a list and the relatedpage_data,and provides handy APIs for processing all the layout elements in batch. `

Parameters
  • blocks (list) – A list of layout element blocks

  • page_data (Dict,optional) – A dictionary storing the page (canvas) related informationlikeheight,width, etc. It should be passed in as akeyword argument to avoid any confusion.Defaults to None.

insert(key,value)[source]

S.insert(index, value) – insert value before index

copy()[source]
relative_to(other)[source]
condition_on(other)[source]
is_in(other,soft_margin={},center=False)[source]
sort(key=None,reverse=False,inplace=False) → Optional[layoutparser.elements.layout.Layout][source]

Sort the list of blocks based on the given

Parameters
  • key ([type],optional) – key specifies a function of one argument that

  • used to extract a comparison key from each list element. (is) –

  • to None. (Defaults) –

  • reverse (bool,optional) – reverse is a boolean value. If set to True,

  • the list elements are sorted as if each comparison were reversed. (then) –

  • to False. (Defaults) –

  • inplace (bool,optional) – whether to perform the sort inplace. If set

  • False,it will return another object instance with _block sorted in (to) –

  • order. Defaults to False. (the) –

Examples::
>>>importlayoutparseraslp>>>i=lp.Interval(4,5,axis="y")>>>l=lp.Layout([i,i.shift(2)])>>>l.sort(key=lambdax:x.coordinates[1],reverse=True)
filter_by(other,soft_margin={},center=False)[source]

Return aLayout object containing the elements that are in theother object.

Parameters

other (BaseCoordElement) – The block to filter the current elements.

Returns

A new layout object after filtering.

Return type

Layout

shift(shift_distance)[source]

Shift all layout elements by user specified amounts on x and y axis respectively. If shift_distance is onenumeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric orTuple(numeric) orList[numeric]) – The number of pixels used to shift the element.

Returns

A new layout object with all the elements shifted in the specified values.

Return type

Layout

pad(left=0,right=0,top=0,bottom=0,safe_mode=True)[source]

Pad all layout elements on the four sides of the polygon with the user-defined pixels. Ifsafe_mode is set to True, the function will cut off the excess padding that falls on the negativeside of the coordinates.

Parameters
  • left (int,optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int,optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int,optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int,optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool,optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

A new layout object with all the elements padded in the specified values.

Return type

Layout

scale(scale_factor)[source]

Scale all layout element by a user specified amount on x and y axis respectively. If scale_factor is onenumeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric orTuple(numeric) orList[numeric]) – The amount for downscaling or upscaling the element.

Returns

A new layout object with all the elements scaled in the specified values.

Return type

Layout

crop_image(image)[source]
get_texts()[source]

Iterate through all the text blocks in the list and append their ocr’ed text results.

Returns

A list of text strings of the text blocks in the list of layout elements.

Return type

List[str]

get_info(attr_name)[source]

Given user-provided attribute name, check all the elements in the list and return the correspondingattribute values.

Parameters

attr_name (str) – The text string of certain attribute name.

Returns

The list of the corresponding attribute value (if exist) of each element in the list.

Return type

List

to_dict() → Dict[str,Any][source]

Generate a dict representation of the layout object withthe page_data and all the blocks in its dict representation.

Returns

The dictionary representation of the layout object.

Return type

Dict

get_homogeneous_blocks() → List[layoutparser.elements.base.BaseLayoutElement][source]

Convert all elements into blocks of the same type basedon the type casting rule:

Interval<Rectangle<Quadrilateral<TextBlock
Returns

A list of base layout elements of the maximal compatibletype

Return type

List[BaseLayoutElement]

to_dataframe(enforce_same_type=False) → pandas.core.frame.DataFrame[source]

Convert the layout object into the dataframe.Warning: the page data won’t be exported.

Parameters

enforce_same_type (bool, optional) – If true, it will convert all the contained blocks tothe maximal compatible data type.Defaults to False.

Returns

The dataframe representation of layout object

Return type

pd.DataFrame