JPH02210586A

Movatterモバイル変換

Info

Publication number: JPH02210586A
Application number: JP63171729A
Authority: JP
Inventors: Noboru Shimizu; 昇清水
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1988-07-12
Filing date: 1988-07-12
Publication date: 1990-08-21
Anticipated expiration: 2014-05-10
Also published as: JP2890306B2

Abstract

PURPOSE:To extract a whole table are including characters in a table as well by extracting line segments, which are the elements of the table, by extracting long black picture element strings, detecting the coordinates of starting points and ending points in the respective line segments, determining the diagonal coordinates of a rectangle expressing the table area from those coordinates and extracting the table area according to the diagonal coordinates. CONSTITUTION:A picture input means 1 is provided to digitize and input a picture including the table and a first image memory 2 is provided to store the input picture. Then, a line segment extracting means 3 is provided to extract the black picture element strings to be continued for more than a fixed threshold value, which is determined in advance in horizontal and vertical direction, and a second image memory 4 is provided to store the extracted black picture element string. A rectangle coordinate detecting means 5 is provided to detect the coordinates of the rectangle, which expresses the table area, based on the black picture element string stored in the second image memory 4 and a table area extracting means 6 is provided to extract the table area from the first image memory 2 according to the detected coordinates of the rectangle expressing the table area. Thus, document picture recognition or document edition can be efficiently executed and the table are including the characters as well can be extracted.

Description

Translated fromJapanese

【発明の詳細な説明】（産業上の利用分野）本発明は人間による認識のためのマーク付は等が行なわ
れていない一般の文書を認識するための文書画像認識に
おいて、文字、図、表、写真などが混在する文書から表
領域を分離する装置に関するものである。Detailed Description of the Invention (Field of Industrial Application) The present invention is applicable to document image recognition for recognizing general documents that are not marked for human recognition. The present invention relates to a device that separates table areas from documents containing photographs and the like.

（従来の技術）従来の文字・図形分離処理は、文字と図形を分離するの
みである。つまり第２図に示すように、表を含んだ文書
画像７０に対しては表の線と文字（表領域外の文字（本
文文字と見出し文字など）と表領域内の文字の両方を含
む）を分離して、文字画像７１と線画像７２とに分離抽
出するのみで、表内の文字を含んだ表領域を分離して、
文字領域画像７３と表領域画像７４とに分離抽出するこ
とはできなかった。(Prior Art) Conventional character/figure separation processing only separates characters and figures. In other words, as shown in FIG. 2, for a document image 70 that includes a table, the lines and characters of the table (including both characters outside the table area (body text, heading characters, etc.) and characters inside the table area) By simply separating and extracting the character image 71 and line image 72, the table area containing the characters in the table can be separated,
It was not possible to separate and extract the character area image 73 and the table area image 74.

このような従来技術においては、表内の文字を、含んだ
表領域を抽出できないことによって、以下のような問題
点があった。This conventional technique has the following problems because it is not possible to extract a table area that includes characters in a table.

■文書画像認識では、文字認識の際に本文領域と表内の
文字が区別できないために、文脈を用いた効率的な文字
認識を行なうことができない。(2) In document image recognition, it is not possible to distinguish between text areas and characters in tables during character recognition, making it impossible to perform efficient character recognition using context.

０表においては、同一行（列）には同じ文字、同じ意味
の文字、あるいは反対の意味の文字などが使用されるこ
とが多いが、このような表の構造的な意味を用いて効率
的に文字認識を行なうことはできない。In tables, the same characters, characters with the same meaning, or characters with opposite meanings are often used in the same row (column). cannot perform character recognition.

■文書編集では、表の線のみに対する編集ができるのみ
で、意味を持つ表領域に対する編集ができない。■When editing documents, you can only edit table lines, but you cannot edit meaningful table areas.

（発明が解決しようとする課題）本発明は、文書画像認識や文書編集を効率的に行なうの
に文字をも含んだ表領域の抽出ができるようにすること
を目的とするものである。(Problems to be Solved by the Invention) An object of the present invention is to make it possible to extract a table area including characters in order to efficiently perform document image recognition and document editing.

（課題を解決するための手段）本発明は、表を含む画像をデジタル化して入力する画像
入力手段と、入力画像を記憶する第１のイメージメモリ
と、水平または垂直方向に予め定めた一定の閾値以上に
わたり連続する黒画素列を抽出する線分抽出手段と、そ
の抽出された黒画素列を格納する第２のイメージメモリ
と、その第２のイメージメモリに格納した黒画素列に基
づき表領域を表わす矩形の座標を検出する矩形座標検出
手段と、その検出した表領域を表す矩形の座標により第
１のイメージメモリから表領域を抽出する表領域抽出手
段とを備えた表領域分離装置である。(Means for Solving the Problems) The present invention includes an image input means for digitizing and inputting an image including a table, a first image memory for storing the input image, and a predetermined fixed amount in the horizontal or vertical direction. A line segment extraction means for extracting a continuous black pixel string over a threshold value, a second image memory for storing the extracted black pixel string, and a table area based on the black pixel string stored in the second image memory. A table area separation device comprising rectangle coordinate detection means for detecting the coordinates of a rectangle representing the table area, and table area extraction means for extracting the table area from the first image memory based on the coordinates of the rectangle representing the detected table area. .

（作用）本発明は、画像入力手段から表を含む一般文書すなわち
人間による処理のためのマーク付は等が行なわれ°てい
ない文書をデジタル入力し、その原画像を第１のイメー
ジメモリに格納しておき、その原画像に対して線分抽出
手段により水平方向または垂直方向に長く連続する黒画
素列を抽出し、その抽出した黒画像素列つまり水平方向
線分のみになった画像と垂直方向線分のみになった画像
を第２のイメージメモリに格納し、矩形座標検出手段に
おいて両画像に存在する各線分の始点と終点を検出し、
表領域の有無を確認し、そして、表領域を表わす対角座
標を求め、次に表領域抽出手段において矩形座標検出手
段からの表領域を表す前記座標を用い、第１イメージメ
モリに格納されている原画像から表領域画像と文字領域
画像（表領域外の画像）を分離する。(Operation) The present invention digitally inputs a general document including a table, that is, a document that has not been marked for human processing, from an image input means, and stores the original image in a first image memory. Then, a long continuous black pixel string in the horizontal or vertical direction is extracted from the original image using a line segment extraction means, and the extracted black pixel string, that is, an image containing only horizontal line segments, is vertically The image containing only directional line segments is stored in a second image memory, and the rectangular coordinate detection means detects the start and end points of each line segment existing in both images,
The presence or absence of a table area is confirmed, and the diagonal coordinates representing the table area are determined. Next, the table area extraction means uses the coordinates representing the table area from the rectangular coordinate detection means, and the coordinates representing the table area are stored in the first image memory. Separate the table area image and character area image (image outside the table area) from the original image.

（実施例）第１図は本発明の一実施例を示すもので、この表領域分
離装置は、画像入力部１、第１のイメージメモリ２、長
ランレングス抽出部３、イメージメモリ４、矩形座標検
出部５、および表領域抽出部６からなっている。(Embodiment) FIG. 1 shows an embodiment of the present invention, and this table area separation device includes an image input section 1, a first image memory 2, a long run length extraction section 3, an image memory 4, a rectangular It consists of a coordinate detection section 5 and a table area extraction section 6.

画像入力部ｌは、本文文字や表を含む文書画像を入力す
る。たとえば、第２図（ａ）原画像７０を２値デジタル
データとして入力する。The image input unit 1 inputs a document image including text characters and tables. For example, the original image 70 in FIG. 2(a) is input as binary digital data.

第１のイメージメモリ２は、入力した２値デジタルデー
タを記憶しておく。The first image memory 2 stores input binary digital data.

長ランレングス抽出部３は、第１のイメージメモリ内の
予め定めた一定の閾値以上の長ランレングス（＝長く連
続する黒画素列）を水平および垂直方向走査することに
より、取り出す。The long run length extraction unit 3 extracts long run lengths (= long continuous black pixel rows) in the first image memory that are equal to or larger than a predetermined threshold value by scanning in the horizontal and vertical directions.

水平方向の長ランレングスの黒画素列の取り出し方法を
第３図を用いて説明する。第３図（ａ）は原画像の一部
分であり、四角の一個が１ドツトに対応し、斜線部分が
黒画像、白い部分が白画素、縦の太線がメモリ内のバイ
ト単位を示している。A method for extracting a long horizontal run length black pixel column will be explained with reference to FIG. FIG. 3(a) shows a part of the original image, where one square corresponds to one dot, the diagonal line area is a black image, the white area is a white pixel, and the thick vertical line is a byte unit in the memory.

この画像に対して、水平方向に走査し、閾値（この例で
は１０ドツト）以上の黒ランを取り出す。This image is scanned in the horizontal direction, and black runs exceeding a threshold value (10 dots in this example) are extracted.

結果は第３図（ｂ）のようになる。垂直方向の長ランレ
ングスの黒画素列の取り出し方法を第４図を用いて示す
。第４図（ａ）は、原画像の一部である。本実施例にお
けるメモリはバイト単位でアクセスを行なうものを用い
たので、垂直方向に走査するには第４図（ａ）のままで
は水平方向の走査と違い、１ドツトの読み出し毎にその
ドツトを含む１バイト全体が読み出されることになる。The result is as shown in Figure 3(b). A method for extracting a black pixel column with a long vertical run length will be shown with reference to FIG. FIG. 4(a) is a part of the original image. Since the memory used in this embodiment is accessed in byte units, when scanning in the vertical direction, as shown in FIG. 4(a), unlike horizontal scanning, each dot is read out. The entire 1 byte containing the data will be read.

つまり、水平方向の走査では８ドツトを１回のアクセス
で読み出すことができるのに対し、第４図（ａ）の配列
のままで垂直方向の走査を行なうと８ドツトの読み出し
にメモリに対し８回のアクセスが必要となり、走査に時
間を要することになる。In other words, when scanning in the horizontal direction, 8 dots can be read out in one access, whereas when scanning in the vertical direction with the arrangement shown in FIG. This requires multiple accesses, and scanning takes time.

そこで、垂直方向の走査も水平方向の走査と同じように
できるように、第４図（ｂ）に示すとおり原画像を９０
度回転した画像をメモリ上に作成する。その９０度回転
した画像に対して、水平方向と同じ処理を行なう。Therefore, in order to perform vertical scanning in the same way as horizontal scanning, the original image was
Creates a rotated image in memory. The same processing as in the horizontal direction is performed on the 90 degree rotated image.

この処理によって、第２のイメージメモリ４には、第５
図に示すように水平方向線分のみの画像（ａ）と垂直方
向線分のみの画像（ｂ）が２面できあがる。ただし、垂
直方向線分のみの画像は９０度回転したままの画像であ
る。Through this process, the fifth image is stored in the second image memory 4.
As shown in the figure, two images are created: an image (a) containing only horizontal line segments and an image (b) containing only vertical line segments. However, an image of only vertical line segments is an image rotated by 90 degrees.

第２のイメージメモリ４内の２つの画像に対して、矩形
座標検出部５では、表の対角座標（左上座標と右下座標
または右上座標と左下座標）を検出する。具体的には、
まず、第５図（ａ）に示すような水平方向のみの画像に
対して、同図７７の走査方向（ここでは７７の方向とす
る。逆でもよい。）で、走査して水平方向線分７５の始
点、終点のＸＮ　ｙ座標を求める。第５図（ｂ）に示す
ような垂直方向のみの画像に対しても同様に、７７の走
査方向で線分７６の始点、終点のＸＸ　ｙ座標を求める
。ただし、この際の座標系は第５図（ｂ）のように９０
度回転したものとなっている。For the two images in the second image memory 4, the rectangular coordinate detection unit 5 detects the diagonal coordinates (upper left coordinate and lower right coordinate, or upper right coordinate and lower left coordinate) of the table. in particular,
First, an image only in the horizontal direction as shown in FIG. 5(a) is scanned in the scanning direction 77 in FIG. Find the XNy coordinates of the starting point and ending point of 75. Similarly, for an image only in the vertical direction as shown in FIG. 5(b), the XX and y coordinates of the starting point and ending point of the line segment 76 are determined in the scanning direction 77. However, the coordinate system at this time is 90° as shown in Figure 5(b).
It has been rotated a degree.

水平／垂直方向線分のＸＩ　’Ｊ座標から、線分７５と
７６が互いに交差することを確認（この確認はアンダー
ライン等の表を構成していない線分を取り除くためであ
る）シ、交差している線分のみについて、水平方向線分
の始点で最小のＸ座標ｘＩｌｌと、終点で最大のＸ座標
Ｘ、を求め、垂直方向線分の終点で最小のｙ座標ｙ。と
、始点で最大のｙ座標ｙ１を求める。これによって、表
の左上座標（ＸＯ％ｙｇ）　、右下座標（ｘ７、ｙ、）
が求まる。Check that line segments 75 and 76 intersect each other from the XI 'J coordinates of the horizontal/vertical line segments (this check is to remove line segments that do not constitute a table, such as underlines). Find the minimum X coordinate xIll at the starting point of the horizontal line segment, the maximum X coordinate X at the end point, and the minimum y coordinate y at the end point of the vertical line segment. Then, find the maximum y-coordinate y1 at the starting point. With this, the upper left coordinates (XO%yg) and lower right coordinates (x7, y,) of the table
is found.

ここで、左上座標と右上座標を簡易に線群７５の最初に
みつかった水平線の始点のＸ１ｙ座標と、最後に見つか
った水平線の終点のＸＮ　ｙ座標から求めなかったのは
、第６図のように開いている表（周りが水平／垂直線で
囲まれていない表）にも対処できるようにしたためであ
る。Here, the reason why the upper left and upper right coordinates were not simply calculated from the X1y coordinate of the starting point of the first horizontal line found in the line group 75 and the XNy coordinate of the ending point of the last horizontal line found is as shown in Figure 6. This is so that it can also be used for tables that are open (tables that are not surrounded by horizontal/vertical lines).

表領域抽出部６では、矩形座標検出部５より渡された左
上座標（Ｘｏｌｙｏ）と右上座標（Ｘｌ、ｙｌ）から、
その座標に対応する第１のイメージメモリ２の矩形領域
を切り出し、第２図（ｅ）のような表内の文字を含んだ
表領域７４を抽出できる。また、矩形座標内を白く塗り
つぶすことによって第２図（ｄ）のような表領域外の画
像７３を得ることができる。In the table area extraction unit 6, from the upper left coordinate (Xolyo) and upper right coordinate (Xl, yl) passed from the rectangle coordinate detection unit 5,
A rectangular area of the first image memory 2 corresponding to the coordinates is cut out, and a table area 74 containing characters in the table as shown in FIG. 2(e) can be extracted. Furthermore, by filling the inside of the rectangular coordinates with white, an image 73 outside the table area as shown in FIG. 2(d) can be obtained.

本発明の上記実施例においては、バイト単位でアクセス
可能なメモリを用い、長ランレングス抽出部３で画像を
９０度回転させることによって、垂直方向も効率よ（走
査しているが、これをビット単位でめアクセスが可能な
メモリを用いることによって、第５図（ｂ）のように回
転した座標系を用いな（でも垂直方向も水平方向と同じ
に効率よく走査することができる。In the above embodiment of the present invention, by using a memory that can be accessed in bytes and rotating the image by 90 degrees in the long run length extraction unit 3, the vertical direction is also efficiently scanned. By using a memory that can be accessed in units, it is possible to scan in the vertical direction as efficiently as in the horizontal direction (even without using a rotated coordinate system as shown in FIG. 5(b)).

また、抽出の対象とする表の構造を規定すること、たと
えば、表はかならず水平／垂直線で閉じていることなど
を限定することによって、垂直方向の線分取り出しを省
略してしまうなどの簡略化を行なうことできる。たとえ
ば、アンダーラインがなく、閉じた表を対象とするなら
ば、１番上にある水平線の始点のＸＮ　ｙ座標と１番下
にある水平線の終点のＸＮ　ｙ座標を求めることのみに
よって、表領域の抽出が可能である。In addition, by specifying the structure of the table to be extracted, for example, by restricting the table to always be closed with horizontal/vertical lines, it is possible to omit the extraction of vertical line segments. It is possible to carry out transformations. For example, if the target is a closed table with no underlining, the table area can be It is possible to extract

（発明の効果）以上のように、本発明は、長い黒画素列を抽出すること
によって表の要素である線分を抽出し、各線分の始点と
終点の座標を検出し、それらの座標から表領域を表す矩
形の対角座標を決定し、その対角座標により表領域を抽
出するように構成したので、従来技術のように表の線分
のみを抽出するのとは異なり、表内の文字をも含む表領
域全体を抽出することが可能である。従って、本発明を
文書画像認識の前処理に用いた場合には、文書画像の認
識を高効率に行なうために必要な条件を備えた分離され
た表領域と文字領域を提供でき、極めて有用である。(Effects of the Invention) As described above, the present invention extracts line segments that are elements of a table by extracting a long black pixel string, detects the coordinates of the start point and end point of each line segment, and The diagonal coordinates of the rectangle representing the table area are determined, and the table area is extracted using the diagonal coordinates. It is possible to extract the entire tablespace including characters. Therefore, when the present invention is used for preprocessing of document image recognition, it is possible to provide a separated table area and character area that have the necessary conditions for highly efficient document image recognition, which is extremely useful. be.

また、本発明は、長い黒画素列を抽出し、各黒画素列の
その始点と終点の座標を検出し比較するという簡易な演
算処理によって表領域を抽出するので、装置の構成を簡
単にすることができると共に、処理の高速化を実現する
ことができる。Furthermore, the present invention simplifies the configuration of the device because the table area is extracted through simple arithmetic processing of extracting a long black pixel string, detecting and comparing the coordinates of the start and end points of each black pixel string. In addition, it is possible to realize faster processing.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図は、本発明の一実施例を示すブロック図である。第２図は文字と表の分離の態様を説明するための図であ
り、同図（ａ）は原入力画像、（ｂ）は従来手法゛によ
る文字画像、（Ｃ）は従来手法による線画像、（ｄ）は
文字領域画像、（ｅ）は表領域画像の例を示す図である
。第３図は、水平方向に長い黒画素列を取り出すための説
明図である。第５図は、長い黒画素列から表領域の矩形座標を抽出す
るための説明図である。第６図は、開いた表（表の四方が水平／垂直線で囲まれ
ていない表）の−例を示す図である。■・・・画像入力部、２・・・第１のイメージメモリ、
３・・・長ランレングス抽出部、４・・・第２のイメー
ジメモリ、５・・・矩形座標検出部、６・・・表領域抽
出部、７０・・・原画像、７１・・・従来手法による文
字画像、７２・・・従来手法による線画像、７３・・・
本発明による文字領域画像、７４・・・本発明による表
領域画像、７５・・・水平方向線分、７６・・・垂直方
向線分、７７・・・走査方向。＝１２−第２図（ｂ）（ｃ）第図（ａ）第図（ａ）手続１？１１正書（方式）昭和６３年１０月２４日特許庁長官　吉　１）文　毅　殿事件の表示発明の名称特願昭８３−１７１７２９号表領域分離装置補正をする者事件との関係住　　所名　　　称代表者FIG. 1 is a block diagram showing one embodiment of the present invention. FIG. 2 is a diagram for explaining the mode of separating characters and tables, in which (a) is the original input image, (b) is a character image obtained by the conventional method, and (C) is a line image obtained by the conventional method. , (d) is a diagram showing an example of a character area image, and (e) is a diagram showing an example of a table area image. FIG. 3 is an explanatory diagram for extracting a horizontally long black pixel row. FIG. 5 is an explanatory diagram for extracting rectangular coordinates of a table area from a long black pixel string. FIG. 6 is a diagram showing an example of an open table (a table whose four sides are not surrounded by horizontal/vertical lines). ■...Image input section, 2...First image memory,
3... Long run length extraction unit, 4... Second image memory, 5... Rectangular coordinate detection unit, 6... Table area extraction unit, 70... Original image, 71... Conventional Character image by method, 72...Line image by conventional method, 73...
Character area image according to the present invention, 74...Table area image according to the present invention, 75...Horizontal line segment, 76...Vertical line segment, 77...Scanning direction. =12- Figure 2 (b) (c) Figure (a) Figure (a) Procedure 1? 11 official text (method) October 24, 1988 Director General of the Patent Office Yoshi 1) Indication of Moon Yi case Name of the invention Japanese Patent Application No. 83-171729 Person who amends table area separation device Relationship to the case Address Name Representative

Claims

Translated fromJapanese

【特許請求の範囲】表を含む画像をデジタル化して入力する画像入力手段と
、入力画像を記憶する第１のイメージメモリと、水平また
は垂直方向に予め定めた一定の閾値以上にわたり連続す
る黒画素列を抽出する線分抽出手段と、その抽出された黒画素列を格納する第２のイメージメモ
リと、第２のイメージメモリに格納した黒画素列に基づき表領
域を表わす矩形の座標を検出する矩形座標検出手段と、その検出した表領域を表す矩形の座標により第１のイメ
ージメモリから表領域を抽出する表領域抽出手段とを備えたことを特徴とする表領域分離装置。[Claims] Image input means for digitizing and inputting an image including a table; a first image memory for storing the input image; and black pixels that are continuous over a predetermined threshold in the horizontal or vertical direction. a line segment extraction means for extracting a column; a second image memory for storing the extracted black pixel column; and detecting the coordinates of a rectangle representing a table area based on the black pixel column stored in the second image memory. A table area separation device comprising: rectangular coordinate detecting means; and table area extracting means for extracting a table area from a first image memory based on the rectangular coordinates representing the detected table area.