CN101122999B

Movatterモバイル変換

Info

Publication number: CN101122999B
Application number: CN2007101439463A
Authority: CN
Inventors: 娄海涛; 鲍泓; 唐智星; 康乐; 张鑫蕊
Original assignee: Beijing Union University
Current assignee: Beijing Union University
Priority date: 2007-04-16
Filing date: 2007-08-15
Publication date: 2010-07-07
Anticipated expiration: 2027-08-15
Also published as: CN101122999A

Abstract

The invention relates to a method of automatic extracting seal image from Chinese handwriting and drawing works, and the method comprises following steps: The L*a*b* component color analysis and the method of mapping the analysis results to the RGB color space are used to filter the non-red color information from the target image; The remained information of the image is de-noised; The non-seal information in the remained information of the image is eliminated using the geometrical region based secondary splitting and filtering method, the communication region based image filtering method andthe margin detection method; and the existing position corresponding relations between the rectangle L3 and the Chinese handwriting and drawing works are used for image splitting and extracting. The invention provides a method of accurate and automatic extracting all seal image information from a part of or the whole part of a digital image of Chinese handwriting and drawing works, so the invention establishes a foundation for realizing contents-based Chinese handwriting and drawing work image indexing system with the seal image as the key information. The invention can be widely used in the field of cultural relic digitalization.

Description

Translated fromChinese

一种自动提取中国书画作品中印章图像的方法A method for automatically extracting seal images in Chinese calligraphy and painting works

技术领域technical field

本发明涉及一种图像提取方法，特别是一种自动提取中国书画作品中印章图像的方法。The invention relates to an image extraction method, in particular to a method for automatically extracting seal images in Chinese calligraphy and painting works.

背景技术Background technique

中国书画作品中的印章具有重要艺术价值，是书画作品中不可分割的一部分，通过对书画作品中印章图像的鉴别和检索有助于实现书画作品相关信息的检索和鉴识。Seals in Chinese calligraphy and painting works have important artistic value and are an inseparable part of calligraphy and painting works. The identification and retrieval of seal images in calligraphy and painting works can help to realize the retrieval and identification of relevant information of calligraphy and painting works.

图像检索技术自二十世纪70年代以来一直是个非常活跃的研究课题。到目前为止，检索技术主要有两种：基于语义的检索技术和基于内容的检索技术。早期的图像检索是基于语义(图像关键字)的检索，该检索方法需要人工对每幅图像按其内容进行标注，然后将标注信息存到文本数据库中用于后来的检索，随着图像的增多，人工标注非常困难，而且，每个人对图像内容的理解不同会造成标注的主观性过强，不利于用户检索。二十世纪90年代以后，图像检索的研究重点是基于图像内容的检索(Content Based Image Retrieval，CBIR)，即在数据库中找出满足某一特定的视觉特征描述的图像的过程，其基本思想是通过分析图像的视觉特征和上下文联系来进行检索。这种技术使用特定的算法与技术手段由计算机自动提取包含图像内容的可视特征如颜色、纹理、形状、对象的位置和相互关系等，并将提取的不同图像的相互区别的一组特征存入图像特征数据库，通过对数据库中图像和查询样本图像在特征空间进行相似匹配，检索出与样本相似的图像。Image retrieval technology has been a very active research topic since the 1970s. So far, there are two main retrieval technologies: semantic-based retrieval technology and content-based retrieval technology. Early image retrieval was based on semantic (image keyword) retrieval. This retrieval method needs to manually label each image according to its content, and then store the label information in the text database for later retrieval. With the increase of images , manual labeling is very difficult, and everyone's different understanding of image content will cause the subjectivity of labeling to be too strong, which is not conducive to user retrieval. After the 1990s, the focus of image retrieval research is based on image content retrieval (Content Based Image Retrieval, CBIR), that is, the process of finding images that satisfy a specific visual feature description in the database. The basic idea is Retrieval is performed by analyzing the visual features and contextual connections of images. This technology uses specific algorithms and technical means to automatically extract visual features including image content, such as color, texture, shape, object position and mutual relationship, etc., by computer, and store a set of extracted features that are different from each other. The image feature database is entered into the image feature database, and the image similar to the sample is retrieved by similar matching between the image in the database and the query sample image in the feature space.

自20世纪90年代以来，基于内容的图像检索的研究和应用在国外取得了长足的发展，一些著名的图像检索系统相继被推出：QBIC(Query By Image Content)图像检索系统是IBM公司90年代开发制作的图像和动态景象检索系统，是第一个基于内容的商业化的图像检索系统；VIR Image Engine是由Virage公司开发的基于内容的图像检索引擎，它同时也支持基于色彩、颜色布局、纹理和结构等视觉特征的图像检索；RetrievalWare是由Excal ibur科技有限公司开发的一种基于内容的图像检索工具，提供基于颜色、形状、纹理、颜色结构、亮度结构和纵横比6种图像属性的检索；Photobook是美国麻省理工学院的多媒体实验室开发的用于图像查询和浏览的交互工具，用户可以在三个子系统中分别进行基于形状、基于纹理和基于面部特征的图像检索；哥伦比亚大学开发的Vi sualSEEK和WebSEEK分别是基于视觉特征和面向WWW的文本或图像的检索工具。Since the 1990s, the research and application of content-based image retrieval has made great progress abroad, and some famous image retrieval systems have been launched one after another: QBIC (Query By Image Content) image retrieval system was developed by IBM in the 1990s The image and dynamic scene retrieval system produced is the first content-based commercial image retrieval system; VIR Image Engine is a content-based image retrieval engine developed by Virage, which also supports color, color layout, texture Image retrieval based on visual features such as structure and structure; RetrievalWare is a content-based image retrieval tool developed by Excalibur Technology Co., Ltd., which provides retrieval based on six image attributes: color, shape, texture, color structure, brightness structure and aspect ratio ; Photobook is an interactive tool for image query and browsing developed by the Multimedia Laboratory of the Massachusetts Institute of Technology. Users can perform image retrieval based on shape, texture and facial features in three subsystems; developed by Columbia University VisualSEEK and WebSEEK are retrieval tools based on visual features and WWW-oriented text or images respectively.

在国内，清华大学于1997年研制了一个Internet上的静态图像的基于内容检索的原型系统，中国科学院计算技术研究所研究了基于特征的多媒体信息检索系统，北京华旗图像数据智能技术有限公司研发了图像智能检索软件可以按外观设计专利的图像内容进行检索。In China, Tsinghua University developed a content-based prototype system for static images on the Internet in 1997. The Institute of Computing Technology, Chinese Academy of Sciences studied a feature-based multimedia information retrieval system. Beijing Huaqi Image Data Intelligent Technology Co., Ltd. developed With the image intelligent retrieval software, the retrieval can be carried out according to the image content of the design patent.

在文物领域，随着文物数字化的深入，大量的文物图像被以数字图像的形式保存下来，如何能够通过图像本身(或是其草图)实现对文物图像和相关信息的检出成为文物数字化领域的核心课题之一。就目前检索到的相关资料看，国内外尚未发现关于中国书画作品中印章图像自动提取方法的相关报道。In the field of cultural relics, with the deepening of the digitalization of cultural relics, a large number of images of cultural relics have been preserved in the form of digital images. How to realize the detection of images of cultural relics and related information through the images themselves (or their sketches) has become a challenge in the field of digital cultural relics. One of the core subjects. According to the relevant information retrieved so far, there is no relevant report about the automatic extraction method of seal image in Chinese painting and calligraphy works at home and abroad.

考虑到印章在书画作品中的特殊地位，利用提取出来的印章信息进行检索，将会大大提高检索的精度。中国书画作品源远流长，书画作品中的印章受年代、材质、印色、篆文、刻法、形状、字体、书画作品的材质、钤盖力度以及裱糊时的外力作用等因素的影响，给印章图像的提取带来了一定困难。目前，对于印章图像数字化领域的研究，还仅限于“公章”，会计和法人印章的研究。对于这些钤盖于现代文书中的印章图像的识别和提取，由于其背景相对简单，而比较容易实现。对于中国书画作品中印章图像的提取和识别，上述“公章”的提取方法则不能起到应有的作用。Considering the special status of seals in calligraphy and painting works, using the extracted seal information for retrieval will greatly improve the accuracy of retrieval. Chinese calligraphy and painting works have a long history. The seals in calligraphy and painting works are affected by factors such as age, material, printing color, seal script, engraving, shape, font, material of calligraphy and painting works, seal strength and external force when pasting, which give the seal image Extraction poses certain difficulties. At present, the research on the digitization of seal images is limited to "official seals", accounting and corporate seals. For the recognition and extraction of these seal images stamped in modern documents, due to their relatively simple background, it is relatively easy to implement. For the extraction and identification of seal images in Chinese calligraphy and painting works, the above-mentioned extraction method of "official seal" cannot play its due role.

发明内容Contents of the invention

针对上述问题，本发明的目的是提供一种从整幅书画作品或局部书画作品的数字图像中准确地自动提取印章图像的方法。In view of the above problems, the purpose of the present invention is to provide a method for accurately and automatically extracting stamp images from the digital images of the entire calligraphy and painting works or partial calligraphy and painting works.

为实现上述目的，本发明采取以下技术方案：一种自动提取中国书画作品中印章图像的方法，其特征在于：它包括如下步骤：(1)利用基于L^*a^*b^*分量颜色分析，以及其分析结果到RGB颜色空间的映射方法滤除目标数字图像中非红颜色信息；(2)对所述图像剩余信息中包含的噪声进行处理；(3)对经噪声处理后的所述图像剩余信息中非印章信息进行剔除：采用基于几何区域的二次分割和过滤的方法，剔除所述图像中的低密度颜色信息；采用基于连通区域的图像过滤方法，剔除所述图像中剩余图像部分所占面积远大于和远小于印章可能范围的图像信息；采用边缘检测方法将所述图像中剩余图像部分的高密度非印章颜色区域转化为低密度区域；再次采用所述基于几何区域的二次分割和过滤的方法，剔除所述图像的由高密度转化为低密度的颜色信息；(4)对所述图像进行分割，提取印章。In order to achieve the above object, the present invention takes the following technical solutions: a method for automatically extracting seal images in Chinese calligraphy and painting works, characterized in that: it comprises the steps of: (1) using color analysis based on L^* a^* b^* component, and The mapping method of its analysis result to the RGB color space filters out the non-red color information in the target digital image; (2) process the noise contained in the remaining information of the image; (3) process the remaining information of the image after the noise processing Eliminate non-stamp information in the information: use the method of secondary segmentation and filtering based on geometric regions to remove the low-density color information in the image; Image information that occupies an area much larger and far smaller than the possible range of the stamp; adopt an edge detection method to convert the high-density non-stamp color area of the remaining image part in the image into a low-density area; again use the secondary segmentation based on the geometric area and filtering methods, eliminating the color information of the image converted from high density to low density; (4) segmenting the image to extract the seal.

将所述图像进行颜色空间的转换，即由RGB颜色空间转换到XYZ颜色空间，再由XYZ颜色空间转换到L^*a^*b^*颜色空间：The image is converted into a color space, that is, converted from an RGB color space to an XYZ color space, and then converted from an XYZ color space to an L^* a^* b^* color space:

$[\begin{matrix} {X x}_{n no} \\ {Y Y}_{n no} \\ {Z Z}_{n no} \end{matrix}] = = [\begin{matrix} {k k}_{r r} {x x}_{r r} & {k k}_{g g} {x x}_{g g} & {k k}_{b b} {x x}_{b b} \\ {k k}_{r r} {y the y}_{r r} & {k k}_{g g} {y the y}_{g g} & {k k}_{b b} {y the y}_{b b} \\ {k k}_{r r} {z z}_{r r} & {k k}_{g g} {z z}_{g g} & {k k}_{b b} {z z}_{b b} \end{matrix}] [\begin{matrix} R R \\ G G \\ B B \end{matrix}] = = [\begin{matrix} {x x}_{r r} & {x x}_{g g} & {x x}_{b b} \\ {y the y}_{r r} & {y the y}_{g g} & {y the y}_{b b} \\ {z z}_{r r} & {z z}_{g g} & {z z}_{b b} \end{matrix}] [\begin{matrix} {k k}_{r r} \\ {k k}_{g g} \\ {k k}_{b b} \end{matrix}]$

$[\begin{matrix} X x \\ Y Y \\ Z Z \end{matrix}] = = [\begin{matrix} 0.607 0.607 & 0.174 0.174 & 0.200 0.200 \\ 0.299 0.299 & 0.587 0.587 & 0.114 0.114 \\ 00 . . 000000 & 0.066 0.066 & 1.116 1.116 \end{matrix}] [\begin{matrix} R R \\ G G \\ B B \end{matrix}]$

a^*＝500(f(X/Y_n)-f(Y/Y_n))a^* ＝500(f(X/Y_n )-f(Y/Y_n ))

b^*＝200(f(Y/Y_n)-f(Z/Z_n))b^* =200(f(Y/Y_n )-f(Z/Z_n ))

其中，k_r、k_g和k_b是比例系数，x_r、x_g、x_b、y_r、y_g、y_b、z_r、z_g、z_b为国际照明委员会xyY色度图中的红、绿和蓝的坐标，X_n、Y_n和Z_n是XYZ国际坐标制中参考白光的三色刺激值，X、Y、Z和R、G、B分别为各自颜色空间中的对应颜色分量，L^*、a^*、b^*为L^*a^*b^*颜色空间中的各分量，f(t)如下：Among them, k_r , k_g and k_b are proportional coefficients, x_r , x_g , x_b , y_r , y_g , y_b , z_r , z_g , z_b are the xyY chromaticity diagrams of the International Commission on Illumination The coordinates of red, green and blue, X_n , Y_n and Z_n are the tristimulus values of reference white light in the XYZ international coordinate system, X, Y, Z and R, G, B are the corresponding colors in their respective color spaces Components, L^* , a^* , b^* are the components in the L^* a^* b^* color space, f(t) is as follows:

根据中国书画作品中印章图像的一般视感特性，剔除所述图像中偏冷色调的背景图像：设置a为大于0的可调实数值，b为绝对值小于120任意实数的集合，然后利用L^*a^*b^*分量颜色分析结果在RGB颜色空间上对所述图像进行映射，即当b^*颜色不在b中或者a^*颜色小于a时被剔出，实现对所述图像中视感为红色的信息的过滤。According to the general visual characteristics of seal images in Chinese calligraphy and painting works, remove the background image with a cooler tone in the image: set a to be an adjustable real value greater than 0, b to be a set of any real number whose absolute value is less than 120, and then use L^* a^* b^* component color analysis results map the image on the RGB color space, that is, when the b^* color is not in b or the a^* color is smaller than a, it is removed, and the visual perception of the image is red. information filtering.

所述基于几何区域的二次分割和过滤的方法包括如下步骤：(1)将所述图像按设定的步长值划分为若干矩形区域，并计算各个区域颜色密度；(2)以首次划分的区域交叉点作为矩形区域的中心，对所述图像进行二次区域划分并计算各区域的颜色密度；(3)确定两次划分所获得的所述图像全部区域的颜色密度，得到颜色密度矩阵，L1为第一次区域划分的颜色密度矩阵，L2为第二次区域划分的颜色密度矩阵，当颜色密度值大于颜色密度阀值时，对应的矩阵元素取值为1，反之为0；(4)将L1中与L2中对应为1的元素相邻的方格矩阵值填充为1，获得了一个在L1中加入了L2对应信息的新矩阵L3，保留L3中矩阵值为1的元素所代表的对应图像区域，剔除L3中值为0的元素所代表的对应图像区域。The method for the secondary segmentation and filtering based on the geometric region comprises the steps: (1) divide the image into several rectangular regions according to the set step value, and calculate the color density of each region; (2) divide the image for the first time The intersection point of the region is used as the center of the rectangular region, the image is divided into two regions and the color density of each region is calculated; (3) the color density of the entire region of the image obtained by the two divisions is determined to obtain a color density matrix , L1 is the color density matrix of the first region division, L2 is the color density matrix of the second region division, when the color density value is greater than the color density threshold, the corresponding matrix element takes thevalue 1, otherwise it is 0; ( 4) Fill the square matrix value adjacent to the element corresponding to 1 in L1 in L1 with 1, and obtain a new matrix L3 with information corresponding to L2 added in L1, and retain the elements whose matrix value is 1 in L3. The corresponding image area represented by , removes the corresponding image area represented by the element whose value is 0 in L3.

本发明由于采取以上技术方案，其具有以下优点：1、由于本发明提供了一种能够从整幅书画作品或局部书画作品数字图像中准确地自动提取印章图像全部信息的方法，所以为实现以印章图像为关键信息的基于内容的中国书画作品图像检索系统奠定了基础。2、本发明提供了一种基于L^*a^*b^*分量颜色分析，以及其分析结果到RGB颜色空间的映射方法，所以为视觉感官印章颜色域范围内红色图像信息的提取提供了一种基本思路和方法。3、本发明提供了一种基于几何区域的二次分割和过滤方法，实现了对包含印章信息图像的过滤，所以为印章图像的提取提供了一种基于颜色和结构特征的研究模式。本发明可广泛应用于文物数字化领域。The present invention has the following advantages due to taking the above technical scheme: 1, because the present invention provides a kind of method that can automatically extract all information of the seal image accurately from the digital image of whole painting and calligraphy works or partial works of calligraphy and painting, so in order to realize the following The seal image lays the foundation for a content-based image retrieval system for Chinese calligraphy and painting works of key information. 2, the present invention provides a kind of based on L^* a^* b^* component color analysis, and its analysis result to the mapping method of RGB color space, so provide a kind of basic for the extraction of red image information in the range of visual sensory seal color field ideas and methods. 3. The present invention provides a secondary segmentation and filtering method based on geometric regions, which realizes the filtering of images containing seal information, so a research mode based on color and structural features is provided for the extraction of seal images. The invention can be widely used in the field of digitization of cultural relics.

附图说明Description of drawings

图1是本发明方法的流程示意图Fig. 1 is a schematic flow sheet of the inventive method

图2是本发明方法基于L^*a^*b^*分量颜色过滤前的图像Fig. 2 is the image before the inventive method is based on L^* a^* b^* component color filtering

图3是本发明方法基于L^*a^*b^*分量颜色过滤后的图像Fig. 3 is the image after the method of the present invention filters based on L^* a^* b^* component color

图4是本发明方法进行噪声处理前的图像Fig. 4 is the image before the method of the present invention carries out noise processing

图5是本发明方法进行噪声处理后的图像Fig. 5 is the image after noise processing by the method of the present invention

图6是本发明方法对图像进行两次几何区域划分的示意图Fig. 6 is a schematic diagram of the method of the present invention performing two geometric region divisions on the image

图7是第一次划分的区域和第二次划分的区域之间的关系图Figure 7 is a diagram of the relationship between the area divided for the first time and the area divided for the second time

图8是本发明方法第一次几何区域划分的颜色密度矩阵L1Fig. 8 is the color density matrix L1 of the first geometric region division of the inventive method

图9是本发明方法第二次几何区域划分的颜色密度矩阵L2Fig. 9 is the color density matrix L2 of the second geometric region division of the method of the present invention

图10是本发明方法L2中的值对L1中取值的影响关系示意图Fig. 10 is a schematic diagram of the influence relationship of the value in the method L2 of the present invention on the value in L1

图11是本发明方法在L1中加入L2影响后获得的新矩阵L3Fig. 11 is the new matrix L3 obtained after the method of the present invention adds L2 influence in L1

图12是采用本发明方法提取到的印章图像Fig. 12 is the seal image that adopts the method of the present invention to extract

图13是本发明选用的清代画家王原祁的一幅山水画(局部)Fig. 13 is a landscape painting (partial) of the Qing Dynasty painter Wang Yuanqi that the present invention selects

图14是采用本发明方法从图13所示实施例中提取的印章图像Fig. 14 is the seal image that adopts the method of the present invention to extract from the embodiment shown in Fig. 13

图15是基于本发明方法实现的中国书画作品检索的系统框图Fig. 15 is the system block diagram of the Chinese painting and calligraphy works retrieval based on the method of the present invention

具体实施方式Detailed ways

下面结合附图和实施例，对本发明方法进行详细描述。The method of the present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

本发明方法通过常规方法获取一幅包含印章信息的中国书画作品或作品局部的数字图像，利用一种图像处理组合方法，对中国书画作品中的印章图像进行识别，并逐个提取出来。The method of the present invention acquires a digital image of a Chinese painting and calligraphy work or part of the work containing seal information by a conventional method, uses an image processing and combination method to identify and extract the seal images in the Chinese painting and calligraphy work one by one.

如图1所示，印章图像提取方法包括如下步骤：As shown in Figure 1, the stamp image extraction method includes the following steps:

1、利用基于L^*a^*b^*分量颜色分析，以及其分析结果到RGB颜色空间的映射方法滤除目标图像中非红颜色信息；1. Use the mapping method based on L^* a^* b^* component color analysis and its analysis result to RGB color space to filter out the non-red color information in the target image;

2、对图像剩余信息中包含的噪声进行处理；2. Process the noise contained in the remaining information of the image;

3、对图像剩余信息中非印章信息进行剔除；3. Eliminate non-seal information in the remaining information of the image;

4、图像分割提取印章。4. Image segmentation and extraction of stamps.

计算机中图像颜色的描述一般都采用RGB颜色空间。RGB颜色空间中虽然包含有R分量，但无法利用它对视觉感受中的“红”颜色进行过滤。所以将RGB颜色空间下的图像映射到L^*a^*b^*颜色空间，通过分析a^*和b^*分量以实现对目标颜色的过滤，其中+a^*表示红色，-a^*表示绿色，+b^*表示黄色，-b^*表示蓝色，颜色的明度由L^*的百分数来表示。将用户提交的图像进行颜色空间的转换，即由RGB颜色空间转换到XYZ颜色空间，再到L^*a^*b^*颜色空间：The description of image color in computer generally adopts RGB color space. Although the RGB color space contains the R component, it cannot be used to filter the "red" color in the visual perception. Therefore, the image in the RGB color space is mapped to the L^* a^* b^* color space, and the target color is filtered by analyzing the a^* and b^* components, where +a^* means red, -a^* means green, +b^* indicates yellow, -b^* indicates blue, and the lightness of the color is expressed by the percentage of L^* . Convert the image submitted by the user to the color space, that is, convert from the RGB color space to the XYZ color space, and then to the L^* a^* b^* color space:

其中，k_r、k_g和k_b是比例系数，x_r、x_g、x_b、y_r、y_g、y_b、z_r、z_g、z_b为国际照明委员会(CIE：International Commission on Illumination)xyY色度图中的红、绿和蓝的坐标，X_n、Y_n和Z_n是XYZ国际坐标制中参考白光的三色刺激值。以上式为基础，可以进一步得出ITU-R BT.601在光源C下由RGB颜色空间到XYZ颜色空间的转换关系：Among them, k_r , k_g and k_b are proportional coefficients, and x_r , x_g , x_b , y_r , y_g , y_b , z_r , z_g , z_b are International Commission on Illumination (CIE: International Commission on The coordinates of red, green and blue in the Illumination)xyY chromaticity diagram, X_n , Y_n and Z_n are the tristimulus values of reference white light in the XYZ international coordinate system. Based on the above formula, the conversion relationship from RGB color space to XYZ color space of ITU-R BT.601 under light source C can be obtained:

其中，X、Y、Z和R、G、B分别为各自颜色空间中的对应颜色分量，再由XYZ转换到L^*a^*b^*：Among them, X, Y, Z and R, G, B are the corresponding color components in their respective color spaces, and then converted from XYZ to L^* a^* b^* :

a^*＝500(f(X/X_n)-f(Y/Y_n))a^* ＝500(f(X/X_n )-f(Y/Y_n ))

b^*＝200(f(Y/Y_n)-f(Z/Z_n))b^* =200(f(Y/Y_n )-f(Z/Z_n ))

其中，L^*、a^*、b^*为L^*a^*b^*颜色空间中的各分量，f(t)如下：Among them, L^* , a^* , b^* are the components in the L^* a^* b^* color space, and f(t) is as follows:

根据中国书画作品中印章图像的一般视感特性，将图像中偏冷色调的背景图像剔除。设置两个阀值a、b，a为大于0的实数值(可调，通常取整数)，b为绝对值小于120任意实数的集合，当某一像素b^*颜色不在b中或者a^*颜色小于a时被剔出，然后利用过滤后的结果在RGB颜色空间上对图像进行映射，可以实现对图像中视感为红色的信息的过滤(如图2、图3所示，其中图2为图像过滤前的效果，图3为图像过滤后的效果)。According to the general visual characteristics of seal images in Chinese calligraphy and painting works, the background image with a cooler tone in the image is removed. Set two thresholds a and b, a is a real value greater than 0 (adjustable, usually an integer), b is a set of any real number whose absolute value is less than 120, when a pixel b^* color is not in b or a^* color When it is less than a, it is removed, and then the filtered result is used to map the image on the RGB color space, which can realize the filtering of the information that is visually red in the image (as shown in Figure 2 and Figure 3, where Figure 2 is the image The effect before filtering, Figure 3 is the effect after image filtering).

由于书画作品的颜料、材质、保存状况以及包浆等因素的影响，经上述步骤处理后的图像包含有一定的噪声，需根据实际情况进行处理，可选用一种或几种过滤方法如自适应滤波器、中值滤波器和高斯滤波器等之一或它们的组合，也可以根据实际需要自行设计(如图4、图5所示，其中图4为滤波前的效果，图5为滤波后的效果)。Due to the influence of factors such as pigments, materials, preservation conditions, and patina of calligraphy and painting works, the image processed by the above steps contains certain noise, which needs to be processed according to the actual situation. One or several filtering methods such as adaptive Filter, median filter and Gaussian filter, etc. or their combination can also be designed according to actual needs (as shown in Figure 4 and Figure 5, where Figure 4 is the effect before filtering, and Figure 5 is the effect after filtering Effect).

经过滤的图像信息中，红色部分不一定都是印章图像，非印章信息的红色部分为包含大量的低分布密度红色信息和高分布密度红色信息的图像区域，印章信息分布密度则介于两者之间，因此还需要进一步对非印章信息进行剔除。In the filtered image information, the red part is not necessarily the seal image, the red part of the non-seal information is the image area containing a large number of low distribution density red information and high distribution density red information, and the distribution density of the seal information is between the two Therefore, it is necessary to further eliminate non-seal information.

采用基于几何区域的二次分割和过滤的方法，剔除低密度颜色信息。实验表明颜色密度小于某一阀值(该阀值是经过大量实验后得到的经验设定值，它可以根据印章上文字或图像的凹凸形体比例关系进行相应的调整)的区域不可能包含有印章信息，所以将整幅图像按设定的步长值划分为若干矩形区域，然后对各个区域分别计算颜色密度。为避免划分区域时将印章信息分割开来，造成部分印章信息所在区域被作为非印章区域剔除，对原图进行二次区域划分。第二次的划分以首次划分的区域交叉点作为矩形区域的中心。例如，将图像划分为3×3区域(如图6、图7所示，其中图6为两次几何区域划分的效果，图7为第一次划分的区域阴影部分和第二次划分的区域间的对应关系)。另外，区域划分可以确定印章图像在特定书画作品中的位置，为后面印章图像提取提供坐标信息。The method of secondary segmentation and filtering based on geometric area is used to eliminate low-density color information. Experiments show that the area where the color density is less than a certain threshold (the threshold is an empirical setting obtained after a large number of experiments, and it can be adjusted according to the proportional relationship between the text or image on the seal) cannot contain the seal Information, so the whole image is divided into several rectangular areas according to the set step value, and then the color density is calculated for each area. In order to avoid dividing the seal information when dividing the area, causing some areas where the seal information is located to be excluded as non-seal areas, the original image is divided into two areas. In the second division, the intersection point of the first division is taken as the center of the rectangular area. For example, divide the image into 3×3 areas (as shown in Figure 6 and Figure 7, where Figure 6 shows the effect of two geometric area divisions, and Figure 7 shows the shaded part of the first divided area and the second divided area correspondence between them). In addition, the region division can determine the position of the seal image in a specific calligraphy and painting work, and provide coordinate information for subsequent seal image extraction.

确定两次划分所获得图像全部区域的颜色密度，得到颜色密度矩阵，当颜色密度值大于颜色密度阀值时，对应的矩阵元素取值为1，反之为0。设第一次几何区域划分的颜色密度矩阵为L1，第二次几何区域划分的颜色密度矩阵为L2。例如，将图像划分为4×4的矩形区域(如图8、图9所示，其中图8为第一次几何区域划分得到的颜色矩阵，图9为第二次几何区域划分得到的颜色矩阵)，将L2中为1的元素在L1中的对应元素值填充为1(如图10所示)，获得一个在L1中加入了L2对应信息的新矩阵(如图11所示)。设在L1中添加了L2中对应为1的元素信息后所获得的颜色矩阵为L3。L3中矩阵值为1的元素所代表的对应图像区域颜色密度达到了可能包含印章图像信息的要求，故保留它所拥有的图像信息；L3中值为0的元素所代表的对应图像区域颜色密度没有达到可能包含印章图像信息的要求，故剔除这些区域中的图像信息。Determine the color density of the entire area of the image obtained by the two divisions to obtain a color density matrix. When the color density value is greater than the color density threshold, the corresponding matrix element takes a value of 1, otherwise it is 0. Let the color density matrix of the first geometric area division be L1, and the color density matrix of the second geometric area division be L2. For example, divide the image into 4×4 rectangular areas (as shown in Figure 8 and Figure 9, wherein Figure 8 is the color matrix obtained by the first geometric area division, and Figure 9 is the color matrix obtained by the second geometric area division ), fill the corresponding element value in L1 of the element that is 1 in L2 with 1 (as shown in Figure 10), and obtain a new matrix that adds the corresponding information of L2 in L1 (as shown in Figure 11). It is assumed that the color matrix obtained after adding element information corresponding to 1 in L2 to L1 is L3. The color density of the corresponding image area represented by the element with a matrix value of 1 in L3 meets the requirement that it may contain the image information of the stamp, so the image information it owns is retained; the color density of the corresponding image area represented by the element with a value of 0 in L3 It does not meet the requirement that it may contain stamp image information, so the image information in these areas is eliminated.

低密度的颜色信息剔除后，剩余图像中只含有颜色密度较高的信息，首先采用基于连通区域的图像过滤方法，剔除剩余图像中所占面积远大于和远小于印章可能范围的图像信息，本实施例采用种子填充算法。此时，图像中仅包含有印章图像以及与印章图像大小相近的孤立的高密度颜色图像，而非印章图像明显不具备印章图像所具有的丰富的结构信息。采用边缘检测方法如索贝尔(Sobel)算子、Roberts算子、Canny算子和拉普拉斯算子等，将高密度的非印章颜色区域转化为低密度区域，印章图像仍旧保持一定密度。本实施例采用Canny边缘检测子进行图像的边缘检测，勾边后印章部分仍具备比较丰富的特征信息，而那些孤立的高密度颜色区域因不具备丰富的边缘信息而被削弱。从信息分布的角度看，印章部分转变为相对高密度部分，而孤立的高密度颜色部分则退化为相对低密度部分。对剩余图像进行边缘信息提取后，再次采用基于几何区域的二次分割和过滤的方法，剔除原高密度颜色信息，并修改矩阵L3中相应元素的值。After the low-density color information is removed, the remaining image only contains information with high color density. Firstly, the image filtering method based on connected regions is used to remove the image information in the remaining image whose area is much larger or smaller than the possible range of the stamp. An embodiment employs a seed filling algorithm. At this time, the image only contains the seal image and an isolated high-density color image with a similar size to the seal image, and the non-seal image obviously does not have the rich structural information that the seal image has. Using edge detection methods such as Sobel operator, Roberts operator, Canny operator and Laplacian operator, etc., the high-density non-stamp color area is converted into a low-density area, and the seal image still maintains a certain density. In this embodiment, the Canny edge detector is used to detect the edge of the image. After the outline is drawn, the stamp part still has relatively rich feature information, while those isolated high-density color areas are weakened because they do not have rich edge information. From the perspective of information distribution, the seal part transforms into a relatively high-density part, while the isolated high-density color part degenerates into a relatively low-density part. After extracting the edge information of the remaining image, the method of secondary segmentation and filtering based on the geometric area is used again to eliminate the original high-density color information, and modify the value of the corresponding element in the matrix L3.

根据矩阵L3和书画作品图像之间存在的位置对应关系，可以较为容易的识别L3中被标识为印章的区域所对应的图像边界，并提取出印章图像(如图12所示)。According to the position correspondence between the matrix L3 and the images of calligraphy and painting works, it is relatively easy to identify the image boundary corresponding to the area marked as a seal in L3, and extract the seal image (as shown in Figure 12).

如图13所示，本实施例对中国清代画家王原祁的一幅山水画(局部)中包含的三枚印章图像进行提取(如图14所示)，提取率可达到100％。As shown in FIG. 13 , in this embodiment, three seal images (as shown in FIG. 14 ) contained in a landscape painting (partial) by Chinese Qing Dynasty painter Wang Yuanqi are extracted, and the extraction rate can reach 100%.

本发明提供了一种能够从整幅书画作品或书画作品局部图像中准确地自动提取出印章图像的全部信息的方法，可以实现一个利用印章图像作为关键信息的基于内容的中国书画作品检索系统(如图15所示)。对大量书画作品提取图像特征和印章图像特征并分别存贮于书画作品特征库和印章库。用户检索时可根据需求进行查询(可以是一幅书画作品或其局部图像)，利用印章信息精确定位被检索书画作品，并获取该作品的相关信息。The present invention provides a kind of method that can extract the whole information of seal image accurately and automatically from the whole calligraphy and painting work or the partial image of calligraphy and painting work, can realize a content-based Chinese painting and calligraphy work retrieval system that utilizes seal image as key information ( as shown in Figure 15). The image features and seal image features are extracted from a large number of calligraphy and painting works and stored in the feature library of calligraphy and painting works and the seal library respectively. When searching, the user can query according to the needs (it can be a painting or calligraphy work or its partial image), use the seal information to accurately locate the painting and calligraphy work to be retrieved, and obtain relevant information about the work.

尽管为说明目的公开了本发明的具体实施例和附图，其目的在于帮助理解本发明的内容并据以实施，但是本领域的技术人员可以理解：在不脱离本发明及所附的权利要求的精神和范围内，各种替换、变化和修改都是可能的。因此，本发明不应局限于最佳实施例和附图所公开的内容，要求保护的范围以权利要求书界定的范围为准。Although specific embodiments and drawings of the present invention are disclosed for the purpose of illustration, the purpose is to help understand the content of the present invention and implement it accordingly, but those skilled in the art can understand that: without departing from the present invention and the appended claims Various substitutions, changes and modifications are possible within the spirit and scope of . Therefore, the present invention should not be limited to the content disclosed in the preferred embodiment and the accompanying drawings, and the scope of protection is subject to the scope defined in the claims.

Claims

1. a method of extracting seal image in the Chinese Painting and Calligraphy works automatically is characterized in that it comprises the steps:

1) utilizes based on L^*a^*b^*Component color is analyzed, with and analysis result non-red color information in the mapping method filtering target digital image of RGB color space;

2) noise that comprises in the described image remaining information is handled;

3) non-seal information in the described image remaining information after noise processed is rejected: adopt based on the secondary splitting of geometric areas and the method for filtration, reject the low-density colouring information in the described image; Employing is based on the image filtering method of connected region, reject residual image part area occupied in the described image much larger than with image information much smaller than the seal possible range; Adopt edge detection method that the non-seal color region of high density of residual image part in the described image is converted into density regions; Adopt describedly based on the secondary splitting of geometric areas and the method for filtration once more, that rejects described image is converted into low-density colouring information by high density;

4) described image is cut apart, extracted seal;

Describedly comprise the steps: based on the secondary splitting of geometric areas and the method for filtration

(1) described image is divided into some rectangular areas by the step value of setting, and calculates each field color density;

(2) with center, the regional point of crossing divided first, described image is carried out second zone divide and calculate each regional color density as the rectangular area;

(3) color density of the described image Zone Full of determining twice division and being obtained, obtain the color density matrix, L1 is the color density matrix of the area dividing first time, L2 is the color density matrix of the area dividing second time, when color density value during greater than the color density threshold value, corresponding matrix element value is 1, otherwise is 0;

(4) will correspond to the adjacent grid matrix value of 1 element and be filled to 1 among the L1 with among the L2, obtained a new matrix L 3 that in L1, adds the L2 corresponding informance, keep matrix value among the L3 and be the correspondence image zone of 1 element representative, reject the L3 intermediate value and be the correspondence image zone of 0 element representative.

2. the method for seal image in a kind of automatic extraction Chinese Painting and Calligraphy works as claimed in claim 1 is characterized in that: described image is carried out the conversion of color space, promptly by the RGB color space conversion to the XYZ color space, again by the XYZ color space conversion to L^*a^*b^*Color space:

[\begin{matrix} X_{n} \\ Y_{n} \\ Z_{n} \end{matrix}] = [\begin{matrix} k_{r} x_{r} & k_{g} x_{g} & k_{b} x_{b} \\ k_{r} y_{r} & k_{g} y_{g} & k_{b} y_{b} \\ k_{r} z_{r} & k_{g} z_{g} & k_{b} z_{b} \end{matrix}] [\begin{matrix} R \\ G \\ B \end{matrix}] = [\begin{matrix} x_{r} & x_{g} & x_{b} \\ y_{r} & y_{g} & y_{b} \\ z_{r} & z_{g} & z_{b} \end{matrix}] [\begin{matrix} k_{r} \\ k_{g} \\ k_{b} \end{matrix}]

[\begin{matrix} X \\ Y \\ Z \end{matrix}] = [\begin{matrix} 0.607 & 0.174 & 0.200 \\ 0.299 & 0.587 & 0.114 \\ 0.000 & 0.066 & 1.116 \end{matrix}] [\begin{matrix} R \\ G \\ B \end{matrix}]

a^*＝500×(f(X/X_n)-f(Y/Y_n))

b^*＝200×(f(Y/Y_n)-f(Z/Z_n)

Wherein, k_r, k_gAnd k_bBe scale-up factor, x_r, x_g, x_b, y_r, y_g, y_b, z_r, z_g, z_bBe the coordinate of the red, green and blue in the xyY of the International Commission on Illumination chromatic diagram, X_n, Y_n, Z_nBe in the international coordinated system of XYZ with reference to the tristimulus values of white light, X, Y, Z and R, G, B are respectively the corresponding color component in the color space separately, L^*, a^*, b^*Be L^*a^*b^*Each component in the color space, f (t) is as follows:

According to the general visual sense characteristic of seal image in the Chinese Painting and Calligraphy works, reject the background image of colder tone in the described image: be provided with a for greater than 0 be real number value, b is the set of absolute value less than 120 any real numbers, utilizes L then^*a^*b^*The component color analysis result shines upon described image on the RGB color space, promptly works as b^*Color is not in b or a^*Color is picked out during less than a, realizes to visual sense in the described image be the filtration of the information of redness.