Identification method for single-layer one-to-many and many-to-one share graphsTechnical Field
The invention belongs to the technical field of image recognition, and relates to a recognition method for single-layer one-to-many and many-to-one share graphs.
Background
With the daily and monthly variation of internet technology, the field of artificial intelligence is more vigorous, and the related technology and the product proportion in the daily life of people are also increased. The image recognition technology is an important field in artificial intelligence, is the basis of many practical technologies, such as stereoscopic vision, motion analysis, data fusion and the like, and has important application value in the fields of navigation, weather forecast, natural resource analysis, environment monitoring, physiological lesion research and the like. The specific recognition analysis of complex images is an important field of artificial intelligence, and the target recognition of the current images is mature for the recognition of characteristics such as license plates, faces and pedestrians; therefore, researchers hope to recognize and analyze more complex relation images (such as a share map), so that related personnel are free from the traditional manual share analysis method, the share right distribution can be mastered efficiently and accurately, and the working efficiency is improved.
However, the existing share graphs are mostly from annual or quarterly reports published by companies and related software (such as sky and eye examination), the pictures are complex, the architecture of the company shares is difficult to intuitively know, and the analysis is not only to analyze the share of one graph and one company, so that the work is time-consuming and labor-consuming, and difficult to understand. In addition, at present, no research for identifying the share graphs by using an image identification technology at home and abroad is available, and no technology for researching aspects such as analysis of the share relation graphs is available.
Disclosure of Invention
The invention aims to provide a method for identifying single-layer one-to-many and many-to-one share graphs, which solves the problem that the original share graph in the prior art is difficult to intuitively reflect the shares of a company.
The technical proposal adopted by the invention is that,
a recognition method for single-layer one-to-many and many-to-one share graphs comprises the following specific steps:
step 1, inputting a one-to-many or many-to-one share image as a share image to be identified;
step 2, extracting the coordinates of companies (individuals), arrows and percentages in the pictures by adopting a fast R-CNN network;
step 3, determining corner coordinates according to the arrow coordinates, and determining the trend of an arrow according to the arrow corner coordinates; dividing a company (person) into a pointed object and a pointed object according to the trend of an arrow, and binding one-to-one binding the pointed object with more one of the pointed object and the pointed object with percentage; finally, recognizing characters in the pointed object and the pointed object by utilizing an OCR recognition method;
and 4, constructing a directed weighted graph of the control flow of the object-arrow-percentage-pointed object according to the pointing relation obtained in the step 3.
Wherein step 2 comprises:
step 2.1, adopting a large number of stock charts, manually marking companies (individuals), arrows and percentages in the charts, and taking the charts as a data set;
step 2.2, establishing a VGG-16 network model, wherein VGG-16 comprises 13 convolution layers, 3 full connection layers and 5 pooling layers;
step 2.3, training the data set by the VGG-16 network model;
and 2.4, detecting the stock diagram to be identified by adopting a trained VGG-16 network model, and outputting a detection result, wherein the detection result is the coordinates of a company (person), an arrow and a percentage.
The size of a convolution kernel adopted by 13 convolution layers in the step 2 is 3x3 convolution, stride=1 is adopted, the filling mode is padding=same, and each convolution layer uses a relu activation function; generating positive anchors and corresponding bounding box regression offsets respectively, and then calculating proposals;
the adopted pooling core parameters of the pooling layer of (1) are all 2×2, and the stride stride=2, max; the proposals of the convolution layer is utilized to extract the proposal feature from the feature maps and send it to the subsequent fully connected and softmax network for classification (i.e., classifying what object the proposal is to be).
Step 3.1, determining corner coordinates according to the arrow coordinates, and determining the direction of an arrow according to the arrow corner coordinates:
the three corner points of one of the arrows obtained from step 2 are set as (a (x1 ,y1 ),B(x1 ,y1 ),C(x3 ,y3 )): let y be1 ,y2 Is less than the given differenceFixed threshold e1 The corner points A and B are considered to be on a horizontal line, and then y is judged3 And y is1 If y3 >y1 The arrow is considered to be downward if y3 <y1 The arrow is considered to be upward; traversing all arrow point coordinates, and judging the directions one by one;
step 3.2, dividing company (personal) names into pointed objects and pointed objects according to the pointing direction of an arrow, binding one of the pointed objects and the pointed objects with a larger number with a percentage, dividing the inputted stock map into two groups according to the size of the ordinate of the company name, wherein the group with the largest ordinate in the company (personal) coordinates is the pointed object if the pointing direction is upward, and the group with the smallest ordinate in the company (personal) coordinates is the pointed object if the pointing direction of the arrow is downward; then, one-to-one binding of the pointed object and the more one of the pointed objects with the percentage is performed: let the smallest and largest abscissas among the coordinates of four points of one of the pointed object and the more number of pointed objects be (x)min ,xmax ) The abscissa of the percentage is found to be in (xmin ,xmax ) One of the two is then bound in a specific data structure (such as a dictionary), the remaining objects of one of the more numbers are traversed, and one-to-one binding is carried out with the percentages;
and 3.3, recognizing characters in coordinates of the pointed object and the pointed object by utilizing an OCR technology.
Step 4 comprises:
step 4.1, establishing an empty directed graph G, and sequentially adding the empty directed graph G into the directed graph G as a node by utilizing the company (person) name obtained in the step 3.3 to obtain a directed graph G' of only basic nodes;
step 4.2, converting the pointing relation in step 3.2 into triples [ u, v, w ] on the basis of the directed graph G' in step 4.1, wherein u is a starting point and represents a pointing object; v is the end point, represents the pointed object, w is the weight, represents the percentage of strands, and the converted triples are used as parameters and added into a directed graph G ', so that a directed weighted graph G' of the strand control flow is finally formed.
The invention has the beneficial effects that
The method utilizes the deep learning framework fast R-CNN technology and the image recognition technology to recognize and analyze the share images, overcomes the defects of time and labor waste and difficult understanding when the individual or company performs share analysis, makes up the defect of research on the aspect at home and abroad, and provides a high-efficiency and accurate method.
Drawings
FIG. 1 is a schematic diagram of a single-layer one-to-many or many-to-one share image recognition and resolution method according to the present invention;
FIG. 2 is a schematic diagram of VGG-16 network structure of Faster R-CNN in the method for identifying and resolving single-layer one-to-many or many-to-one strand images according to the present invention;
FIG. 3 is a schematic representation of the input stock layout in example 1 of the method of the present invention for single layer one-to-many or many-to-one stock layout identification and resolution;
FIG. 4 is a diagram of the resulting complex network in example 1 of the method of the present invention for single layer one-to-many or many-to-one strand identification and resolution.
Detailed Description
The invention will be described in detail below with reference to the drawings and the detailed description.
As shown in fig. 1, the one-to-many or many-to-one share graph identifying and analyzing method is characterized by comprising the following specific steps:
step 1, inputting a one-to-many or many-to-one share image as a share image to be identified;
step 2, extracting the coordinates of companies (individuals), arrows and percentages in the pictures by adopting a fast R-CNN network;
step 3, determining corner coordinates according to the arrow coordinates, and determining the trend of an arrow according to the arrow corner coordinates; dividing a company (person) into a pointed object and a pointed object according to the trend of an arrow, and binding one-to-one binding the pointed object with more one of the pointed object and the pointed object with percentage; finally, recognizing characters in the pointed object and the pointed object by utilizing an OCR recognition method;
and 4, constructing a directed weighted graph of the control flow of the object-arrow-percentage-pointed object according to the pointing relation obtained in the step 3.
In the step 1, for a to-be-identified share image of a multilayer stock control relation, firstly, the to-be-identified share image needs to be scaled to a fixed size;
the step 2 comprises the following steps:
step 2.1, taking a large number of share graphs and manually marking companies (individuals), arrows and percentages in the graphs to serve as a data set.
Step 2.2, as in fig. 2, a VGG-16 network model is built, VGG-16 comprising 13 convolutional layers, 3 fully connected layers, 5 pooling layers,
the size of a convolution kernel adopted by the 13 convolution layers is 3x3 convolution, stride stride=1 is adopted, the filling mode is padding=same, and each convolution layer uses a relu activation function; generating positive anchors and corresponding bounding box regression offsets respectively, and then calculating proposals;
the adopted pooling core parameters of the pooling layer of (1) are all 2×2, and the stride stride=2, max; the proposals of the convolution layer is utilized to extract the proposal feature from the feature maps and send it to the subsequent fully connected and softmax network for classification (i.e., classifying what object the proposal is to be).
Step 2.3, training the data set by the VGG-16 network model.
And 2.4, detecting the stock diagram to be identified by adopting a trained VGG-16 network model, and outputting a detection result, wherein the detection result is the coordinates of a company (person), an arrow and a percentage.
The step 3 comprises the following steps:
and 3.1, determining corner coordinates according to the arrow coordinates, and determining the direction of an arrow according to the arrow corner coordinates.
The three corner points of one of the arrows obtained from step 2 are set as (a (x1 ,y1 ),B(x1 ,y1 ),C(x3 ,y3 )): let y be1 ,y2 Is less than a given threshold e1 Consider that the two points of the corner points A and B are at a levelOn line, at this time, judge y3 And y is1 If y3 >y1 The arrow is considered to be downward if y3 <y1 The arrow is considered to be upward; traversing all the arrow point coordinates, and judging the directions one by one.
Step 3.2, dividing company (personal) names into pointed objects and pointed objects according to the pointing direction of an arrow, binding one of the pointed objects and the pointed objects with a larger number with a percentage, dividing the inputted stock map into two groups according to the size of the ordinate of the company name, wherein the group with the largest ordinate in the company (personal) coordinates is the pointed object if the pointing direction is upward, and the group with the smallest ordinate in the company (personal) coordinates is the pointed object if the pointing direction of the arrow is downward; then, one-to-one binding of the pointed object and the more one of the pointed objects with the percentage is performed: let the smallest and largest abscissas among the coordinates of four points of one of the pointed object and the more number of pointed objects be (x)min ,xmax ) The abscissa of the percentage is found to be in (xmin ,xmax ) And then binding the two in a specific data structure (such as a dictionary), traversing the rest objects of one more parties, and carrying out one-to-one binding with percentages.
And 3.3, recognizing characters in coordinates of the pointed object and the pointed object by utilizing an OCR technology.
Wherein step 4 comprises:
and 4.1, establishing an empty directed graph G, and sequentially adding the empty directed graph G serving as a node into the directed graph G by utilizing the company (person) name obtained in the step 3.3 to obtain a directed graph G' of the basic only node.
Step 4.2, converting the pointing relation in step 3.2 into triples [ u, v, w ] on the basis of the directed graph G' in step 4.1, wherein u is a starting point and represents a pointing object; v is the end point, represents the pointed object, w is the weight, represents the percentage of strands, and the converted triples are used as parameters and added into a directed graph G ', so that a directed weighted graph G' of the strand control flow is finally formed.
Example 1
Executing step 1, inputting a to-be-identified share graph as fig. 3, wherein fig. 3 is a one-to-many share graph;
2-3, wherein the data sets are mainly from a China bidding net and a huge tide information net, the total value is more than 100G, and as the single image of the stock right image contains the characteristics of a plurality of target images, the number of the original data sets is 3200, the existing data sets are turned over by utilizing an open-cv library, the number of the data sets is expanded to 11000, and the number of the target images of each category is more than 60000; the OCR technology is to call the existing and mature OCR interface (such as hundred-degree OCR API) for recognition, so that the recognition rate is improved;
step 4 is executed, wherein the complex network for constructing the pointing relationship is a visualized network constructed based on graph theory and complex network modeling tool NetworkX, and the final control flow directional weighted graph is shown in fig. 4.