s represents the similarity between the target pixel point and the clustering center, and the greater the similarity is, the more similar the target pixel point and the clustering center is; a. the_i A value representing the ith dimension in the feature descriptor of the cluster center; b is_i Representing the value of the ith dimension in the feature descriptor of the target pixel point; k represents the number of dimensions of the feature descriptor, and k is 3 in the present scheme.

For example, when the feature descriptor of a certain cluster center is [50,2,3] and the feature descriptor of a target pixel point is [60,1,2], the similarity between the target pixel point and the cluster center is calculated in the following manner:

the similarity between the target pixel point and the cluster center is 0.99.

In this alternative embodiment, the target pixel point and the cluster center corresponding to the maximum similarity may be classified into the same cluster;

c, respectively taking each pixel point in the transcoded image as a target pixel point, and repeating the step b to obtain a plurality of cluster clusters, wherein each cluster comprises a plurality of pixel points;

d, calculating the mean value of the feature descriptors of all the pixel points in each cluster to serve as the mean value corresponding to each cluster, calculating the difference value between the mean value and the cluster center, and outputting a plurality of clusters if the difference value is smaller; and if the difference value is larger, taking the mean value as a clustering center and repeating the steps a to d.

In the optional embodiment, each cluster represents an image block, each image block includes a plurality of pixel points, and the similarity of all the pixel points in each image block is high.

Illustratively, fig. 4 is a schematic diagram of the plurality of pixel blocks.

Therefore, a plurality of pixel points in the transcoded image are initialized and set as clustering centers through the preset number of the key point types of the dog face, the pixel points in the transcoded image are clustered according to the clustering centers to generate a plurality of image blocks, and data support is provided for subsequently calculating the key points of the image.

And S12, calculating the key points of each image block, and labeling the transcoded images according to the key points to obtain labeled images.

In an optional embodiment, the calculating the keypoints of each image block and labeling the transcoded image according to the keypoints to obtain a labeled image includes:

calculating the coordinates of key points of each image block according to the coordinates of all pixel points in each image block;

setting a virtual anchor frame of each image block according to the key point coordinates and a preset width and height parameter;

and setting a category label for each virtual anchor frame by using a preset labeling tool to obtain a labeled image.

In this optional embodiment, the key point coordinates of each image block may be calculated according to the coordinates of all pixel points in each image block, and taking an example of a certain image block in the transcoded image, the calculation manner of the key point coordinates in the image block satisfies the following relational expression:

x＝(x_min +x_max )/2

y＝(y_min +y_max )/2

wherein x represents the abscissa of the key point; x is the number of_min Represents the minimum value, x, of the abscissa of all the pixels in the image block_max Representing the maximum value of the abscissa of all pixel points in the image block; y represents the ordinate of the key point; y is_min Represents the minimum value of the vertical coordinates of all pixel points in the image block, y_max And representing the maximum value of the vertical coordinates of all the pixel points in the image block.

In this optional embodiment, the virtual anchor frame of each image block may be set according to the coordinates of the key point and a preset width-height parameter, where the width-height parameter includes a width parameter w and a height parameter h, and for example, when the coordinates of the key point of a certain image block are [ x, y ], and the width parameter w corresponding to the key point is 10 and the height parameter h is 20, the coordinates of the geometric center of the virtual anchor frame corresponding to the image block are [ x, y ], and the width of the virtual anchor frame corresponding to the image block is 10 and the height is 20.

In this optional embodiment, a label may be set for each virtual anchor frame according to a preset labeling tool, and in this embodiment, the labeling tool may be a labelme tool, and the function of the labeling tool is to give a label to a pixel point in an image. And setting the category label of each key point by using the labelme tool, and taking the category label of each key point as the category labels of all pixel points in the virtual anchor frame corresponding to the key point.

In this alternative embodiment, a transcoded image with multiple key points may be used as a marker image, each marker image includes at least one key point, each key point corresponds to a virtual anchor frame, and the categories of all pixel points in the virtual anchor frame are the same, for example, the categories may include dog face key parts such as [ ear, eye, nose ], and the like.

In this optional embodiment, the marker image has a plurality of key points, each key point corresponds to a category label and a set of width and height parameters, and in this embodiment, the coordinates in the marker image are [ x, y ]]The probability value of the pixel point belonging to the i category key point is p_i,,y Illustratively, when the coordinates in the marker image are [2,3]]The pixel point is the key point with the category of 'ear', then p_{The ear, the ear neck and the ear neck are connected with each other,} ＝1。

illustratively, fig. 5 is a schematic diagram of the marker image.

Therefore, the positions of the key points of each image block are calculated through the coordinates of all the pixel points in each image block, and further data labeling is carried out on all the key points to obtain a labeled image, so that the accuracy and the efficiency of the data labeling can be improved.

And S13, training a dog face key point detection model based on the marked image and the transcoding image.

In an optional embodiment, the training of the dog face keypoint detection model based on the labeled image and the transcoded image comprises:

In this alternative embodiment, all the transcoded images may be used as sample images, and the label image corresponding to each transcoded image may be used as a label image, and further, all the sample images and the label image may be stored to obtain a training data set.

In this optional embodiment, the encoder may be an existing neural network structure such as ResNet (residual error network), DLA (deep layer aggregation network), or Hourglass (funnel network); the decoder includes a first decoder, a second decoder and a third decoder, and the decoder may be an existing feature extraction network such as CNN (convolutional neural network) or R-CNN (cyclic convolutional neural network).

In this alternative embodiment, the input data and the output data of the encoder and the decoder include:

the input of the encoder is a sample image, the output of the encoder is a feature map, and the size of the feature map is the same as that of the sample image;

the input of the decoder is the feature map, the output of the first decoder is a plurality of predicted heat maps corresponding to the sample image, all the predicted heat maps have the same size as the sample image, each predicted heat map comprises at least one key point, all the key points in each predicted heat map have the same category, the value of each pixel point in each predicted heat map represents the probability that the pixel point belongs to the key point corresponding to the heat map category, and exemplarily, when the label of one predicted heat map is 'ear', and one coordinate of [1, 2] exists in the predicted heat map]And the probability that the pixel point is the key point with the category of 'ear' is 1 if the pixel point is the key point with the pixel value of 1, in this embodiment, the probability value of each pixel point in the prediction heat map can be recorded as

Representing coordinates [ x, y ] in said predicted heatmap]The probability that the pixel point belongs to the i category key point;

the output of the second decoder is a predicted coordinate, the predicted coordinate refers to the coordinate of the key point in the sample image predicted by the second decoder, and the predicted coordinate can be recorded as

The output of the third decoder is a predicted width-height parameter, the predicted width-height parameter refers to the width and height of the virtual anchor frame corresponding to each key point in the sample image predicted by the third decoder, and in the scheme, the predicted width-height parameter can be recorded as

Wherein

A prediction width parameter representing the output of said third decoder,

A predicted height parameter representing an output of the third decoder.

Illustratively, as shown in fig. 6, a schematic structural diagram of the initial dog face key point detection model is shown.

In this optional embodiment, in order to ensure that the output of the dog face key point detection model is as similar as possible to the tag image, it is necessary to perform iterative training on the initial dog face key point detection model according to a preset loss function to update parameters of the initial dog face key point detection model, where the preset loss function includes a heat map loss, a displacement loss, and a regression loss, and the heat map loss satisfies the following relation:

wherein L is_heatmap Representing the heat map loss, a smaller heat map loss value indicating a more similar keypoint in the input predicted heat map to a keypoint in the sample image; n represents the number of key point categories in the label image, N_x Representing the width of the prediction heat map, namely the number of coordinate points in the row direction of the prediction heat map; n is_y Representing the height of the prediction heat map, namely the number of coordinate points in the column direction of the prediction heat map; x and y represent coordinates of pixel points in the heat map; α and β represent preset blending parameters, and α ═ 2 and β ═ 4 may be set in the present embodiment;

represents the probability that the pixel point with coordinate (x, y) in the prediction heat map belongs to the key point of the category i, as an example

Time represents the coordinate in the heatmap as [2,3]]The probability that the pixel point belongs to the key point of the ear category is 1 when

Time represents the coordinates in the heat map as [50,50 ]]The probability that the pixel point at (b) belongs to the key point of the 'nose' category is 0.8.

In this optional embodiment, a regression loss function may be constructed according to the width and height parameters and the predicted width and height parameters in the label image, where the regression loss function satisfies the following relation:

wherein L is_regression Representing regression loss, wherein the regression loss is used for representing the difference between the key point range size in the prediction heat map and the key point range size in the label image, and the smaller the value of the regression loss is, the more similar the predicted width and height parameters are to the width and height parameters in the label image; m represents the number of key points in the label image; w and h respectively represent width and height parameters corresponding to key points of the kth category in the label image;

representing the predicted breadth-height parameter corresponding to the predicted kth key point.

In this alternative embodiment, the displacement loss function satisfies the following relationship:

wherein L is_offset Representing the loss value of the displacement loss function, wherein the smaller the regression loss value is, the closer the predicted key point coordinate is to the coordinate of the key point in the label image; m represents the number of key points in the label image, and i representsThe index of the key point in the label image is shown; d is a radical of_i Representing Euclidean distances between the predicted coordinates and coordinates of key points in the label image, an

In this alternative embodiment, the overall loss value of the initial dog face key point detection model may be calculated according to the heat map loss, the regression loss and the displacement loss, and the overall loss value may be calculated in a manner satisfying the following relation:

Loss＝L_regression +A×L_heatmap +B×L_offset

the Loss represents the total Loss value, and the smaller the total Loss value is, the more similar the output of the initial dog face key point detection model is to the label image is, the better the performance of the initial dog face key point detection model is; l is_regression Represents the regression loss value; l is_heatmap Represents the heat map loss value; l is_offset Representing the displacement loss value; a and B represent preset weighting parameters, and according to experience obtained through multiple experiments, A can be 2 and B can be 4.

In this optional embodiment, the sample image may be sequentially input into the initial dog face key point detection model to obtain a key point detection result corresponding to the sample image, where the detection result includes a plurality of heatmaps, coordinates of each key point, and a width and height parameter corresponding to each key point. Further, the overall loss value may be calculated according to the label image corresponding to the sample image and the detection result, the overall loss value is used to characterize the difference between the label image and the prediction result, and a smaller overall loss value indicates that the detection result is more similar to the label image.

In this optional embodiment, parameters of the initial dog face key point detection model may be iteratively updated by using a gradient descent method, until the total loss value is smaller than a preset termination threshold value, the iteration is stopped to obtain the dog face key point detection model, and the termination threshold value may be 0.001 according to experience obtained through multiple experiments.

And S14, inputting the image to be detected into the dog face key point detection model to obtain a detection result.

In an optional embodiment, the inputting the image to be detected into the dog face key point detection model to obtain a detection result includes:

In this alternative embodiment, a dog face image to be detected may be input into the dog face key point detection model to obtain a plurality of prediction heat maps and prediction width and height parameters, each prediction heat map corresponds to one prediction category tag, the prediction category tags are categories of all key points in the heat map, each prediction heat map includes at least one key point, and the prediction width and height parameters are used to characterize a range of key portions in the dog face image, as an example, as shown in fig. 7, a schematic diagram of the detection result is shown.

In this optional embodiment, the pixel points at the same position in the image to be detected can be searched for as the predicted key points of the image to be detected according to the pixel points with the pixel value of 1 in the predicted heat map, and the category of the heat map is used as the category of the key points. Further, dividing a real anchor frame in the image to be detected according to the prediction width and height parameters and the prediction key points, and taking the real anchor frame as a detection result, wherein the real anchor frame is used for representing that all pixel points in the anchor frame belong to corresponding categories.

Fig. 2 is a functional block diagram of a preferred embodiment of the artificial intelligence-based dog face key point detection apparatus according to the embodiment of the present application. The artificial intelligence-based dog face keypoint detection device 11 comprises atranscoding unit 110, asegmentation unit 111, a markingunit 112, atraining unit 113 and adetection unit 114. The module/unit referred to in this application refers to a series of computer program segments that can be executed by theprocessor 13 and that can perform a fixed function, and that are stored in thememory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.

In an alternative embodiment, thetranscoding unit 110 is configured to transcode the historical dog face image to obtain a transcoded image.

and sharpening the gray level image to obtain a transcoding image.

taking the coordinates of each pixel point in the dog face image as the coordinates of a sampling matrix corresponding to each pixel point, wherein exemplarily, the coordinates of the sampling matrix corresponding to the pixel point with the coordinates of [2,3] are [2,3 ];

Gray＝0.299*R+0.587*G+0.114*B

P＝(1+γ)*Gray-γ*m

wherein, P represents the sharpening value of the pixel point; gray represents the Gray value of the pixel point; m represents the mean value of the gray values of all pixel points in the gray image; γ represents a preset sharpening parameter, which may be 0.5 based on experience with multiple trials.

P＝(1+0.5)*100-0.5*80＝110

the sharpening value corresponding to the pixel point is 110.

In this alternative embodiment, each sharpened value may be used as a pixel value corresponding to each pixel point to obtain a transcoded image.

In an alternative embodiment, thesegmentation unit 111 is configured to segment the transcoded image to obtain a plurality of image blocks.

constructing a feature descriptor of each pixel point in the transcoding image;

Illustratively, when the sharpening value of a certain pixel point in the transcoded image is [50] and the coordinate of the pixel point is [2,3], the feature descriptor of the pixel point is [50,2,3 ].

a, selecting a pixel point from the transcoded image as a target pixel point;

wherein, S represents the similarity between the target pixel point and the clustering center, and the greater the similarity is, the more similar the target pixel point and the clustering center is; a. the_i A value representing the ith dimension in a feature descriptor of the cluster center; b is_i Representing the value of the ith dimension in the feature descriptor of the target pixel point; k represents the number of dimensions of the feature descriptor, and k is 3 in the present scheme.

the similarity between the target pixel point and the cluster center is 0.99.

c, respectively taking each pixel point in the transcoded image as a target pixel point and repeating the step b to obtain a plurality of cluster clusters, wherein each cluster comprises a plurality of pixel points;

In this optional embodiment, each cluster represents an image block, each image block includes a plurality of pixel points, and the similarity of all the pixel points in each image block is high.

Illustratively, fig. 4 is a schematic diagram of the plurality of pixel blocks.

In an alternative embodiment, thelabeling unit 112 is configured to calculate a keypoint for each image block and label the transcoded image according to the keypoint to obtain a labeled image.

x＝(x_min +x_max )/2

y＝(y_min +y_max )/2

wherein x represents the abscissa of the key point; x is the number of_min Represents the minimum value, x, of the abscissa of all the pixels in the image block_max Representing the maximum value of the abscissa of all pixel points in the image block; y represents the ordinate of the key point; y is_min Represents the minimum value, y, of the vertical coordinates of all the pixel points in the image block_max And representing the maximum value of the vertical coordinates of all the pixel points in the image block.

In this optional embodiment, the virtual anchor frame of each image block may be set according to the coordinates of the key points and a preset width-height parameter, where the width-height parameter includes a width parameter w and a height parameter h, and for example, when the coordinates of the key points of a certain image block are [ x, y ], and the width parameter w corresponding to the key points is 10 and the height parameter h is 20, the coordinates of the geometric center of the virtual anchor frame corresponding to the image block are [ x, y ], and the width of the virtual anchor frame corresponding to the image block is 10 and the height is 20.

In this alternative embodiment, a transcoded image with multiple key points may be used as a marker image, each marker image includes at least one key point, each key point corresponds to a virtual anchor frame, and the categories of all pixel points in the virtual anchor frame are the same, for example, the categories may include key parts of a dog face such as [ ear, eye, nose ], and the like.

In this optional embodiment, the marker image has a plurality of key points, each key point corresponds to a category label and a set of width and height parameters, and in this embodiment, the coordinates in the marker image are [ x, y ]]The probability value of the pixel point belonging to the i category key point is p_i,,y Illustratively, when the coordinates in the marker image are [2,3]]The pixel point at is the key point with the category of 'ear', then p_{The ear, the ear part is fixed on the ear shell,} ＝1。

illustratively, fig. 5 is a schematic diagram of the marker image.

In an alternative embodiment, thetraining unit 113 is configured to train a dog face keypoint detection model based on the labeled images and the transcoded images.

taking the transcoded image as a sample image, taking the marked image as a label image, and storing the sample image and the label image to obtain a training data set;

the input of the decoder is the feature map, the output of the first decoder is a plurality of predicted heat maps corresponding to the sample image, all the predicted heat maps have the same size as the sample image, each predicted heat map comprises at least one key point, all the key points in each predicted heat map have the same category, the value of each pixel point in each predicted heat map represents the probability that the pixel point belongs to the key point corresponding to the heat map category, and exemplarily, when the label of one predicted heat map is 'ear', and one coordinate of [1, 2] exists in the predicted heat map]And the probability that the pixel is the key point of the category of ear is 1 if the pixel value is the key point of 1, in this embodiment, it can be noted that the probability value of each pixel in the prediction heat map is 1

The output of the third decoder is a predicted width-height parameter, the predicted width-height parameter refers to the width and height parameters of the virtual anchor frame corresponding to each key point in the sample image predicted by the third decoder, and the method is characterized in thatIn the case, the predicted width and height parameter can be recorded as

Wherein

A prediction width parameter representing the output of said third decoder,

A predicted height parameter representing an output of the third decoder.

Illustratively, fig. 6 is a schematic structural diagram of the initial dog face key point detection model.

representing coordinates in said predicted heat map as(x, y) probability that the pixel belongs to the category i keypoint, exemplary when

Time represents the coordinate in the heatmap as [2,3]]The probability that the pixel point belongs to the ear category key point is 1 when

Time represents the coordinate in the heat map as [50,50 ]]The probability that the pixel point at (b) belongs to the key point of the 'nose' category is 0.8.

In this alternative embodiment, a regression loss function may be constructed according to the width-height parameter and the predicted width-height parameter in the label image, where the regression loss function satisfies the following relation:

wherein L is_offset A loss value representing said displacement loss function, said smaller said regression loss value indicating said predicted turn-offThe more similar the key point coordinates are to the coordinates of the key points in the label image; m represents the number of key points in the label image, and i represents the index of the key points in the label image; d_i Representing Euclidean distances between the predicted coordinates and coordinates of key points in the label image, an

In this alternative embodiment, the overall loss value of the initial dog face keypoint detection model may be calculated according to the heat map loss, the regression loss, and the displacement loss, and the overall loss value may be calculated in a manner that satisfies the following relation:

Loss＝L_regression +A×L_heatmap +B×L_offset

In an alternative embodiment, thedetection unit 114 is configured to input the image to be detected into the dog face key point detection model to obtain a detection result.

In this alternative embodiment, a dog face image to be detected may be input into the dog face key point detection model to obtain a plurality of prediction heat maps and prediction width and height parameters, each prediction heat map corresponds to one prediction category tag, the prediction category tag is a category of all key points in the heat map, each prediction heat map includes at least one key point, and the prediction width and height parameters are used to characterize a range of key portions in the dog face image, as an example, as shown in fig. 7, a schematic diagram of the detection result is shown.

Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Theelectronic device 1 comprises amemory 12 and aprocessor 13. Thememory 12 is used for storing computer readable instructions, and theprocessor 13 is used for executing the computer readable instructions stored in the memory to implement the artificial intelligence based dog face key point detection method of any one of the above embodiments.

In an alternative embodiment, theelectronic device 1 further comprises a bus, a computer program stored in thememory 12 and executable on theprocessor 13, such as an artificial intelligence based dog face key detection program.

Fig. 3 only shows theelectronic device 1 with components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of theelectronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.

Referring to fig. 1, amemory 12 of theelectronic device 1 stores a plurality of computer-readable instructions to implement an artificial intelligence based method for detecting key points on a dog face, and aprocessor 13 can execute the plurality of instructions to implement:

transcoding the historical dog face image to obtain a transcoded image;

segmenting the transcoding image to obtain a plurality of image blocks;

Specifically, the specific implementation method of the instruction by theprocessor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, which is not described herein again.

It will be understood by those skilled in the art that the schematic diagram is merely an example of theelectronic device 1, and does not constitute a limitation to theelectronic device 1, theelectronic device 1 may have a bus-type structure or a star-type structure, and theelectronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, theelectronic device 1 may further include an input and output device, a network access device, etc.

It should be noted that theelectronic device 1 is only an example, and other existing or future electronic products, such as those that may be adapted to the present application, should also be included in the scope of protection of the present application, and are included by reference.

Memory 12 includes at least one type of readable storage medium, which may be non-volatile or volatile. The readable storage medium includes flash memory, removable hard disks, multimedia cards, card type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. Thememory 12 may in some embodiments be an internal storage unit of theelectronic device 1, for example a removable hard disk of theelectronic device 1. Thememory 12 may also be an external storage device of theelectronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash memory card (FlashCard), and the like, provided on theelectronic device 1. Further, thememory 12 may also include both an internal storage unit and an external storage device of theelectronic device 1. Thememory 12 can be used not only to store application software installed in theelectronic device 1 and various types of data, such as codes of an artificial intelligence-based dog face key point detection program, etc., but also to temporarily store data that has been output or is to be output.

Theprocessor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital processing chips, graphics processors, and combinations of various control chips. Theprocessor 13 is a control unit (control unit) of theelectronic device 1, connects various components of theelectronic device 1 by various interfaces and lines, and executes various functions and processes data of theelectronic device 1 by running or executing programs or modules stored in the memory 12 (for example, executing a dog face key point detection program based on artificial intelligence, etc.), and calling data stored in thememory 12.

Theprocessor 13 executes an operating system of theelectronic device 1 and various types of application programs installed. Theprocessor 13 executes the application program to implement the steps of the above-mentioned embodiments of the artificial intelligence based dog face key point detection method, such as the steps shown in fig. 1.

Illustratively, the computer program may be partitioned into one or more modules/units that are stored in thememory 12 and executed by theprocessor 13 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in theelectronic device 1. For example, the computer program may be segmented into atranscoding unit 110, asegmentation unit 111, alabeling unit 112, atraining unit 113, adetection unit 114.

The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute the portions of the artificial intelligence based dog face key point detection method according to the embodiments of the present application.

The integrated modules/units of theelectronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the processes in the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer-readable storage medium and executed by a processor, to implement the steps of the embodiments of the methods described above.

Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-only memory (ROM), random access memory and other memory, etc.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connected communication between thememory 12 and at least oneprocessor 13 or the like.

Although not shown, theelectronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least oneprocessor 13 through a power management device, so that functions such as charge management, discharge management, and power consumption management are implemented through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. Theelectronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, theelectronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between theelectronic device 1 and other electronic devices.

Optionally, theelectronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (organic light-emitting diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in theelectronic device 1 and for displaying a visualized user interface, among other things.

The embodiment of the present application further provides a computer-readable storage medium (not shown), in which computer-readable instructions are stored, and the computer-readable instructions are executed by a processor in an electronic device to implement the method for detecting a key point of a dog face based on artificial intelligence according to any of the above embodiments.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the specification may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present application and not for limiting, and although the present application is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application.

Claims

1. A dog face key point detection method based on artificial intelligence is characterized by comprising the following steps:

transcoding the historical dog face image to obtain a transcoded image;

segmenting the transcoding image to obtain a plurality of image blocks;

2. The artificial intelligence based dog face key point detection method according to claim 1, wherein transcoding the historical dog face image to obtain a transcoded image comprises:

and sharpening the gray level image to obtain a transcoding image.

3. The artificial intelligence based dog face keypoint detection method of claim 1, wherein said segmenting the transcoded image to obtain a plurality of image blocks comprises:

constructing a feature descriptor of each pixel point in the transcoding image;

4. The artificial intelligence based dog face keypoint detection method of claim 3, wherein said clustering each pixel point in the transcoded image based on the feature descriptors and the clustering center to obtain a plurality of image blocks comprises:

selecting one pixel point from the transcoded image as a target pixel point;

5. The artificial intelligence based dog face key point detection method according to claim 1, wherein the calculating key points of each image block and labeling the transcoded image according to key points to obtain a labeled image comprises:

setting a virtual anchor frame of each image block according to the coordinates of the key points and preset width and height parameters;

6. The artificial intelligence based dog face keypoint detection method of claim 1, wherein the training of a dog face keypoint detection model based on the labeled image and the transcoded image comprises:

7. The method for detecting key points of a dog face based on artificial intelligence as claimed in claim 1, wherein said inputting the image to be detected into the detection model of key points of the dog face to obtain the detection result comprises:

8. The utility model provides a dog face key point detection device based on artificial intelligence, its characterized in that, the device includes:

the marking unit is used for calculating key points of each image block and marking the transcoding image according to the key points to obtain a marked image;

9. An electronic device, characterized in that the electronic device comprises:

a memory storing computer readable instructions; and

a processor executing computer readable instructions stored in the memory to implement the artificial intelligence based dog face keypoint detection method of any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that: the computer-readable storage medium stores computer-readable instructions which are executed by a processor in an electronic device to implement the artificial intelligence based dog face key point detection method according to any one of claims 1 to 7.