CN113132755B

Movatterモバイル変換

Info

Publication number: CN113132755B
Application number: CN201911415561.7A
Authority: CN
Inventors: 刘家瑛; 胡越予; 杨帅; 王德昭; 郭宗明
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2022-04-01
Anticipated expiration: 2039-12-31
Also published as: CN113132755A

Abstract

The invention discloses an extensible man-machine cooperative image coding method and system. The method comprises the following steps: extracting an edge image of each sample picture and vectorizing the edge image to be used as compact representation of a driving machine vision task; extracting key points in the vectorized edge image to serve as auxiliary information; respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams; preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information; inputting the edge graph obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network; performing loss function calculation according to the obtained calculation result and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for network weight updating until the neural network converges to obtain a double-path code stream decoder; acquiring an edge image and auxiliary information of an image to be processed, and coding and compressing the edge image and the auxiliary information to obtain two paths of code streams; and the double-path code stream decoder decodes the received code stream and reconstructs an image.

Description

Translated fromChinese

可扩展人机协同图像编码方法及系统、解码器训练方法Scalable human-machine collaborative image coding method and system, decoder training method

技术领域technical field

本发明属于图像编码领域，涉及一种可扩展人机协同图像编码方法及编码系统，本发明可以同时提升图像在人眼视觉以及机器视觉下的质量。The invention belongs to the field of image coding, and relates to a scalable human-machine collaborative image coding method and coding system. The invention can simultaneously improve the quality of images under human vision and machine vision.

背景技术Background technique

在数字图像的使用传播过程中，有损图像压缩是一项不可或缺的关键技术。传统有损图像压缩方案通过对图像进行变换得到紧凑表示从而继续量化、熵编码来进行压缩，极大地降低了数字图像在存储和传输过程中的开销，使得数字图像在日常生活得以被普遍使用。In the process of using and spreading digital images, lossy image compression is an indispensable key technology. The traditional lossy image compression scheme transforms the image to obtain a compact representation and continues to quantify and entropy encode it for compression, which greatly reduces the overhead of digital images in the process of storage and transmission, and enables digital images to be widely used in daily life.

随着计算机视觉技术的发展，越来越多的应用场景中需要考虑图像在机器视觉下的质量，也就是有损压缩后的图像在机器视觉任务下仍然可以保持与无损图像相当的性能。但是传统有损图像压缩方案仅针对人眼视觉进行优化，无法保证机器视觉下的质量。而如果仅仅考虑对机器视觉任务的特征进行压缩，不保证图像的恢复重建，则无法在人眼视觉下进行观察。With the development of computer vision technology, more and more application scenarios need to consider the quality of images under machine vision, that is, images after lossy compression can still maintain comparable performance to lossless images under machine vision tasks. However, traditional lossy image compression schemes are only optimized for human vision and cannot guarantee the quality under machine vision. However, if only the features of the machine vision task are considered to be compressed, and the restoration and reconstruction of the image are not guaranteed, it cannot be observed under human vision.

为了同时保证人眼视觉以及机器视觉下的性能，本发明提出了一个可扩展的人机协同图像编码系统。在此之上，根据需求的不同，可以通过传输解码不同等级的码流，得到只针对机器视觉的重建图像以及针对人眼视觉的重建图像。In order to ensure the performance under human vision and machine vision at the same time, the present invention proposes an extensible human-machine cooperative image coding system. On top of this, according to different requirements, different levels of code streams can be transmitted and decoded to obtain reconstructed images only for machine vision and reconstructed images for human vision.

发明内容SUMMARY OF THE INVENTION

本发明在上述技术背景下，设计了一种可扩展的人机协同图像编码方法及编码系统。不同于传统视觉的单一码流，本发明提出了一个可扩展的编码框架，同时生成两路码流——视觉驱动的紧凑表示码流以及辅助信息码流，从而根据不同任务需求进行解码重建。本发明的解码器采取了一个生成模型，可以针对不同等级的码流进行解码。对于视觉驱动的紧凑表示码流，生成针对机器视觉的重建图像。对于视觉驱动的紧凑表示码流以及辅助信息码流，生成针对人眼视觉的重建图像。整体框架如附图1所示。Under the above technical background, the present invention designs an extensible human-machine collaborative image coding method and coding system. Different from the single code stream of traditional vision, the present invention proposes an extensible coding framework, and generates two code streams at the same time - a vision-driven compact representation code stream and an auxiliary information code stream, so as to perform decoding and reconstruction according to different task requirements. The decoder of the present invention adopts a generative model, and can decode code streams of different levels. For vision-driven compact representation codestreams, generate reconstructed images for machine vision. For vision-driven compact representation codestreams and auxiliary information codestreams, reconstructed images for human vision are generated. The overall frame is shown in Figure 1.

本发明的技术方案为：The technical scheme of the present invention is:

一种可扩展人机协同图像编码方法，其步骤包括：A scalable human-machine collaborative image coding method, the steps of which include:

1)提取各样本图片的边缘图；1) Extract the edge map of each sample image;

2)利用贝塞尔曲线对边缘图进行矢量化，作为驱动机器视觉任务的紧凑表示；然后在矢量化后的边缘图中进行关键点提取，将提取的关键点作为辅助信息；2) Use the Bezier curve to vectorize the edge map as a compact representation for driving machine vision tasks; then extract key points in the vectorized edge map, and use the extracted key points as auxiliary information;

3)对所述紧凑表示和所述辅助信息分别进行熵编码无损压缩，获得两路码流；3) Entropy coding and lossless compression are respectively performed on the compact representation and the auxiliary information to obtain two code streams;

4)对两路码流进行初步解码，获得边缘图以及辅助信息；4) Preliminarily decode the two code streams to obtain edge maps and auxiliary information;

5)对于需生成针对人眼视觉的重建图像任务，则将解码得到的边缘图以及辅助信息输入生成神经网络中，进行网络的前向计算；对于需生成针对机器视觉的重建图像任务，则将解码得到的边缘图输入生成神经网络中，进行网络的前向计算；5) For the task of generating reconstructed images for human vision, input the decoded edge map and auxiliary information into the generating neural network, and perform forward calculation of the network; for the task of generating reconstructed images for machine vision, then The decoded edge map is input into the generating neural network, and the forward calculation of the network is performed;

6)步骤5)得到的计算结果与对应原始图片进行损失函数计算，并将计算的损失反向传播到神经网络进行网络权值更新；6) The calculation result obtained in step 5) and the corresponding original picture are subjected to loss function calculation, and the calculated loss is back-propagated to the neural network to update the network weights;

7)重复步骤2)～6)直到神经网络的损失收敛，得到针对人眼视觉重建图像任务的双路码流解码器或针对机器视觉重建图像任务的紧凑表示码流解码器；7) Repeat steps 2) to 6) until the loss of the neural network converges, and obtain a dual-channel code stream decoder for the task of reconstructing images with human vision or a compact representation code stream decoder for the task of reconstructing images with machine vision;

8)对于一待处理图像I，获取该图像I的边缘图和辅助信息并编码压缩后得到两路码流，8) for a to-be-processed image I, obtain the edge map and auxiliary information of this image I and obtain two code streams after encoding and compressing,

分别记为B_E和B_C；are denoted as_BE and_BC respectively;

9)根据任务需求选择双路码流解码器或紧凑表示码流解码器对收到的码流解码，重建图像。9) Select a dual-channel code stream decoder or a compact representation code stream decoder according to the task requirements to decode the received code stream and reconstruct the image.

进一步的，提取所述关键点的方法为：若所述边缘图矢量化后的线为直线段，则使用直线模式提取关键点，否则使用贝塞尔曲线模式提取关键点。Further, the method for extracting the key points is as follows: if the vectorized line of the edge map is a straight line segment, use the straight line mode to extract the key points; otherwise, use the Bezier curve mode to extract the key points.

进一步的，使用所述直线模式提取关键点的方法为：若直线段与水平线夹角大于设定角度，则过线段中点在水平线上左右等距采样两个颜色值；若小于或等于设定角度，则过线段中点在竖直线上上下等距采样两个颜色值；使用所述贝塞尔曲线模式提取关键点的方法为：将平行于贝塞尔曲线起点与终点连线的线与贝塞尔曲线的切点记下，若切线与水平线夹角大于设定角度，则过切点在水平线上采样曲线内的一个颜色值；若小于或等于设定角度，则过切点在竖直线上采样曲线内的一个颜色值。Further, the method for extracting key points using the straight line mode is: if the angle between the straight line segment and the horizontal line is greater than the set angle, then sampling two color values equidistantly left and right on the horizontal line through the midpoint of the line segment; angle, then sample two color values equidistantly up and down the vertical line through the midpoint of the line segment; the method for extracting key points using the Bezier curve mode is: connect a line parallel to the start point and end point of the Bezier curve Note down the tangent point with the Bezier curve. If the angle between the tangent and the horizontal line is greater than the set angle, the tangent point is a color value in the sampling curve on the horizontal line; if it is less than or equal to the set angle, the tangent point is at A color value within the sample curve on the vertical line.

进一步的，所述设定角度为45°。Further, the set angle is 45°.

进一步的，对于机器视觉任务，步骤8)中将边缘图对应的码流B_E发送给紧凑表示码流解码器，步骤9)中，紧凑表示码流解码器对码流B_E解码得到矢量化后的边缘图E并对其进行前向传递，得到解码图像

对于人眼视觉任务，步骤8)将码流B_E和B_C发送给双路码流解码器，步骤9)中，双路码流解码器对码流B_E和B_C解码得到E和C并对其进行前向传递，得到解码图像

Further, for the machine vision task, in step 8), the code stream B_E corresponding to the edge map is sent to the compact representation code stream decoder, and in step 9), the compact representation code stream decoder decodes the code stream B_E to obtain a vectorized representation. After the edge map E and pass it forward, get the decoded image

For the human eye vision task, step 8) sends the code streams B_E and B_C to the dual-channel code stream decoder, and in step 9), the dual-channel code stream decoder decodes the code streams B_E and B_C to obtain E and C and forward pass it to get the decoded image

一种双路码流解码器训练生成方法，其步骤包括：A method for training and generating a dual-channel code stream decoder, the steps of which include:

5)将解码得到的边缘图以及辅助信息输入生成神经网络中，进行网络的前向计算；5) Input the edge map and auxiliary information obtained by decoding into the generating neural network, and perform forward calculation of the network;

7)重复步骤2)～6)直到神经网络的损失收敛，得到针对人眼视觉重建图像任务的双路码流解码器。7) Steps 2) to 6) are repeated until the loss of the neural network converges, and a dual-channel code stream decoder for the task of reconstructing images for human vision is obtained.

一种紧凑表示码流解码器训练生成方法，其步骤包括：A method for training and generating a compact representation code stream decoder, the steps of which include:

1)提取各样本图片的边缘图，并对边缘图进行矢量化作为驱动机器视觉任务的紧凑表示；1) Extract the edge map of each sample image, and vectorize the edge map as a compact representation for driving machine vision tasks;

2)对所述紧凑表示进行熵编码无损压缩，获得一路码流；2) Entropy coding lossless compression is performed on the compact representation to obtain a code stream;

3)对码流进行初步解码，获得边缘图；3) Preliminarily decode the code stream to obtain an edge map;

4)将解码得到的边缘图输入生成神经网络中，进行网络的前向计算；4) Input the edge map obtained by decoding into the generating neural network, and perform the forward calculation of the network;

5)根据步骤4)得到的计算结果与对应原始图片进行损失函数计算，并将计算的损失反向传播到神经网络进行网络权值更新；5) Calculate the loss function according to the calculation result obtained in step 4) and the corresponding original picture, and back-propagate the calculated loss to the neural network to update the network weights;

6)重复步骤2)～5)直到神经网络的损失收敛，得到针对机器视觉重建图像任务的紧凑表示码流解码器。6) Repeat steps 2) to 5) until the loss of the neural network converges to obtain a compact representation code stream decoder for the task of reconstructing images in machine vision.

一种可扩展人机协同图像编码系统，其特征在于，包括编码器、双路码流解码器和紧凑表示码流解码器；其中，An extensible human-machine collaborative image coding system, characterized in that it includes an encoder, a dual-channel code stream decoder and a compact representation code stream decoder; wherein,

编码器，用于提取图片的边缘图；然后利用贝塞尔曲线对边缘图进行矢量化，作为驱动机器视觉任务的紧凑表示；然后在矢量化后的边缘图中进行关键点提取，将提取的关键点作为辅助信息；然后对所述紧凑表示和所述辅助信息分别进行熵编码无损压缩，获得两路码流；The encoder is used to extract the edge map of the picture; then use the Bezier curve to vectorize the edge map as a compact representation for driving machine vision tasks; then perform key point extraction in the vectorized edge map, and extract the extracted The key points are used as auxiliary information; then the compact representation and the auxiliary information are respectively subjected to entropy coding lossless compression to obtain two code streams;

双路码流解码器，用于对两路码流进行解码得到边缘图以及辅助信息，然后对解码得到的边缘图以及辅助信息进行前向传递，得到用于人眼视觉重建图像任务的解码图像；Two-channel code stream decoder is used to decode the two-channel code stream to obtain the edge map and auxiliary information, and then forward the decoded edge map and auxiliary information to obtain the decoded image for the task of human visual reconstruction. ;

紧凑表示码流解码器，用于对边缘图对应的码流进行解码得到边缘图，然后对解码得到的边缘图进行前向传递，得到用于机器视觉重建图像任务的解码图像。The compact representation code stream decoder is used to decode the code stream corresponding to the edge map to obtain the edge map, and then forward the decoded edge map to obtain the decoded image for the image reconstruction task of machine vision.

本发明使用贝塞尔曲线提取图像边缘图并矢量化作为驱动机器视觉任务的紧凑表示，并且根据矢量化的边缘图中各直线以及曲线的位置、参数等信息，得到关键点坐标，在原图像上提取关键点，进行编码生成对应两路码流，如附图2所示。The present invention uses Bezier curve to extract image edge map and vectorize it as a compact representation for driving machine vision tasks, and obtains key point coordinates according to information such as the positions, parameters and other information of each line and curve in the vectorized edge map, which is displayed on the original image. Extract key points, and perform coding to generate corresponding two-channel code streams, as shown in FIG. 2 .

接下来描述本发明方法的主要步骤。The main steps of the method of the present invention are described next.

步骤1：收集一批图片，进行边缘图的提取，收集的图片则被视作网络输出的目标被保存。Step 1: Collect a batch of pictures, extract the edge map, and save the collected pictures as the target output by the network.

步骤2：利用贝塞尔曲线对边缘图进行矢量化。在矢量化后的边缘图中采样关键点作为辅助信息(边缘图在经过矢量化之后，被表示为直线和曲线；根据直线和曲线的位置，参数等信息，可计算出关键点的坐标，该坐标被用于在原采集得到的图像中提取关键点)。关键点的提取分为两种模式：直线模式和贝塞尔曲线模式。若矢量化后的线为直线段，则使用直线模式，反之使用贝塞尔曲线模式。直线模式下，若该直线段与水平线夹角大于45°，则过线段中点在水平线上左右等距采样两个颜色值记下；若小于等于45°，则过线段中点在竖直线上上下等距采样两个颜色值记下。贝塞尔曲线模式下，将平行于贝塞尔曲线起点终点连线的线与该贝塞尔曲线的切点记下，在贝塞尔曲线模式下，一段边缘使用贝塞尔曲线进行描述，如说明书附图2(c)中所示，贝塞尔曲线的起点为Ps，终点为Pt，连接Ps和Pt得到直线PsPt，做该直线PsPt和曲线的切线，取切点。若切线与水平线夹角大于45°，则过切点在水平线上采样曲线内的一个颜色值记下；若小于等于45°，则过切点在竖直线上采样曲线内的一个颜色值记下。Step 2: Vectorize the edge map using Bezier curves. Sampling key points in the vectorized edge map as auxiliary information (after the edge map is vectorized, it is represented as a straight line and a curve; The coordinates are used to extract keypoints in the original acquired image). The extraction of key points is divided into two modes: straight line mode and Bezier curve mode. If the vectorized line is a straight line segment, use the straight line mode, otherwise use the Bezier curve mode. In the straight line mode, if the angle between the straight line segment and the horizontal line is greater than 45°, the midpoint of the line segment is equidistantly sampled on the left and right of the horizontal line and two color values are recorded and recorded; if it is less than or equal to 45°, the midpoint of the line segment is on the vertical line. Sample two color values equidistantly up and down. In Bezier curve mode, write down the tangent point between the line parallel to the start point and end point of the Bezier curve and the Bezier curve. In Bezier curve mode, an edge is described by a Bezier curve. As shown in Figure 2(c) of the specification, the starting point of the Bezier curve is Ps, and the end point is Pt. Connect Ps and Pt to obtain a straight line PsPt, and make a tangent between the straight line PsPt and the curve, and take the tangent point. If the angle between the tangent line and the horizontal line is greater than 45°, record a color value in the sampling curve of the overcut point on the horizontal line; if it is less than or equal to 45°, record a color value in the sampling curve of the overcut point on the vertical line Down.

步骤3：对作为紧凑表示的矢量化的边缘图以及关键点辅助信息进行熵编码无损压缩，获得两路码流。Step 3: Perform entropy coding lossless compression on the vectorized edge map as a compact representation and the auxiliary information of key points to obtain two code streams.

步骤4：对两路码流进行初步解码，获得边缘图以及关键点辅助信息。Step 4: Preliminarily decode the two code streams to obtain edge maps and auxiliary information of key points.

步骤5：对于双路码流的解码器，将边缘图以及对应关键点辅助信息作为输入送入对应生成神经网络(可以为Pixel2Pixel网络)中，进行网络的前向计算；对于针对视觉驱动的紧凑表示码流的解码器，将边缘图作为输入送入对应生成神经网络中，进行网络的前向计算。Step 5: For the decoder of the dual code stream, the edge map and the auxiliary information of the corresponding key points are sent as input into the corresponding generative neural network (which can be the Pixel2Pixel network), and the forward calculation of the network is performed; for the vision-driven compact The decoder representing the code stream takes the edge map as input into the corresponding generative neural network, and performs the forward calculation of the network.

步骤6：步骤5得到计算结果，与原始图像进行损失函数计算。Step 6: In step 5, the calculation result is obtained, and the loss function is calculated with the original image.

步骤7：将计算的损失反向传播到两个网络神经网络各层，以更新权值，在下次迭代中使得结果更接近目标效果。Step 7: Back-propagate the calculated loss to the layers of the two neural networks to update the weights to make the result closer to the target effect in the next iteration.

步骤8：重复步骤2-步骤7直到两个神经网络的损失收敛。由此得到了针对双路码流的解码器网络以及针对视觉驱动的紧凑表示码流的解码器网络。Step 8: Repeat steps 2-7 until the losses of both neural networks converge. This results in a decoder network for dual codestreams and a decoder network for visually-driven compact representation of codestreams.

与现有技术相比，本发明的积极效果为：Compared with the prior art, the positive effects of the present invention are:

本发明为一种可扩展的图像有损压缩方案，不仅保证了人眼视觉质量，也保证了机器视觉任务的性能。与传统的图像有损压缩方法中仅输出单一码流不同，本发明中的压缩方案生成两部分码流：视觉驱动的紧凑表示码流以及辅助信息码流。具体而言，本发明使用贝塞尔曲线表示图像的边缘信息作为基础码流，在此基础上提取图像中的关键点作为补充码流，使用以上两种码流表征图像，从而对图像进行高效压缩。此外，本发明采用生成神经网络模型构建解码器，通过输入基础码流或者共同输入基础和补充个码流，分别生成出针对机器视觉的图像以及针对人眼视觉的图像，两者的重建质量都达到了优异效果。The present invention is an extensible image lossy compression scheme, which not only ensures the visual quality of human eyes, but also ensures the performance of machine vision tasks. Unlike traditional image lossy compression methods that only output a single code stream, the compression scheme in the present invention generates two code streams: a visually driven compact representation code stream and an auxiliary information code stream. Specifically, the present invention uses the Bezier curve to represent the edge information of the image as the basic code stream, and on this basis, extracts key points in the image as the supplementary code stream, and uses the above two code streams to characterize the image, so as to efficiently perform the image processing. compression. In addition, the present invention uses a generative neural network model to construct a decoder, and generates images for machine vision and images for human vision by inputting the basic code stream or jointly inputting the basic and supplementary code streams, and the reconstruction quality of both is the same. Excellent results were achieved.

以下数据展示了本方法对比现有的JPEG图像压缩方法的性能改进。本测试衡量在极低码率下，在人脸关键点检测任务上，不同的方法的准确度(错误率)，以及由被试打分的人眼主观质量：The following data demonstrate the performance improvement of this method over existing JPEG image compression methods. This test measures the accuracy (error rate) of different methods on the face key point detection task at a very low bit rate, as well as the subjective quality of the human eye scored by the subjects:

可见，在更低码率下，本发明能够达到更好的性能。It can be seen that at a lower code rate, the present invention can achieve better performance.

附图说明Description of drawings

图1为可扩展的人机协同图像编码器的框架。Figure 1 shows the framework of a scalable human-machine collaborative image encoder.

图2为矢量化的图像边缘图关键点辅助信息提取方法；Fig. 2 is a method for extracting auxiliary information of key points of a vectorized image edge map;

(a)矢量化边缘图，(b)直线(＞45⁰)，(c)直线(﹤45⁰)，(d)贝塞尔曲线。(a) Vectorized edge map, (b) straight line (>45⁰ ), (c) straight line (<45⁰ ), (d) Bezier curve.

具体实施方式Detailed ways

为了对本发明的技术方法进一步阐述，下面结合说明书附图和具体实例，对本发明的可扩展的人机协同图像编码器进行进一步的详细说明。In order to further illustrate the technical method of the present invention, the scalable human-machine collaborative image encoder of the present invention will be further described in detail below with reference to the accompanying drawings and specific examples of the present invention.

本实例将重点详细阐述该技术方法中编码器编码流程的和解码器生成网络的训练过程。假设目前我们已经构建了所需的解码器生成网络，并且有N张训练图像{I₁,I₂,…,I_N}作为训练数据。This example will focus on the detailed description of the encoding process of the encoder and the training process of the decoder generation network in this technical method. Suppose so far we have built the required decoder generation network and have_N training images {I₁ ,I₂ ,...,IN } as training data.

一、训练过程：1. Training process:

步骤1：将{I₁,I₂,…,I_N}中的每一张图像边缘图经过矢量化后的图记为{E₁,E₂,…,E_N}，对应关键点辅助信息记为{C₁,C₂,…,C_N}。Step 1: Denote the vectorized image of each image edge map in {I₁ ,I₂ ,...,_IN } as {E₁ ,E₂ ,...,E_N }, corresponding to the auxiliary information of key points Denoted as {C₁ ,C₂ ,…,C_N }.

步骤2：根据附图1所示，将{E₁,E₂,…,E_N}和{C₁,C₂,…,C_N}送入生成网络中进行前向传递。对于针对机器视觉任务的解码器生成网络，输入仅为{E₁,E₂,…,E_N}。Step 2: According to Fig. 1, {E₁ , E₂ ,..., E_N } and {C₁ , C₂ ,..., C_N } are sent into the generation network for forward transmission. For decoder-generating networks for machine vision tasks, the input is simply {E₁ ,E₂ ,…,E_N }.

步骤3：前向传递得到输出

计算输出与{I₁,I₂,…,I_N}的损失误差。Step 3: Forward pass to get output

Calculates the loss error of the output and {I₁ ,I₂ ,...,_IN }.

步骤4：获得误差值后，对网络进行误差值的反向传播，以训练网络更新模型权值。Step 4: After the error value is obtained, back-propagate the error value to the network to train the network to update the model weights.

步骤5：重复步骤1-步骤4直到神经网络收敛。Step 5: Repeat steps 1-4 until the neural network converges.

二、编解码过程：Second, the encoding and decoding process:

步骤1：提取图像的I的边缘图，并将该边缘图通过贝塞尔曲线矢量化后的图存下，记为E。Step 1: Extract the edge map of I of the image, and save the edge map as a vectorized image by Bezier curve, denoted as E.

步骤2：根据矢量化后的边缘图提取关键点辅助信息。通过遍历其所有线段，根据其线段模式采样关键点。将提取出的关键点辅助信息记为C。Step 2: Extract key point auxiliary information according to the vectorized edge map. Samples keypoints according to their segment pattern by iterating over all their segments. Denote the extracted auxiliary information of key points as C.

步骤3：按照可扩展矢量图形(Scalable Vectore Graphic,SVG)格式对E进行编码，再与C进行熵编码，得到两路码流，分别记为B_E和B_C。Step 3: encode E according to the Scalable Vectore Graphic (SVG) format, and then perform entropy encoding with C to obtain two code streams, which are denoted as B_E and B_C respectively.

步骤4：根据需求选择解码器对不同等级码流解码。对于机器视觉任务，仅需要解码器解码B_E，得到矢量化后的边缘图E。将其输入进对应网络进行前向传递，得到解码图像

对于人眼视觉任务，需要解码B_E和B_C，得到E和C，并送入对应生成网络进行前向传递，得到解码图像

Step 4: Select a decoder to decode different levels of code streams according to requirements. For machine vision tasks, only the decoder needs to decode BE to get the vectorized edge map_E. Input it into the corresponding network for forward pass to get the decoded image

For human vision tasks, it is necessary to decode B_E and B_C to obtain E and C, and send them to the corresponding generation network for forward transmission to obtain the decoded image

显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内，则本发明也意图包含这些改动和变型在内。It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention. Thus, provided that these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.

Claims

Translated fromChinese

1.一种可扩展人机协同图像编码方法，其步骤包括：1. A scalable human-machine collaborative image coding method, the steps comprising:

6)根据步骤5)得到的计算结果与对应原始图片进行损失函数计算，并将计算的损失反向传播到神经网络进行网络权值更新；6) Calculate the loss function according to the calculation result obtained in step 5) and the corresponding original picture, and back-propagate the calculated loss to the neural network to update the network weights;

8)对于一待处理图像I，提取图像I的边缘图，并将该边缘图通过贝塞尔曲线矢量化后的图存下，记为E；根据矢量化后的边缘图提取关键点辅助信息，将提取出的关键点辅助信息记为C；按照可扩展矢量图形格式对E进行编码，再与C进行熵编码，得到两路码流，分别记为B_E和B_C；8) For a to-be-processed image I, extract the edge map of the image I, and save the edge map through the vectorized image of the Bezier curve, denoted as E; Extract key point auxiliary information according to the vectorized edge map , the extracted key point auxiliary information is denoted as C; E is encoded according to the scalable vector graphics format, and then entropy coding is carried out with C to obtain two code streams, which are denoted as B_E and B_C respectively;

2.如权利要求1所述的方法，其特征在于，提取所述关键点的方法为：若所述边缘图矢量化后的线为直线段，则使用直线模式提取关键点，否则使用贝塞尔曲线模式提取关键点。2. The method according to claim 1, wherein the method for extracting the key points is: if the vectorized line of the edge map is a straight line segment, use a straight line mode to extract the key points, otherwise use a Besse Extract keypoints in Er curve mode.

3.如权利要求2所述的方法，其特征在于，使用所述直线模式提取关键点的方法为：若直线段与水平线夹角大于设定角度，则过线段中点在水平线上左右等距采样两个颜色值；若小于或等于设定角度，则过线段中点在竖直线上上下等距采样两个颜色值；使用所述贝塞尔曲线模式提取关键点的方法为：将平行于贝塞尔曲线起点与终点连线的线与贝塞尔曲线的切点记下，若切线与水平线夹角大于设定角度，则过切点在水平线上采样曲线内的一个颜色值；若小于或等于设定角度，则过切点在竖直线上采样曲线内的一个颜色值。3. The method according to claim 2, wherein the method for extracting key points using the straight line mode is: if the included angle between the straight line segment and the horizontal line is greater than the set angle, then the midpoint of the line segment is equidistant from left to right on the horizontal line. Two color values are sampled; if it is less than or equal to the set angle, the two color values are sampled at equal distances up and down the vertical line through the midpoint of the line segment; the method for extracting key points using the Bezier curve mode is: parallel Record the tangent point between the line connecting the start point and the end point of the Bezier curve and the Bezier curve. If the angle between the tangent line and the horizontal line is greater than the set angle, a color value in the curve will be sampled on the horizontal line through the tangent point; Less than or equal to the set angle, a color value within the sampling curve of the overtangent point is on the vertical line.

4.如权利要求3所述的方法，其特征在于，所述设定角度为45°。4. The method of claim 3, wherein the set angle is 45°.

5.如权利要求1所述的方法，其特征在于，对于机器视觉任务，步骤8)中将边缘图对应的码流B_E发送给紧凑表示码流解码器，步骤9)中，紧凑表示码流解码器对码流B_E解码得到矢量化后的边缘图E并对其进行前向传递，得到解码图像

5. method as claimed in claim 1 is characterized in that, for machine vision task, in step 8), the code stream B_E corresponding to edge map is sent to compact representation code stream decoder, in step 9), compact representation code. The stream decoder decodes the code stream B_E to obtain the vectorized edge map E and forwards it to obtain the decoded image

6.一种双路码流解码器训练生成方法，其步骤包括：6. A method for training and generating a dual-channel code stream decoder, the steps comprising:

7.一种紧凑表示码流解码器训练生成方法，其步骤包括：7. A method for training and generating a compact representation code stream decoder, the steps comprising:

5)根据步骤4)得到的计算结果与对应原始图片进行损失函数计算，并将计算的损失反向传播到神经网络进行网络权值更新；5) According to the calculation result obtained in step 4) and the corresponding original picture, carry out loss function calculation, and back-propagate the calculated loss to the neural network to update the network weights;

6)重复步骤2)～5)直到神经网络的损失收敛，得到针对机器视觉重建图像任务的紧凑表示码流解码器。6) Repeat steps 2) to 5) until the loss of the neural network converges, and obtain a compact representation code stream decoder for the task of reconstructing images in machine vision.

8.一种可扩展人机协同图像编码系统，其特征在于，包括编码器、基于权利要求6所述方法训练得到的双路码流解码器和基于权利要求7所述方法训练得到的紧凑表示码流解码器；其中，8. A scalable human-machine collaborative image coding system, characterized in that it comprises an encoder, a dual-channel code stream decoder obtained by training based on the method of claim 6 and a compact representation obtained by training based on the method according to claim 7 code stream decoder; wherein,