A kind of acceleration system and method for multi-channel video encoding and decodingTechnical field
The present invention relates to server video encoding and decoding technique fields, specifically provide a kind of acceleration of multi-channel video encoding and decodingSystem and method.
Background technique
In recent years, the life of the continuous development let us of network technology is more and more abundant colorful.There is this convenience of networkCarrier, the progress that multimedia technology is also maked rapid progress, as the core and key of multimedia technology, multimedia videoEncoding and decoding all achieve major progress in technology and application aspect in recent years.The main function of Video coding is by video image prime numberIt is collapsed into video code flow according to RGB or YUV etc., to reduce the data volume of video.
More and more for the application demand of present coding and decoding video, present client is all based on to compile with video and solveThe CPU of code function, such as the E3 of intel, cooperate the tall and handsome P4 card reached.Current E3 integrated codec chip maximum processing capabilityIt is 12 road 1080P.And only have 2 chips to be used for coding and decoding video in P4 card, if being intended only as encoding and decoding single application, P4Other calculated performances possessed by blocking undoubtedly are exactly to waste, and the price of P4 card is more expensive.Based on current many customers' placesSmall video is managed, the multiple business such as live streaming even more occurs in security protection, on the road of encoding and decoding and cost has higher wantIt asks.
Summary of the invention
For disadvantage mentioned above, the embodiment of the present invention proposes the acceleration system and method for a kind of multi-channel video encoding and decoding, dropLow cost, while improving the speed of the encoding and decoding of multi-channel video.
The embodiment of the present invention proposes a kind of acceleration system of multi-channel video encoding and decoding, including Camera, Server and addsSpeed card, the accelerator card includes BMC control module, chip processing module and power module;
The BMC control module includes BMC, house dog, fan, temperature sensor and EEPROM;The BMC passes through UARTIt is connected with house dog;The BMC is connected with fan, temperature sensor and EEPROM respectively by I2C;
The chip processing module includes 2,3 or 4 fpga chips;The fpga chip is separately connected two DDR4 memoriesItem, jtag circuit, SPI Flash and EMMC;It is interconnected between the fpga chip using Chiplink, and the fpga chipSwitch is also connected to by UART;
The power module includes power regulator;The power regulator connects external+12V power supply, while controlling with BMCMolding block is connected with chip processing module;
The power module is BMC control module and chip processing module for power supply;The BMC control module is used for out-of-band supervisionKeyholed back plate reason;The chip processing module is for carrying out multi-channel video encoding and decoding.
Further, the accelerator card further include programmable clock chip, indicator light key, reset key, test point byKey and external interface;
The programmable clock chip is for being kept for synchronous clock, display and record time;
The indicator light key, which is used to indicate, accelerates card failure or in place;
The reset key is used to restart in case of constant power when accelerator card breaks down;
The input and output of pin when the test point key is tested for fpga chip;
The external interface includes USB and HUB.
Further, the fpga chip uses the ZU7EV chip of Xilinx.
Further, the capacity of the DDR4 memory bar is 4GB.
Further, it is interconnected between the fpga chip using Chiplink, is used between the fpga chipThe port specification of Chiplink interconnection is Serdes x4, and the fpga chip walks line rate 10Gbps.
A kind of accelerated method of multi-channel video encoding and decoding is that the acceleration system based on a kind of multi-channel video encoding and decoding is realized,It the described method comprises the following steps:
The H.264/H.265 coded data that S1:Server transmits Camera IP Camera is sent in accelerator cardChip processing module;
S2: the H.264/H.265 coded data that the chip processing module in accelerator card transmits Camera IP CameraIt is averagely allocated to each fpga chip according to code stream, each fpga chip is decoded the bit stream data received, so firstCNN reasoning acceleration is carried out afterwards and retrieval accelerates;
S3: accelerator card will be sent to Server by each processed bit stream data of fpga chip in chip processing moduleIn memory.
Further, step S1 includes:
Server memory is written in the H.264/H.265 coded data that NIC transmits Camera IP Camera in ServerFirst memory headroom of middle App process;
Accelerator card driving is called, and applies for the second required memory headroom of accelerator card driving in accelerator card memory, and willThe H.264/H.265 coded data copy of Camera IP Camera transmission is mapped to the second memory headroom.
Further, step S2 includes:
Server writes the PCIE accelerator card register in the space MMCFG, and FPGA reads Camera IP Camera using DMAThe H.264/H.265 coded data of transmission, by the H.264/H.265 coded data of Camera IP Camera transmission from theTwo spaces are transported to chip processing module, are evenly distributed to each fpga chip according to code stream by Switch;
H.264/H.265, the fpga chip is decoded the bit stream data that comes of distribution, and by decoded dataIt carries out CNN reasoning acceleration and retrieval accelerates.
Further, step S3 includes:
After the fpga chip completes CNN reasoning acceleration and retrieval acceleration, MSI interrupt is initiated to Server, Server writesPCIE accelerator card register in the space MMCFG, FPGA use DMA write operation, after completing CNN reasoning acceleration and retrieval accelerationData copy the second memory headroom to from FPGA;
Server is copied the data that CNN reasoning accelerates and retrieves after accelerating are completed or is mapped to from the second memory headroomFirst memory headroom of App process;
Accelerator card driving, which is called, to be returned.
The effect provided in summary of the invention is only the effect of embodiment, rather than invents all whole effects, above-mentionedA technical solution in technical solution have the following advantages that or the utility model has the advantages that
The embodiment of the present invention proposes a kind of acceleration system of multi-channel video encoding and decoding, the system include Camera,Server and accelerator card.Wherein accelerator card includes BMC control module, chip processing module and power module;BMC control module is usedIn out of band supervision management, chip processing module is BMC control module and chip for carrying out multi-channel video encoding and decoding, power moduleProcessing module power supply.BMC control module includes BMC, house dog, fan, temperature sensor and EEPROM, BMC by UART withHouse dog is connected, and BMC also passes through I2C and is connected respectively with fan, temperature sensor and EEPROM.Chip processing module includes 2,3Or 4 fpga chips, fpga chip is separately connected two DDR4 memory bars, jtag circuit, SPI Flash and EMMC, eachIt is interconnected between fpga chip using Chiplink, and fpga chip is connected to Switch.Power module includes electric power adjustmentDevice, power regulator connects external+12V power supply, while being connected with BMC control module and chip processing module.Based on the multichannelA kind of acceleration system of coding and decoding video, it is also proposed that accelerated method of multi-channel video encoding and decoding.The present invention uses fpga chipIt realizes, without using the CPU for having coding and decoding video function, such as the E3 of intel, cooperates the tall and handsome P4 card reached, it is big in costAmplitude is reduced.In addition 2,3 or 3 fpga chips are used, the speed of coding and decoding video can be improved according to actual needs.Fpga chip uses the ZU7EV chip of Xilinx, and each ZU7EV chip can carry out the coding and decoding video on 8 tunnels, 2 FPGA coresPiece can carry out 16 road coding and decoding videos, and 3 fpga chips can carry out 24 road coding and decoding videos, and 4 fpga chips can be into32 road coding and decoding video of row, improves the speed of coding and decoding video.
Detailed description of the invention
Fig. 1 is the structure connection signal of accelerator card in a kind of acceleration system of the multi-channel video encoding and decoding of the embodiment of the present invention 1Figure;
Fig. 2 is 4 fpga chips of accelerator card in the acceleration system based on a kind of multi-channel video encoding and decoding of the embodiment of the present invention 1Interconnect topological diagram;
Fig. 3 is the overall data process figure of the accelerated method based on a kind of multi-channel video encoding and decoding of the embodiment of the present invention 1.
Specific embodiment
In order to clarify the technical characteristics of the invention, below by specific embodiment, and its attached drawing is combined, to this hairIt is bright to be described in detail.Following disclosure provides many different embodiments or example is used to realize different knots of the inventionStructure.In order to simplify disclosure of the invention, hereinafter the component of specific examples and setting are described.In addition, the present invention can be withRepeat reference numerals and/or letter in different examples.This repetition is that for purposes of simplicity and clarity, itself is not indicatedRelationship between various embodiments and/or setting is discussed.It should be noted that illustrated component is not necessarily to scale in the accompanying drawingsIt draws.Present invention omits the descriptions to known assemblies and treatment technology and process to avoid the present invention is unnecessarily limiting.
Embodiment 1
The embodiment of the present invention 1 provides a kind of acceleration system of multi-channel video encoding and decoding, the system include Camera,Server and accelerator card.
Camera is for providing video data;Server receives video data for connecting, and to accelerator card encoding and decoding numberAccording to carrying out control management.
It is the structure company of accelerator card in a kind of acceleration system of the multi-channel video encoding and decoding of the embodiment of the present invention 1 as shown in Figure 1Connect schematic diagram.Accelerator card includes BMC control module, chip processing module and power module;
BMC control module is used for out of band supervision management, and chip processing module is for carrying out multi-channel video encoding and decoding, power supply mouldBlock is BMC control module and chip processing module for power supply.
BMC control module includes BMC, house dog, fan, temperature sensor and EEPROM, and BMC passes through UART and house dogIt is connected, BMC also passes through I2C and is connected respectively with fan, temperature sensor and EEPROM.
Chip processing module includes 2,3 or 4 fpga chips, and fpga chip is separately connected two DDR4 memory bars, JTAGCircuit, SPI Flash and EMMC.The capacity of two of them DDR4 memory bar is 4GB, band ECC, frequency 2400MH.
It is interconnected between each fpga chip using Chiplink.It is based on a kind of multichannel of the embodiment of the present invention 1 as shown in Figure 24 fpga chips of accelerator card interconnect topological diagram in the acceleration system of coding and decoding video;The port specification of each fpga chipSerdes (GTH) x4, walking line rate is 10Gbps.
Fpga chip is connected to Switch by UART.Accelerator card is communicated by PCIE golden finger with Server,Server server provides the signal of X8, is converted into the signal of 4 X8 to fpga chip by PCIe Switch.
Fpga chip uses the ZU7EV chip of Xilinx, and each ZU7EV chip can carry out the coding and decoding video on 8 tunnels, and 2A fpga chip can carry out 16 road coding and decoding videos, and 3 fpga chips can carry out 24 road coding and decoding videos, 4 FPGA coresPiece can carry out 32 road coding and decoding videos.Using 4 fpga chips as explanation in the embodiment of the present invention 1, the model that the present invention protectsIt encloses and is not limited to the embodiment 1.
Power module includes power regulator, and power regulator connects external+12V power supply, at the same with BMC control module andChip processing module is connected.
Accelerator card further includes programmable clock chip, indicator light key, reset key, test point key and external interface;
Programmable clock chip is for being kept for synchronous clock, display and record time;Programmable clock chip respectively withZU7EV0 chip, ZU7EV1 chip, ZU7EV2 chip are connected with ZU7EV3 chip.
Indicator light key, which is used to indicate, accelerates card failure or in place;
Reset key is used to restart in case of constant power when accelerator card breaks down;
The input and output of pin when test point key is tested for fpga chip;
External interface includes USB and HUB.
Accelerator card further includes high-definition digital display interface Display port, high-definition digital display interface Display portIt is connected with ZU7EV0 chip, ZU7EV1 chip, ZU7EV2 chip and ZU7EV3 chip.
A kind of acceleration system based on a kind of multi-channel video encoding and decoding that the embodiment of the present invention 1 proposes, it is also proposed that multichannelThe accelerated method of coding and decoding video.
Before the accelerated method for executing a kind of multi-channel video encoding and decoding, 32 road camera videos compile solution in Camera firstCode data Camera0, Camera1 ... Camera31 passes through the NIP that Switch is sent to Server.
Then, the 32 road cameras that execution execution step S1:Server transmits Camera IP Camera are H.264/H.265 coded data is sent to the chip processing module in accelerator card;
S2: the 32 road cameras that the chip processing module in accelerator card transmits Camera IP Camera are H.264/H.265 coded data distributes 8 road bit stream datas according to each fpga chip, and each fpga chip is to the 8 road code stream numbers receivedAccording to be decoded first, then carry out CNN reasoning acceleration and retrieval accelerate;
S3: accelerator card will be sent to Server by each processed bit stream data of fpga chip in chip processing moduleIn memory.
It is illustrated in figure 3 the overall data stream of the accelerated method based on a kind of multi-channel video encoding and decoding of the embodiment of the present invention 1Cheng Tu.
The 32 road cameras that NIC transmits Camera IP Camera in process 1:Server H.264/H.265 coded numberAccording to the first memory headroom of App process in write-in Server memory.
Process 2: calling accelerator card driving, and applies for that the second required memory of accelerator card driving is empty in accelerator card memoryBetween, and H.264/H.265 coded data is copied or is mapped in second by 32 road cameras that Camera IP Camera transmitsDeposit space.
Process 3:Server writes the PCIE accelerator card register in the space MMCFG, and FPGA reads Camera network using DMA32 road cameras of thecamera head H.264/H.265 coded data, and the 32 tunnels camera shooting that Camera IP Camera is transmittedH.264/H.265 coded data from second space is transported to chip processing module to head, by Switch according to each fpga chipDistribute 8 road bit stream datas.
H.264/H.265, process 4:FPGA chip is decoded the 8 road bit stream datas that distribution comes.
Decoded data are carried out CNN reasoning acceleration to process 5:FPGA chip and retrieval accelerates.
After process 6:FPGA chip completes CNN reasoning acceleration and retrieval acceleration, MSI interrupt, Server are initiated to ServerThe PCIE accelerator card register in the space MMCFG is write, FPGA uses DMA write operation, will complete CNN reasoning and accelerate and retrieve to accelerateData afterwards copy the second memory headroom to from fpga chip;
Process 7:Server is copied the data that CNN reasoning accelerates and retrieves after accelerating are completed or is reflected from the second memory headroomIt is mapped to the first memory headroom of App process.
Process 8: accelerator card driving, which is called, to be returned.
Although specification and drawings and examples have been carried out detailed description to the invention, this field skillArt personnel should be appreciated that and still can be modified or replaced equivalently to the invention;And all do not depart from wound of the present inventionThe technical solution and its improvement for the spirit and scope made, are encompassed by the protection scope of the invention patent.