CN110290425A

Movatterモバイル変換

Info

Publication number: CN110290425A
Application number: CN201910691577.4A
Authority: CN
Inventors: 段聪; 吴江红
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-29
Filing date: 2019-07-29
Publication date: 2019-09-27
Anticipated expiration: 2039-07-29
Also published as: CN110290425B

Abstract

The present invention provides a kind of method for processing video frequency, device and storage mediums；Method includes: acquisition target video；In response to the cutting operation for target object in the target video, obtain from the target video using the target object as the foreground video of prospect；The foreground video includes at least one foreground video frame；Obtain background video；The background video includes at least one background video frame；In response to being directed to the synthetic operation of the foreground video and the background video, the foreground video frame in the foreground video is overlapped with the background video frame in the background video, and the video frame that superposition obtains is encapsulated as synthetic video.By means of the invention it is possible to carry out the synthesis of dynamic video.

Description

A kind of method for processing video frequency, device and storage medium

Technical field

This application involves multimedia technology more particularly to a kind of method for processing video frequency, device and storage medium.

Background technique

With the continuous development of communication and mobile Internet, the epoch based on text and picture have become past, netNetwork live streaming and short video traffic start to be skyrocketed through, the appearance of various video class application programs, significantly reduce people's productionThe threshold of video, more and more users begin participating in video creation.

But video production scheme in the related technology, object that only can be static are synthesized in template video, can not carry outThe synthesis of dynamic video.

Summary of the invention

The embodiment of the present invention provides a kind of method for processing video frequency, device and storage medium, is able to carry out the conjunction of dynamic videoAt.

The technical solution of the embodiment of the present invention is achieved in that

The embodiment of the present invention provides a kind of method for processing video frequency, comprising:

Obtain target video；

In response to the cutting operation for target object in the target video, obtain from the target video with describedTarget object is the foreground video of prospect；The foreground video includes at least one foreground video frame；

Obtain background video；The background video includes at least one background video frame；

In response to being directed to the synthetic operation of the foreground video and the background video, by the prospect in the foreground videoVideo frame is overlapped with the background video frame in the background video, and

The video frame that superposition obtains is encapsulated as synthetic video.

The embodiment of the present invention provides a kind of video process apparatus, comprising:

First acquisition unit, for obtaining target video；

Cutting unit, for being regarded from the target in response to the cutting operation for target object in the target videoIt obtains in frequency using the target object as the foreground video of prospect；The foreground video includes at least one foreground video frame；

Second acquisition unit, for obtaining background video；The background video includes at least one background video frame；

Synthesis unit, for the synthetic operation in response to being directed to the foreground video and the background video, before describedForeground video frame in scape video and the background video frame in the background video into being superimposed, and

The video frame that superposition obtains is encapsulated as synthetic video.

In the above scheme, the cutting unit, is also used to:

The batch splitting received at least two target videos operates；

It operates in response to the batch splitting, obtains from each target video using the target object as prospectVideo clip, and it is determined as corresponding foreground video.

In the above scheme, the synthesis unit, is also used to:

Receive the batch synthetic operation for being directed at least two foreground videos and the background video；

In response to the batch synthetic operation, the foreground video frame at least two foreground video is added to respectivelyIn background video frame in the background video.

In the above scheme, the second acquisition unit, is also used to:

Loaded and displayed has the video selection window of alternative background video；

The video selection received for the video selection window operates；

Obtain the selected background video of the video selection operation.

In the above scheme, described device further include: preview unit is used for:

In response to being directed to the preview operation of the foreground video and the background video, the foreground video frame and institute is presentedState the Overlay of background video frame.

In the above scheme, the cutting unit, is also used to:

Target area where identifying the target object in the video frame of the target video, and by the video frameDescribed in areas transparentization processing except target area；

Video frame after transparency process is encapsulated as the foreground video.

In the above scheme, the cutting unit, is also used to:

Identify the target area where target object described in the video frame of the target video, and according to the target areaDomain obtains the corresponding image array of video frame of the target video；Element in described image matrix characterizes corresponding view respectivelyThe pixel of frequency frame belongs to the probability of the target area；

Described image matrix is subjected to mask process with corresponding video frame, the target area will be removed in the video frameAreas transparent except domain.

In the above scheme, the synthesis unit, is also used to:

Obtain the timestamp alignment relation of the foreground video frame Yu background video frame；

The timestamp alignment relation pair will be met in foreground video frame and the background video in the foreground videoThe background video frame answered is overlapped.

In the above scheme, the synthesis unit, is also used to:

In response to the edit operation for the foreground video and background video setting synthetic parameters, by the prospectVideo frame covers the background video frame, and overlay area of the foreground video frame in the background video frame meets settingSynthetic parameters.

In the above scheme, the synthesis unit, is also used to:

Construct initial matrix identical with the foreground video frame sign；

The element in the initial matrix is adjusted according to the edit operation, obtains the synthetic parameters of characterization settingVariable quantity objective matrix.

In the above scheme, the synthesis unit, is also used to:

The objective matrix is multiplied with the foreground video frame in the foreground video, the foreground video after being adjustedFrame；

The foreground video frame adjusted is covered into the background video frame.

Memory, for storing executable instruction；

Processor when for executing the executable instruction stored in the memory, is realized provided in an embodiment of the present inventionMethod for processing video frequency.

The embodiment of the present invention provides a kind of storage medium, is stored with executable instruction, real when for causing processor to executeExisting method for processing video frequency provided in an embodiment of the present invention.

The prospect that will be split as foreground video using target object in target video, and will be divided from target videoThe video frame that the foreground video frame of video is synthesized with the background video frame of background video is encapsulated as synthetic video, to be based on videoContent, new view is synthesized as background using the target object in target video as prospect and using the video frame of background videoFrequently, the dynamic video of image content coordination is obtained.

Detailed description of the invention

Fig. 1 is an optional structural schematic diagram of processing system for video framework provided in an embodiment of the present invention；

Fig. 2 is an optional structural schematic diagram of video process apparatus provided in an embodiment of the present invention；

Fig. 3 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Fig. 4 is that the embodiment of the present invention provides an optional display interface schematic diagram；

Fig. 5 A is that the embodiment of the present invention provides an optional Overlay schematic diagram；

Fig. 5 B is that the embodiment of the present invention provides an optional Overlay schematic diagram；

Fig. 6 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Fig. 7 is that the embodiment of the present invention provides an optional training sample schematic diagram；

Fig. 8 is that the embodiment of the present invention provides an optional editing interface schematic diagram；

Fig. 9 is that the embodiment of the present invention provides an optional editing interface schematic diagram；

Figure 10 is an optional encoding and decoding configuration diagram of video encoder provided in an embodiment of the present invention；

Figure 11 is an optional flow diagram of method for processing video frequency in the related technology；

Figure 12 is the synthetic effect schematic diagram of method for processing video frequency in the related technology；

Figure 13 is an optional flow diagram of method for processing video frequency in the related technology；

Figure 14 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Figure 15 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Figure 16 is one provided in an embodiment of the present invention optional display interface schematic diagram.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make intoIt is described in detail to one step, described embodiment is not construed as limitation of the present invention, and those of ordinary skill in the art are not havingAll other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.

In the following description, it is related to " some embodiments ", which depict the subsets of all possible embodiments, but canTo understand, " some embodiments " can be the same subsets or different subsets of all possible embodiments, and can not conflictIn the case where be combined with each other.

Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the inventionThe normally understood meaning of technical staff is identical.Term used herein is intended merely to the purpose of the description embodiment of the present invention,It is not intended to limit the present invention.

Before the embodiment of the present invention is further elaborated, to noun involved in the embodiment of the present invention and termIt is illustrated, noun involved in the embodiment of the present invention and term are suitable for following explanation.

1) background, the scenery of main body behind, can show space-time environment locating for personage or event, example in the picture of videoThe building, wall at such as personage rear, ground.

2) prospect is the main body that video shows compared with background closer to the content of camera lens in video pictures, such as stands and buildingPeople before building object.

3) target video, carry out Video Composition when for extracting the video of prospect.

4) background video, carry out Video Composition when for extracting the video of background.

5) it is superimposed, using the partial region in one (or multiple images) as prospect, and using another image as backgroundIt is synthesized, obtains new image.Such as: by image A some region and image B synthesize, obtain image C.ThisIn, image can be the video frame in video.

6) exposure mask (mask) is the image for being shielded to (some or all of pixel) in image to be processedMatrix, so that the part in specific image highlights.Exposure mask can be two-dimensional matrix array, also use multivalue matrix function sometimesAccording to.

7) mask process shields the processing of (such as transparence) based on exposure mask to some regions in image.In imageBinary number (also referred to as mask) progress and operation of same position in each pixel and exposure mask, such as 1&1=1；1&0=0.

8) it encapsulates, multiple video frames is converted to by video file based on certain frame per second and video format.Wherein, frame per second tableShow frame number per second, such as: 25 frames/per second (fps), 60fps etc..Video format can include: Matroska multimedia container(Matroska Multimedia Container, MLV), Audio Video Interleaved format (Audio Video Interleaved,AVI), the video file formats such as dynamic image expert group (Moving Picture Experts Group, MPEG) -4.

Analytic explanation is carried out to the technical solution for being directed to Video Composition in the related technology first.

Technical solution 1) still image and dynamic video synthesis

To still image carry out AI segmentation, be partitioned into the corresponding region of target object, using the region divided as with workIt is merged for the background video of background, the video after being synthesized.Here, the object of fusion is static image and dynamicVideo, the target object in video after synthesis are static, that is to say, that in video in post synthesis, in each video frameTarget object be static.

Technical solution 2) video pictures whole splicing

The video frame or so of two videos is stitched together, synthesize a bigger video, for the background to video intoRow processing, the picture of the video of synthesis are not merged in terms of content.

Above-mentioned several technical solutions there are aiming at the problem that, the embodiment of the present invention provides a kind of method for processing video frequency, by meshIt is split from target video in mark video using target object as foreground video, and the prospect for the foreground video divided is regardedThe video frame that frequency frame is synthesized with the background video frame of background video is encapsulated as synthetic video, thus based on the content of video to dynamicVideo synthesized, obtain image content coordination dynamic video.

Illustrate the exemplary application for realizing the video process apparatus of the embodiment of the present invention below, it is provided in an embodiment of the present inventionVideo process apparatus is desirably integrated into various forms of electronic equipments, and the electronic equipment that the present invention implements to provide may be embodied asVarious terminals, such as mobile phone (mobile phone), tablet computer, laptop etc. have the mobile terminal of wireless communication ability,In another example desktop computer, desktop computer etc..In addition, electronic equipment also may be embodied as a server or by multiple serversThe server cluster of composition, is not limited herein.

It is an optional configuration diagram of processing system for video 100 provided in an embodiment of the present invention referring to Fig. 1, Fig. 1,Both terminal 400 connects server 200 by network 300, and network 300 can be wide area network or local area network, or beCombination realizes that data are transmitted using Radio Link.Operation has video processing applications program in terminal 400, and video processing is answeredIt is provided with interface 410 with program, to receive the relevant operation of the synthetic video of user.

For being provided with video process apparatus provided in an embodiment of the present invention in server 200, exemplary answered at oneIn, when terminal 400 needs synthetic video, target video and background video can be the video recorded using terminal, at this point,Target video and background video can be sent to server by terminal 400, and request server 200 carries out Video Composition.At this point, clothesDevice 200 be engaged in after receiving target video and background video, using method for processing video frequency provided in an embodiment of the present invention, by targetTarget object in video is split, and is prospect and using background video as background using the foreground video split, by prospectForeground video frame in video and the background video frame in background video are overlapped, and superimposed video frame is sealedDress, obtains synthetic video, the synthetic video after encapsulation is finally sent to terminal 400 again.

Such as: as shown in Figure 1, target video is video 101, background video is video 102, and terminal 400 is by 101 He of videoVideo 102 is sent to server 200, and server 200 extracts the foreground video 104 with portrait 103 for prospect from video 101,And by the foreground video frame 1041 of foreground video 104 (the background video frame including 1041-1 to 1041-n) and background video 1021021 (including 1021-1 to 1021-n) is overlapped respectively, and obtaining the video frame 1051 of synthetic video 105, (including 1051-1 is extremely1051-n), wherein n is the integer greater than 1.

To be provided with another exemplary application of video process apparatus provided in an embodiment of the present invention in server 200In, when terminal 400 needs synthetic video, the identification information of target video and background video can be sent to server 200.ClothesBusiness device 200 is based on the received identification information of institute and determines corresponding target video and background video, is mentioned using the embodiment of the present inventionThe method for processing video frequency of confession splits the target object in target video, using the foreground video that splits as prospect simultaneouslyUsing background video as background, the foreground video frame in foreground video and the background video frame in background video are overlapped, andSuperimposed video frame is packaged, synthetic video is obtained, the video after encapsulation is finally sent to terminal 400 again.Terminal400 can go out the video distribution after synthesis.

In an example using terminal 400 as electronic equipment, target video and background video be can be in terminal 400The video file of encapsulation, by terminal 400 itself using the method provided in an embodiment of the present invention in video processing, by foreground videoIn foreground video frame and background video in background video frame be overlapped, and superimposed video frame is packaged, is obtainedVideo file after to synthesis.

It is carried out for being provided with video process apparatus provided in an embodiment of the present invention in server and terminal respectively aboveExplanation, it is possible to understand that ground, video process apparatus provided in an embodiment of the present invention, which can be distributed, to be arranged in terminal and server, fromAnd method for processing video frequency provided in an embodiment of the present invention is completed by terminal and server collaboration.

It should be noted that in embodiments of the present invention, the type of target video and background video can be identical, it can also notTogether.Such as: target video and background video are encapsulated video file.For another example: target video is video flowing, backgroundVideo is encapsulated video file.

Video process apparatus provided in an embodiment of the present invention may be embodied as the side of hardware, software or software and hardware combiningFormula.

As the example of software implementation, video process apparatus may include one or more software module, for independentOr cooperative achievement method for processing video frequency provided in an embodiment of the present invention, software module can be using the various of various front ends or rear endProgramming language.

As the example that hardware is implemented, video process apparatus may include one or more hardware modules, and hardware module canWith using (ASIC, Application Specific Integrated Circuit), Complex Programmable Logic Devices (CPLD,Complex Programmable Logic Device), field programmable gate array (F PGA, Field-ProgrammableGate Array) etc. hardware decoders, be programmed to realize video processing side provided in an embodiment of the present invention alone or synergisticallyMethod.

Below again by taking software and hardware combining as an example, illustrate the exemplary reality of video process apparatus provided in an embodiment of the present inventionIt applies.

Referring to fig. 2, Fig. 2 is 20 1 optional structural schematic diagrams of video process apparatus provided in an embodiment of the present invention, rootAccording to Fig. 2 shows video process apparatus 20 structure, it is anticipated that other exemplary structures of video process apparatus 20, thereforeStructure as described herein is not construed as limiting, such as can be omitted members described below, alternatively, adding hereafter instituteThe component that do not record is to adapt to the specific demands of certain applications.

Video process apparatus 20 shown in Fig. 2 includes: at least one processor 210, memory 240, at least one networkInterface 220 and user interface 230.Various components in video process apparatus 20 are coupled by bus system 250.It can manageSolution, bus system 250 is for realizing the connection communication between these components.Bus system 250 is in addition to including data/address bus, alsoIncluding power bus, control bus and status signal bus in addition.But for the sake of clear explanation, in Fig. 2 all by various busesIt is designated as bus system 250.

Memory 240 can be volatile memory or nonvolatile memory, may also comprise volatile and non-volatileBoth memories.Wherein, nonvolatile memory can be read-only memory (ROM, Read Only Memory), and volatibility is depositedReservoir can be random access memory (RAM, Random Access Memory).The memory of description of the embodiment of the present invention240 are intended to include the memory of any suitable type.

Memory 240 in the embodiment of the present invention can storing data to support the operation of server 200.These dataExample includes: any computer program for operating on video process apparatus 20, such as operating system and application program.ItsIn, operating system includes various system programs, such as ccf layer, core library layer, driving layer etc., for realizing various basic businessesAnd the hardware based task of processing.Application program may include various application programs.

As the example that method for processing video frequency provided in an embodiment of the present invention uses software and hardware combining to implement, the present invention is implementedMethod provided by example can be embodied directly in be combined by the software module that processor 210 executes, and software module, which can be located at, depositsIn storage media, storage medium is located at memory 240, and processor 210 reads the executable finger that software module includes in memory 240It enables, it is real to complete the present invention in conjunction with necessary hardware (e.g., including processor 210 and the other assemblies for being connected to bus 250)The method for processing video frequency of example offer is provided.

The exemplary application and implementation of the video process apparatus above-mentioned for realizing the embodiment of the present invention will be combined, illustrates to realizeThe method for processing video frequency of the embodiment of the present invention.It is to be appreciated that method for processing video frequency shown in Fig. 3 can be held by various electronic equipmentsRow, such as executed by terminal or server, or, it is performed in unison with by terminal and server.

It is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention referring to Fig. 3, Fig. 3, it willThe step of showing in conjunction with Fig. 3 is illustrated.

Step S301 obtains target video.

Target video can be encapsulated video file.Target video can also be video flowing, such as: the stream of live videoMedia data.

The quantity of target video can be one or more.

Operation has video processing applications program in terminal, and target video selection window is provided in video processing applications programMouthful, the identification information of alternative target video is provided in target video selection window, such as: video thumbnails, video nameClaim etc..Terminal receives the selection operation of user, and the corresponding video of identification information that selection operation is selected is as target video.ItsIn, when electronic equipment is server, the video processing applications program run in terminal is the client of server.

For the target video of user's selection, target video can be presented at the terminal, so that user is to selected meshIt marks video and carries out preview, whether be required target video with the selected target video of determination, if not needed for userWhen the target video wanted, window can be selected to carry out reselecting for target video based on target video.

Illustratively, target video selection window can icon 401, icon as shown in window 401 in Fig. 4, in window 401402, icon 403 is respectively the icon of alternative target video, when selection operation selectes icon 402, then the corresponding view of icon 402Frequency is target video.In window 401 further include: the more multi-option 404 of the more alternative target videos of triggering, when more multi-option404 when receiving the touch control operation of user, and the identification information of more alternative target videos can be presented.When acquisition target videoAfterwards, window 405 can be regard as the preview window, the picture of target video is presented in window 405.

Step S302, in response to the cutting operation for target object in the target video, from the target videoIt obtains using the target object as the foreground video of prospect.

The segmentation entrance for receiving the cutting operation for being directed to target object can be loaded in terminal.Terminal can be based on receivingCutting operation generate instruction from target video obtain using the target object as the split order of the foreground video of prospect.

In one example, electronic equipment be terminal when, terminal be based on split order local from target video obtain beforeScape video.

In another example, when electronic equipment is server, split order is sent to server by terminal, and server is based onThe split order received obtains foreground video from target video.

Electronic equipment is based on split order, can call the interface of video encoder, be regarded target based on the interface calledTarget video frame is decomposed into video frame by video encoder by frequency input video encoder.Electronic equipment to each video frame intoRow image recognition identifies target object from each video frame, obtains constituting foreground video based on the region where target objectForeground video frame.Target object can be the object in the prospect in the video frames of target videos such as people, animal.Here, by structureIt is known as foreground video frame at the video frame of foreground video, the foreground video includes at least one foreground video frame.

It identifies from target video and is regarded by the video of prospect of target object, can be known at least one of in the following mannerTarget object in the video frame of other target video:

Identification method one, calibration mode

Receive the proving operation for the video frame that user is directed in target video, pair that the proving operation of user is calibratedAs being determined as target object.Wherein, the object that the proving operation of user is calibrated can be specific object, for example, multipleA people in people, is also possible to a class object, for example, male, women.

Identification method two, image recognition model automatic identification

Target object is used as by the prospect (such as people, animal) of image recognition model automatic identification video frame.

Step S303 obtains background video.

Background video can be encapsulated video file.Background video can also be video flowing.

The video processing applications program run in terminal can provide video selection window to receive user and select background viewThe video selection of frequency operates, and the identification information of determining background video is operated based on the video selection of user.

Here, the video frame in background video is known as background video frame, the background video includes at least one backgroundVideo frame.

It should be noted that in embodiments of the present invention, step S301, the execution sequence of step S302 and step S303 is notSuccessively, step S301, step S302 can be first carried out, step S303 can also be first carried out.

Step S304 regards the prospect in response to being directed to the synthetic operation of the foreground video and the background videoForeground video frame in frequency is overlapped with the background video frame in the background video, and the video frame that superposition is obtained encapsulatesFor synthetic video.

The video processing applications program run in terminal can provide the interactive entrance of triggering Video Composition, to receive instructionThe synthetic operation that foreground video and background video are synthesized, and synthetic instruction is generated based on synthetic operation.

When electronic equipment is server, synthetic instruction is sent to server by terminal, server be based on synthetic instruction intoThe superposition of foreground video frame in row foreground video and the background video frame in background video, to realize foreground video and background viewThe synthesis of frequency.

As shown in Figure 5A, such as: when prospect video bag includes: video A', when background video is video D, by the view of video A'Frequency frame is overlapped with the background video frame in background video, and Overlay can be as shown in Figure 5A, wherein background area 501 isThe picture of video D, object 502 are the corresponding region object a in the prospect in video A.

Electronic equipment carries out the background video frame of the foreground video frame of foreground video and background video according to synthetic parametersSuperposition.Here, can using relative position of the target object in target video and/or opposite imaging size as synthetic parameters,The edit operation that user can be received based on edit page, so that user is adjusted synthetic parameters.

It should be noted that the behaviour of the user's operations such as selection operation, cutting operation, synthetic operation in the embodiment of the present inventionThe mode of work can are as follows: touch-control, voice, gesture etc., the embodiment of the present invention is to the mode of operation of user without any restriction.

It will be the foreground video of prospect from target using target object in method for processing video frequency provided in an embodiment of the present inventionThe view for being split in video, and the foreground video frame for the foreground video divided being superimposed with the background video frame of background videoFrequency frame is encapsulated as synthetic video, to synthesize based on the content of video to dynamic video, obtains image content coordinationDynamic video.Here, it is based on image Segmentation Technology, target object is extracted from a video in real time, with anotherVideo is synthesized, and realizes two automatic syncretizing effects of video, the efficiency of energy significant increase user video production, and excite useMore interesting videos are created at family, allow ordinary user that can also produce the video of similar film special efficacy.

In some embodiments, when the quantity of target video is at least two, step S302 can be executed are as follows: reception is directed toThe batch splitting of at least described two target videos operates；It is operated in response to the batch splitting, from each target videoMiddle acquisition is determined as corresponding foreground video using the target object as the video clip of prospect.

When the quantity of target video is multiple, cutting operation can operate for batch splitting.For multiple target videos, oftenA target video can be identical as the target object of prospect, can not also be identical.Wherein, the corresponding target pair of different target videosAs that can be same class object.Here, the foreground video being partitioned into is the video clip that the target object in target video is constituted.

At this point, step S303 can be executed are as follows: receive at least two foreground videos and the background videoBatch synthetic operation；In response to the batch synthetic operation, the foreground video frame at least two foreground video is distinguishedIt is overlapped with the background video frame in the background video.

Such as: when prospect video bag includes: video A', video B' and video C', background video be video D when, by video A',The video frame of video B' and video C' are overlapped with the background video frame in background video jointly, and Overlay can be such as Fig. 5 B instituteShow, wherein background area 501 be video D picture, object 502, object 503 and object 504 be respectively video A, video B andObject a, object b and the corresponding region object c in prospect in video C.

In some embodiments, when step S303 may be so performed that loaded and displayed has the video of alternative background videoSelect window；The video selection received for the video selection window operates；Obtain the selected back of the video selection operationScape video.

It is provided with video selection window in the video processing applications program run in terminal, is shown in video selection windowThe identification information of alternative background video.In video selection window the identification information of alternative background video can from local acquisition,It can also be obtained from network side.Terminal is operated based on video selection window reception video selection, so that user is grasped based on video selectionMake the background video that the selection from alternative background video carries out Video Composition.

For the background video of user's selection, background video can be presented at the terminal, so that user is to selected backWhether scape video carries out preview, be required background video with the selected background video of determination, if not needed for userWhen the background video wanted, reselecting for background video can be carried out based on video selection window, to selected background videoIt is replaced.Illustratively, video selection window can be as shown in window 401 in Fig. 4, here not to the selection course of background videoIt is repeated again.

In some embodiments, in response to the preview operation for the foreground video and the background video, institute is presentedState the Overlay of foreground video frame Yu the background video frame.

The video processing applications program run in terminal can provide the interactive entrance for receiving preview operation, to receive instructionThe preview operation of the Overlay of preview foreground video and background video.

In some embodiments, as shown in fig. 6, after receiving cutting operation in step S302, following steps can be passed throughForeground video is partitioned into from target video frame:

Step S3021, the target area where identifying the target object in the video frame of the target video, and willAreas transparentization processing except target area described in the video frame；

The target of target object is identified from the video frame of target video by way of image recognition model or calibrationRegion after identifying target area, keeps the pixel value for the pixel for belonging to target area constant, will belong to other than target areaThe pixel value of pixel in region be set as 0, so that the areas transparentization except target area is handled, be partitioned into target videoVideo frame in target object.

Video frame after transparency process is encapsulated as the foreground video by step S3022.

The foreground video frame of transparency process is encapsulated as foreground video based on Video Codec.

In some embodiments, step S3021 this can be implemented so that

Identify the target area where target object described in the video frame of the target video, and according to the target areaDomain obtains the corresponding image array of video frame of the target video；Element in described image matrix characterizes corresponding view respectivelyThe pixel of frequency frame belongs to the probability of the target area；Described image matrix is subjected to mask process with corresponding video frame, withBy the areas transparent in the video frame in addition to the target area.

Here, the target area where the target object in target video frame, image can be identified by image recognition modelImage array of the identification model based on the target area output binaryzation identified.Target can also be identified by the calibration of userThe target area where target object in video frame, and the image array of binaryzation is obtained according to determining target area.?In image array, the corresponding element of pixel other than target area is 0, characterizes the pixel and is not belonging to target area, target areaThe corresponding element of the pixel in domain is 1, characterizes the pixel and belongs to target area.By the video frame of image array and target video intoLine mask processing, the pixel value of the pixel of target area is constant, and the pixel value of the pixel in the region other than target area is0, thus by the areas transparent in video frame in addition to the target area.

Here, image recognition model can be trained by carrying out the sample set of target object mark.Work as target objectWhen for portrait, the training sample in sample set can be as shown in fig. 7, be labeled portrait 702 in portrait picture 701.

In some embodiments, the background in the foreground video frame by the foreground video and the background videoVideo frame is overlapped, comprising:

Obtain the timestamp alignment relation of the foreground video frame Yu background video frame；By the prospect in the foreground videoVideo frame is overlapped with the background video frame for meeting the timestamp alignment relation in the background video.

Here, before by foreground video frame and the superposition of background video frame, each foreground video frame in foreground video is obtainedTimestamp, and the timestamp of each background video frame in background video is obtained, and according to the timestamp of acquisition, determine foreground video frameWith the timestamp alignment relation of background video frame, that is to say, that the pass between the period of foreground video and the period of background videoSystem, and by with timestamp alignment relation foreground video frame and background video frame be overlapped.Wherein, timestamp alignment relationIt can be and automatically determined according to the position of position and each background video frame on a timeline of each foreground video frame on a timeline,The editting function that can also be provided based on video processing applications program is determined.Wherein, the editor that video processing applications program providesFunction can be existed based on the position of the timestamp of user adjustment operation adjustment foreground video frame on a timeline or foreground video framePosition on time shaft.

Such as: background video when it is 2 minutes a length of, period on time shaft is 0 to 2 minute, foreground video when it is a length of30 seconds, and its timestamp is aligned with the 1st point of 16 seconds to the 1st point 45 seconds this period of background video, then in foreground video frameFirst frame and the 1st point of 16 seconds first frame of background video there is timestamp alignment relation, and correspond to frame by frame, by foreground video frameIn in each foreground video frame and background video the 1st point of each background video frame in 45 seconds 16 seconds to the 1st point be overlapped.Here, precedingScape video frame and the frame per second of background video frame can be identical.

For another example: ibid example is adjusted such as Fig. 8 institute the timestamp alignment relation of foreground video frame and background video frameShow, before adjustment, the initial time of foreground video is aligned with the T1 of background video, wherein T1 be 1 point 16 seconds, foreground videoIt is aligned with the 1st point of 16 seconds to the 1st point 45 seconds this period of background video.User is based on slidably control prolongs arrow shown in dotted lineHead shown in direction be adjusted, the initial time of foreground video is adjusted to the T2 of background video, wherein T2 be 1 point 06 second,Then by time adjustment operation by initial position of the foreground video frame in the time shaft of background video by the 1st point adjust within 16 seconds to1st point 06 second, the 1st point of the 1st point of 06- of the period of foreground video frame and background video frame is aligned for 35 seconds at this time, then is regarded prospectThe 1st point of each background video frame in 35 seconds 06 second to the 1st point is overlapped in each foreground video frame and background video in frequency frame.

In embodiments of the present invention, terminal can provide the slidably times such as control adjustment interface in user interface, so thatUser passes through the starting that slidably times such as control adjustment interface selection foreground video is synthesized with background video on a user interfaceTime, synthesis end time.It should be noted that the initial time of synthesis or the end time of synthesis are between background videoBetween initial time and end time.When foreground video frame and background video frame are superimposed by electronic equipment, based on selectedIt is background video frame that the initial time of synthesis, which starts decoding background video respectively, and based on the background video frame decomposed and prospectVideo frame is overlapped frame by frame, until the end time of synthesis.If between the initial time of synthesis and the end time of synthesisEvery the duration than foreground video, then it is subject to end time of foreground video.If the initial time of synthesis and the end of synthesisThe interval of time is shorter than the foreground video time, then be subject to selection synthesis end time, i.e., do not arrive also foreground video endingJust terminate to synthesize.

In response to the edit operation for the foreground video and background video setting synthetic parameters；By the prospectVideo frame covers the background video frame, and overlay area of the foreground video frame in the background video frame meets settingSynthetic parameters.Synthetic parameters include at least one following parameter: position, size etc., are regarded with characterizing foreground video frame in backgroundThe superposed positions such as relative position, relative size in frequency frame.

The video processing applications program run in terminal can provide edit page, can display foreground view in edit pageBackground video frame in the foreground video frame and background video of frequency can show the prospect view with timestamp alignment relation hereFrequency frame and background video frame.

It is loaded with editor's interactive interface on editing interface, receives the edit operation of setting synthetic parameters, to set synthesisParameter.Wherein, edit operation can be the operations such as translation, rotation, scaling.

In practical applications, the editing interface for carrying out edit operation can be as shown in figure 9, provide one in editing interface 901Rectangle frame 902 identical with the size of the target object in foreground video frame receives user to foreground video based on the rectangle frameEdit operation.

When determine user complete edit operation after, can automatic trigger synthetic operation, may be based on user in the display interfaceThe interactive entrance of operation receives synthetic operation.Prospect is regarded based on the synthetic parameters of edit operation setting in response to synthetic operationFrequency frame covers the background video frame, so that covering of the foreground video in background video frame meets the synthetic parameters of setting.

In some embodiments, synthetic parameters of the determination foreground video frame in the background video frame, packetIt includes: construction initial matrix identical with the foreground video frame sign；According to the edit operation in the initial matrixElement is adjusted, and obtains the objective matrix for characterizing the variable quantity of the synthetic parameters.

Here, the matrix identical with the high width of target object in foreground video frame of construction, referred to as initial matrix.According to editorOperation is adjusted initial matrix, obtains the objective matrix of the variable quantity of characterization synthetic parameters.When edit operation is translation,The value of the corresponding element of pixel where the position of translation is then revised as displacement.When edit operation is scaling, then willThe value of the corresponding element of pixel where the position of scaling is revised as scaling.It, then will rotation when edit operation is rotationPosition where the corresponding element of pixel value be revised as scaling angle function.

Illustratively, when the high width of prospect video frame is 3, initial matrix can be the matrix of 3*3When translationObjective matrix can beObjective matrix when translation can beObjective matrix when rotation can beWherein, t_x、t_yRespectively indicate the displacement along the translation of the direction x, y, s_x、s_yRespectively indicate edgeX, the ratio of the direction y scaling, the q in sin (q)/cos (q) indicate the angle of rotation.Wherein, the ratio along the direction x, y scaling isIndicate s_x、s_yIndicate that 2-d spatial coordinate (x, y) scales s centered on (0,0) in the horizontal direction_xTimes, it contracts in vertical directionPut s_yTimes, that is to say, that the horizontal distance of coordinate position distance (0,0) becomes level of the former coordinate from place-centric point after transformationThe s of distance_xTimes, vertical range becomes the s of vertical range of the former coordinate from place-centric point_yTimes.Wherein, 1,0 is without practical meaningJustice is will to calculate the default parameters for being expressed as obtaining when math matrix.

In some embodiments, described that the foreground video frame is covered into the background video frame, and the foreground videoOverlay area of the frame in the background video frame meets the synthetic parameters of setting, comprising:

The objective matrix is multiplied with the foreground video frame in the foreground video, the foreground video after being adjustedFrame；The foreground video frame adjusted is covered into the background video frame.

Here, objective matrix can be multiplied with the bitmap of foreground video frame, the bitmap of the foreground video frame after being adjusted.Bitmap (Bitmap) is stored in a manner of the two-dimensional array of RGBA pixel.Coordinate position is p0 (x0, y0) in foreground video framePixel when being converted, the parameters such as the displacement of transformation, scaling size are inputted into R-matrix, obtain corresponding objective matrix M(x0, y0), then the coordinate position of the pixel is p1 (x1, y1) in the foreground video frame after adjustment, and the calculating of p1 (x1, y1) is publicFormula are as follows:

P1 (x1, y1)=p0 (x0, y0) * M (x0, y0)；

Wherein, p0 (x0, y0) is with the transposition [x y] of matrix [x y]^TIt is calculated.

Such as: when a space coordinate p0 (x0, y0) first prolongs the direction x translation t_x, then prolong the direction y translation t_yThe seat finally obtainedMark p1 (x1, y1)=(x0+t_x, y0+t_y), it, can when being indicated with the form of matrix are as follows:

Each pixel can obtain a new coordinate position in foreground video frame, to obtain a new pixel twoDimension group, the bitmap that can be reduced to by this two-dimensional array after new Bitmap, that is, adjustment.

Method for processing video frequency provided in an embodiment of the present invention is capable of providing edit page, and is received and used based on edit pageEdit operation of the family to foreground video frame, when adjustment foreground video frame is synthesized with background video frame, relative to background video frameRelative position and imaging size.

Illustratively, by taking electronic equipment is using Android platform as an example, to video encoder involved in the embodiment of the present inventionIt is illustrated, the encoding and decoding framework of video encoder is as shown in Figure 10:

Codec, which can handle, enters data to generate output data, and codec uses one group of input buffer and defeatedBuffer carrys out asynchronous process data out.An empty input block can be created by loader, to send after filling dataIt is handled to codec.The input data that codec provides client is converted, be then output to one it is emptyOutput buffer.Last client gets the data of output buffer, consumes the data of the inside, and the output of occupancy is bufferedArea is released back into codec.If subsequent, there are also data needs to continue with, and codec will repeat these operations.

The data type that codec can be handled can include: compressed data and original video data.Buffer area can be passed through(ByteBuffers) these data are handled, screen buffer (Surface) is needed to open up original video data at this timeShow, can also improve the performance of encoding and decoding in this way.Surface can be used local screen buffer, this buffer area do not map orCopy ByteBuffers to.Such mechanism allows the more efficient of codec.It, can not usually when using SurfaceOriginal video data is accessed, but cis (ImageReader) can be used and access decoded original video frame.

In the following, electronic equipment is that terminal is actual application scenarios using target object as portrait, it will illustrate implementation of the present inventionExemplary application of the example in actual application scenarios.

The relevant technologies, Video Composition scheme can be as shown in figure 11, comprising: selection background template 1101.Background template 1101For a background video.The portrait picture 1102 that picture material includes portrait is chosen, selects portrait figure in such a way that user smearsThe portrait area of piece 1102, based on user selection portrait area, divided by AI, by portrait picture 1102 be divided into portrait andBackground two parts show that portrait 1103 forms editor's image 1104, and right to pluck out portrait 1103 on background template 1101Show that portrait 1103 and the position of background template 1101 are edited in editor's image 1104, after completing the editing, and based on volumeThe synthetic parameters collected are merged to by portrait 1103 with background template 1101, obtain the synthetic video that picture material includes portrait1105。

The synthetic effect of Video Composition scheme shown in Figure 11 is as shown in figure 12, by the people in static portrait picture 1102As 1103 are synthesized in background template 1101, the display page 1105 of synthetic video 1104 is obtained.Video Composition shown in Figure 11Scheme is to carry out portrait background segment to static picture, is then synthesized in video, limitation is larger, first is that must haveThe good template background video of particular production, second is that can only be split for static images, the portrait plucked out is static, forfeitureMany interests.In addition the portrait segmentation of picture is needed to carry out region manually to smear selection, there is multiframe for processingThe video efficiency of image is too low.

Therefore, the be in step with scheme of function of video is that simply two videos or so are stitched together, and synthesis one is biggerVideo, the background of two videos is not handled, there are two scenes for the video of synthesis, it appears loftier.

In order to which the limitation for solving the above-mentioned Video Composition scheme for being only capable of being synthesized to static picture in video is fixed big, or willThe lofty technological deficiency of the Video Composition scheme scene that two videos are spliced, the embodiment of the present invention provide a kind of video processingMethod, comprising: video selection, video decoding, portrait segmentation, picture editting, Video Composition, as shown in figure 14, comprising:

Video selection is carried out from local video, obtains background video 1401 and portrait video 1402 i.e. target video.Video decoding is carried out to background video 1401, obtains video frame 1403 i.e. background video frame.Video is carried out to portrait video 1402Decoding, obtains video frame 1404.Video frame 1404 is inputted into neural network model 1405 and carries out portrait segmentation, exports portraitMask Figure 140 6 obtains portrait image 1407 i.e. prospect by mask Figure 140 6 of portrait and the mask process of video frame 1404Video frame.

When receiving beginning edit operation, by background video frame 1402 and the display of portrait image 1407 in editing interfaceOn.Portrait image 1407 on editing interface receives the edit operation of user, based on the edit operation of user to portrait imageThe relative position of 1407 relative video frames 1402 and relative size are adjusted, and obtain relativeness, and be based on relativeness peopleAs image is edited.Wherein, edit operation portrait image 1407 carried out can include: the processing such as translation, scaling and rotation.When receiving preview operation, edited portrait image 1408 and video frame 1402 are rendered, output is used as portrait figureAs 1407 and the Overlay 1409 of video frame 1402.When receiving synthetic operation, again by rendering to edited peopleAs image 1408 and video frame 1402 are rendered, synthetic frame 1410 is obtained, again by Multimedia Encoder by synthetic frame1410 are packaged into synthetic video 1411.Wherein, it after exporting Overlay 1409, can also continue to receive edit operation, to portrait figureAs the relative position of 1407 relative video frames 1402 and relative size are adjusted.

Terminal device can show video selection option by system photograph album or the customized photograph album page, based on shown choosingItem selection background video 1401 and portrait video 1402.Terminal device is regarded background video 1401 and portrait by MediaCodecFrequently 1402 it is decoded into multiple single-frame images respectively.Each frame image decoded to portrait video 1402 carries out portrait segmentation, obtainsTo portrait image 1407.Terminal device passes through each frame portrait image 1407 of the target Matrix to segmentation of characterization synthetic parametersBitmap carry out matrixing, the Bitmap of the portrait image after being edited, then Bitmap after editor is passed throughOpenGL ES API uploads to the texture cell of graphics processing unit (Graphic Process Unit, GPU), terminal deviceGPU by tinter by the corresponding texture of background video frame 702 and editor after portrait image texture carry out image mixClosing operation obtains final synthetic frame, and synthetic frame is encoded into synthetic video by MediaCodec.

In the following, to being described with the next stage in method for processing video frequency provided in an embodiment of the present invention: portrait segmentation, figurePiece editor, rendering, the decoding of video and synthesis.

1, portrait is divided

In the server, using include multiple portrait class pictures manually marked set as training set, to neural networkModel is trained, and is transplanted to terminal by the neural network model after training and preservation, and by the neural network model after trainingIn equipment.

Server collects portrait class picture, and is labeled by portrait class picture of the artificial mode to collection, willThe corresponding region of portrait is as prospect in portrait class picture, and using the region other than portrait as background, by foreground and backgroundEach pixel distinguishes.The portrait class picture manually marked can be as shown in fig. 7, carry out portrait 702 in portrait picture 701Mark.

For dynamic target video, target video is decoded into real time by static state by Video Decoder (MediaCodec)Frame, then the neural network model by static frames input training is handled, and is returned to the picture mask (binary map) of segmentation, is passed throughOriginal image in mask and target video frame carries out transparency blending, in the portrait i.e. foreground video after can cutting out segmentationForeground video frame.

2, picture editor

User it is divisible go out portrait picture translate, scale, rotate etc. and edit.User can be according to self-demand to volumeThe position and size collected are controlled.

The portrait picture being partitioned into is stored in a manner of Bitmap in memory, can be by Matrix matrix to storageBitmap is converted.By constructing a rectangle frame equal with the wide height of portrait picture, then it is supplied on graphical interfacesThe interactive entrance of user dragging and rotation, available user edit the Matrix i.e. objective matrix of rectangle frame generation, will be originalPortrait picture pixels be multiplied with Matrix, the Bitmap after transformation such as available translation, scaling and rotation.

3, it renders

Portrait can carry out live preview after splitting in target video.In addition in editor's size and location informationIt later, can also live preview.In live preview, OpenGL ES has been adopted in the terminal and has been rendered, by the image of each frameRGB data uploads to the texture cell of GPU, is then rendered by the rendering pipeline of GPU, and final GPU can be by the figure of outputAs data render to screen Frame Buffer in, to be shown on screen.

Have efficient parallel processing and rendering framework based on GPU, be very suitable to the processing and rendering of image, therefore, leads toThe API of OpenGL ES is crossed to utilize GPU to be rendered, may be implemented to achieve the purpose that real-time rendering special efficacy.

4, video decoding and synthesis

By the video frame being decoded to obtain to video in video, to handled frame by frame video.It is merging mostWhen the video at end, need using Video Composition technology, that is, Video coding.

By taking Android platform as an example, coding and decoding video can be carried out based on the MediaCodec module on Android.

Method for processing video frequency provided in an embodiment of the present invention can be in the video processing applications program of mobile video and live streaming classIn, the multiple videos of rapid synthesis are used for, entertaining video editing efficiency is promoted.As shown in figure 15, user when in use, can be successivelySelect multiple videos, such as: video 151-1, video 151-2 ... video 151-m, wherein video 151-2 be template video (i.e.Background video).User clicks the interactive interface that portrait is scratched for a key that video processing applications program provides, image application programTo other than place video 151-2 video 151-1 ... each video carries out portrait segmentation in video 151-m, corresponding to generate multiple portraitsVideo: portrait video 152-1 ... portrait video 152-m.Wherein, in portrait video, the background area of non-portrait is transparent.UserIt can be sequentially adjusted in size and relative position of the portrait video relative to template video (may include a background), and live preview meltsClose effect.When image, which is applied to figure, receives user's click synthesis button, Video Composition is carried out, by portrait video 152-1 ...Portrait video 152-m is blended into jointly in video 151-2, obtains synthetic video 153, and can protect the video 153 that synthesis is completedIt is stored to local.

The video matting interface that video processing applications program provides can be currently in work such as 1601 in Figure 16, window 1602That makees area needs to be scratched portrait list of videos, in addition there is the button 1603 of one " stingy portrait " and the button of " starting to edit "1604.Portrait in the video currently selected can be extracted, and shown in real time by " stingy portrait " button 1603 when the user clicksShow in preview area, clicks " starting to edit " button 1604, editing interface 1605 of entering.Editing interface 1605 is for editing portraitIn addition there are " replacement background video " button 1608 and one in the relative position and size of video 1606 and background video 1607" starting to synthesize " button 1609, " replacement background video " button 1608 for replacing the background video currently selected, " start to closeAt " button 1609 is for starting final Video Composition.

Method for processing video frequency provided in an embodiment of the present invention by carrying out automatic portrait background segment to dynamic video, thisThe portrait that sample plucks out is movement and fresh and alive image, and user is allowed to select arbitrary background video, and portrait is synthesized to thisIn video, to realize two sections even fusion of multistage video.For example, the video danced indoors using two sections of users,Two portraits wherein respectively are plucked out, are synthesized in the scene video of another stage, to realize the cooperation of two people strange landsThe effect of performance.Due to having carried out portrait background segment, the video scene finally synthesized is unified, therefore this programme introduces moreWriting space, the imagination and creativity of user can be sufficiently excited, to promote the playability and interest of software entirety.

Illustrate the exemplary structure of software module below, in some embodiments, as shown in Fig. 2, in video process apparatusSoftware module may include:

First acquisition unit 2401, for obtaining target video；

Cutting unit 2402, for the cutting operation in response to being directed to target object in the target video, from the meshIt marks in video and obtains using the target object as the foreground video of prospect；The foreground video includes at least one foreground videoFrame；

Second acquisition unit 2403, for obtaining background video；The background video includes at least one background video frame；

Synthesis unit 2404, for the synthetic operation in response to being directed to the foreground video and the background video, by instituteThe foreground video frame stated in foreground video is overlapped with the background video frame in the background video, and the view that superposition is obtainedFrequency frame is encapsulated as synthetic video.

In some embodiments, cutting unit 2402 are also used to:

The batch splitting received at least described two target videos operates；It is operated in response to the batch splitting, fromIt obtains in each target video using the target object as the video clip of prospect, and is determined as corresponding foreground video.

In some embodiments, synthesis unit 2404 are also used to:

Receive the batch synthetic operation for being directed at least two foreground videos and the background video；In response to described batchMeasure synthetic operation, the background that the foreground video frame at least two foreground video is added to respectively in the background videoIn video frame.

In some embodiments, second acquisition unit 2403 are also used to:

Loaded and displayed has the video selection window of alternative background video；Receive the video for being directed to the video selection windowSelection operation；Obtain the selected background video of the video selection operation.

In some embodiments, described device further include: preview unit is used for:

In some embodiments, cutting unit 2402 are also used to:

Target area where identifying the target object in the video frame of the target video, and by the video frameDescribed in areas transparentization processing except target area；Video frame after transparency process is encapsulated as the foreground video.

In some embodiments, cutting unit 2402 are also used to:

In some embodiments, synthesis unit 2403 are also used to:

In some embodiments, the synthesis unit 2403, is also used to:

In some embodiments, synthesis unit 2403 are also used to:

Construct initial matrix identical with the foreground video frame sign；According to the edit operation to the initial matrixIn element be adjusted, obtain the objective matrix for characterizing the variable quantity of the synthetic parameters.

In some embodiments, synthesis unit 2403 are also used to:

The example of hardware implementation, method provided by the embodiment of the present invention are used as method provided in an embodiment of the present inventionThe processor 410 of hardware decoding processor form can be directly used to execute completion, for example, dedicated by one or more applicationIntegrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device(PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex ProgrammableLogic Device), field programmable gate array (FPGA, Field-Programmable Gate Array) or other electronicsElement, which executes, realizes method provided in an embodiment of the present invention.

The embodiment of the present invention provides a kind of storage medium for being stored with executable instruction, wherein it is stored with executable instruction,When executable instruction is executed by processor, processor will be caused to execute method provided in an embodiment of the present invention, for example, such as Fig. 3The method shown.

In some embodiments, executable instruction can use program, software, software module, the form of script or code,By any form of programming language (including compiling or interpretative code, or declaratively or process programming language) write, and itsIt can be disposed by arbitrary form, including be deployed as independent program or be deployed as module, component, subroutine or be suitble toCalculate other units used in environment.

As an example, executable instruction can with but not necessarily correspond to the file in file system, can be stored inA part of the file of other programs or data is saved, for example, being stored in hypertext markup language (HTML, Hyper TextMarkup Language) in one or more scripts in document, it is stored in the single file for being exclusively used in discussed programIn, alternatively, being stored in multiple coordinated files (for example, the file for storing one or more modules, subprogram or code section).

As an example, executable instruction can be deployed as executing in a calculating equipment, or it is being located at one placeMultiple calculating equipment on execute, or, be distributed in multiple places and by multiple calculating equipment of interconnection of telecommunication networkUpper execution.

It will using target object be foreground video from target video in target video in conclusion through the embodiment of the present inventionIn split, and the foreground video frame of the foreground video to be divided is as prospect and using the background video frame of background video as backScape carries out the synthesis of video frame, and the video frame of synthesis is encapsulated as synthetic video, thus based on the content of video to dynamic viewFrequency is synthesized, and the dynamic video of image content coordination is obtained.A key operation based on display interface, to multiple target videos intoThe dividing processing of row mass target object.Also, editing interface is provided a user, the edit operation based on user, to prospectVideo is edited relative to the position of background video and imaging size.

The above, only the embodiment of the present invention, are not intended to limit the scope of the present invention.It is all in this hairMade any modifications, equivalent replacements, and improvements etc. within bright spirit and scope, be all contained in protection scope of the present invention itIt is interior.

Claims

1. a kind of method for processing video frequency characterized by comprising

Obtain target video；

In response to the cutting operation for target object in the target video, obtain from the target video with the targetObject is the foreground video of prospect；The foreground video includes at least one foreground video frame；

In response to being directed to the synthetic operation of the foreground video and the background video, by the foreground video in the foreground videoFrame is overlapped with the background video frame in the background video, and

The video frame that superposition obtains is encapsulated as synthetic video.

2. the method according to claim 1, wherein described in response to for target object in the target videoCutting operation, from the target video obtain using the target object as the foreground video of prospect, comprising:

The batch splitting received at least two target videos operates；

It operates in response to the batch splitting, obtains from each target video using the target object as the video of prospectSegment, and it is determined as corresponding foreground video.

3. the method according to claim 1, wherein described in response to being directed to the foreground video and the backgroundThe synthetic operation of video folds the foreground video frame in the foreground video with the background video frame in the background videoAdd, comprising:

In response to the batch synthetic operation, the foreground video frame at least two foreground video is added to respectively describedIn background video frame in background video.

4. the method according to claim 1, wherein the acquisition background video, comprising:

The video selection received for the video selection window operates；

Obtain the selected background video of the video selection operation.

5. the method according to claim 1, wherein the method also includes:

In response to being directed to the preview operation of the foreground video and the background video, the foreground video frame and the back is presentedThe Overlay of scape video frame.

6. the method according to claim 1, wherein described obtain from the target video with the target pairAs the foreground video for prospect, comprising:

Target area where identifying the target object in the video frame of the target video, and by institute in the video frameState the areas transparentization processing except target area；

Video frame after transparency process is encapsulated as the foreground video.

7. according to the method described in claim 6, it is characterized in that, it is described from the video frame of the target video identification described inTarget area where target object, and the areas transparentization except target area described in the video frame is handled, comprising:

It identifies the target area where target object described in the video frame of the target video, and is obtained according to the target areaTo the corresponding image array of video frame of the target video；Element in described image matrix characterizes corresponding video frame respectivelyPixel belong to the probability of the target area；

Described image matrix and corresponding video frame are subjected to mask process, will in the video frame except the target area itOuter areas transparent.

8. method according to any one of claims 1 to 7, which is characterized in that the prospect by the foreground videoVideo frame is overlapped with the background video frame in the background video, comprising:

The background of the timestamp alignment relation will be met in foreground video frame and the background video in the foreground videoVideo frame is overlapped.

9. method according to any one of claims 1 to 7, which is characterized in that the prospect by the foreground videoVideo frame is overlapped with the background video frame in the background video, comprising:

In response to the edit operation for the foreground video and background video setting synthetic parameters, by the foreground videoFrame covers the background video frame, and overlay area of the foreground video frame in the background video frame meets the conjunction of settingAt parameter.

10. according to the method described in claim 9, it is characterized in that, the method also includes:

Construct initial matrix identical with the foreground video frame sign；

The element in the initial matrix is adjusted according to the edit operation, obtains the variation for characterizing the synthetic parametersThe objective matrix of amount.

11. according to the method described in claim 10, it is characterized in that, described cover the background view for the foreground video frameFrequency frame, and overlay area of the foreground video frame in the background video frame meets the synthetic parameters of setting, comprising:

The objective matrix is multiplied with the foreground video frame in the foreground video, the foreground video frame after being adjusted；

The foreground video frame adjusted is covered into the background video frame.

12. a kind of video process apparatus characterized by comprising

First acquisition unit, for obtaining target video；

Cutting unit, for the cutting operation in response to being directed to target object in the target video, from the target videoIt obtains using the target object as the foreground video of prospect；The foreground video includes at least one foreground video frame；

Synthesis unit regards the prospect for the synthetic operation in response to being directed to the foreground video and the background videoForeground video frame in frequency is overlapped with the background video frame in the background video, and the video frame that superposition is obtained encapsulatesFor synthetic video.

13. a kind of video process apparatus characterized by comprising

Memory, for storing executable instruction；

Processor when for executing the executable instruction stored in the memory, is realized described in any one of claim 1 to 11Method for processing video frequency.

14. a kind of storage medium, which is characterized in that being stored with executable instruction, when for causing processor to execute, realizing rightIt is required that 1 to 11 described in any item method for processing video frequency.