CN108924518B

Movatterモバイル変換

Info

Publication number: CN108924518B
Application number: CN201810983402.6A
Authority: CN
Inventors: 张磊
Original assignee: Shanghai Mengtong Culture Communication Co ltd
Current assignee: Shanghai Mengtong Culture Communication Co.,Ltd.
Priority date: 2018-08-27
Filing date: 2018-08-27
Publication date: 2020-10-09
Anticipated expiration: 2038-08-27
Also published as: CN108924518A

Abstract

The present disclosure provides a method for synthesizing in a recommendation video and a related product, the method comprising the steps of: acquiring a plurality of sub-videos to be synthesized, wherein the plurality of sub-videos comprise: the sub-videos of the same person shot for multiple times in the same scene; extracting a plurality of character images of each sub-video in the plurality of sub-videos, and determining a first range of the character images according to the plurality of character images; the method comprises the steps of taking a first range as a center, dividing a scene into a left near scene area, a right near scene area, a left far scene area and a right far scene area, extracting n sub-videos with the best effect in the left near scene area, the right near scene area, the left far scene area and the right far scene area from a plurality of sub-videos, and combining the left near scene area video, the left far scene area video, the right near scene area video and the right far scene area video in the n sub-videos according to the range of 4 areas to obtain a combined video. The technical scheme provided by the application has the advantage of low cost.

Description

Method for synthesizing in recommendation video and related products

Technical Field

The invention relates to the technical field of culture media, in particular to a synthesis method in a recommendation video and a related product.

Background

Enterprises are the main bodies of social business operation, and many enterprises need some recommendations, so recommendation videos come with the advent of the enterprises, and the recommendation videos also become enterprise promotion videos, which are produced by professional movie companies.

The existing recommendation videos are synthesized by a plurality of video files, wherein the synthesis is based on artificial synthesis, so the cost is high, the artificial synthesis effect cannot be controlled, and the synthesis effect is not uniform.

Disclosure of Invention

The embodiment of the invention provides a method for synthesizing a recommendation video and a related product, which can realize automatic video synthesis and have the advantages of low cost and uniform synthesis effect.

In a first aspect, an embodiment of the present invention provides a method for synthesizing a recommendation video, where the method includes the following steps:

acquiring a plurality of sub-videos to be synthesized, wherein the plurality of sub-videos comprise: the sub-videos of the same person shot for multiple times in the same scene;

extracting a plurality of character images of each sub-video in the plurality of sub-videos, and determining a first range of the character images according to the plurality of character images;

dividing the scene into a left near scene area, a right near scene area, a left far scene area and a right far scene area by taking the first range as a center, extracting n sub-videos with the best effect in the left near scene area, the right near scene area, the left far scene area and the right far scene area from the plurality of sub-videos, and combining the left near scene area video, the left far scene area video, the right near scene area video and the right far scene area video in the n sub-videos according to the range of 4 areas to obtain a combined video; n is an integer of 4 or less.

Optionally, the first range is a minimum range including each person image in the plurality of sub-videos.

Optionally, the dividing the scene into a left near scene area, a right near scene area, a left far scene area, and a right far scene area specifically includes:

the sizes of the left near scene area, the right near scene area, the left far scene area and the right far scene area are determined according to the sizes of the scenes.

Optionally, the synthesizing the left near scene area video, the left far scene area video, the right near scene area video, and the right far scene area video in the n sub-videos according to the range of 4 areas to obtain a synthesized video specifically includes:

if n =4, materializing the left near scene area video of the first sub-video, transparentizing the remaining area of the first sub-video, materializing the left far scene area video of the second sub-video, transparentizing the remaining area of the second sub-video, materializing the right near scene area video of the third sub-video, transparentizing the remaining area of the third sub-video, materializing the right far scene area video of the fourth sub-video, transparentizing the remaining area of the fourth sub-video, and then overlapping the processed 4 sub-videos to obtain the composite video.

In a second aspect, a terminal is provided, which includes: a processor, a communication unit and a display screen,

the communication unit is configured to acquire a plurality of sub-videos to be synthesized, where the plurality of sub-videos include: the sub-videos of the same person shot for multiple times in the same scene;

the processor is used for extracting a plurality of character images of each sub-video in the plurality of sub-videos and determining a first range of the character images according to the plurality of character images; dividing the scene into a left near scene area, a right near scene area, a left far scene area and a right far scene area by taking the first range as a center, extracting n sub-videos with the best effect in the left near scene area, the right near scene area, the left far scene area and the right far scene area from the plurality of sub-videos, and combining the left near scene area video, the left far scene area video, the right near scene area video and the right far scene area video in the n sub-videos according to the range of 4 areas to obtain a combined video; n is an integer of 4 or less.

Optionally, the processor is specifically configured to determine sizes of a left near scene area, a right near scene area, a left far scene area, and a right far scene area according to the size of the scene.

Optionally, the processor is specifically configured to, for example, n =4, materialize a left near scene area video of the first sub video, transparentize a remaining area of the first sub video, materialize a left far scene area video of the second sub video, transparentize a remaining area of the second sub video, materialize a right near scene area video of the third sub video, transparentize a remaining area of the third sub video, materialize a right far scene area video of the fourth sub video, transparentize a remaining area of the fourth sub video, and then superimpose the processed 4 sub videos to obtain the composite video.

In a third aspect, a computer-readable storage medium is provided, which stores a program for electronic data exchange, wherein the program causes a terminal to execute the method provided in the first aspect.

The embodiment of the invention has the following beneficial effects:

it can be seen that, after the technical scheme provided by the application obtains the plurality of sub-videos to be synthesized, the plurality of character images are obtained, the first range of the characters is determined, the 4 scene areas are determined according to the first range, then the best n sub-videos in the 4 scene areas are extracted, and then the n sub-videos are spliced and synthesized according to the scene areas to obtain the video.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a terminal.

Fig. 2 is a flow chart of a method for synthesizing in a recommendation video.

Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," "third," and "fourth," etc. in the description and claims of the invention and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a terminal, and as shown in fig. 1, the terminal may include an intelligent terminal, and specifically may be a tablet computer, such as an Android tablet computer, an iOS tablet computer, a Windows Phone tablet computer, and the like. Specifically, the terminal may further include: personal computer, server, etc., the terminal comprising: processor 101, display screen 104, communication module 102, memory 103 and image processor.

The processor 101 is a control center of the terminal, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal and processes data by operating or executing software programs and/or modules stored in the memory and calling data stored in the memory, thereby monitoring or controlling the terminal as a whole. Alternatively, processor 101 may include one or more processing units; optionally, the processor 101 may integrate an application processor, a modem processor, and an artificial intelligence chip, wherein the application processor mainly processes an operating system, a user interface, an application program, and the like.

Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The communication module can be used for receiving and sending information. Typically, the communication module includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the communication module can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, such as a mobile communication protocol or a short-range communication protocol (including but not limited to bluetooth, WIFI, etc.).

The image processor may be specifically configured to perform relevant processing on an image (e.g., a video), and in practical applications, the image processor may be integrated into the processor 101.

The display screen may be used to display advertisements, and may specifically be an LCD display screen, but may also be other forms of display screens, such as a touch display screen.

Referring to fig. 2, fig. 2 provides a method for synthesizing recommendation video, which is executed by the terminal shown in fig. 1 and shown in fig. 2, and includes the following steps:

step S201, obtaining a plurality of sub-videos to be synthesized, where the plurality of sub-videos include: the sub-videos of the same person shot for multiple times in the same scene;

step S202, extracting a plurality of character images of each sub-video in a plurality of sub-videos, and determining a first range of the character images according to the plurality of character images;

the determining the first range of the person image may specifically include: and determining the range of each person image of each sub-video, and superposing the ranges of the person images to obtain a first range, wherein the first range is the minimum range containing each person image.

Step S203, with the first range as the center, divides the scene into a left near scene area, a right near scene area, a left far scene area, and a right far scene area.

The division of the 4 scene areas may be divided according to different scenes, for example, for a relatively open field, such as an outside scene, a large inside scene (e.g., an airport), and the like, the 2 near scene areas may have a smaller range, for example, the 2 near scene areas may be the same as the first range, the remaining left area is a far left scene area (divided into a left area by a vertical centerline of the first range), the remaining right area is a far right scene area, if the field is an indoor scene, the 2 near scene areas may have a larger range, for example, the first range is multiplied by a factor greater than 1, for example, 1.2, and the remaining left area is a far left scene area (divided into a left area by a vertical centerline of the first range), and the remaining right area is a far right scene area.

Step S204, extracting n sub-videos with best effects (for example, with highest definition) from the left near scene area, the right near scene area, the left far scene area, and the right far scene area from the plurality of sub-videos, and combining the left near scene area video, the left far scene area video, the right near scene area video, and the right far scene area video in the n sub-videos according to the range of 4 areas to obtain a combined video. N is an integer of 4 or less.

According to the technical scheme, after the plurality of sub-videos to be synthesized are obtained, the plurality of character images are obtained, the character first range is determined, the 4 scene areas are determined according to the first range, the best n sub-videos in the 4 scene areas are extracted, the n sub-videos are spliced and synthesized according to the scene areas to obtain the videos, so that automatic synthesis of the videos is achieved, the advantages of the plurality of sub-videos can be better played by dividing the sub-videos into the 4 scene areas, the video synthesis effect is improved, and therefore the video synthesis method has the advantages of being low in cost and uniform in synthesis effect.

If n =4, the video composition may specifically include: the method comprises the steps of materializing a left near scene area video of a first sub video, transparentizing a residual area of the first sub video, materializing a left far field scene area video of a second sub video, transparentizing a residual area of the second sub video, materializing a right near field scene area video of a third sub video, transparentizing a residual area of the third sub video, materializing a right far field scene area video of a fourth sub video, transparentizing a residual area of the fourth sub video, and then overlapping the processed 4 sub videos to obtain a synthesized video. The definition refers to the definition of each detail shadow and its boundary on the image.

The method for determining the first range may specifically include:

determining a face range of a face through a face recognition algorithm, setting 1 chest region (rectangle), a left-hand region (rectangle), a right-hand region (rectangle) and a double-leg region (rectangle) by taking the face range as a reference, extracting RGB values of each pixel point in the 1 chest region, counting the number of the same RGB values, determining the first RGB value with the largest number, connecting adjacent pixel points in the first RGB value to obtain a first pixel frame, if the first pixel frame is closed, determining that the region in the first pixel frame is the trimmed chest region, if the first pixel frame is discontinuous, determining the distance of a broken line segment of a non-pixel frame, if the distance of the broken line segment is smaller than a set threshold value and the value of each broken line segment is the same, connecting the broken line segment with a straight line to obtain a closed second pixel frame, and taking the second pixel frame as the trimmed chest region.

For the 3 four-limb areas, the left-hand area, the right-hand area and the two-leg area are divided, and for the two-leg area, the two-leg area after pruning can be obtained by adopting a pruning method of a chest area;

the left-hand area pruning method specifically comprises the following steps:

extracting the RGB value of each pixel point in 1 left-hand area, counting the number of the same RGB values, determining the first RGB value with the largest number and the second RGB value with the largest number, connecting the adjacent pixel points in the first RGB value to obtain a first pixel frame, connecting the adjacent pixel points in the second RGB value to obtain a second pixel frame, and determining that the first pixel frame and the second pixel frame are the left-hand area after trimming if the first pixel frame and the second pixel frame are both closed and the first pixel frame is connected with the second pixel frame. And obtaining the trimmed right-hand area in the same way.

The determination of the scope of the hero is only the confirmation of the approximate scope, because the recommendation video only needs to determine the approximate scope of the hero when being cut, and because the refinement determination is not necessary, because the scene and the person corresponding to the film source are the same, the determination of the scope can be directly processed.

Referring to fig. 3, fig. 3 provides a terminal including: a processor 301, a communication unit 302 and a display 303,

Embodiments of the present invention also provide a computer storage medium, where the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute part or all of the steps of any one of the methods for synthesizing in a recommended video as described in the above method embodiments.

Embodiments of the present invention also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods of synthesizing in a recommended video as set forth in the above method embodiments.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules illustrated are not necessarily required to practice the invention.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.

The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a memory and includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above embodiments of the present invention are described in detail, and the principle and the implementation of the present invention are explained by applying specific embodiments, and the above description of the embodiments is only used to help understanding the method of the present invention and the core idea thereof; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for synthesizing a recommendation video, the method comprising:

extracting a plurality of character images of each sub-video in the plurality of sub-videos, and determining a first range of the character images according to the plurality of character images; the confirming the first range of the person image according to the plurality of person images comprises: determining the range of each person image of each sub-video, and superposing the ranges of the person images to obtain a first range;

with the first range as the center, dividing the scene into a left near scene area, a right near scene area, a left far scene area and a right far scene area, extracting 4 sub-videos with the best effect in the left near scene area, the right near scene area, the left far scene area and the right far scene area from the plurality of sub-videos, and combining the left near scene area video, the left far scene area video, the right near scene area video and the right far scene area video in the 4 sub-videos according to the range of the 4 areas to obtain a combined video

The synthesizing of the left near scene area video, the left far scene area video, the right near scene area video and the right far scene area video in the 4 sub-videos according to the range of the 4 areas to obtain a synthesized video specifically includes:

the method comprises the steps of realizing the left near scene area of a first sub-video, realizing the transparency of the residual area of the first sub-video, realizing the left far field scene area of a second sub-video, realizing the transparency of the residual area of the second sub-video, realizing the right near field scene area of a third sub-video, realizing the transparency of the residual area of the third sub-video, realizing the right far field scene area of a fourth sub-video and realizing the transparency of the residual area of the fourth sub-video, and overlapping the processed 4 sub-videos to obtain a composite video.

2. The method of claim 1,

the first range is a minimum range including each person image in the plurality of sub videos.

3. The method of claim 2, wherein the dividing the scene into a left near scene area, a right near scene area, a left far scene area, and a right far scene area specifically comprises:

4. A terminal, the terminal comprising: a processor, a communication unit and a display screen, characterized in that,

the processor is used for extracting a plurality of character images of each sub-video in the plurality of sub-videos and determining a first range of the character images according to the plurality of character images; the confirming the first range of the person image according to the plurality of person images comprises: determining the range of each person image of each sub-video, and superposing the ranges of the person images to obtain a first range; dividing the scene into a left near scene area, a right near scene area, a left far scene area and a right far scene area by taking the first range as a center, extracting 4 sub-videos with the best effect in the left near scene area, the right near scene area, the left far scene area and the right far scene area from the plurality of sub-videos, and combining the left near scene area video, the left far scene area video, the right near scene area video and the right far scene area video in the 4 sub-videos according to the range of the 4 areas to obtain a combined video; the processor is specifically configured to implement a left near scene area of the first sub-video, to transparentize a remaining area of the first sub-video, to implement a left far-field scene area of the second sub-video, to transparentize a remaining area of the second sub-video, to implement a right near-field scene area of the third sub-video, to transparentize a remaining area of the third sub-video, to implement a right far-field scene area of the fourth sub-video, to transparentize a remaining area of the fourth sub-video, and to superimpose the processed 4 sub-videos to obtain a composite video;

the terminal is as follows: a tablet computer or a personal computer.

5. The terminal of claim 4,

6. The terminal of claim 5,

the processor is specifically configured to determine sizes of a left near scene area, a right near scene area, a left far scene area, and a right far scene area according to the size of the scene.

7. A computer-readable storage medium storing a program for electronic data exchange, wherein the program causes a terminal to perform the method as provided in any one of claims 1-3.