Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The embodiment of the application provides a video generation method, a display window is arranged in a background image, a foreground image and a recording object which are obtained from a network or from a terminal locally are displayed on the display window, a composite image is obtained, and a recording video is formed by multi-frame composite images.
According to the technical scheme provided by the embodiment of the application, the execution main body of each step can be a terminal. The terminal may be a smartphone, a tablet computer, a laptop or a desktop computer, etc. In some embodiments, a recording application is installed in the terminal, and the execution subject of each step may also be the recording application.
Referring to fig. 1, a flow chart of a video generation method according to an embodiment of the present application is shown. The method may comprise the steps of:
step 101, collecting a background image.
The background image includes a target object, which is used to describe a recording object, and the recording object may be a person, an animal, an article, a virtual person, and the like, which is not limited in this embodiment of the application.
The number of the background images may be one or multiple, and the number of the background images is not limited in the embodiment of the present application. The terminal is provided with hardware or software having an image capturing function by which an image is captured.
In some embodiments, the terminal is configured with hardware having an image capture function (such as a camera) through which the terminal captures a background image. It should be noted that the hardware with the image capturing function may be hardware of the terminal itself, or may be hardware that is independent from the terminal and has a wired connection or a wireless connection with the terminal.
In some embodiments, software with image capture functionality (such as screen recording software) is provided, by which the terminal captures background images. The background image acquired by the screen recording software is the display content in the terminal screen.
In other embodiments, the terminal acquires images through the camera and the screen recording software at the same time, and synthesizes the images acquired by the camera and the screen recording software to obtain a background image.
Step 102, obtaining a foreground image.
The number of foreground images may be one or multiple, and the number of foreground images is not limited in the embodiment of the present application. In a possible implementation manner, the number of foreground images is one, that is, each synthesized image is synthesized by using the same foreground image. In another possible implementation, the number of foreground images is the same as the number of background images, that is, each synthesized image is synthesized by using different foreground images.
In some embodiments, the foreground image is a frame of image in the foreground video, and the terminal acquires the foreground video first and then extracts the foreground image from the foreground video. The terminal can obtain the foreground video from a local preset storage path. The terminal can also acquire the foreground video from the network. For example, the terminal may obtain the foreground video from a server corresponding to the recording application.
The execution sequence of acquiring the background image and the foreground image is not limited in the embodiment of the application. The terminal can acquire the background image first and then acquire the foreground image, can acquire the foreground image first and then acquire the background image, and can acquire the background image and acquire the foreground image simultaneously.
When the technical scheme provided by the embodiment of the application is applied to a live broadcast scene, the terminal can acquire the foreground video and extract the foreground image from the foreground video, and then acquire the background image. When the technical scheme provided by the embodiment of the application is applied to a short video recording scene, the terminal can acquire the background image, acquire the foreground video and extract the foreground image from the foreground video.
And 103, displaying the foreground image in a target window in the background image to obtain a composite image.
The target window includes a target object. The target window is a preset region for displaying the target object in the foreground image and the background image. The number of the synthetic images may be determined according to the number of the background images, for example, the number of the synthetic images is the same as the number of the background images.
The composition of the composite image may be determined based on the size of the target window. When the size of the target window is the same as the size of the display window for displaying the background image, the composite image is composed of the foreground image and the target object, and when the size of the target window is smaller than the size of the display window for displaying the background image, the composite image is composed of the foreground image, the target object, and a partial region of the background object other than the target object.
In some embodiments,step 103 may be implemented as the following sub-steps: and replacing the display content in the area except the target object in the target window with the foreground image.
And the terminal replaces the display content in the region except the target object in the target window with the foreground image to obtain a composite image.
In other embodiments,step 103 may be replaced with the following sub-steps: and displaying the foreground image on the upper layer of the display content in the target window in an overlapping manner.
And the transparency of the overlapping area of the foreground image and the target object is a preset value. The preset value can be set according to actual requirements. For example, the preset value is 1, that is, an overlapping area of the foreground image and the target object is completely transparent, and the target object can be displayed through the overlapping area.
And 104, generating a target video according to the synthetic image.
And the terminal plays the plurality of synthesized images frame by frame to obtain the target video.
To sum up, the technical scheme provided by the embodiment of the application obtains the synthetic image by setting the display window in the background image and displaying the foreground image and the recording object which are obtained from the network or from the terminal locally in the display window, and the multi-frame synthetic image forms the recording video.
Please refer to fig. 2, which shows a flowchart of a video generation method provided by an embodiment of the present application. The method comprises the following steps:
step 201, a background image is collected.
The background image includes a target object.
Referring collectively to FIG. 2, a schematic diagram of an interface for generating a composite image is shown, according to an embodiment of the present application. Thebackground image 31 includes atarget object 311.
Step 202, marking the target object in the background image.
In order to avoid blocking the target object when generating the composite image from the foreground image, the target object needs to be marked before generating and combining the images.
In some embodiments, the terminal marks the target object in the background image by the following sub-steps:
(1) Carrying out graying processing on the background image to obtain a grayed background image;
the graying processing is to convert a background image into a grayscale image, and the grayscale image is to represent an image of each pixel point by 8-bit (0-255) grayscale values. The gray values are used to represent the shades of the colors.
In the embodiment of the application, the terminal may perform graying processing on the background image by using a method such as an averaging method, a maximum-minimum averaging method, a weighted averaging method, and the like, so as to obtain a grayed background image.
(2) And carrying out image segmentation on the background image subjected to the gray processing to obtain a background image marked with the target object.
The background image marked out of the target object may be a mono-channel mask image in which the transparency of the target object is 1, i.e. completely opaque, and the transparency of the background object in the areas other than the target object is 0, i.e. completely transparent, and the transparency of the edges of the target object is between 0 and 1.
When the target object is a portrait or a virtual character, the grayed background image may be subjected to image segmentation by using a human body segmentation technique. In the embodiment of the present application, the algorithm used for image segmentation on the grayed background image may be an image edge segmentation algorithm, an image threshold segmentation algorithm, a region-based segmentation algorithm, a morphological watershed algorithm, and the like, which is not limited in the embodiment of the present application.
Referring to fig. 3 in combination, the terminal marks atarget object 311 in thebackground image 31.
Step 203, obtaining a foreground image.
And step 204, setting display parameters of the target window.
The display parameters of the target window include at least one of: length, width, vertex coordinates. The display parameters of the target window may be determined according to the position of the target object in the background image and the size of the target object. The display parameters of the target window may be set by default by the terminal or may be set by the user in a user-defined manner, which is not limited in the embodiment of the present application.
Specifically, the length and width of the target window may be determined according to the size of the target object. In some embodiments, the length of the target window is greater than the length of the target object and less than or equal to the length of the background image, and the width of the target window is greater than the width of the target object and less than or equal to the width of the background image. The vertex coordinates of the target window may be determined according to the position of the target object in the background image.
Step 205, determining the target window according to the display parameters of the target window.
The terminal can uniquely determine the target window according to the display parameters of the target window.
And step 206, displaying the foreground image in the target window in the background image to obtain a composite image.
The target window includes a target object. Referring to fig. 3 in combination, the terminal displays theforeground image 32 and thetarget object 311 in thetarget window 312 in thebackground image 31, resulting in thecomposite image 33.
Step 207, generating a target video according to the composite image.
To sum up, the technical scheme provided by the embodiment of the application obtains the synthetic image by setting the display window in the background image and displaying the foreground image and the recording object which are obtained from the network or from the terminal locally in the display window, and the multi-frame synthetic image forms the recording video.
In the following, embodiments of the apparatus of the present application are described, and for portions of the embodiments of the apparatus not described in detail, reference may be made to technical details disclosed in the above-mentioned method embodiments.
Referring to fig. 4, a block diagram of a video generation apparatus according to an exemplary embodiment of the present application is shown. The video generation apparatus may be implemented as all or a part of the terminal by software, hardware, or a combination of both. The device includes: animage acquisition module 401, animage acquisition module 402, animage composition module 403, and avideo generation module 404.
And animage acquisition module 401, configured to acquire a background image.
Animage obtaining module 402, configured to obtain a foreground image.
Animage synthesizing module 403, configured to display the foreground image in a target window in the background image to obtain a synthesized image, where the target window includes the target object.
Avideo generating module 404, configured to generate a target video according to the composite image.
In summary, according to the technical scheme provided by the embodiment of the application, the display window is arranged in the background image, the foreground image and the recording object obtained from the network or from the terminal locally are displayed on the display window to obtain the synthetic image, and the multi-frame synthetic image forms the recorded video.
In an alternative embodiment provided based on the embodiment shown in fig. 4, theimage synthesis module 403 is configured to replace the displayed content in the region except for the target object in the target window with the foreground image.
In an optional embodiment provided based on the embodiment shown in fig. 4, theimage synthesis module 403 is configured to superpose the foreground image on the upper layer of the display content displayed in the target window; and the transparency of the overlapping area of the foreground image and the target object is a preset value.
In an optional embodiment provided based on the embodiment shown in fig. 4, the apparatus further includes: an object tagging module (not shown in fig. 4).
And the object marking module is used for marking the target object in the background image.
Optionally, the object labeling module is configured to:
carrying out graying processing on the background image to obtain a grayed background image;
and performing image segmentation on the background image subjected to the graying processing to obtain the background image marked with the target object.
In an optional embodiment provided based on the embodiment shown in fig. 4, the apparatus further includes: a window determination module (not shown in fig. 4).
A window determination module to:
setting display parameters of the target window, wherein the display parameters of the target window comprise at least one of the following items: length, width, vertex coordinates;
and determining the target window according to the display parameters of the target window.
It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules as needed, that is, the internal structure of the device may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments, which are not described herein again.
Fig. 5 shows a block diagram of a terminal 500 according to an exemplary embodiment of the present application. The terminal 500 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 500 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.
In general, the terminal 500 includes: aprocessor 501 and amemory 502.
Theprocessor 501 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. Theprocessor 501 may be implemented in at least one hardware form of Digital Signal Processing (DSP), field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA).Processor 501 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in a wake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, theprocessor 501 may be integrated with a Graphics Processing Unit (GPU), which is responsible for rendering and drawing the content required to be displayed on the display screen.
Memory 502 may include one or more computer-readable storage media, which may be non-transitory.Memory 502 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in thememory 502 is used to store a computer program for execution by theprocessor 501 to implement the method for push streaming of live data provided by the method embodiments in the present application.
In some embodiments, the terminal 500 may further optionally include: aperipheral interface 503 and at least one peripheral. Theprocessor 501,memory 502, andperipheral interface 503 may be connected by buses or signal lines. Each peripheral may be connected to theperipheral interface 503 by a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one ofradio frequency circuitry 504,touch screen display 505,camera 505,audio circuitry 507,positioning components 508, andpower supply 509.
Theperipheral interface 503 may be used to connect at least one Input/Output (I/O) related peripheral to theprocessor 501 and thememory 502. In some embodiments, theprocessor 501,memory 502, andperipheral interface 503 are integrated on the same chip or circuit board; in some other embodiments, any one or two of theprocessor 501, thememory 502, and theperipheral interface 503 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
TheRadio Frequency circuit 504 is used for receiving and transmitting Radio Frequency (RF) signals, also called electromagnetic signals. Theradio frequency circuitry 504 communicates with communication networks and other communication devices via electromagnetic signals. Therf circuit 504 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, theradio frequency circuit 504 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. Theradio frequency circuitry 504 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or Wireless Fidelity (WiFi) networks. In some embodiments,rf circuitry 504 may also include Near Field Communication (NFC) related circuitry, which is not limited in this application.
Thedisplay screen 505 is used to display a User Interface (UI). The UI may include graphics, text, icons, video, and any combination thereof. When thedisplay screen 505 is a touch display screen, thedisplay screen 505 also has the ability to capture touch signals on or over the surface of thedisplay screen 505. The touch signal may be input to theprocessor 501 as a control signal for processing. At this point, thedisplay screen 505 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, thedisplay screen 505 may be one, providing the front panel of the terminal 500; in other embodiments, the display screens 505 may be at least two, respectively disposed on different surfaces of the terminal 500 or in a folded design; in still other embodiments, thedisplay 505 may be a flexible display disposed on a curved surface or a folded surface of the terminal 500. Even more, thedisplay screen 505 can be arranged in a non-rectangular irregular figure, i.e. a shaped screen. TheDisplay screen 505 may be made of Liquid Crystal Display (LCD), organic Light-Emitting Diode (OLED), or the like.
Camera assembly 505 is used to capture images or video. Optionally,camera assembly 505 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of a terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and the rear cameras are any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and a Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments,camera head assembly 505 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp and can be used for light compensation under different color temperatures.
Audio circuitry 507 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to theprocessor 501 for processing or inputting the electric signals to theradio frequency circuit 504 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 500. The microphone may also be an array microphone or an omni-directional acquisition microphone. The speaker is used to convert electrical signals from theprocessor 501 or theradio frequency circuit 504 into sound waves. The loudspeaker can be a traditional film loudspeaker and can also be a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments,audio circuitry 507 may also include a headphone jack.
Thepositioning component 508 is used to locate the current geographic Location of the terminal 500 for navigation or Location Based Service (LBS). ThePositioning component 508 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.
Power supply 509 is used to power the various components interminal 500. Thepower source 509 may be alternating current, direct current, disposable or rechargeable. Whenpower supply 509 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 500 also includes one or more sensors 55. The one or more sensors 55 include, but are not limited to: acceleration sensor 511, gyro sensor 512, pressure sensor 513, fingerprint sensor 514, optical sensor 515, and proximity sensor 515.
The acceleration sensor 511 may detect the magnitude of acceleration on three coordinate axes of the coordinate system established with the terminal 500. For example, the acceleration sensor 511 may be used to detect components of the gravitational acceleration in three coordinate axes. Theprocessor 501 may control thetouch screen 505 to display the user interface in a landscape view or a portrait view according to the acceleration signal of gravity collected by the acceleration sensor 511. The acceleration sensor 511 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 512 may detect a body direction and a rotation angle of the terminal 500, and the gyro sensor 512 may cooperate with the acceleration sensor 511 to acquire a 3D motion of the user on theterminal 500. Theprocessor 501 may implement the following functions according to the data collected by the gyro sensor 512: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization while shooting, game control, and inertial navigation.
The pressure sensor 513 may be disposed on a side bezel of the terminal 500 and/or an underlying layer of thetouch display screen 505. When the pressure sensor 513 is disposed on the side frame of the terminal 500, a holding signal of the terminal 500 by the user can be detected, and theprocessor 501 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 513. When the pressure sensor 513 is disposed at the lower layer of thetouch display screen 505, theprocessor 501 controls the operability control on the UI interface according to the pressure operation of the user on thetouch display screen 505. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 514 is used for collecting a fingerprint of the user, and theprocessor 501 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 514, or the fingerprint sensor 514 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, theprocessor 501 authorizes the user to perform relevant sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 514 may be provided on the front, rear, or side of the terminal 500. When a physical button or a vendor Logo is provided on the terminal 500, the fingerprint sensor 514 may be integrated with the physical button or the vendor Logo.
The optical sensor 515 is used to collect the ambient light intensity. In one embodiment, theprocessor 501 may control the display brightness of thetouch display screen 505 based on the ambient light intensity collected by the optical sensor 515. Specifically, when the ambient light intensity is high, the display brightness of thetouch display screen 505 is increased; when the ambient light intensity is low, the display brightness of thetouch display screen 505 is turned down. In another embodiment,processor 501 may also dynamically adjust the shooting parameters ofcamera head assembly 505 based on the ambient light intensity collected by optical sensor 515.
A proximity sensor 516, also referred to as a distance sensor, is typically disposed on the front panel of the terminal 500. The proximity sensor 516 is used to collect the distance between the user and the front surface of the terminal 500. In one embodiment, when the proximity sensor 516 detects that the distance between the user and the front surface of the terminal 500 is gradually reduced, theprocessor 501 controls thetouch display screen 505 to switch from the bright screen state to the dark screen state; when the proximity sensor 516 detects that the distance between the user and the front surface of the terminal 500 becomes gradually larger, theprocessor 501 controls thetouch display screen 505 to switch from the message screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 5 is not intended to be limiting ofterminal 500 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
In an exemplary embodiment, there is also provided a computer-readable storage medium having stored therein a computer program, which is loaded and executed by a processor of a terminal to implement the video generation method in the above-described method embodiments.
Alternatively, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided a computer program product for implementing the video generation method provided in the above method embodiments when the computer program product is executed.
It should be understood that reference herein to "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. As used herein, the terms "first," "second," and the like, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another.
The above-mentioned serial numbers of the embodiments of the present application are merely for description, and do not represent the advantages and disadvantages of the embodiments.
The above description is only exemplary of the application and should not be taken as limiting the application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the application should be included in the protection scope of the application.