Detailed Description
The technical solutions in the embodiments of the present application will be described clearly and completely below, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
For more detailed explanation of the present application, a video rendering method, an apparatus, a terminal device and a computer storage medium in a video conference provided by the present application are specifically described below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 shows a schematic diagram of an application scenario of a video rendering method in a video conference provided by an embodiment of the present application, where the application scenario includes aterminal device 100 provided by an embodiment of the present application, and theterminal device 100 may be various electronic devices (such as structural diagrams of 102, 104, 106, and 108) having a display screen, including but not limited to a smart phone and a computer device, where the computer device may be at least one of a desktop computer, a portable computer, a laptop computer, a tablet computer, and the like. A user may start or join a network conference by using an application installed on the terminal device 100 (for example, a certain conference), theterminal device 100 may receive video images of other participants, and then may view video pictures of all the participants by executing the video rendering method in the video conference according to the present application, and please refer to the embodiment of the video rendering method in the video conference in the specific process.
Next, theterminal device 100 may be generally referred to as one of a plurality of terminal devices, and the present embodiment is only illustrated by theterminal device 100. Those skilled in the art will appreciate that the number of terminal devices described above may be greater or fewer. For example, the number of the terminal devices may be only a few, or the number of the terminal devices may be tens of or hundreds, or may be more, and the number and the type of the terminal devices are not limited in the embodiment of the present application. Theterminal device 100 may be configured to execute a video rendering method in a video conference provided in an embodiment of the present application.
Based on this, the embodiment of the present application provides a video rendering method in a video conference. Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a video rendering method in a video conference according to an embodiment of the present application, and taking the method applied to the terminal device in fig. 1 as an example to explain the method, the method includes the following steps:
step S110, acquiring multiple video images.
The video conference is a new network live broadcast form and can be applied to a multi-user online conference of an enterprise. The video image is a picture image included in a video stream generated by a participant of a video conference. The number or path of the video images is usually related to the number of participants, and each participant can generate one path of video image correspondingly.
In general, when a terminal device adopted by a participant receives video streams from other participants, the terminal device needs to process and render the video streams to display video frames. The processing process of the video stream specifically comprises the following steps: acquiring a plurality of paths of video streams; decoding each path of video stream to obtain each path of video frame data; and performing texture processing on each path of video frame data to obtain each path of video image.
Since the rendering process of the image is mainly performed in the GPU, the GPU processes the multi-dimensional graphics (i.e. textures). In addition, the video stream has a large data size, and needs to be encoded during transmission. Therefore, in this embodiment, after acquiring multiple paths of video streams in a video, the terminal device performs decoding processing on the multiple paths of video streams to form multiple pieces of video frame data, and then sequentially converts each piece of video frame data into a texture that can be used by the GPU, and records the texture as each video image.
Step S120, at least one view container is established.
Wherein the number of view containers is less than the number of ways of the video image.
Step S130, the multiple paths of video images are respectively placed in at least one view container for rendering according to the layout information and the view container identification information of the multiple paths of video images, so as to form multiple paths of video images.
Wherein the video images in each view container do not overlap or obscure each other.
In particular, a view container refers to a user interface component used to process or render images, thereby forming a video frame.
In an alternative embodiment, the View container may be a View. View generally refers to displaying or occupying a rectangular area on a screen, and is mainly responsible for rendering and event processing. View may consider a display device such as a terminal device to present a container of View picture content, and View itself is system resource consuming.
At present, a video is rendered based on a GPU hardware mode, and one video is generally rendered by adopting one View, so that each video rendering needs a corresponding thread to be specially used for binding an OpenGL environment, and N videos need N threads. Thus, as the number of videos increases, the number of views and the consumption of corresponding thread system resources will increase by times, which is likely to cause resource waste. Therefore, at least one view container is provided in the present embodiment, and the number of view containers is smaller than the number of paths of the video conference stream. Multiple video images can be rendered by using a smaller number of View containers, namely, the multiple video images can be placed in the same View for rendering, so that multiple video pictures are formed. The specific process is that a part of video pictures in the received multiple paths of video pictures are rendered on a layer where one View is located through a GPU, the other part of video pictures are rendered on another layer where one or more views are located through the GPU, and then the video pictures are rendered on each View; and finally, the content (namely the video picture) in the View is synthesized by the system to a display screen of the terminal equipment for displaying. Referring specifically to FIG. 3, in the two View containers (i.e., View) in FIG. 3, the boxes formed by the upper small rows of widgets (i.e., Video 1-Video 7) represent the first View container; the box formed by the large window (i.e., VideoN) below represents the second view container. In addition, fig. 3 shows that N video frames are divided into 2 rows, and the upper row of small windows is a multi-channel video frame obtained by rendering the video images of a part of participants (i.e., N-1 participants) in the first view container; the lower large window is a video picture formed by rendering the video images of another part of participants (namely the Nth participant) in the second view container.
In an alternative embodiment, multiple video images can be placed anywhere in any View, as long as there is no overlap or occlusion between the video images. However, in actual use, for the sake of the regularity and the aesthetic appearance of the video frame display, the placement positions and the sizes of the video images may be laid out, for example, the video images may be arranged in View in a horizontal or vertical manner, so that the arrangement of the display positions and the sizes of the video frames is realized.
Further, the placement position and the size of the video image in View can be adjusted, so as to implement the typesetting or adjusting of the size and the display position of each video picture, for example, the arrangement order of any one of the video pictures and the size of the picture in fig. 3 can be adjusted by sliding left and right.
It should be noted that when there are multiple view containers, one or several view containers may be selected to render multiple video images to form multiple video frames. The number of video images in each view container is not limited, and the number of video images (i.e., the number of paths) placed in a plurality of view containers may be the same or different.
It should be further noted that the number of view containers may not be fixed, and the number of view containers may be dynamically increased or decreased according to the number of video pictures to be rendered in the image rendering process.
In addition, the layout information refers to information for determining the placement position of the video images in the view container. The layout information includes, but is not limited to, coordinate information and size (e.g., width and height) information. For example, the layout information may be represented by (x, y, w, h), where x, y represent the coordinates of the video image in View; and w, h denote the width and height of the video image. Each path of video image has layout information.
The view container identification information refers to information used to determine or select a view container, by which it can be determined in which view container a certain path of video image is placed for rendering. Wherein each video image has view container identification information.
Then, according to the layout information and the View container identification information, all the video images of each path can be placed at the corresponding positions of the View.
Further, in step S130, the rendering process of placing the multiple paths of video images into at least one view container according to the layout information and the view container identification information of the multiple paths of video images includes: determining a target view container placed by each path of video image according to the view container identification information of each path of video image; determining the target position and the target size of each path of video image in a target view container according to the coordinate information and the size information of each path of video image; and sequentially placing each path of video image at a target position according to the index information of each path of video image, and rendering according to the target size.
Specifically, as shown in FIG. 4, assume that there are two View containers (i.e., views), where one View is represented by a dashed box. Each View container has View container identification information, where mViewId represents View container identification information, mViewId may take 0 or 1, 0 represents the first View, and 1 represents the second View. In addition, each video image has an mViewId value, and according to the mViewId value, whether a certain video image is placed in the target View container can be determined, namely whether the certain video image is placed in the first View or the second View can be determined. After determining a target View container (i.e., a target View) for placing a certain path of video images, it is necessary to lay out the video images in the target View.
Before the conference view images are typeset, the target position and the target size of each path of video image in the target view container need to be determined according to the coordinate information and the size information of each path of video image. Referring to fig. 5, a rectangular frame formed by an outer frame formed by a dotted line in the figure is called a View container (i.e., View), and may be represented by coordinates, where the coordinates of the starting point may be (0, 0). A small rectangular box named video in the view container represents a certain path of video image, and mx and my in the view container respectively represent coordinates of the video image; width represents the width of the video picture and height represents the height of the video picture. The target position and target size of the video image in the view container can be determined according to mx, my, width, and height, and then the video conference can be placed or typeset according to the target position and target size. When multiple paths of video images exist, typesetting is carried out on each path of video image according to the method. When the multi-channel video images are typeset, in order to avoid confusion, the video images can be sequentially placed in the target view container according to the index information of the video images for rendering.
The index information may be a number, i.e. each video image is numbered, and then each video image is laid out in the target view container in turn according to the numbering order.
In addition, when the size of one or more paths of video images exceeds the view container due to overlarge size, the size of the one or more paths of video images can be adjusted to avoid that part of video images cannot be completely displayed.
By the mode, the video images can be quickly and accurately arranged in the corresponding view container, so that typesetting can be quickly completed, and disorder of video display is avoided.
According to the video rendering method in the video conference, firstly, multiple paths of video images are obtained; establishing at least one view container; wherein the number of the view containers is less than the number of paths of the video images; then, placing the multiple paths of video images in at least one view container respectively according to the typesetting information and the view container identification information of the multiple paths of video images for rendering so as to form multiple paths of video images; wherein the paths of video images in each view container do not overlap or obscure each other.
In the video rendering method in the video conference in this embodiment, multiple paths of video images may be laid out in one or more view containers for rendering according to the layout information and the view container identification information of each path of video image, that is, multiple paths of video images may be rendered by using a relatively small number of view containers, so as to generate multiple paths of video images. Since the view container is resource consuming, less resources are consumed when the number of view containers is small. Therefore, the rendering method can greatly reduce the consumption of resources.
Further, several embodiments of placing multiple video images in at least one view container are provided, which are described in detail below.
The first embodiment:
in one embodiment, the view container includes a first view container and a second view container; the step of respectively placing the multiple paths of video images in at least one view container for rendering comprises the following steps: placing a first preset number of video images in a first view container for rendering; and placing a second preset number of video images in a second view container for rendering.
In particular, the view containers may be two, respectively denoted as a first view container and a second view container. When multiple paths of video images exist, a first preset number of video images can be selected to be placed in a first view container for rendering; and selecting a second preset number of video images to be placed in a second view container for rendering.
The first preset number and the second preset number may be preset values, may be any positive integer, and may be set by a user according to actual needs.
In addition, when the video images with the first preset number and the second preset number are selected from the multiple video images and are respectively placed in the first view container or the second view container, when a user has no specific requirements on the display position and the display mode of the video images, the video images can be randomly selected or selected according to the time, the speaking sequence and the like when the participants enter the conference; when the user has a requirement on the display position and the display mode of the video picture, which part (namely which paths) of video images are respectively placed in the first view container and the second view container can be determined according to the display position and the display mode of the video picture required by the user.
It should be noted that, when there are multiple view containers, the video images in different view containers may be repeated, that is, the video images of the same participant may be repeatedly displayed. In addition, the arrangement or layout of the video images in the different view containers may be different, for example, the video images in the first view container may be arranged in a landscape manner, and the video images in the second view container may be arranged in a portrait manner.
By adopting the method, the display modes of the multi-channel video pictures can be more diversified.
Second embodiment:
in one embodiment, the rendering the multiple video images in the at least one view container respectively comprises: placing all the multiple paths of video images in a first view container for rendering; and responding to the user operation, and selecting one or more of the multiple video images to be placed in the second view container for rendering.
In particular, the view containers may be two, respectively denoted as a first view container and a second view container. When multiple paths of video images exist, all the video images can be placed in a first view container for rendering; and then according to the selection of the user, the video images of one or more participants interested by the user or needing important attention are selected to be placed in the second view container for rendering, so that the video images of the one or more participants interested by the user are amplified and displayed in further detail. For example, in FIG. 3, there are two View containers (i.e., View), the first View container is used to render Video frames that show (i.e., the top row of small windows, Video 1-Video 7) N-1 (only 7 are shown in the figure) participants, and the second View container is used to display Video frames of one (i.e., the Nth) participant that the user is interested in or wants to focus on.
By adopting the method, on one hand, the display modes of the multi-channel video pictures can be more diversified, and on the other hand, the method is convenient for users to pay attention to the video pictures of the important participants.
In addition, after one or more video images are selected from the multiple video images and placed in the second view container for rendering, the method further comprises the following steps: and responding to the user operation, and deleting one path or recording the video image from the second view container.
Specifically, when there are multiple paths of video images, the video images of one or more participants interested by the user or needing important attention can be selected according to the selection of the user, and then the video images of the one or more participants needing no more important attention can be placed in the second view container for display such as amplification, and then the user can delete the video images of the one or more participants needing no more important attention from the second view container. For example, in fig. 3, after the user selects to use the first view container to render the video frames showing N-1 participants (only 7 are shown in the figure), and the second view container is used to display the video frames of the nth participant that the user is interested in or wants to focus on, the nth video image may be deleted from the second view container by clicking or dragging to the nth video image, and at this time, the video frame of the nth participant will not be displayed in the second view container.
By adopting the method, the user can conveniently operate the video image which is wanted to focus on, and the video image can be updated in time after the video image is selected wrongly or the focus on is finished.
Further, a specific embodiment of the binding of the view container and the video picture is given, which is described as follows:
in one embodiment, the video rendering method in the video conference further comprises: and establishing a binding relation between the view container and any one or more video images.
Specifically, a certain view container can be bound with any one or several video images, and after the view container is bound with the video images, the view container can automatically acquire the video images bound with the view container, so that the video images can be prevented from being placed in a wrong view container.
FIG. 4 shows a specific binding process, which assumes that there are two View containers (i.e., views), wherein a dashed box in the figure represents a View. mViewId: 0 denotes the first View, mViewId: and 1 denotes a second View. Two video images are assumed, wherein the video images can be represented by mView: a and mView: b denotes an a video image and a B video image, respectively. mViewId may be separately: 0 and mView: a, mViewId: 1 and mView: and B, binding the first View with the A video image and the second View with the B video image.
And after the binding is finished, the first View can directly acquire the A video image from the multi-path video images and render the A video image, and the second View can directly acquire the B video image from the multi-path video images and render the B video image. When rendering is completed or not needed, unbinding operations can be performed. In addition, in the process of rendering, the binding relationship can be changed when the video image needs to be changed.
The specific process is as follows: if the internal layout parameters of A are updated, mViewId needs to be assigned: 0 and mView: a, then assigning other layout parameters; if B is replaced by C, mViewId needs to be assigned: 1 and mView: c, assigning other layout parameters; the B video picture can be released by itself.
Next, an embodiment of adjusting or updating a video picture is further provided, and the following is described in detail:
in one embodiment, the method for rendering video in a video conference further comprises: adjusting the target position and/or target size of the road or multi-path video images in the view container according to the typesetting information; or: and updating one or more paths of video images in the view container based on the view container identification information.
Specifically, in the process of displaying or playing the video picture, the display position and size of the video picture can be adjusted. In this embodiment, the position of the video image in View or the size of the video image can be changed by modifying the layout information (i.e., the coordinate information and/or the width and height information) of the video image, so as to change the display position of the video frame and the size of the frame.
In addition, the video images in the view container may also be updated by modifying the view container identification information of the video images, for example, removing the a video images in the first view container, or adding the a video images in the first view container, or updating the a video images in the first view container to the B video images, so as to finally change the video picture display manner.
By adopting the mode, the display position and size of the video picture can be conveniently and flexibly adjusted, so that the display effect of the video picture is more displayed.
It should be understood that, although the steps in the flowchart of fig. 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a portion of the steps in fig. 2 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The embodiment disclosed in the present application describes a method for rendering a video in a video conference in detail, and the method disclosed in the present application can be implemented by devices in various forms, so that the present application also discloses a device for rendering a video in a video conference corresponding to the method, and a detailed description is given below with respect to a specific embodiment.
Referring to fig. 6, a video rendering apparatus in a video conference disclosed in an embodiment of the present application mainly includes:
and animage obtaining module 610, configured to obtain multiple channels of video images.
Acontainer building module 620 for building at least one view container; wherein the number of view containers is less than the number of ways of the video image.
Animage rendering module 630, configured to place the multiple paths of video images in at least one view container respectively for rendering according to the layout information and the view container identification information of the multiple paths of video images, so as to form multiple paths of video images; wherein the paths of video images in each view container do not overlap or obscure each other.
In one embodiment, theimage acquisition module 610 is configured to acquire multiple video streams in a video conference; decoding each path of video stream to obtain each path of video frame data; and performing texture processing on each path of video frame data to obtain each path of video image.
In one embodiment, the view container includes a first view container and a second view container; animage rendering module 630, configured to place a first preset number of video images in a first view container for rendering; and placing a second preset number of video images in a second view container for rendering.
In one embodiment, theimage rendering module 630 is configured to place all of the multiple video images in the first view container for rendering; and responding to the user operation, and selecting one or more video images from the multiple video images to be placed in the second view container for rendering.
In one embodiment, the apparatus further comprises: and the image deleting module is used for responding to the user operation and deleting one path of video images or recording the video images from the second view container.
In one embodiment, the layout information includes coordinate information and size information; animage rendering module 630, configured to determine, according to the view container identification information of each path of video image, a target view container in which each path of video image is placed; determining the target position and the target size of each path of video image in a target view container according to the coordinate information and the size information of each path of video image; and sequentially placing each path of video image at a target position according to the index information of each path of video image, and rendering according to the target size.
In one embodiment, the apparatus further comprises: and the binding module is used for establishing the binding relationship between the view container and any one or more paths of video images.
In one embodiment, the apparatus further comprises: the adjusting module is used for adjusting the target position and/or the target size of one or more paths of video images in the view container according to the typesetting information; or:
and the updating module is used for updating one or more paths of video images in the view container based on the view container identification information.
For specific limitations of the video rendering apparatus in the video conference, the above limitations on the method may be referred to, and are not described herein again. The various modules in the above-described apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent of a processor in the terminal device, and can also be stored in a memory in the terminal device in a software form, so that the processor can call and execute operations corresponding to the modules.
Referring to fig. 7, fig. 7 is a block diagram illustrating a structure of a terminal device according to an embodiment of the present application. Theterminal device 70 may be a computer device. Theterminal device 70 in the present application may include one or more of the following components: aprocessor 72, amemory 74, and one or more applications, wherein the one or more applications may be stored in thememory 74 and configured to be executed by the one ormore processors 72, the one or more applications configured to perform the methods described in the video rendering method embodiments in the video conference described above.
Processor 72 may include one or more processing cores. Theprocessor 72, using various interfaces and lines to connect various parts throughout theterminal device 70, performs various functions of theterminal device 70 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in thememory 74, and calling data stored in thememory 74. Alternatively, theprocessor 72 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). Theprocessor 72 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may be implemented by a communication chip, rather than being integrated into theprocessor 72.
TheMemory 74 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Thememory 74 may be used to store instructions, programs, code, sets of codes, or sets of instructions. Thememory 74 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by theterminal device 70 in use, and the like.
Those skilled in the art will appreciate that the structure shown in fig. 7 is a block diagram of only a portion of the structure relevant to the present disclosure, and does not constitute a limitation on the terminal device to which the present disclosure applies, and that a particular terminal device may include more or less components than those shown in the drawings, or combine certain components, or have a different arrangement of components.
In summary, the terminal device provided in the embodiment of the present application is used to implement the video rendering method in the video conference corresponding to the foregoing method embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Referring to fig. 8, a block diagram of a computer-readable storage medium according to an embodiment of the present disclosure is shown. The computer-readable storage medium 80 has stored therein program code that can be invoked by a processor to perform the methods described in the video rendering method embodiments of the video conference.
The computer-readable storage medium 80 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 80 includes a non-transitory computer-readable storage medium. The computerreadable storage medium 80 has storage space forprogram code 82 for performing any of the method steps of the method described above. The program code can be read from and written to one or more computer program products. Theprogram code 82 may be compressed, for example, in a suitable form.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.