CN111935532B

Movatterモバイル変換

Info

Publication number: CN111935532B
Application number: CN202010816324.8A
Authority: CN
Inventors: 吴丰诚
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-08-14
Filing date: 2020-08-14
Publication date: 2024-03-01
Anticipated expiration: 2040-08-14
Also published as: CN111935532A

Abstract

The application provides a video interaction method, a video interaction device, electronic equipment and a computer readable storage medium; the method comprises the following steps: playing the video in the playing interface; and responding to the scaling operation for the video, determining a scaling position set by the scaling operation in the video, and locking the scaling position in the video to a fixed position of the playing interface so as to continuously play the scaled video obtained according to scaling in the playing interface. According to the method and the device, the user can be supported to individually and flexibly zoom the video picture in the video watching process.

Description

Video interaction method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to internet technology, and in particular, to a video interaction method, apparatus, electronic device, and computer readable storage medium.

Background

Video is widely used as an information transmission medium, and with the continuous development of internet technology and the continuous updating of digital video acquisition equipment, the resolution of the video is higher and higher, and a video picture comprises more detailed contents than before.

In the video playing process, there is often a need to view details of a certain partial area in a video frame, but the related technology generally only has insufficient support for scaling of a video playing interface, for example, the video playing needs to be paused to perform a screen capturing operation so as to perform scaling, or a fixed position in the video frame is used as an origin to perform scaling, which cannot meet the requirement of a user for viewing details of diversified contents in the video.

As can be seen, there is no effective solution for achieving flexible scaling of pictures in video.

Disclosure of Invention

The embodiment of the application provides a video interaction method, a video interaction device, electronic equipment and a computer readable storage medium, which can support a user to individually and flexibly scale a video picture in a video watching process.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides a video interaction method, which comprises the following steps:

playing the video in the playing interface;

in response to a zoom operation for the video, determining a zoom position set by the zoom operation in the video, and

and locking the zoom position in the video to a fixed position of the playing interface so as to continuously play the zoomed video obtained after the zoom is adjusted in the playing interface.

The embodiment of the application provides an interaction device for video, which comprises:

the playing module is used for playing the video in the playing interface;

a scaling module for determining a scaling position set in the video by a scaling operation in response to the scaling operation for the video;

the scaling module is further configured to lock the scaling position in the video to a fixed position of the playing interface, so as to continue playing the scaled video obtained after scaling adjustment in the playing interface.

In the above aspect, the zoom module is further configured to determine, when the zoom operation is a multi-finger pinch operation for zooming out or a multi-finger expand operation for zooming in, an intermediate position of an initial plurality of contacts of the zoom operation in a real-time screen of the video, and determine the intermediate position as a zoom position set by the zoom operation in the video.

In the above solution, the scaling module is further configured to determine, when the scaling operation is a click operation, a position where the scaling operation clicks in a real-time picture of the video as a scaling position set by the scaling operation in the video.

In the above solution, the scaling module is further configured to present a button corresponding to at least one candidate scaling position in a real-time frame of the video; and responding to the button triggering operation, and determining the candidate zoom position corresponding to the triggered button as the zoom position set by the zoom operation in the video.

In the above aspect, the scaling module is further configured to take, as the candidate scaling position, a position determined by at least one of: determining the position of a falling point of a viewing line of sight in the real-time picture; collecting audio data, performing voice recognition on the audio data to obtain a speaking text, and determining the position of content matched with the speaking text in the real-time picture; determining a fixed position in the real-time picture, wherein the fixed position comprises a center position and an edge position; identifying characters in the real-time picture, and determining the position of the characters with the character size smaller than a character size threshold value; determining the central position of an area where a target object is located in the real-time picture; wherein the types of the target objects include: interactive objects and scaled objects in the historical video; and acquiring the set historical zoom position in the real-time picture.

In the above scheme, the zoom module is further configured to lock the zoom-in fixed point position in the subsequent frame of the video to the fixed position of the playing interface, and play the zoom-in area in the subsequent frame after being zoomed in by the zoom-in scale; the amplifying region is a region taking the amplifying fixed point position as a center in a real-time picture of the video; wherein the subsequent picture is a real-time picture to be displayed in the video after receiving the zoom operation.

In the above aspect, the scaling module is further configured to perform at least one of the following operations, where a zoom-in area centered on the zoom-in fixed point position is determined in a real-time picture of the video: determining a plurality of candidate areas taking the amplifying fixed point position as a center in the real-time picture, and determining the candidate areas with the difference value of the content density between the candidate areas and the adjacent candidate areas being larger than a density difference value threshold value as the amplifying areas; determining an area which takes the amplifying fixed point position as a center and comprises target content in the real-time picture as the amplifying area, wherein the type of the target content is the same as that of the amplified content in the historical video, or the word size of words contained in the target content is smaller than a word size threshold; determining the amplification area according to a set boundary in response to a boundary setting operation centering on the amplification fixed point position; and when the zoom-in fixed point position is the position of a landing point of a viewing line in the real-time screen, determining a field of view region centered on the landing point position as the zoom-in region.

In the above solution, the scaling module is further configured to determine a scaling ratio corresponding to a parameter of the scaling operation; alternatively, a fixed scale is determined based on each received trigger operation.

In the above aspect, the scaling module is further configured to determine a scaling ratio that is positively related to a length of a track of the scaling operation when the scaling operation is a multi-finger pinch-out operation or a multi-finger expansion operation for a real-time picture of the video; determining a scaling factor positively correlated to a number of clicks of the scaling operation when the scaling operation is a click operation for a real-time picture of the video; when the scaling operation is a long press operation for a real-time picture of the video, a scaling ratio positively correlated to a duration of the long press operation is determined.

In the above solution, when continuing to play the scaled video obtained after the scaling adjustment in the play interface, the video interaction device further includes: a moving module, configured to determine a new zoom position corresponding to a movement operation for a subsequent frame of the zoomed video in response to the movement operation, and lock the new zoom position in the new subsequent frame of the video to a fixed position of the playing interface, so as to continue playing the zoomed video obtained after adjustment according to the zoom scale in the playing interface; the subsequent picture is a real-time picture to be displayed in the video after the scaling operation is received, and the new subsequent picture is a real-time picture to be displayed in the video after the moving operation is received.

In the above scheme, the moving module is further configured to synchronously move the zoom position according to a distance and a direction of movement of the moving operation, and use a position obtained after the movement as a new zoom position.

In the above scheme, the playing interface is a video view in a video view layer, and an identifier view layer is arranged on the top layer of the video view layer; the scaling module is used for identifying the scaling operation through the identifier view layer so as to determine a scaling position set by the scaling operation in the video; according to the zoom position and the zoom scale, carrying out coordinate transformation processing on the video view in the video view layer to obtain a subsequent picture of the zoomed video after adjustment according to the zoom scale; the subsequent picture is a real-time picture to be displayed in the video after the scaling operation is received.

An embodiment of the present application provides an electronic device, including:

a memory for storing computer executable instructions;

and the processor is used for realizing the video interaction method provided by the embodiment of the application when executing the computer executable instructions stored in the memory.

The embodiment of the application provides a computer readable storage medium, which stores computer executable instructions for implementing the video interaction method provided by the embodiment of the application when the computer executable instructions are executed by a processor.

The embodiment of the application has the following beneficial effects:

in the process of playing the video, the method supports the user to zoom any position of the video picture, and compared with the prior art which can only zoom the center position of the video picture, the method has higher zoom form diversity and can meet the personalized requirements of the user; and the scaling position determined by the scaling operation is locked to the fixed position of the playing interface to scale the video, so that the scaling content in the video appointed by the user can be always presented at the fixed position of the playing interface, the scaling effectiveness is improved, and further the waste of scaling resources of the user terminal can be avoided.

Drawings

Fig. 1A and 1B are schematic views of an application scenario provided by the related art;

Fig. 2 is a schematic structural diagram of an interactive system 100 for video according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an electronic device 500 according to an embodiment of the present application;

fig. 4A is a flowchart of an interaction method of video provided in an embodiment of the present application;

fig. 4B is an application scenario schematic diagram of an interaction method of video provided in an embodiment of the present application;

fig. 4C is an application scenario schematic diagram of an interaction method of video provided in an embodiment of the present application;

fig. 5 is a flow chart of an interaction method of video provided in an embodiment of the present application;

fig. 6 is an application scenario schematic diagram of an interaction method of video provided in an embodiment of the present application;

fig. 7A, fig. 7B, fig. 7C, and fig. 7D are schematic application scenarios of the video interaction method provided in the embodiments of the present application;

fig. 8 is a schematic diagram of an interaction method of video according to an embodiment of the present application;

fig. 9 is a schematic diagram of an interaction method of video provided in an embodiment of the present application;

fig. 10A and fig. 10B are application scene diagrams of an interaction method of video provided in an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

In the following description, the terms "first/second/third" are used merely to distinguish between similar objects and do not represent a particular ordering of the objects, it being understood that the "first/second/third" may be interchanged with a particular order or precedence where allowed, to enable embodiments of the present application described herein to be implemented in other than those illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.

1) In response to a condition or state that is used to represent the condition or state upon which the performed operation depends, the performed operation or operations may be in real-time or with a set delay when the condition or state upon which it depends is satisfied; without being specifically described, there is no limitation in the execution sequence of the plurality of operations performed.

2) A client, an application running in the terminal for providing various services, such as a video client, a live client, or a short video client, etc.

3) Scaling, that is, zooming in or out, is to uniformly zoom in or out on a video picture by a scaling scale (or scaling factor) based on a zoom position.

4) The zoom position, i.e. the origin of zooming a picture of the video, will be locked to a fixed position of the playing interface of the screen (e.g. the screen center position) for display at the time of the video zooming.

5) Scaling, i.e., the scaling of a picture in a video to be reduced or enlarged. The scaling scale includes positive and negative scales, the positive scale representing enlargement of the original picture, for example, +5% representing enlargement of the original picture by 5%, i.e., enlargement of the original picture by 105% before; a negative scale indicates that the original picture is reduced, for example, -5% indicates that the original picture is reduced by 5%, i.e., the original picture is reduced to 95% before.

Taking an application scene of online education as an example, in the related technology, recording and broadcasting course video supports center point double-finger zoom, and single-finger mobile viewing content is not supported for the amplified video. For example, referring to fig. 1A and 1B, fig. 1A and 1B are schematic views of application scenarios provided by the related art, for example, fig. 1A is an effect of enlarging a video frame based on a center point 101 of the video frame; fig. 1B shows an effect of reducing a video frame, for example, reducing an original video frame 102 to present a reduced video frame 103 in a playback interface.

In the embodiment of the present application, the following technical problems are found in the related art: 1) Students can only amplify video content by a video center when learning and recording courses; 2) The amplified video content does not support single-finger sliding viewing; 3) In the process of scaling video, the double fingers move while scaling, so that video offset is caused, and the situation that the double fingers cannot be seen occurs.

Aiming at the technical problems, the embodiment of the application provides a video interaction method which can support a user to individually scale a video in a video watching process. Referring to fig. 2, fig. 2 is a schematic structural diagram of an interactive system 100 for video according to an embodiment of the present application. Wherein the interactive system 100 of video comprises: the server 200, the network 300, and the terminal 400 will be described separately.

Server 200 is a background server of client 410 for transmitting corresponding video to client 410 in response to a video acquisition request of client 410.

The network 300 may be a wide area network or a local area network, or a combination of both, for mediating communication between the server 200 and the terminal 400.

The terminal 400 is configured to run the client 410, where the client 410 is a client with a video playing function. The client 410 is configured to send a video acquisition request to the server 200 in response to a video playing operation of a user, so as to receive a video sent by the server 200, and play the video in a playing interface; and the method is also used for responding to the scaling operation of the user on the video, locking the scaling position set by the scaling operation in the video to the fixed position of the playing interface so as to continuously play the scaled video obtained after the scaling adjustment in the playing interface.

In some embodiments, the terminal 400 implements the video interaction method provided in the embodiments of the present application by running a computer program, for example, the computer program may be a native program or a software module in an operating system; it may be a local (Native) Application (APP), i.e. a program that needs to be installed in an operating system to run, such as a video APP or a live APP; the method can also be an applet, namely a program which can be run only by being downloaded into a browser environment; but also a video applet or a live applet that can be embedded in any APP. In general, the computer programs described above may be any form of application, module or plug-in.

The embodiment of the application can be realized by means of Cloud Technology (Cloud Technology), wherein the Cloud Technology refers to a hosting Technology for integrating serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

The cloud technology is a generic term of network technology, information technology, integration technology, management platform technology, application technology and the like based on cloud computing business model application, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical network systems require a large amount of computing and storage resources.

As an example, the server 200 may be a stand-alone physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and basic cloud computing services such as big data and artificial intelligence platforms. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal 400 and the server 200 may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present application.

Next, the structure of the electronic device provided in the embodiment of the present application will be described, where the electronic device may be the terminal 400 shown in fig. 2, referring to fig. 3, and fig. 3 is a schematic structural diagram of the electronic device 500 provided in the embodiment of the present application, and the electronic device 500 shown in fig. 3 includes: at least one processor 510, a memory 550, at least one network interface 520, and a user interface 530. The various components in electronic device 500 are coupled together by bus system 540. It is appreciated that the bus system 540 is used to enable connected communications between these components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to the data bus. The various buses are labeled as bus system 540 in fig. 3 for clarity of illustration.

The processor 510 may be an integrated circuit chip with signal processing capabilities such as a general purpose processor, such as a microprocessor or any conventional processor, or the like, a digital signal processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

The user interface 530 includes one or more output devices 531 that enable presentation of media content, including one or more speakers and/or one or more visual displays. The user interface 530 also includes one or more input devices 532, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 550 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 550 may optionally include one or more storage devices physically located remote from processor 510.

Memory 550 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile memory may be read only memory (ROM, read Only Me mory) and the volatile memory may be random access memory (RAM, random Access Memor y). The memory 550 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 550 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 551 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

network communication module 552 is used to reach other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Serial Bus), etc.;

A presentation module 553 for enabling presentation of information (e.g., a user interface for operating a peripheral device and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;

the input processing module 554 is configured to detect one or more user inputs or interactions from one of the one or more input devices 532 and translate the detected inputs or interactions.

In some embodiments, the video interaction device provided in the embodiments of the present application may be implemented in software, and fig. 3 shows the video interaction device 555 of the video stored in the memory 550, which may be software in the form of a computer program, a plug-in, or the like, for example, a video client, a live client, or a short video client. The interaction device 555 of the video comprises the following software modules: a play module 5551 and a zoom module 5552, which are logical, and thus may be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be described hereinafter.

The video interaction method provided in the embodiment of the present application may be executed by the terminal 400 in fig. 2 alone, or may be executed by the terminal 400 and the server 200 in fig. 2 cooperatively.

In the following, an example of a method of interaction of video provided in the embodiment of the present application is described by the terminal 400 in fig. 2 alone. Referring to fig. 4A, fig. 4A is a schematic flow chart of an interaction method of video provided in an embodiment of the present application, and will be described with reference to the steps shown in fig. 4A.

It should be noted that the method shown in fig. 4A may be executed by various computer programs executed by the terminal 400, and is not limited to the above-described client 410, such as the operating system 551, software modules, and scripts described above, and thus the client should not be considered as limiting the embodiments of the present application.

In step S101, a video is played in a play interface.

Here, the video may be a video stored locally in the terminal or may be a video requested to be acquired from a server.

In some embodiments, in response to a video play operation, sending a video acquisition request to a server to cause the server to send a corresponding video in response to the video acquisition request; and receiving the video and playing the real-time picture of the video in the playing interface.

In step S102, in response to a zoom operation for a video, a zoom position set in the video by the zoom operation is determined.

In some embodiments, in response to a zoom operation for a video, a zoom position for the zoom operation set in a real-time picture in the video is determined.

Here, the zoom position refers to the origin of the zoom. The zoom operation includes a zoom-out operation and a zoom-in operation; the scaling operation may be various forms of operation preset by the operating system and having no conflict with the registered operation; or may be a user-defined operation of various forms that is non-conflicting with the registered operation. The scaling operation includes at least one of: click operations (e.g., single click operations, multiple click operations, or multiple click operations, etc.); a sliding operation according to a specific trajectory or direction; voice operation; motion sensing operation (e.g., up-and-down shaking operation or curve motion operation). Thus, the operation experience of the user can be improved.

In some embodiments, when the zoom operation is a multi-finger pinch operation for zooming out or a multi-finger expand operation for zooming in, an intermediate position of the first plurality of contacts of the zoom operation in a real-time picture of the video is determined, and the intermediate position is determined as a zoom position set by the zoom operation in the video.

Taking an example in which the zoom operation is an operation for zooming in (i.e., a zoom operation), the first position is determined as a zoom position in response to an operation of taking the first position of the real-time screen as a start position and simultaneously sliding from the start position to the second position and the third position of the real-time screen.

For example, in fig. 7A, the upper left corner region 702 is supported to slide in the direction of the double-finger extension arrow 701; in fig. 7B, the middle region 704 is enlarged in the direction of the double-fingered extension arrow 703.

Taking an example in which the zoom operation is an operation for zooming out (i.e., a zoom operation), the first position is determined as a zoom position in response to an operation of taking the second position and the third position of the real-time screen as the start positions and simultaneously sliding from the start positions to the first position of the real-time screen.

According to the method and the device for playing the video content, the user is supported to zoom in and zoom out any content in the video picture through simple operation, the technical problem that students can only zoom in the video content through a video center when learning and recording courses in the related technology can be solved, the zoom form diversity is higher, and the personalized requirements of the user can be met.

In other embodiments, when the zoom operation is a click operation, a position at which the zoom operation clicks in a real-time screen of the video is determined as a zoom position set by the zoom operation in the video. In this way, the content in the vicinity of the position clicked by the user can be enlarged or reduced.

In still other embodiments, a button corresponding to at least one candidate zoom position is presented in a real-time picture of the video; in response to a button triggering operation, a candidate zoom position corresponding to the triggered button is determined as a zoom position set in the video by the zoom operation.

Here, before the button corresponding to the at least one candidate zoom position is presented in the real-time screen of the video, the operation may be turned on in response to the zoom mode for the video, and thus the at least one button can be presented in the real-time screen to facilitate the user to zoom in on the content of the corresponding button.

As an example, a location determined by at least one of the following means is taken as a candidate zoom location:

in one mode, a position of a point of fall of a viewing line of sight in a real-time screen is determined.

As an example, the process of determining the landing point of the viewing line of sight in the real-time screen is: invoking a camera device (such as a camera) of the terminal to collect the positions of the pupils of the viewer and the reflection bright spots on the outer surface of the cornea of the eyeball; and determining a falling point corresponding to the sight line of the viewer in the real-time picture according to the positions of the pupils of the viewer and the reflecting bright spots on the outer surface of the cornea of the eyeball.

Here, the reflection bright spot on the outer surface of the cornea of the eyeball refers to Purkinje Image (Purkinje Image), which is a bright spot on the cornea of the eyeball, and is generated by the reflection (CR, cornea l Reflection) of the light entering the pupil on the outer surface of the cornea.

The principle of determining the landing point corresponding to the line of sight of the viewer in the real-time picture according to the positions of the reflective bright spots on the pupils and the outer surfaces of the cornea of the eyeballs of the viewer is as follows: because the position of the terminal camera is fixed, the position of the terminal screen light source is also fixed, the center position of the eyeball is unchanged, and the absolute position of the purkinje is not changed along with the rotation of the eyeball. But its position relative to the pupil and the eyeball is constantly changing, for example, the purkinje spot is located between the pupils of the viewer when the viewer stares at the camera; and when the viewer lifts his head, the purkinje spot is just below the viewer's pupil. Thus, the eye direction of the viewer can be estimated by using the geometric model by locating the pupil and the purkinje positions on the eye image in real time and calculating the cornea reflection vector. Based on the relationship between the eye characteristics of the viewer and the content presented on the terminal screen, which is established in the pre-scaling process (i.e. letting the viewer look at a specific point on the terminal screen), the falling point corresponding to the line of sight of the viewer can be determined in the real-time picture.

For example, the cornea reflection vector of the viewer is determined according to the positions of the pupil of the viewer and the reflection bright spots on the outer surface of the cornea of the eyeball; determining the line of sight direction of the viewer when the viewer views the real-time picture according to the cornea reflection vector of the viewer; and determining a falling point in the real-time picture according to the sight direction of a viewer when watching the video.

Therefore, the current watching position can be determined accurately in real time according to the sight line of the viewer, so that the content near the watching position can be scaled, the viewer can be helped to see the content near the watching position clearly, and the learning efficiency of the user is improved.

And in a second mode, collecting audio data, performing voice recognition on the audio data to obtain a speaking text, and determining the position of the content matched with the speaking text in a real-time picture.

As an example, call a microphone to collect audio data of a speaker for real-time picture explanation in real-time; performing voice recognition on the audio data to obtain a speaking text corresponding to the audio data; the spoken text is compared with content included in the real-time frame, and the location of the content matching the spoken text is determined in the real-time frame.

Taking online education as an example, a real-time picture comprises a first chapter content and a second chapter content, a client acquires audio data which is taught by a teacher aiming at the real-time picture in real time, and carries out voice recognition on the audio data to obtain that a speaking text corresponding to the audio data is the second chapter content, so that the client determines the position of the second chapter content displayed in the real-time picture as a candidate zoom position.

Therefore, the current explanation position can be determined in real time and accurately according to the voice of the speaker, so that the content near the explanation position is scaled, the viewer can be helped to see the content near the explanation position clearly, and the learning efficiency of the user is improved.

Mode three, a fixed position in the real-time picture is determined.

Here, the fixed positions include a center position and an edge position (e.g., a corner position in a real-time screen).

And fourthly, identifying the characters in the real-time picture, and determining the position of the characters with the character size smaller than the character size threshold value.

Here, the word size threshold may be a default value, a value set by the user, or a value determined based on the word sizes of all the words included in the real-time screen, for example, an average value of the word sizes of all the words included in the real-time screen is used as the word size threshold.

Therefore, the part with smaller word size in the real-time picture can be prompted to the user, so that the user can enlarge the part, and the user can be helped to see the content in the video clearly.

And fifthly, determining the central position of the area where the target object is located in the real-time picture.

As one example, the target object is an interactive object. An interactive object refers to an object that is interacted with (e.g., praise, comment, forward or back-step on, etc.) during video viewing. The interactive object is an object of interest to the user, that is, the area including the object has a high possibility of being enlarged, so that the central position of the area where the interactive object is located is determined as a candidate zoom position, the operation of the user can be saved, and the zoom efficiency can be improved.

As another example, the target object is a scaled object in the historical video. Therefore, the scaled object in the historical video can be inherited, so that the operation of a user is saved, and the scaling efficiency is improved.

In a sixth aspect, a history scaling position set in a real-time screen is acquired.

Here, the history zoom position is a zoom position set when viewing the history video. Therefore, the historical scaling position can be inherited, so that the operation of a user is saved, and the scaling efficiency is improved.

And seventhly, obtaining the zoom position set by the user watching the same video in the real-time picture, so that the zoom position set by the history can be inherited, thereby saving the operation of the user and improving the zoom efficiency.

Here, the zoom position may be a historical zoom position or a real-time zoom position. Taking live as an example, since the content watched by the live audience is consistent, the zoom position acquired at this time is a real-time zoom position.

And eighth mode, obtaining a zoom position set by the user with social relationship with the user.

As an example, users with social relationships to users may be users with similar hobbies for videos, or may be classmates. Therefore, the zoom position corresponding to the user with similar preference for the video or the zoom position set by the classmates can be inherited, so that calculation is not needed again, and the calculation resource is saved.

In step S103, the zoom position in the video is locked to a fixed position of the playback interface.

Here, the fixed position may be a central position of the playing interface, or may be a non-central position of the playing interface, for example, a position of the playing interface in which the center is right, or the like, which is not limited in this application.

In some embodiments, a zoom position in a subsequent picture of the video is locked to a fixed position of the playback interface, wherein the subsequent picture is a real-time picture to be displayed in the video after receiving the zoom operation.

Here, locking the zoom position to a fixed position of the playback interface means setting the zoom position as the origin of zooming to uniformly zoom the picture of the video at a later time. Therefore, the zoom position selected by the user can be locked to the fixed position of the playing interface, so that the zoom content in the video designated by the user can be always presented at the fixed position of the playing interface, and the zoom effectiveness is improved.

Taking the example that the zoom position is the zoom-in fixed point position and the fixed position is the center position of the playback interface, in fig. 7B, the zoom-in fixed point position 705 is locked to the fixed position 706 (i.e., the center position) of the playback interface to present an effect of zooming in the region 704 in the playback interface, where the fixed position 706 of the zoomed-in content presented in the playback interface is the zoom-in fixed point position 705.

Taking the example that the zoom position is a zoom-in fixed point position and the fixed position is a non-center position (i.e., a center-right position) of the play interface, in fig. 7C, the zoom-in fixed point position 705 is locked to a fixed position 707 (i.e., a center-right position) of the play interface to present an effect of zooming in the region 704 in the play interface, wherein the fixed position 707 of the zoomed-in content presented in the play interface is the zoom-in fixed point position 705.

In step S104, the scaled video obtained after the scaling adjustment is continuously played in the playing interface.

Here, the original video playback interface and the scaled video playback interface may be switched to be displayed: and when the scaling operation is received, switching the original video playing interface to the scaling video playing interface.

Thus, step S101 may be: playing the video in the original video playing interface; step S104 may be: and continuously playing the scaled video obtained after the scaling adjustment in the scaled video playing interface. So that the non-scaled original video and/or the scaled video can be variously presented in the human-computer interaction interface.

In some embodiments, when the zoom position is a zoom-in fixed-point position and the zoom scale is a zoom-in scale, step S103 and step S104 may include: and locking the amplifying fixed point position in the subsequent picture of the video to the fixed position of the playing interface, and playing the amplifying region amplified by the amplifying proportion in the subsequent picture.

Here, the enlarged region is a region centered on the enlargement fixed point position in the real-time picture of the video; the subsequent screen is a real-time screen to be displayed in the video after receiving the zoom-in operation.

As an example, before playing the enlarged region enlarged by the enlargement scale in the subsequent screen, it may further include: a zoom-in area centered on the zoom-in setpoint position will be determined in a real-time picture of the video by at least one of:

In one aspect, a plurality of candidate regions centering on the zoom-in fixed point position in the real-time screen are determined, and a candidate region having a difference in content density from the adjacent candidate region larger than a density difference threshold is determined as a zoom-in region.

Here, the density difference threshold may be a default value, a value set by the user, or a value determined based on a difference in content density between each candidate region and an adjacent candidate region, for example, an average value of the difference in content density between all candidate regions and the adjacent candidate region is taken as the density difference threshold.

As an example, a plurality of circles which are centered on the enlarged position and radiate outward with an equal difference radius are determined as candidate areas; determining a difference in content density between any two adjacent candidate regions; determining two adjacent candidate areas with the difference value of the content density larger than a density difference value threshold value as candidate amplifying areas; among the candidate enlarged regions, a region in which the content density is large is determined as an enlarged region. In this way, when the user suddenly shifts to an area where the viewing content is sparse to an area where the viewing content is compact, the user largely does not see the content in the area where the content is compact due to the consistency of the vision, and thus the area needs to be enlarged more likely.

In the second mode, an area including the target content centered on the zoom-in fixed point position in the real-time screen is determined as a zoom-in area.

As an example, the target content is the same as the type of the content amplified in the history video, so that the historically set amplifying region can be inherited, thereby saving the operation of the user and improving the scaling efficiency.

As another example, the word size of the words included in the target content is less than the word size threshold, and thus, the content with the smaller word size can be enlarged to the user, thereby being able to help the user to see the content in the video.

In a third aspect, in response to a boundary setting operation centering on the zoom-in fixed point position, a zoom-in area is determined from the set boundary.

Here, the boundary setting operation may be various forms of operation that are preset by the operating system and that do not conflict with registered operations; or may be a user-defined operation of various forms that is non-conflicting with the registered operation. The scaling operation includes at least one of: sliding operation; and (5) voice operation. Thus, the operation experience of the user can be improved.

Taking the boundary setting operation as an example of the sliding operation, the locus of the sliding operation is determined as the boundary of the enlarged region, and thus the enlarged region can be determined. The enlarged area may be a regular area, for example, a circle or a rectangle, or may be an irregular area, which is not limited in the embodiment of the present application.

In a fourth aspect, the global area of the real-time screen is determined as the enlarged area, and thus, the content in the real-time screen can be globally enlarged.

In a fifth aspect, when the enlarged fixed point position is a position of a landing point of the viewing line on the real-time screen, the field of view region centered on the landing point position is determined as the enlarged region.

Here, the position of the falling point of the viewing line in the real-time screen is determined to be the same as the above example, and a detailed description thereof will be omitted.

As an example, when the enlarged fixed point position is a position of a landing point of the viewing line of sight in the real-time screen, a viewing angle of the viewing line of sight is determined; according to the viewing angle and the position of the landing point, a field of view area is determined in the real-time picture, and the field of view area is determined as an enlarged area.

Therefore, the current viewing area can be determined in real time and accurately according to the sight of the viewer, so that the content in the viewing area is amplified, the viewer can be helped to see the content in the viewing area clearly, and the learning efficiency of the user is improved.

In some embodiments, when the zoom position is a zoom-out fixed-point position and the zoom scale is a zoom-out scale, step S103 and step S104 may include: and locking the reduced fixed point position in the subsequent picture of the video to the fixed position of the playing interface, and playing the reduced area which is reduced by the reduced scale in the subsequent picture.

Here, the reduced area is an area centered on the reduced fixed point position in the real-time screen of the video; the subsequent screen is a real-time screen to be displayed in the video after receiving the zoom-out operation.

For example, referring to fig. 4B, fig. 4B is an application scenario schematic diagram of an interaction method of video provided in an embodiment of the present application. In fig. 4B, the area 402 in the global screen, i.e., the above-mentioned reduced area, is displayed by supporting the double-finger sliding along the arrow 401 to reduce.

For example, referring to fig. 4C, fig. 4C is an application scenario schematic diagram of an interaction method of video provided in an embodiment of the present application. In fig. 4C, the reduced area 405 (i.e., the global picture) is reduced by sliding along the direction of the double-finger arrow 404, so as to present the reduced global picture and the background area 403 in the playing interface, where the background area 403 may present interactive information related to the video being played, and may also support the user to record (e.g., make notes, etc.) in the form of a tablet.

In some embodiments, before step S104, further comprises: the scaling will be determined by at least one of the following:

in one aspect, a scaling factor corresponding to a parameter of a scaling operation is determined.

As one example, when the scaling operation is a multi-finger pinch-out operation or a multi-finger expand operation for a real-time picture of a video, a scaling ratio that is positively correlated with the length of the locus of the scaling operation is determined.

For example, the locus of the zoom operation may be a straight line or a curved line.

When the scaling operation is a multi-finger expanding operation which takes the first position of the real-time picture as a starting position and simultaneously slides to the second position and the third position of the real-time picture from the starting position, the scaling is determined according to the distance between the second position and the third position, wherein the scaling is in direct proportion to the distance. The multi-finger expansion operation may be an operation for amplification.

When the scaling operation is a multi-finger kneading operation in which the second position and the third position of the real-time screen are set as the start positions and the second position and the third position are simultaneously slid from the start positions to the first position of the real-time screen, a scaling ratio is determined according to a distance between the second position and the third position, wherein the scaling ratio is proportional to the distance. The multi-finger kneading operation may be an operation for downsizing.

According to the method and the device for determining the scaling degree, the scaling degree can be determined according to the length of the track of the sliding of the finger, and the operation cost of a user can be saved, so that the scaling efficiency and the operation experience of the user are improved.

As another example, when the zoom operation is a click operation for a real-time picture of the video, a scale that is positively correlated to the number of clicks of the zoom operation is determined.

As yet another example, when the zoom operation is a long press operation for a real-time picture of the video, a scale that is positively correlated to a duration of the long press operation is determined.

And in a second mode, a fixed scaling is determined according to each received triggering operation.

As an example, when a user triggers a real-time picture of a video each time, the real-time picture can be scaled, wherein the scale of each scaling is fixed.

For example, the fixed scale may be a default value or a value set by the user.

Taking the fixed scaling ratio of +5% as an example, when a user triggers a real-time picture of a video, expanding the picture of the video to 105% of an original picture; when the user triggers the real-time picture of the video again, the picture of the video is enlarged to 110% of the original picture, and so on. Thus, each time of zooming operation can zoom the real-time picture with the same zoom degree.

In a third mode, when the zoom operation is a zoom operation, the zoom-in scale is determined according to the area of the zoom-in area and the area of the playback interface.

For example, when the zoom-in area is zoomed in until the playback interface is fully rendered, the zoom-in ratio is the ratio of the area of the zoom-in area to the area of the playback interface. In this way, the content in the enlarged area can be completely presented in the playback interface.

In the fourth mode, when the zoom operation is a zoom-out operation, the zoom-out scale is determined according to the area of the zoom-out area and the area of the playback interface.

For example, when the reduced area that is originally completely presented in the playback interface is enlarged to the global content that is completely presented in the playback interface as a real-time picture, the reduced scale is the ratio of the area of the reduced area to the area of the playback interface. Thus, the global content of the real-time picture can be completely presented in the playing interface.

In some embodiments, the playback interface is a video view in a video view layer, with the top layer of the video view layer (from the perspective of the viewer looking at the screen) setting the identifier view layer; step S102 may include: the zoom operation is identified by the identifier view layer to determine a zoom position for the zoom operation to set in the video. Step S104 may include: and carrying out coordinate transformation processing on the video view in the video view layer according to the zoom position and the zoom scale to obtain a subsequent picture of the zoomed video after adjustment according to the zoom scale.

As an example, a zoom operation is recognized by a recognizer view layer and sent to a view controller, and a zoom position set in a video corresponding to the zoom operation is determined by the view controller; sending the zoom position to the video view layer through the view controller; and carrying out coordinate transformation processing on the video view in the video view layer according to the zoom position and the zoom scale by the video view layer to obtain a subsequent picture of the zoomed video after adjustment according to the zoom scale.

According to the embodiment of the application, the video can be scaled at any position on the premise that the video playing progress is not affected by simple data transmission and coordinate transformation between view layers, so that the technical problem that in the related art, video offset is caused when scaling is performed while moving due to scaling of double fingers in the process of scaling the video can be solved, and the situation that the video cannot be seen is caused can be avoided, and excessive computing resources can be avoided when the user terminal performs scaling.

When the scaled video obtained after scaling adjustment is continuously played in the playing interface, referring to fig. 5, fig. 5 is a schematic flow chart of an interaction method of the video provided in the embodiment of the present application, and based on fig. 4A, step S105 and step S106 may be included after step S104.

In step S105, in response to a movement operation for a subsequent screen of the zoom video, a new zoom position corresponding to the movement operation is determined.

Here, the mobile operation may be various forms of operation that are preset by the operating system and that do not conflict with registered operations; or may be a user-defined operation of various forms that is non-conflicting with the registered operation. The moving operation includes at least one of: a sliding operation according to a specific trajectory or direction; voice operation; somatosensory operation (for example, up-and-down shaking operation, curvilinear motion operation, etc.); line-of-sight operations (e.g., operations to move subsequent pictures based on the line-of-sight of the viewer). Thus, the operation experience of the user can be improved.

In some embodiments, the zoom positions are synchronously moved according to the distance and direction of movement of the movement operation in the subsequent screen, and the positions obtained after the movement are taken as new zoom positions.

As an example, in response to an operation of taking the fourth position of the subsequent screen as the start position and sliding toward the fifth position of the subsequent screen, a relative positional relationship between the fourth position and the fifth position is determined; in the new subsequent screen, a position having a relative positional relationship with the center position of the screen before movement is determined as a new zoom position.

Here, the subsequent picture is a real-time picture to be displayed in the video after receiving the zoom operation, and the new subsequent picture is a real-time picture to be displayed in the video after receiving the move operation. That is, the subsequent screen is different from the new subsequent screen in that the subsequent screen is a screen displayed in the playback interface before the movement operation is received, and the new subsequent screen is a screen displayed in the playback interface after the movement operation is received. As another way of description, a subsequent picture may be referred to as a first picture and a new subsequent picture may be referred to as a second picture.

In step S106, the new zoom position in the new subsequent frame of the video is locked to the fixed position of the playing interface, so as to continue playing the zoomed video adjusted according to the zoom scale in the playing interface.

Taking the example of an enlarged video presented in the play interface prior to receiving the move operation, in fig. 7C, a single finger is supported to slide in the direction of arrow 705 to the upper left corner area of the display. Thus, the user can view the enlarged partial contents through a single-finger sliding operation.

Taking the example that the reduced video is displayed in the playing interface before the mobile operation is received as an example, referring to fig. 6, fig. 6 is an application scene schematic diagram of the video interaction method provided in the embodiment of the present application. In fig. 6, the reduced area 501 is slid to the lower right corner of the playback interface in the direction of the single-finger extension arrow 502. Therefore, the user can adjust the position of the reduced area in the playing interface through single-finger sliding operation, namely the background area can be updated, so that interaction information related to the played video is presented in the background area.

The embodiment of the application supports the movement of the amplified or reduced video content, and in the movement process, the video playing progress cannot be influenced, so that the technical problem that the amplified video content does not support single-finger sliding viewing in the related technology can be solved.

The video interaction method provided by the embodiment of the application is described below by taking the online education as an application scene.

The embodiment of the application supports a video zooming function, and a user can zoom a video designated area in a double-finger operation mode to view local content; the user can also view the local content through a single finger sliding operation. Referring to fig. 7A, fig. 7B, fig. 7C, and fig. 7D, fig. 7A, fig. 7B, fig. 7C, and fig. 7D are schematic application scenarios of the video interaction method provided in the embodiments of the present application.

In fig. 7A, the upper left corner region 702 is supported to slide in the direction of the double-finger arrow 701. In fig. 7B and 7C, the middle region 704 is enlarged in the direction of the double-finger extension arrow 703. In fig. 7C, the support single finger slides in the direction of arrow 705 to the upper left corner region of the display.

Next, a specific implementation manner of the video interaction method provided by the embodiment is described with reference to fig. 8 and fig. 9, and fig. 8 and fig. 9 are schematic diagrams of the video interaction method provided by the embodiment of the present application.

In fig. 8, a gesture recognizer view layer (or gesture view layer, namely the recognizer view layer) is newly added to the player control layer (namely the view controller); after the gesture recognition of the user double-finger touch screen is successful, when the zoom gesture changes, a video zoom point (namely, the initial touch position of the finger) and a zoom scale (namely, the scale determined according to the sliding distance of the finger) are acquired at the moment. And then, carrying out coordinate axis transformation on the video view (or called video picture), updating the transformed result to a video view layer, and achieving the effect of fixed point amplification, namely, determining the position of a corresponding scaling point in the video view as a central coordinate point, and scaling the video view according to the central coordinate point. For single-finger sliding viewing of the video, according to the moving point of the moving gesture, updating the central coordinate point of the video view to achieve the moving effect of the video.

For example, the starting point of the movement gesture is a (x 1, y 1), the ending point is B (x 2, y 2), the relative position of a and the original center coordinate point (i.e. origin (0, 0)) in the coordinate system is kept unchanged, and when a is moved to B, the coordinate point corresponding to the position after the original center coordinate point is moved is the updated center coordinate point.

The gesture view layer receives gesture events, acquires coordinate information according to the recognition state of the gestures, obtains final coordinates through a series of coordinate transformations, and then updates the coordinates of the video picture view to realize the effects of zooming and moving viewing.

Compared with the related art, the method and the device have the advantages that the moving processing of the fixed point coordinates and the restoration processing of the scaled result coordinates are added. And opening a movement gesture to the zoomed video, and supporting movement viewing. Specifically, the scaling and movement effect principle is shown in fig. 9. In fig. 9, the video frame can achieve the effect of fixed point amplification through the first step, the second step and the third step, and can achieve the moving effect through the movement.

The embodiment of the application can realize the following technical effects:

1) The video scaling function supports live broadcasting and recorded broadcasting courses, and students' teaching scenes meeting different types of courses are realized.

2) Through supporting the video amplification function under the horizontal and vertical screen, the student can be better look over the content that the teacher input, promotes experience of lessng.

Referring to fig. 10A and 10B, fig. 10A and 10B are schematic application scenarios of the video interaction method provided in the embodiments of the present application. FIG. 10A is a vertical screen zoom effect, with the content in the zoom-in area 110 being zoomed in on FIG. 10A; fig. 10B is a landscape zoom effect, and the content in the zoom-in area 120 is zoomed in fig. 10B.

(1) The dilemma that students who see recorded lessons cannot see contents input by the clear instructors or contents of questions of students and cannot be explained on line is solved.

(2) The method solves the problem that a live-broadcast lesson teacher forgets to amplify the content, students remind in the chat area, and the teacher does not see the embarrassing scene of the chat area.

3) The amplified content is flexibly checked through the single-finger mobile video, and the playing progress of the video course is not influenced.

Embodiments of the present application are not limited to being provided as methods and hardware, but may be implemented in a variety of ways, such as being provided as a computer readable storage medium (storing instructions for performing the methods of interaction of video provided by embodiments of the present application), as exemplified below.

Mobile terminal application and module: the embodiment of the application can be provided as a software module designed by using programming languages such as C/C++, java and the like, and is embedded into various mobile terminal Apps (such as Tencentration class and the like) based on Android, iOS and the like (stored in a storage medium of a mobile terminal by executable instructions and executed by a processor of the mobile terminal), so that related tasks such as video playing, video zooming and the like can be completed directly by using the computing resources of the mobile terminal, and the results of the video zooming and the like can be transmitted to a remote server in a periodic or aperiodic manner through various network communication modes or can be stored locally at the mobile terminal.

An exemplary structure of the video interaction device 555 implemented as a software module provided in the embodiments of the present application is described below in conjunction with fig. 3, and in some embodiments, as shown in fig. 3, the software module stored in the video interaction device 555 of the memory 550 may include:

a play module 5551, configured to play a video in a play interface;

a scaling module 5552 for determining a scaling position set in the video by a scaling operation in response to the scaling operation for the video;

the scaling module 5552 is further configured to lock the scaling position in the video to a fixed position of the playing interface, so as to continue playing the scaled video obtained after scaling adjustment in the playing interface.

In the above aspect, the scaling module 5552 is further configured to determine, when the scaling operation is a multi-finger pinch operation for zooming out or a multi-finger expand operation for zooming in, an intermediate position of an initial plurality of contacts of the scaling operation in a real-time screen of the video, and determine the intermediate position as a scaling position set by the scaling operation in the video.

In the above aspect, the scaling module 5552 is further configured to determine, when the scaling operation is a click operation, a position where the scaling operation clicks in a real-time frame of the video as a scaling position set by the scaling operation in the video.

In the above aspect, the scaling module 5552 is further configured to present a button corresponding to at least one candidate scaling position in a real-time frame of the video; and responding to the button triggering operation, and determining the candidate zoom position corresponding to the triggered button as the zoom position set by the zoom operation in the video.

In the above aspect, the scaling module 5552 is further configured to use a location determined by at least one of the following manners as the candidate scaling location: determining the position of a falling point of a viewing line of sight in the real-time picture; collecting audio data, performing voice recognition on the audio data to obtain a speaking text, and determining the position of content matched with the speaking text in the real-time picture; determining a fixed position in the real-time picture, wherein the fixed position comprises a center position and an edge position; identifying characters in the real-time picture, and determining the position of the characters with the character size smaller than a character size threshold value; determining the central position of an area where a target object is located in the real-time picture; wherein the types of the target objects include: interactive objects and scaled objects in the historical video; and acquiring the set historical zoom position in the real-time picture.

In the above solution, the scaling module 5552 is further configured to lock the zoom-in fixed position in the subsequent frame of the video to the fixed position of the playing interface, and play the zoom-in area in the subsequent frame after being zoomed in by the zoom-in scale; the amplifying region is a region taking the amplifying fixed point position as a center in a real-time picture of the video; wherein the subsequent picture is a real-time picture to be displayed in the video after receiving the zoom operation.

In the above aspect, the scaling module 5552 is further configured to perform at least one of the following operations, where a zoom-in area centered on the zoom-in fixed point position is determined in a real-time frame of the video: determining a plurality of candidate areas taking the amplifying fixed point position as a center in the real-time picture, and determining the candidate areas with the difference value of the content density between the candidate areas and the adjacent candidate areas being larger than a density difference value threshold value as the amplifying areas; determining an area which takes the amplifying fixed point position as a center and comprises target content in the real-time picture as the amplifying area, wherein the type of the target content is the same as that of the amplified content in the historical video, or the word size of words contained in the target content is smaller than a word size threshold; determining the amplification area according to a set boundary in response to a boundary setting operation centering on the amplification fixed point position; and when the zoom-in fixed point position is the position of a landing point of a viewing line in the real-time screen, determining a field of view region centered on the landing point position as the zoom-in region.

In the above aspect, the scaling module 5552 is further configured to determine a scaling ratio corresponding to a parameter of the scaling operation; alternatively, a fixed scale is determined based on each received trigger operation.

In the above aspect, the scaling module 5552 is further configured to determine, when the scaling operation is a multi-finger pinch-out operation or a multi-finger expansion operation for a real-time picture of the video, a scaling ratio that is positively related to a length of a track of the scaling operation; determining a scaling factor positively correlated to a number of clicks of the scaling operation when the scaling operation is a click operation for a real-time picture of the video; when the scaling operation is a long press operation for a real-time picture of the video, a scaling ratio positively correlated to a duration of the long press operation is determined.

In the above solution, when continuing to play the scaled video obtained after the scaling adjustment in the play interface, the video interaction device 555 further includes: a moving module, configured to determine a new zoom position corresponding to a movement operation for a subsequent frame of the zoomed video in response to the movement operation, and lock the new zoom position in the new subsequent frame of the video to a fixed position of the playing interface, so as to continue playing the zoomed video obtained after adjustment according to the zoom scale in the playing interface; the subsequent picture is a real-time picture to be displayed in the video after the scaling operation is received, and the new subsequent picture is a real-time picture to be displayed in the video after the moving operation is received.

In the above scheme, the playing interface is a video view in a video view layer, and an identifier view layer is arranged on the top layer of the video view layer; the scaling module 5552 is configured to identify, by the identifier view layer, the scaling operation to determine a scaling position set by the scaling operation in the video; according to the zoom position and the zoom scale, carrying out coordinate transformation processing on the video view in the video view layer to obtain a subsequent picture of the zoomed video after adjustment according to the zoom scale; the subsequent picture is a real-time picture to be displayed in the video after the scaling operation is received.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the video interaction method according to the embodiment of the present application.

The embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions, where the computer-executable instructions are stored, which when executed by a processor, cause the processor to perform the video interaction method provided by the embodiments of the present application, for example, the video interaction method shown in fig. 4A and 5, and the computer includes various computing devices including an intelligent terminal and a server.

In some embodiments, the computer readable storage medium may be FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.

In some embodiments, computer-executable instructions may be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, in the form of programs, software modules, scripts, or code, and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.

As an example, computer-executable instructions may, but need not, correspond to files in a file system, may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext markup language document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

As an example, computer-executable instructions may be deployed to be executed on one computing device or on multiple computing devices located at one site or, alternatively, distributed across multiple sites and interconnected by a communication network.

In summary, the embodiment of the application has the following beneficial effects:

(1) The method supports users to zoom out and zoom in any content in the video pictures through simple operation, not only can solve the technical problem that students can only zoom in video content by a video center when learning and recording courses in the related technology, but also has higher zoom form diversity and can meet the personalized demands of users.

(2) The current watching position can be determined in real time and accurately according to the sight of the viewer, so that the content near the watching position can be scaled, the viewer can be helped to see the content near the watching position clearly, and the learning efficiency of the user is improved.

(3) The method and the device can determine the current explanation position in real time and accurately according to the voice of the speaker so as to scale the content near the explanation position, thereby helping the viewer to see the content near the explanation position clearly and further improving the learning efficiency of the user.

(4) The part with smaller word size in the real-time picture can be prompted to the user so that the user can enlarge the part, and therefore the user can be helped to see the content in the video clearly.

(5) By simple data transmission and coordinate transformation between view layers, the video can be scaled at any position on the premise of not influencing the video playing progress, so that the technical problems that in the related art, the video is offset due to scaling and moving in the process of scaling the video, and the situation that the video cannot be seen occurs can be solved, and excessive computing resources are consumed by a user terminal during scaling can be avoided.

(6) The method supports the movement of the amplified or reduced video content, does not influence the video playing progress in the movement process, and can solve the technical problem that the amplified video content does not support single-finger sliding viewing in the related technology.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and scope of the present application are intended to be included within the scope of the present application.

Claims

1. A method of video interaction, the method comprising:

playing the video in the playing interface;

locking the zoom position in the video to a fixed position of the playing interface so as to continuously play the zoomed video obtained after adjustment according to the zoom scale in the playing interface;

in response to a move operation for a subsequent frame of the scaled video, determining a new scaling position corresponding to the move operation, and

locking the new zoom position in the new subsequent picture of the video to the fixed position of the playing interface so as to continuously play the zoomed video obtained after adjustment according to the zoom scale in the playing interface;

The subsequent picture is a real-time picture to be displayed in the video after the zoom operation is received, and the new subsequent picture is a real-time picture to be displayed in the video after the movement operation is received.

2. The method of claim 1, wherein the determining a zoom position for the zoom operation set in the video comprises:

when the zoom operation is a multi-finger pinch operation for zooming out or a multi-finger expand operation for zooming in, an intermediate position of an initial plurality of contacts of the zoom operation in a real-time screen of the video is determined, and the intermediate position is determined as a zoom position set by the zoom operation in the video.

3. The method of claim 1, wherein the determining a zoom position for the zoom operation set in the video comprises:

and when the scaling operation is a clicking operation, determining the position of the scaling operation clicked in the real-time picture of the video as the scaling position set by the scaling operation in the video.

4. The method of claim 1, wherein the determining a zoom position for the zoom operation set in the video comprises:

Presenting a button corresponding to at least one candidate zoom position in a real-time picture of the video;

and responding to the button triggering operation, and determining the candidate zoom position corresponding to the triggered button as the zoom position set by the zoom operation in the video.

5. The method of claim 4, wherein prior to presenting a button corresponding to at least one candidate zoom position in a real-time picture of the video, the method further comprises:

taking as the candidate zoom position a position determined by at least one of:

determining the position of a falling point of a viewing line of sight in the real-time picture;

collecting audio data, performing voice recognition on the audio data to obtain a speaking text, and determining the position of content matched with the speaking text in the real-time picture;

determining a fixed position in the real-time picture, wherein the fixed position comprises a center position and an edge position;

identifying characters in the real-time picture, and determining the position of the characters with the character size smaller than a character size threshold value;

determining the central position of an area where a target object is located in the real-time picture, wherein the types of the target object comprise: interactive objects and scaled objects in the historical video;

And acquiring the set historical zoom position in the real-time picture.

6. The method of claim 1, wherein the method comprises the steps of,

when the zoom position is a zoom-in fixed point position and the zoom scale is a zoom-in scale, the step of locking the zoom position in the video to a fixed position of the playing interface so as to continue playing the zoomed video obtained after the zoom scale adjustment in the playing interface comprises the following steps:

locking the amplifying fixed point position in the subsequent picture of the video to the fixed position of the playing interface, and playing the amplifying region amplified by the amplifying proportion in the subsequent picture;

the amplifying region is a region taking the amplifying fixed point position as a center in a real-time picture of the video;

wherein the subsequent picture is a real-time picture to be displayed in the video after receiving the zoom operation.

7. The method of claim 6, wherein prior to said playing the enlarged region of the subsequent picture enlarged by the enlargement scale, the method further comprises:

a zoom-in area centered on the zoom-in setpoint position is to be determined in a real-time picture of the video by at least one of:

Determining a plurality of candidate areas taking the amplifying fixed point position as a center in the real-time picture, and determining the candidate areas with the difference value of the content density between the candidate areas and the adjacent candidate areas being larger than a density difference value threshold value as the amplifying areas;

determining an area which takes the amplifying fixed point position as a center and comprises target content in the real-time picture as the amplifying area, wherein the type of the target content is the same as that of the amplified content in the historical video, or the word size of words contained in the target content is smaller than a word size threshold;

determining the amplification area according to a set boundary in response to a boundary setting operation centering on the amplification fixed point position;

and when the zoom-in fixed point position is the position of a landing point of a viewing line in the real-time screen, determining a field of view region centered on the landing point position as the zoom-in region.

8. The method of claim 1, wherein before continuing to play the scaled video that is scaled in the play interface, the method further comprises:

determining a scaling scale corresponding to a parameter of the scaling operation; or,

A fixed scale is determined based on each received trigger operation.

9. The method of claim 8, wherein the determining a scaling ratio corresponding to the parameter of the scaling operation comprises:

determining a scaling factor positively correlated to a length of a locus of the scaling operation when the scaling operation is a multi-finger pinch operation or a multi-finger expand operation for a real-time picture of the video;

determining a scaling factor positively correlated to a number of clicks of the scaling operation when the scaling operation is a click operation for a real-time picture of the video;

when the scaling operation is a long press operation for a real-time picture of the video, a scaling ratio positively correlated to a duration of the long press operation is determined.

10. The method of claim 1, wherein the step of determining the position of the substrate comprises,

the playing interface comprises an original video playing interface for displaying the video which is not zoomed and a zoomed video playing interface for displaying the zoomed video;

the relationship between the original video playing interface and the scaled video playing interface comprises:

the original video playing interface and the scaled video playing interface are displayed in a split screen mode; the scaled video playing interface floating layer floats on the top layer of the original video playing interface; and the original video playing interface and the scaled video playing interface are switched to be displayed.

11. The method of claim 1, wherein the determining a new zoom position corresponding to the move operation comprises:

and synchronously moving the zoom position according to the moving distance and direction of the moving operation, and determining the position obtained after the movement as the new zoom position.

12. The method of claim 1, wherein the step of determining the position of the substrate comprises,

the playing interface is a video view in a video view layer, and an identifier view layer is arranged on the top layer of the video view layer;

the determining a zoom position set in the video by the zoom operation includes:

identifying, by the identifier view layer, the zoom operation to determine a zoom position set by the zoom operation in the video;

and continuing to play the scaled video obtained after scaling adjustment in the play interface, wherein the method comprises the following steps:

according to the zoom position and the zoom scale, carrying out coordinate transformation processing on the video view in the video view layer to obtain a subsequent picture of the zoomed video after adjustment according to the zoom scale;

the subsequent picture is a real-time picture to be displayed in the video after the scaling operation is received.

13. A video interactive apparatus, comprising:

the playing module is used for playing the video in the playing interface;

the scaling module is further configured to lock the scaling position in the video to a fixed position of the playing interface, so as to continue playing the scaled video obtained after scaling adjustment in the playing interface;

a moving module, configured to determine a new zoom position corresponding to a movement operation for a subsequent frame of the zoomed video in response to the movement operation, and lock the new zoom position in the new subsequent frame of the video to a fixed position of the playing interface, so as to continue playing the zoomed video obtained after adjustment according to the zoom scale in the playing interface; the subsequent picture is a real-time picture to be displayed in the video after the zoom operation is received, and the new subsequent picture is a real-time picture to be displayed in the video after the movement operation is received.

14. An electronic device, comprising:

A memory for storing computer executable instructions;

a processor for implementing the video interaction method of any one of claims 1 to 12 when executing computer-executable instructions stored in said memory.

15. A computer readable storage medium storing computer executable instructions which when executed are adapted to implement the method of interaction of video according to any one of claims 1 to 12.