Disclosure of Invention
In a first aspect, the present application provides a rendering instruction processing method, where the method is applied to a terminal device, where the terminal device includes a graphics processor GPU, and the GPU supports target expansion in an open graphics library openGL ES of an embedded system, and the method includes:
the method comprises the steps of obtaining a first rendering instruction which is not realized based on the target extension, wherein the first rendering instruction is used for realizing a rendering task, obtaining a second rendering instruction according to the first rendering instruction, wherein the second rendering instruction is a rendering instruction realized based on the target extension, and the second rendering instruction is used for realizing the rendering task;
taking an application program needing graphic rendering as an example of a game, when the game is started, a rendering instruction aiming at a rendering task in the game can be obtained, and then the rendering task corresponding to the rendering instruction is executed; in the embodiment of the application, the terminal device can be preconfigured with a plurality of rules, the rules define preset conditions which the instructions should meet, and if the first rendering instructions meet the preset conditions, the first rendering instructions can represent that the current rendering scene needs to be optimized;
That is, the current GPU supports the target extension on which the second rendering instruction depends, but in the application, the second rendering instruction is not used for the rendering task, but the first rendering instruction which is not dependent on the target extension is used, and at this time, the second rendering instruction can be acquired as the subsequent rendering instruction for executing the rendering task;
in an alternative implementation, the device power consumption when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the device power consumption when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or the corresponding memory data copy amount when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding memory data copy amount when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or the corresponding GPU load when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding GPU load when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or the corresponding CPU load of the CPU when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding CPU load when the GPU is triggered to execute the rendering task according to the second rendering instruction.
That is, under the condition of executing the same rendering task, the rendering efficiency of the second rendering instruction is better than that of the first rendering task;
triggering the GPU to execute the rendering task according to the second rendering instruction;
in an alternative implementation, the CPU obtains the second rendering instruction and then passes the second rendering instruction to the GPU driver. For example, the GPU driver may compile the second rendering instructions into GPU-executable objects or machine code, after which these second rendering instructions can be executed by the GPU, which may perform the corresponding rendering tasks after receiving the compiled second rendering instructions.
In the embodiment of the application, the game can reduce the terminal performance loss caused by frequent memory mapping and the like or reduce the load of the CPU/GPU by executing the instruction realized based on GPU expansion without modifying codes. Meanwhile, the embodiment of the application is a general expansion optimization framework based on GPU expansion, the performance of a mobile phone supporting the expansion can be improved, the optimization is not enabled for a mobile phone not supporting the expansion, and the compatibility is better.
In an optional implementation, the obtaining a second rendering instruction according to the first rendering instruction includes:
and acquiring the second rendering instruction corresponding to the first rendering instruction based on a mapping relation, wherein the mapping relation comprises a preset corresponding relation between the first rendering instruction and the second rendering instruction.
The CPU may acquire the second rendering instruction corresponding to the first rendering instruction based on a mapping relationship, where the mapping relationship includes a preset correspondence between the first rendering instruction and the second rendering instruction, and the mapping relationship includes a correspondence between the rendering instructions, and in the mapping relationship, the first rendering instruction corresponds to the second rendering instruction.
In an optional implementation, the obtaining the second rendering instruction according to the first rendering instruction includes obtaining the second rendering instruction according to the first rendering instruction when the first rendering instruction meets a preset condition, where the preset condition includes that the first rendering instruction belongs to a preset instruction set.
The terminal device may pre-specify some instructions, which are not implemented based on the target extension, and have poor rendering efficiency, and the current GPU supports a certain extension, based on which another instruction that can implement the same rendering task may exist. At this time, the terminal apparatus may designate some instructions in advance as instructions (e.g., the first rendering instructions in the present embodiment) of candidate optimization, and correspondingly, may designate optimization instructions (e.g., the second rendering instructions in the present embodiment) as candidate optimization instructions in advance.
It should be noted that, the first rendering instruction may be an instruction set including a plurality of instructions, and at this time, the preset instruction set may include a plurality of instructions, and a time sequence between the instructions, so that the first rendering instruction belongs to the preset instruction set, and it may be understood that the first rendering instruction includes names of the plurality of instructions, and a sequence of execution of the first rendering instruction belongs to the preset instruction set.
In an optional implementation, the first rendering instruction is configured to operate a target cache, and the preset condition includes at least one of:
the cache type of the target cache meets a preset condition;
the target cache is operated by the first rendering instruction to have history information meeting a preset condition, or,
And the context information of the target cache meets the preset condition.
It should be noted that, for the cache type of the target cache, whether the type of the data stored in the target cache meets the condition or whether some attributes of the target cache meet the condition may be determined, for example, when the first rendering instruction is a glBufferData instruction, whether the target cache is used for storing vertex or index data and whether the purpose of the target cache is dynamically updated may be determined, if so, whether the cache type of the target cache meets the preset condition.
It should be noted that, for the target cache, the historical operation number of times of the first rendering instruction is greater than a preset value, whether the historical operation number of times of the target cache is greater than the preset value or not may be obtained. For example, when the first rendering instruction is glBlitFramebuffer instructions, it may be determined whether the number (frequency) of times Read Framebuffer to which it is bound is called in a certain time of the history is greater than a preset value.
It should be noted that, the context information may represent a rendering state, for example, when the first rendering instruction is glBlitFramebuffer instructions, the currently bound color and depth template attachment of Read Framebuffer may be determined, whether the sampling value is greater than 1 may be determined, and so on.
In an optional implementation, the first rendering instruction is configured to operate on a texture object, and the preset condition includes that context information of the texture object meets a preset condition.
In an alternative implementation, the second rendering instruction is configured to invoke an interface of the target extension.
Illustratively, the first rendering instruction may be a glBufferData instruction and the second rendering instruction may be a glBufferStorageEXT instruction implemented based on an EXT_buffer_store extension. It should be noted that the number of the substrates, the GL MAP PERSISTENT BIT EXT, GL MAP COHERENT BIT EXT and GL dynamamic STORAGE BIT EXT attributes may be additionally specified in the glBufferStorageEXT instruction, to create buffers that support persistent address mapping.
In an optional implementation, the first rendering instruction is located in a target function table, where the first rendering instruction includes a plurality of first sub-instructions, and correspondingly, if the first rendering instruction meets a preset condition, the method includes:
And if at least one of the plurality of first sub-instructions included in the first rendering instruction meets a preset condition.
It should be noted that, the preset condition satisfied by the at least one first sub-instruction may refer to the preset condition satisfied by the first rendering instruction.
In an alternative implementation, the second rendering instruction includes a plurality of second sub-instructions, each corresponding to a first sub-instruction, and the executing the second rendering instruction in response to the first rendering instruction includes:
And executing the second sub-instruction corresponding to each first sub-instruction in response to each first sub-instruction.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension and is an instruction related to implementing a function of the target extension.
In an optional implementation, the first rendering instruction includes a first target rendering instruction and a second target rendering instruction, where the first target rendering instruction is a rendering instruction corresponding to a first frame image, the second target rendering instruction is a rendering instruction corresponding to a second frame image, the first frame image is an image frame before the second frame image, and if the first rendering instruction meets a preset condition, the method includes:
If the first target rendering instruction meets the preset condition, and the second target rendering instruction meets the preset condition.
Specifically, before each frame (first frame image) ends, a scene is identified based on analysis of the first target rendering instruction, and whether to enter optimization is determined. For the intercepted first target rendering instruction, analyzing specific parameters, calling times, current OpenGL ES state information, recorded history information, current context characteristics and the like, and then, matching preset scenes in the expansion optimization algorithm library (specifically, refer to the part of the above embodiment if the first rendering instruction meets the preset condition), if the first target rendering instruction meets the preset condition, a second target rendering instruction which can be intercepted in a second frame image is analyzed, the specific parameters, calling times, current OpenGL ES state information, recorded history information, current context characteristics and the like, and then, the preset scenes in the expansion optimization algorithm library are dematched (specifically, refer to the part of the above embodiment if the first rendering instruction meets the preset condition is determined), and if the second target rendering instruction meets the preset condition, the second rendering instruction is acquired.
In a second aspect, the present application provides a rendering instruction processing apparatus, the apparatus being applied to a terminal device, the terminal device including a graphics processor GPU, the GPU supporting target expansion in an embedded system open graphics library openGL ES, the apparatus comprising:
the instruction acquisition module is used for acquiring a first rendering instruction which is not realized based on the target extension, and the first rendering instruction is used for realizing a rendering task;
Obtaining a second rendering instruction according to the first rendering instruction, wherein the second rendering instruction is a rendering instruction realized based on the target extension and is used for realizing the rendering task, and
And the instruction execution module is used for triggering the GPU to execute the rendering task according to the second rendering instruction.
In an alternative implementation, the instruction acquisition module is configured to:
and acquiring the second rendering instruction corresponding to the first rendering instruction based on a mapping relation, wherein the mapping relation comprises a preset corresponding relation between the first rendering instruction and the second rendering instruction.
In an alternative implementation, the device power consumption when the GPU is triggered to perform the rendering task according to the first rendering instruction is greater than the device power consumption when the GPU is triggered to perform the rendering task according to the second rendering instruction, and/or,
The corresponding memory data copy number when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding memory data copy number when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or,
The GPU load corresponding to the first rendering instruction triggering the GPU to execute the rendering task is larger than the GPU load corresponding to the second rendering instruction triggering the GPU to execute the rendering task, and/or,
And the CPU load of the central processing unit corresponding to the GPU when the first rendering instruction triggers the GPU to execute the rendering task is larger than the CPU load corresponding to the GPU when the second rendering instruction triggers the GPU to execute the rendering task.
In an alternative implementation, the preset condition includes that the first rendering instruction belongs to a preset instruction set.
In an optional implementation, the first rendering instruction is configured to operate a target cache, and the preset condition includes at least one of:
the cache type of the target cache meets a preset condition;
the target cache is operated by the first rendering instruction to have history information meeting a preset condition, or,
And the context information of the target cache meets the preset condition.
In an optional implementation, the first rendering instruction is configured to operate on a texture object, and the preset condition includes that context information of the texture object meets a preset condition.
In an alternative implementation, the second rendering instruction is configured to invoke an interface of the target extension.
In an optional implementation, the first rendering instruction includes a plurality of first sub-instructions, and correspondingly, the instruction obtaining module is configured to obtain the second rendering instruction if at least one of the plurality of first sub-instructions included in the first rendering instruction meets a preset condition.
In an alternative implementation, the second rendering instruction includes a plurality of second sub-instructions, each corresponding to one of the first sub-instructions, and the instruction execution module is configured to:
And executing the second sub-instruction corresponding to each first sub-instruction in response to each first sub-instruction.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension and is an instruction related to implementing a function of the target extension.
In an optional implementation, the first rendering instruction includes a first target rendering instruction and a second target rendering instruction, where the first target rendering instruction is a rendering instruction corresponding to a first frame image, the second target rendering instruction is a rendering instruction corresponding to a second frame image, the first frame image is an image frame before the second frame image, and if the first rendering instruction meets a preset condition, the method includes:
If the first target rendering instruction meets the preset condition, and the second target rendering instruction meets the preset condition.
In a third aspect, the present application provides a terminal device comprising a processor and a memory, the processor retrieving code stored in the memory to perform any one of the first aspect and its alternative implementations.
In a fourth aspect, the present application provides a non-transitory computer readable storage medium containing computer instructions for performing the rendering instruction processing method of any one of the above first aspect and optional implementations thereof,
In a fifth aspect, the present application also provides a computer program product comprising computer instructions for execution by a processor of a host device for performing the operations performed by the processor in any one of the possible implementations of the present embodiment.
The embodiment of the application provides a rendering instruction processing method which is applied to terminal equipment, wherein the terminal equipment comprises a graphic processor GPU, the GPU supports target expansion in an embedded system open graphics library openGL ES, the method comprises the steps of obtaining a first rendering instruction which is not realized based on the target expansion, the first rendering instruction is used for realizing a rendering task, obtaining a second rendering instruction according to the first rendering instruction, the second rendering instruction is a rendering instruction realized based on the target expansion, the second rendering instruction is used for realizing the rendering task, and triggering the GPU to execute the rendering task according to the second rendering instruction. By the method, for a specific scene, the rendering instruction based on the GPU expansion is used for replacing the rendering instruction which is not originally based on the GPU expansion, so that the performance loss of the terminal equipment can be reduced.
Detailed Description
Embodiments of the present application will now be described with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the present application. As one of ordinary skill in the art can know, with the development of technology and the appearance of new scenes, the technical scheme provided by the embodiment of the application is also applicable to similar technical problems.
The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules that are expressly listed or inherent to such process, method, article, or apparatus. The naming or numbering of the steps in the present application does not mean that the steps in the method flow must be executed according to the time/logic sequence indicated by the naming or numbering, and the execution sequence of the steps in the flow that are named or numbered may be changed according to the technical purpose to be achieved, so long as the same or similar technical effects can be achieved.
OpenGL (open graphics library) is a professional graphic program interface defining a cross-programming language and cross-platform programming interface specification, and is applied to the fields of content creation, energy, entertainment, game development, manufacturing industry, pharmaceutical industry, virtual reality and the like, and OpenGL can help programmers to realize development of high-performance and high-visual expressive graphic processing software on hardware devices such as personal computers (personal computer, PCs), workstations, supercomputers and the like.
OpenGL ES (openGL for embedded systems) is a subset of OpenGL three-dimensional graphics application program interfaces (application programming interface, APIs) designed for embedded devices such as cell phones and game hosts. Among them, openGL ES is based on OpenGL, and many non-absolutely necessary characteristics such as glBegin/glEnd, complex primitives such as quadrangle (gl_ QUADS) and polygon (gl_ POLYGONS) are removed. Over the years of development, there are now mainly two versions, openGL ES 1.X for fixed pipeline hardware and OpenGL ES 2.X for programmable pipeline hardware. Wherein, openGL ES 1.0 is based on the OpenGL 1.3 specification, and OpenGL ES 1.1 is based on the OpenGL 1.5 specification. OpenGL ES 2.0 is defined with reference to the OpenGL 2.0 specification.
The enabling scenes of OpenGL ES include, but are not limited to, picture processing such as picture-tone conversion, beauty, etc., camera preview effect processing such as beauty cameras, etc., video processing, 3D games, etc.
In one implementation, openGL is implemented as a client-server system, where the application acts as a client and OpenGL acts as a server. As shown in fig. 1, when the client program needs to call the interface of OpenGL to implement 3D rendering, openGL commands and data are buffered in a random access memory (random access memory, RAM), and under a certain condition, these commands and data are sent to an image random access memory (video random access memory, VRAM) through a CPU clock, and under the control of the GPU, the rendering of graphics is completed by using the data and commands in the VRAM, and the result is stored in a frame buffer, and the frame in the frame buffer is finally sent to a display, so as to display the result. In modern graphics hardware systems, it is also supported to send data directly from RAM to VRAM or from frame buffer to RAM (e.g. VBO, PBO in OpenGL) without a CPU clock.
In some OpenGL implementations, such as those associated with X windows systems, the client and server execute on separate machines, which are connected via a network. In this case, the client sends OpenGL commands, which are converted to window system related protocols, and then sent to the server through the shared network.
In the field of games, the problems of frame dropping and power consumption of games are always pain points which plague terminal manufacturers. Correspondingly, graphics rendering technology is also continuously advancing, openGL ES is developed from 1.0 to 3.2, more characteristics are supported, and rendering efficiency is also continuously improved. GPU manufacturers are actively promoting various load-reducing and image quality-improving technologies, and more efficient OpenGL ES expansion is proposed, and new OpenGL ES expansion can improve rendering efficiency, however, the expansion has a long evolution road before entering the standard. For example, GPU manufacturers propose a new extension that can promote rendering efficiency of a specific scene, promote to other GPU manufacturers, become a generic extension, then promote the approval of Khronos organization, and finally promote the integration standard.
However, before becoming standard or being approved by the standard organization, the game engine will generally not adapt quickly to use the latest extension, i.e. even if the GPU of the terminal device currently running the game supports a certain non-standard OpenGL ES extension, since the extension is not standard, the rendering instructions in the game are not based on the extension, but are implemented based on other instructions for the same rendering task, and the final rendering efficiency is reduced (e.g. poor display effect after rendering or higher power consumption for the terminal device, etc.). In order to solve the above problems, the present application proposes a rendering instruction processing method.
A system architecture diagram of an embodiment of the present application is described next.
FIG. 2 is a block diagram illustrating a computing device 30 that may implement the techniques described in this disclosure. The computing device 30 may be a rendering instruction processing device in an embodiment of the application, examples of computing device 30 include, but are not limited to, wireless devices, mobile or cellular telephones (including so-called smartphones), personal Digital Assistants (PDAs), video game consoles that include video displays, mobile video game devices, mobile video conferencing units, laptop computers, desktop computers, television set-top boxes, tablet computing devices, electronic book readers, fixed or mobile media players, and the like.
In the example of fig. 2, computing device 30 includes a central processing unit (central processing unit, CPU) 32 having a CPU memory 34, a graphics processing unit (graphics processing unit, GPU) 36 having a GPU memory 38 and one or more shading units 40, a display unit 42, a display buffer unit 44, a user interface unit 46, and a storage unit 48. In addition, storage unit 48 may store GPU driver 50 with compiler 54, GPU program 52, and locally compiled GPU program 56.
Examples of CPU 32 include, but are not limited to, a Digital Signal Processor (DSP), a general purpose microprocessor, an Application Specific Integrated Circuit (ASIC), a field programmable logic array (FPGA), or other equivalent integrated or discrete logic circuit. Although CPU 32 and GPU 36 are illustrated as separate units in the example of fig. 2, in some examples CPU 32 and GPU 36 may be integrated into a single unit. The CPU 32 may execute one or more application programs. Examples of applications may include web browsers, email applications, spreadsheets, video games, audio and/or video capturing, playback, or editing applications, or other applications that initiate the generation of image data to be presented via display unit 42.
In the example shown in FIG. 2, CPU 32 includes CPU memory 34.CPU memory 34 may represent an on-chip storage device or memory used in executing machines or object code. The CPU memories 34 may each include a hardware memory register capable of storing a fixed number of digital bits. The CPU 32 may be capable of reading values from the local CPU memory 34 or writing values to the local CPU memory 34 more quickly than reading values from the storage unit 48 (which may be accessed, for example, via a system bus) or writing values to the storage unit 48.
GPU 36 represents one or more specialized processors for performing graphics operations. That is, for example, GPU 36 may be a dedicated hardware unit having fixed functionality and programmable components for rendering graphics and executing GPU applications. GPU 36 may also include a DSP, general purpose microprocessor, ASIC, FPGA, or other equivalent integrated or discrete logic circuitry.
GPU 36 also includes GPU memory 38, which may represent on-chip storage or memory used in executing machine or object code. GPU memory 38 may each include a hardware memory register capable of storing a fixed number of digital bits. GPU 36 may be capable of reading values from local GPU memory 38 or writing values to local GPU memory 38 more quickly than reading values from storage unit 48 (which may be accessed, for example, via a system bus) or writing values to storage unit 48.
GPU 36 also includes a shading unit 40. As described in more detail below, shading unit 40 may be configured as a programmable pipeline of processing components. In some examples, shading unit 40 may be referred to as a "shader processor" or "unified shader" and may perform geometry, vertex, pixel, or other shading operations to render graphics. Shading unit 40 may include one or more components not specifically shown in fig. 2 for clarity, such as components for fetching and decoding instructions, one or more Arithmetic Logic Units (ALUs) for performing arithmetic computations, and one or more memories, caches, or registers.
Display unit 42 represents a unit capable of displaying video data, images, text, or any other type of data. The display unit 42 may include a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) display, and the like.
Display buffer unit 44 represents a memory or storage device dedicated to storing data for display unit 42 for presentation of images (e.g., photographs or video frames). Display buffer unit 44 may represent a two-dimensional buffer containing a plurality of storage locations. The number of storage locations within display buffer unit 44 may be substantially similar to the number of pixels to be displayed on display unit 42. For example, if display unit 42 is configured to include 640x480 pixels, display buffer unit 44 may include 640x480 storage locations. Display buffer unit 44 may store the final pixel value for each of the pixels processed by GPU 36. Display unit 42 may retrieve the final pixel values from display buffer unit 44 and display the final image based on the pixel values stored in display buffer unit 44.
User interface unit 46 represents a unit that a user may use to interact with or otherwise interface with other units of computing device 30 (e.g., CPU 32) to communicate with other units of computing device 30. Examples of user interface unit 46 include, but are not limited to, a trackball, a mouse, a keyboard, and other types of input devices. The user interface unit 46 may also be a touch screen and may be incorporated as part of the display unit 42.
The storage unit 48 may include one or more computer-readable storage media. Examples of storage unit 48 include, but are not limited to, random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer or processor.
In some example implementations, the storage unit 48 may contain instructions that cause the CPU 32 and/or GPU 36 to perform the functions of the present invention for implementing the CPU 32 and GPU 36. In some examples, the storage unit 48 may be considered a non-transitory storage medium. The term "non-transitory" may indicate that the storage medium is not embodied in a carrier wave or propagated signal. However, the term "non-transitory" should not be construed to mean that the storage unit 48 is not removable. As one example, the storage unit 48 may be removed from the computing device 30 and moved to another device. As another example, a storage unit substantially similar to storage unit 48 may be inserted into computing device 30. In some examples, a non-transitory storage medium may store data (e.g., in RAM) that may change over time.
Storage unit 48 stores GPU driver 50 and compiler 54, GPU program 52, and locally compiled GPU program 56.GPU driver 50 represents a computer program or executable code that provides an interface to access GPU 36. CPU 32 executes GPU driver 50, or portions thereof, to connect with GPU 36, and for this reason GPU driver 50 is shown in the example of fig. 2 as GPU driver 50 within CPU 32, labeled with a dashed box. GPU driver 50 may access programs or other executable files executed by CPU 32, including GPU program 52.
GPU program 52 may include code written in a High Level (HL) programming language (e.g., using an Application Programming Interface (API)). Examples of APIs include open graphics library (OpenGL). In general, an API contains a predetermined standardized set of commands that are executed by associated hardware. The API commands allow a user to instruct the hardware components of the GPU to execute the commands without the user knowing the specifics of the hardware components.
GPU program 52 may call or otherwise include one or more functions provided by GPU driver 50. CPU 32 generally executes the program in which GPU program 52 is embedded and upon encountering GPU program 52, passes GPU program 52 to GPU driver 50 (e.g., in the form of a command stream). CPU 32 executes GPU driver 50 in this context to process GPU program 52. For example, GPU driver 50 may process GPU program 52 by compiling the GPU program into objects or machine code executable by GPU 36. This object code is shown in the example of fig. 2 as a native compiled GPU program 56.
In some examples, compiler 54 may operate in real-time or near real-time to compile GPU program 52 during execution of a program in which GPU program 52 is embedded. For example, compiler 54 generally represents a module that reduces HL instructions defined in accordance with the HL programming language to LL instructions of a low-level (LL) programming language. After compilation, these LL instructions can be executed by a particular type of processor or other type of hardware (e.g., FPGA, ASIC, etc. (including, e.g., CPU 32 and GPU 36).
In the example of FIG. 2, compiler 54 may receive GPU program 52 from CPU 32 when executing HL code comprising GPU program 52. Compiler 54 may compile GPU program 52 into a native compiled GPU program 56 that conforms to the LL programming language. Compiler 54 then outputs a native compiled GPU program 56 that includes LL instructions.
GPU 36 generally receives native compiled GPU program 56 (as shown by "native compiled GPU program 56" marked by a dashed box within GPU 36), after which, in some examples, GPU 36 renders the image and outputs the rendered portion of the image to display buffer unit 44. For example, GPU 36 may generate a plurality of primitives to be displayed at display unit 42. Primitives may include one or more lines (including curves, splines, etc.), points, circles, ellipses, polygons (where a polygon is generally defined as a set of one or more triangles), or any other two-dimensional (2D) primitive. The term "primitive" may also refer to three-dimensional (3D) primitives such as cubes, cylinders, spheres, cones, pyramids, circles, and the like. In general, the term "primitive" refers to any geometric shape or element rendered by GPU 36 for display as an image (or frame in the context of video data) via display unit 42.
GPU36 may transform the primitive or other state data of the primitive (e.g., which defines texture, brightness, camera configuration, or other aspects of the primitive) into a so-called "world space" by applying one or more model transforms (which may also be specified in the state data). Once transformed, GPU36 may apply a view transform of the active camera (which may also be specified in the state data defining the camera) to transform the coordinates of the primitives and light into the camera or eye space. GPU36 may also perform vertex shading to render the appearance of primitives in any view of active light. GPU36 may perform vertex shading in one or more of the above models, world, or view spaces (although vertex shading is typically performed in world space).
Once the primitive is rendered, GPU 36 may perform a projection to project the image into (as one example) a unit cube having poles at (-1, -1, -1) and (1, 1). This unit cube is commonly referred to as a typical view volume. After transforming the model from eye space to a typical view volume, GPU 36 may perform clipping to remove any primitives that do not reside at least partially in the view volume. In other words, GPU 36 may remove any primitives that are not within a camera frame. GPU 36 may then map the coordinates of the primitive from the view volume to screen space, effectively reducing the 3D primitive of the primitive to the 2D coordinates of the screen.
Given the transformed and projected vertices of the primitives defined with their associated shading data, GPU 36 may then rasterize the primitives. For example, GPU 36 may calculate and set the color of the pixels of the screen covered by the primitive. During rasterization, GPU 36 may apply any texture associated with the primitive (where the texture may include state data). GPU 36 may also perform a Z-buffer algorithm (also referred to as a depth test) during rasterization to determine if any primitives and/or objects are obscured by any other objects. The Z-buffer algorithm orders the primitives according to their depth so that GPU 36 knows the order in which each primitive is drawn onto the screen. GPU 36 outputs the rendered pixels to display buffer unit 44.
The display buffer unit 44 may temporarily store the rendered pixels of the rendered image until the entire image is rendered. In this context, the display buffer unit 44 may be regarded as an image frame buffer. The display buffer unit 44 may then transmit the rendered image to be displayed on the display unit 42. In some alternative examples, rather than temporarily storing the image in display buffer unit 44, GPU 36 may output the rendered portion of the image directly to display unit 42 for display. The display unit 42 may then display the image stored in the display buffer unit 78.
In the embodiment of the present application, when a program for performing graphics rendering is required to run, the CPU 32 may acquire a rendering instruction for each frame of image from the central processing unit memory and execute the rendering instruction to implement graphics rendering. In particular, the CPU 32 may obtain a function table (hook table) for each frame image from the central processing unit memory, where the function table may include a plurality of rendering instructions, and the CPU may implement graphics rendering by executing each rendering instruction in the function table (the specific execution flow may refer to the description in the corresponding embodiment of fig. 2, which is not repeated here).
In particular, a GPU may support multiple extensions in OpenGL ES, however, application developers may not write programs with more efficient extensions for graphics rendering because the extensions are not generalized to standards, etc.
In order to solve the above-mentioned problems, a detailed description is given next of a rendering instruction processing method provided in an embodiment of the present application.
Referring to fig. 3, fig. 3 is a flowchart of a rendering instruction processing method according to an embodiment of the present application, where, as shown in fig. 3, the rendering instruction processing method according to the embodiment of the present application includes:
301. And acquiring a first rendering instruction which is not realized based on the target extension, wherein the first rendering instruction is used for realizing a rendering task.
Taking an application program needing graphic rendering as an example of a game, when the game is started, a rendering instruction aiming at a rendering task in the game can be obtained, and then the rendering task corresponding to the rendering instruction is executed, wherein the first rendering instruction is one of the obtained rendering instructions.
In the embodiment of the application, the first rendering instruction is not realized based on the target extension.
302. And acquiring a second rendering instruction according to the first rendering instruction, wherein the second rendering instruction is a rendering instruction realized based on the target extension, and the second rendering instruction is used for realizing the rendering task.
In the embodiment of the application, if the first rendering instruction meets the preset condition, a second rendering instruction is acquired according to the first rendering instruction, wherein the second rendering instruction is a rendering instruction realized based on the target extension.
In the embodiment of the application, in the process of replacing the instruction which does not adopt the target extension in the application program code with the instruction based on extension implementation, firstly, the instruction which needs to be converted needs to be identified.
In the embodiment of the application, after the first rendering instruction is acquired, the first rendering instruction is analyzed to determine whether the current scene meets the preset requirement, and if the first rendering instruction meets the preset condition, the first rendering instruction is determined to be required to be converted.
In the embodiment of the application, the terminal device can be preconfigured with a plurality of rules which define preset conditions which the instructions should meet, wherein the first rendering instruction is used for operating the target cache, and the preset conditions at least comprise one of the following conditions that the first rendering instruction belongs to a preset instruction set, the cache type of the target cache meets the preset conditions, the historical operation times of the target cache by the first rendering instruction is larger than a preset value, or the context information of the target cache meets the preset conditions.
The first rendering instruction comprises a plurality of first sub-instructions, and accordingly, whether at least one of the plurality of first sub-instructions included in the first rendering instruction meets a preset condition needs to be judged.
Specifically, for the preset condition that the first rendering instruction belongs to the preset instruction set, the terminal device may pre-designate some instructions, where the instructions are not implemented based on the target extension, rendering efficiency is poor, and the current GPU supports a certain extension, and based on the extension, another instruction that can implement the same rendering task may exist. At this time, the terminal apparatus may designate some instructions in advance as instructions (e.g., the first rendering instructions in the present embodiment) of candidate optimization, and correspondingly, may designate optimization instructions (e.g., the second rendering instructions in the present embodiment) as candidate optimization instructions in advance.
In one implementation, the first rendering instruction may be a separate instruction.
For example, the glMapBufferRange instruction and the glBufferData instruction in openGL ES, when executed in a scenario where vertex and index data are frequently updated, there are a large number of map operations, which causes high GPU load and CPU load. Also, for example, glBlitFramebuffer instructions, when executed, the data size of the DDR memory copy is large. The preset instruction set in the embodiment of the present application may be an instruction set including the above instruction.
It should be noted that, whether the instruction belongs to the preset instruction set may be directly determined by identifying the instruction name.
In one implementation, the first rendering instruction may be an instruction set including a plurality of instructions, where the preset instruction set may include a plurality of instructions, and a time sequence between the instructions, and it may be understood that the first rendering instruction belongs to the preset instruction set, and that the first rendering instruction includes names of the plurality of instructions, and a sequential execution sequence thereof all belongs to the preset instruction set. For example, the preset instruction set includes a subset of instructions { glBindBuffer; glBufferData; glMapBufferRange; memcpy; glUnmapBuffer }.
It should be noted that the first rendering instruction belongs to a preset instruction set, and may be only one condition that the first rendering instruction satisfies a preset condition.
In one implementation, the first rendering instruction is configured to operate a target buffer, where the target buffer may be one of a frame buffer framebuffer, buffer and a rendering buffer renderbuffer.
Specifically, the type of the target cache meets a preset condition, which may be whether the type of the data stored in the target cache meets the condition, or whether some attributes of the target cache meet the condition, for example, when the first rendering instruction is a glBufferData instruction, whether the target cache is used for storing vertex or index data and whether the purpose of the target cache is dynamically updated may be determined, and if so, whether the type of the target cache meets the preset condition is determined.
Specifically, the historical operation times of the target cache by the first rendering instruction are larger than a preset value, and whether the historical operation times of the target cache by the first rendering instruction are larger than the preset value or not can be obtained. For example, when the first rendering instruction is glBlitFramebuffer instructions, it may be determined whether the number (frequency) of times Read Framebuffer to which it is bound is called in a certain time of the history is greater than a preset value.
And obtaining the context information of the target cache, wherein the context information can represent a rendering state, for example, when a first rendering instruction is glBlitFramebuffer instructions, the currently bound color and depth template attachment of Read Framebuffer can be judged, whether the sampling value is larger than 1 or not is judged, and the like.
In the embodiment of the application, whether the current scene meets the condition can be judged based on whether the first rendering instruction belongs to a preset instruction set, whether the cache type of the target cache meets the preset condition, whether the historical operation times of the target cache by the first rendering instruction is larger than a preset value, whether the context information of the target cache meets at least one preset condition in the preset conditions or not, and whether the first rendering instruction meets the preset condition or not is further determined.
In the embodiment of the application, if the first rendering instruction meets the preset condition, a second rendering instruction can be acquired, wherein the second rendering instruction is configured to be realized based on the target extension.
Specifically, a mapping relationship between a first rendering instruction and a second rendering instruction can be preconfigured, wherein the mapping relationship comprises a preset corresponding relationship between the first rendering instruction and the second rendering instruction, and the second rendering instruction corresponding to the first rendering instruction can be obtained from the mapping relationship after the first rendering instruction is determined to meet a preset condition.
In one implementation, the second rendering instruction is configured to invoke an interface of the target extension.
Illustratively, the first rendering instruction may be a glBufferData instruction and the second rendering instruction may be a glBufferStorageEXT instruction implemented based on an EXT_buffer_store extension. It should be noted that the number of the substrates, the GL MAP PERSISTENT BIT EXT, GL MAP COHERENT BIT EXT and GL dynamamic STORAGE BIT EXT attributes may be additionally specified in the glBufferStorageEXT instruction, to create buffers that support persistent address mapping.
In one implementation, the second rendering instruction includes a plurality of second sub-instructions, each corresponding to one of the first sub-instructions.
In one implementation, the first rendering instruction is configured to operate a texture object, and the preset condition includes that context information of the texture object satisfies a preset condition.
By way of example, referring to FIG. 4, FIG. 4 is an alternate illustration of rendering instructions in the present embodiment, as shown in FIG. 4, at a first time glMapBufferRange, a first rendering instruction may be the instruction set { glBindBuffer; glBufferData; glMapBufferRange; memcpy; glUnmapBuffer }, and correspondingly, a second rendering instruction may be the instruction set { glBindBuffer; glBufferStorageEXT; glMapBufferRange; memcpy; return }, where glBindBuffer corresponds to glBindBuffer, glBufferData corresponds to glBufferStorageEXT, glMapBufferRange corresponds to glMapBufferRange, memcpy corresponds to memcpy, glUnmapBuffer corresponds to return. It can be seen that at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension. At least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension and is an instruction related to a function that implements the target extension.
When not first glMapBufferRange, the first rendering instruction may be an instruction set { glBindBuffer; glBufferData; glMapBufferRange; memcpy; glUnmapBuffer }, and the corresponding second rendering instruction may be an instruction set { glBindBuffer; return address; memcpy; return }, where glBindBuffer corresponds to glBindBuffer, glBufferData corresponds to return and glMapBufferRange corresponds to return address, memcpy corresponds to memcpy, glUnmapBuffer corresponds to return.
For another example, the first rendering instruction may be a glBlitFramebuffer instruction and the second rendering instruction may be a glRenderbufferStorageMultisampleEXT instruction implemented based on an EXT_ multisampled _render_to_texture extension.
In the embodiment of the application, the equipment power consumption when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the equipment power consumption when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or the corresponding memory data copy quantity when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding memory data copy quantity when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or the corresponding GPU load when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding GPU load when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or the corresponding CPU load of the CPU when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding CPU load when the GPU is triggered to execute the rendering task according to the second rendering instruction.
In the embodiment of the application, the first rendering instruction comprises a first target rendering instruction and a second target rendering instruction, wherein the first target rendering instruction is a rendering instruction corresponding to a first frame image, the second target rendering instruction is a rendering instruction corresponding to a second frame image, and the first frame image is an image frame before the second frame image, if the first target rendering instruction meets a preset condition, and the second target rendering instruction meets the preset condition.
Specifically, before each frame (first frame image) ends, a scene is identified based on analysis of the first target rendering instruction, and whether to enter optimization is determined. For the intercepted first target rendering instruction, analyzing specific parameters, calling times, current OpenGL ES state information, recorded history information, current context characteristics and the like, and then, matching preset scenes in the expansion optimization algorithm library (specifically, refer to the part of the above embodiment if the first rendering instruction meets the preset condition), if the first target rendering instruction meets the preset condition, a second target rendering instruction which can be intercepted in a second frame image is analyzed, the specific parameters, calling times, current OpenGL ES state information, recorded history information, current context characteristics and the like, and then, the preset scenes in the expansion optimization algorithm library are dematched (specifically, refer to the part of the above embodiment if the first rendering instruction meets the preset condition is determined), and if the second target rendering instruction meets the preset condition, the second rendering instruction is acquired.
303. And triggering the GPU to execute the rendering task according to the second rendering instruction.
In the embodiment of the application, after the CPU acquires the second rendering instruction, the second rendering instruction is transmitted to the GPU driver. For example, the GPU driver may compile the second rendering instructions into GPU-executable objects or machine code, after which these second rendering instructions can be executed by the GPU, which may perform the corresponding rendering tasks after receiving the compiled second rendering instructions.
In an alternative implementation, the code corresponding to the second rendering instruction may be executed with the first rendering instruction as a pointer.
In the embodiment of the application, the second rendering instruction can comprise a plurality of second sub-instructions, each second sub-instruction corresponds to one first sub-instruction, and the second sub-instruction corresponding to each first sub-instruction can be executed in response to each first sub-instruction. That is, after the first rendering instruction from the application program is acquired, the code corresponding to the first rendering instruction is not executed, but the code corresponding to the second rendering instruction is executed with the first rendering instruction as a pointer. The executing of the second rendering instruction may be understood as sending the second rendering instruction to the corresponding driver, and executing, based on the GPU, a rendering task corresponding to the second rendering instruction.
Taking a scene rendered by a game as an example, in the embodiment of the application, the game can reduce terminal performance loss caused by frequent memory mapping and the like or reduce the load of a CPU/GPU by executing instructions realized based on GPU expansion without modifying codes. Meanwhile, the embodiment of the application is a general expansion optimization framework based on GPU expansion, the performance of a mobile phone supporting the expansion can be improved, the optimization is not enabled for a mobile phone not supporting the expansion, and the compatibility is better.
The embodiment of the application provides a rendering instruction processing method which is applied to terminal equipment, wherein the terminal equipment comprises a graphic processor GPU, the GPU supports target expansion in an embedded system open graphics library openGL ES and comprises the steps of obtaining a first rendering instruction, obtaining a second rendering instruction if the first rendering instruction meets a preset condition, wherein the second rendering instruction is configured to be realized based on the target expansion, and responding to the first rendering instruction to execute the second rendering instruction. By the method, for a specific scene, the rendering instruction based on the GPU expansion is used for replacing the rendering instruction which is not originally based on the GPU expansion, so that the performance loss of the terminal equipment can be reduced.
Next, taking a game-rendered scene as an example, an embodiment is given that includes more details than the corresponding embodiment of fig. 3.
Referring to fig. 5, fig. 5 is a schematic diagram of a rendering instruction processing method 500 provided by an embodiment of the present application, where when a game starts 501, an initialization process may be performed 502, and the initialization process may include traversing a directory where an expansion optimization algorithm is located and loading all installed dynamic library files, where the expansion optimization algorithm in the embodiment of the present application may include a function CheckExtEnabled for checking whether the expansion optimization algorithm is supported, onEnterHookMode that an expansion optimization algorithm is enabled and then calls a hook table of an OpenGL ES and then replaces the hook table of the OpenGL ES, onExitHookMode that an expansion optimization algorithm is turned off and then returns the hook table of the OpenGL ES, onInterceptEndCommand that each frame is called before the end, completing scene recognition and determining whether a subsequent frame enters optimization.
The function pointers of all the expansion optimization algorithms to the external interface are obtained based on the GetProcAddress 503, and the algorithm enabling detection function CheckExtEnabled of each expansion optimization algorithm is called to determine whether to enable the algorithm. Specifically, all openGL ES extensions supported by the current device GPU may be obtained, and it is checked whether the extensions that the current extension optimization algorithm depends on are supported, and if so, onEnterHookMode is enabled 505.
Before the end of each frame, onInterceptEndCommand function 506 is called, identifying the scene, deciding whether to enter optimization. Intercepting all OpenGL ES and EGL function calls, and jumping to an extended optimization framework instruction analysis module. For the intercepted instruction, analyzing specific parameters, calling times, current OpenGL ES state information, recorded historical information and current context characteristics, and matching preset scenes 507 in an extended optimization algorithm library.
Wherein each algorithm presets a supported scenario, which is a set of rules, may include, for example:
the instruction name is that each scene is a combination of a series of instructions, and the approximate scene can be deduced through the instructions;
2, instruction parameters, namely further reducing the scene range based on specific parameter information of the instruction;
Analyzing instruction call times in a period of time, optimizing only a general scene, and not optimizing a scene which occasionally appears;
4, obtaining current OpenGL ES context state information, and optimizing only scenes meeting the expected range;
And 5, historical information, namely, instruction call needing to be recorded for a period of time, and the like, which are used as references of matching information.
And for the intercepted instructions, the preset scenes in the extended optimization algorithm library are matched. If the scenes match, instruction conversion optimization begins to be performed. The algorithm enabling function OnEnterHookMode of each extended optimization algorithm is invoked to replace the GL function table of the current rendering thread.
In the next frame, if optimization can be entered, instruction conversion 508 is performed and the next frame is rendered 509.
Thereafter, the optimization is turned off, the function table 510 is restored
An embodiment comprising more details than the corresponding embodiment of fig. 3 is given next.
Referring to fig. 6a, fig. 6a is a schematic diagram of a rendering instruction processing provided in an embodiment of the present application, and as shown in fig. 6a, conversion of a rendering instruction may be implemented by using an extended optimization framework in this embodiment, unlike the embodiment of fig. 3, fig. 6a shows a flow in which multiple optimization decisions and optimization execution may be performed.
As shown in fig. 6a, the first rendering instruction (glFunction) that may be acquired by the extension optimization framework performs one optimization based on scene matching, transition enabling (which may be described with reference to the corresponding embodiment of fig. 5) and instruction conversion, and then performs another optimization based on scene matching, transition enabling (which may be described with reference to the corresponding embodiment of fig. 5) and instruction conversion, where the two optimizations differ in that the instruction is replaced with a rendering instruction that supports a different extension based on a different scene identification.
Referring more specifically to fig. 6b, fig. 6b is a schematic illustration of a rendering instruction processing provided by an embodiment of the present application, as shown in fig. 6b, for glFunction a, which is based on scene matching, transition enabling (which may be specifically described with reference to the corresponding embodiment of fig. 5), and instruction transition, only the optimization corresponding to algorithm 2 is implemented (algorithm 1 may be due to a scene matching failure). For glFunction, based on scene matching, conversion enabling and instruction conversion, the optimization corresponding to algorithm 1 and algorithm 2 is realized. For glFunction, which is based on scene matching, transition enabling, and instruction transition, no optimization is achieved (algorithm 1 and algorithm 2 may be due to scene matching failure). For glFunction4, based on scene matching, transition enabling, and instruction transition, only optimization corresponding to algorithm 1 is implemented (algorithm 2 may be due to scene matching failure).
An embodiment comprising more details than the corresponding embodiment of fig. 3 is given next. It is directed to scenes where vertex and index data are frequently updated in game rendering.
In this embodiment, the glMapBufferRange instruction may be identified, and it is checked whether the currently bound buffer is vertex or index data, or dynamically writable. If so, scene matching is performed if the map instruction call reaches a certain proportion, and then expansion conversion is enabled.
In this embodiment, a buffer dictionary may be maintained, and all buffer information may be recorded, including target, size, usage information.
Referring to FIG. 7, when glBufferData instruction 701 is fetched, it may be checked whether the target cache in which it operates is vertex or index data, whether the use is a dynamic update, and whether the size meets the limit 702. If the condition is met, call glBufferStorageEXT interface 703, additionally specify GL_MAP_ PERSISTENT _BIT_EXT, GL_MAP_ COHERENT _BIT_EXT and GL_DYNAMIC_STORAGE_BIT_EXT attributes, creating buffers that support persistent address mapping. If the interface has been previously converted and no change in size, properties has occurred, then return is directly made. At the first glMapBufferRange times 704, the GL_MAP_ PERSISTENT _BIT_EXT and GL_MAP_ COHERENT _BIT_EXT attributes need to be added to the access parameters to obtain the persistent memory address 705. The next time glMapBufferRange the same buffer 704, the previously obtained address 706 is returned directly. glUnmapBuffer, direct return.
By the mode, in a scene that a large amount of map operations exist between the vertex and the index data, the call proportion of glBufferData instructions, glMapBufferRange instructions and glUnmapBuffer instructions is reduced, and the loads of a CPU and a GPU can be effectively reduced.
An embodiment comprising more details than the corresponding embodiment of fig. 3 is given next. Which is directed to antialiasing scenes in the rendering process. As in FIG. 8, a glBlitFramebuffer instruction 801 may be identified that verifies whether the currently bound Read Framebuffer color and depth template attachment has a sample value greater than 1. If so, scene match 802 is enabled if the Blit instruction call reaches a certain scale, after which extended conversion is enabled.
A new framebuffer803 may be created first, a new single sample color texture 804 created, and space 805 allocated, a new depth template attachment 806 created, and the color and depth template attachments for the interface provided single samples are bound using extensions. RenderBuffer allocates space glRenderbufferStorageMultisampleEXT 807, binds textures to Framebuffer, glframebufferTexture2DMultisampleEXT 808, binds RBOs to glFramebufferRenderBuffer 809, and directly renders to new Framebuffer in the next frame 810.
When binding multi-sample framebuffer, the newly created framebuffer ID is replaced, so that the multi-sample result is output to the single-sample attachment. When the frame buffer where the Blit is located is bound, the binding frame buffer and Blit operations are skipped. When binding to a later frame buffer, the binding texture is replaced by the texture corresponding to the newly created frame buffer due to the use of the single sample texture output by the previous Blit.
In this embodiment, for the antialiasing scene, the game can be extended by the GPU without modifying the code, reducing 60% of DDR data copies, and reducing power consumption of the mobile phone.
Referring to fig. 9, fig. 9 provides a schematic structural diagram of a rendering instruction processing apparatus 900, where the apparatus is applied to a terminal device, the terminal device includes a graphics processor GPU, the GPU supports target extension in an open graphics library openGL ES of an embedded system, and the apparatus includes:
The instruction acquisition module 901 is configured to acquire a first rendering instruction that is not implemented based on the target extension, where the first rendering instruction is used to implement a rendering task; acquiring a second rendering instruction according to the first rendering instruction, wherein the second rendering instruction is a rendering instruction realized based on the target extension, and the second rendering instruction is used for realizing the rendering task;
The instruction execution module 902 is configured to trigger the GPU to execute the rendering task according to the second rendering instruction.
In an alternative implementation, the instruction acquisition module is configured to:
and acquiring the second rendering instruction corresponding to the first rendering instruction based on a mapping relation, wherein the mapping relation comprises a preset corresponding relation between the first rendering instruction and the second rendering instruction.
In an alternative implementation, the device power consumption when the GPU is triggered to perform the rendering task according to the first rendering instruction is greater than the device power consumption when the GPU is triggered to perform the rendering task according to the second rendering instruction, and/or,
The corresponding memory data copy number when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding memory data copy number when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or,
The GPU load corresponding to the first rendering instruction triggering the GPU to execute the rendering task is larger than the GPU load corresponding to the second rendering instruction triggering the GPU to execute the rendering task, and/or,
And the CPU load of the central processing unit corresponding to the GPU when the first rendering instruction triggers the GPU to execute the rendering task is larger than the CPU load corresponding to the GPU when the second rendering instruction triggers the GPU to execute the rendering task.
In an alternative implementation, the preset condition includes that the first rendering instruction belongs to a preset instruction set.
In an optional implementation, the first rendering instruction is configured to operate a target cache, and the preset condition includes at least one of:
the cache type of the target cache meets a preset condition;
the target cache is operated by the first rendering instruction to have history information meeting a preset condition, or,
And the context information of the target cache meets the preset condition.
In an optional implementation, the first rendering instruction is configured to operate on a texture object, and the preset condition includes that context information of the texture object meets a preset condition.
In an alternative implementation, the second rendering instruction is configured to invoke an interface of the target extension.
In an optional implementation, the first rendering instruction includes a plurality of first sub-instructions, and correspondingly, the instruction obtaining module is configured to obtain the second rendering instruction if at least one of the plurality of first sub-instructions included in the first rendering instruction meets a preset condition.
In an alternative implementation, the second rendering instruction includes a plurality of second sub-instructions, each corresponding to one of the first sub-instructions, and the instruction execution module is configured to:
And executing the second sub-instruction corresponding to each first sub-instruction in response to each first sub-instruction.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension and is an instruction related to implementing a function of the target extension.
In an optional implementation, the first rendering instruction includes a first target rendering instruction and a second target rendering instruction, where the first target rendering instruction is a rendering instruction corresponding to a first frame image, the second target rendering instruction is a rendering instruction corresponding to a second frame image, the first frame image is an image frame before the second frame image, and if the first rendering instruction meets a preset condition, the method includes:
If the first target rendering instruction meets the preset condition, and the second target rendering instruction meets the preset condition.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a terminal device 1000 provided by the present application, where, as shown in fig. 10, the terminal device includes a processor 1001 and a memory 1003, where the processor includes a central processor CPU and a graphics processor GPU, the GPU supports target extension in an open graphics library openGL ES of an embedded system, and the CPU is configured to obtain code of the memory to execute:
the method comprises the steps of obtaining a first rendering instruction which is not realized based on the target extension, wherein the first rendering instruction is used for realizing a rendering task, obtaining a second rendering instruction according to the first rendering instruction, wherein the second rendering instruction is a rendering instruction realized based on the target extension, and the second rendering instruction is used for realizing the rendering task;
triggering the GPU to execute the rendering task according to the second rendering instruction;
In the embodiment of the application, the game can reduce the terminal performance loss caused by frequent memory mapping and the like or reduce the load of the CPU/GPU by executing the instruction realized based on GPU expansion without modifying codes. Meanwhile, the embodiment of the application is a general expansion optimization framework based on GPU expansion, the performance of a mobile phone supporting the expansion can be improved, the optimization is not enabled for a mobile phone not supporting the expansion, and the compatibility is better.
In an optional implementation, the obtaining a second rendering instruction according to the first rendering instruction includes:
and acquiring the second rendering instruction corresponding to the first rendering instruction based on a mapping relation, wherein the mapping relation comprises a preset corresponding relation between the first rendering instruction and the second rendering instruction.
In an optional implementation, the obtaining the second rendering instruction according to the first rendering instruction includes obtaining the second rendering instruction according to the first rendering instruction when the first rendering instruction meets a preset condition, where the preset condition includes that the first rendering instruction belongs to a preset instruction set.
In an optional implementation, the first rendering instruction is configured to operate a target cache, and the preset condition includes at least one of:
the cache type of the target cache meets a preset condition;
the target cache is operated by the first rendering instruction to have history information meeting a preset condition, or,
And the context information of the target cache meets the preset condition.
In an optional implementation, the first rendering instruction is configured to operate on a texture object, and the preset condition includes that context information of the texture object meets a preset condition.
In an alternative implementation, the device power consumption when the GPU is triggered to perform the rendering task according to the first rendering instruction is greater than the device power consumption when the GPU is triggered to perform the rendering task according to the second rendering instruction, and/or,
The corresponding memory data copy number when the GPU is triggered to execute the rendering task according to the first rendering instruction is larger than the corresponding memory data copy number when the GPU is triggered to execute the rendering task according to the second rendering instruction, and/or,
The GPU load corresponding to the first rendering instruction triggering the GPU to execute the rendering task is larger than the GPU load corresponding to the second rendering instruction triggering the GPU to execute the rendering task, and/or,
And the CPU load of the central processing unit corresponding to the GPU when the first rendering instruction triggers the GPU to execute the rendering task is larger than the CPU load corresponding to the GPU when the second rendering instruction triggers the GPU to execute the rendering task.
In an alternative implementation, the second rendering instruction is configured to invoke an interface of the target extension.
In an optional implementation, the first rendering instruction is located in a target function table, where the first rendering instruction includes a plurality of first sub-instructions, and correspondingly, if the first rendering instruction meets a preset condition, the method includes:
And if at least one of the plurality of first sub-instructions included in the first rendering instruction meets a preset condition.
It should be noted that, the preset condition satisfied by the at least one first sub-instruction may refer to the preset condition satisfied by the first rendering instruction.
In an alternative implementation, the second rendering instruction includes a plurality of second sub-instructions, each corresponding to a first sub-instruction, and the executing the second rendering instruction in response to the first rendering instruction includes:
And executing the second sub-instruction corresponding to each first sub-instruction in response to each first sub-instruction.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension.
In an alternative implementation, at least one of the plurality of second sub-instructions is configured to invoke an interface of the target extension and is an instruction related to implementing a function of the target extension.
In an optional implementation, the first rendering instruction includes a first target rendering instruction and a second target rendering instruction, where the first target rendering instruction is a rendering instruction corresponding to a first frame image, the second target rendering instruction is a rendering instruction corresponding to a second frame image, the first frame image is an image frame before the second frame image, and if the first rendering instruction meets a preset condition, the method includes:
If the first target rendering instruction meets the preset condition, and the second target rendering instruction meets the preset condition.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or other network device, etc.) to perform all or part of the steps of the method according to the embodiment of fig. 2 of the present application. The storage medium includes a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
While the application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that the foregoing embodiments may be modified or equivalents may be substituted for some of the features thereof, and that the modifications or substitutions do not depart from the spirit of the embodiments.