CN118708364A

Movatterモバイル変換

Info

Publication number: CN118708364A
Application number: CN202411195295.2A
Authority: CN
Inventors: 张波
Original assignee: Alibaba Cloud Computing Ltd
Current assignee: Alibaba Cloud Computing Ltd
Priority date: 2024-08-28
Filing date: 2024-08-28
Publication date: 2024-09-27
Anticipated expiration: 2044-08-28
Also published as: CN118708364B

Abstract

Translated fromChinese

本申请提供了一种NUMA适配方法、NUMA优化器、芯片，其中，NUMA适配方法包括：获取NUMA架构中服务器的静态配置和操作系统的动态配置；根据静态配置和动态配置确定可用的目标配置；根据用户选择的目标模式，在目标配置中为目标应用确定出指定资源，封装目标应用并在指定资源下启动封装后的应用，目标模式为NUMA适配模式中的一种或者多种模式。通过本申请，解决了相关技术中，传统方案每个产品每个工具每个脚本都做NUMA适配费时费力并不合理的技术问题，达到了提高NUMA适配通用性的技术效果。

The present application provides a NUMA adaptation method, a NUMA optimizer, and a chip, wherein the NUMA adaptation method includes: obtaining the static configuration of the server and the dynamic configuration of the operating system in the NUMA architecture; determining the available target configuration according to the static configuration and the dynamic configuration; determining the specified resources for the target application in the target configuration according to the target mode selected by the user, encapsulating the target application and starting the encapsulated application under the specified resources, and the target mode is one or more modes in the NUMA adaptation mode. Through this application, the technical problem that the traditional solution in the related art is time-consuming, labor-intensive and unreasonable for each product, tool and script to perform NUMA adaptation is solved, and the technical effect of improving the versatility of NUMA adaptation is achieved.

Description

Translated fromChinese

一种NUMA适配方法、NUMA优化器、芯片A NUMA adaptation method, NUMA optimizer, and chip

技术领域Technical Field

本申请涉及芯片技术领域，尤其涉及一种NUMA适配方法、NUMA优化器、芯片。The present application relates to the field of chip technology, and in particular to a NUMA adaptation method, a NUMA optimizer, and a chip.

背景技术Background Art

目前在信创项目中，绝大部分的中央处理器（ Central Processing Unit， CPU）芯片都是多非一致性内存访问（Non-Uniform Memory Access，NUMA）架构的，另外intel在最新的芯片中也将计划采用多NUMA架构。其中，NUMA是一种计算机内存设计，用于多处理器架构，其中每个处理器都有自己的局部内存，并且通过一个内存总线与其他处理器共享访问全局内存。NUMA的目的是提高多处理器计算机系统中内存访问的可伸缩性。但是传统应用都是不支持NUMA的，如果传统的方案每个产品每个工具每个脚本都是做NUMA适配费时费力并不合理，所以采用一个通用性好的NUMA适配方案十分必要。Currently in the information innovation projects, most of the central processing unit (CPU) chips are based on the non-uniform memory access (NUMA) architecture. In addition, Intel plans to adopt a multi-NUMA architecture in its latest chips. NUMA is a computer memory design used in multi-processor architectures, in which each processor has its own local memory and shares access to global memory with other processors through a memory bus. The purpose of NUMA is to improve the scalability of memory access in multi-processor computer systems. However, traditional applications do not support NUMA. If the traditional solution requires every product, tool, and script to be NUMA-adapted, it will be time-consuming and labor-intensive and unreasonable, so it is necessary to adopt a NUMA adaptation solution with good versatility.

发明内容Summary of the invention

本申请实施例提供了一种NUMA适配方法、NUMA优化器、芯片，以解决上述一个或多个技术问题。The embodiments of the present application provide a NUMA adaptation method, a NUMA optimizer, and a chip to solve one or more of the above-mentioned technical problems.

第一方面，本申请实施例提供了一种NUMA适配方法，包括：In a first aspect, an embodiment of the present application provides a NUMA adaptation method, including:

获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；Get the static configuration of the server and the dynamic configuration of the operating system in the non-uniform memory access NUMA architecture;

根据所述静态配置和所述动态配置确定可用的目标配置；Determine an available target configuration according to the static configuration and the dynamic configuration;

根据用户选择的目标模式，在所述目标配置中为目标应用确定出指定资源，封装所述目标应用并在所述指定资源下启动封装后的应用，其中，所述目标模式为NUMA适配模式中的一种或者多种模式。According to the target mode selected by the user, designated resources are determined for the target application in the target configuration, the target application is encapsulated and the encapsulated application is started under the designated resources, wherein the target mode is one or more modes of the NUMA adaptation mode.

第二方面，本申请实施例提供了一种NUMA优化器，设置在应用和操作系统OS之间，包括：In a second aspect, an embodiment of the present application provides a NUMA optimizer, which is arranged between an application and an operating system OS, and includes:

获取模块，用于获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；An acquisition module, used to acquire static configuration of a server and dynamic configuration of an operating system in a non-uniform memory access NUMA architecture;

确定模块，用于根据所述静态配置和所述动态配置确定可用的目标配置；A determination module, configured to determine an available target configuration according to the static configuration and the dynamic configuration;

处理模块，用于根据用户选择的目标模式，在所述目标配置中为目标应用确定出指定资源，封装所述目标应用并在所述指定资源下启动封装后的应用，其中，所述目标模式为NUMA适配模式中的一种或者多种模式。A processing module is used to determine designated resources for a target application in the target configuration according to a target mode selected by a user, encapsulate the target application and start the encapsulated application under the designated resources, wherein the target mode is one or more modes of the NUMA adaptation mode.

第三方面，本申请实施例提供了一种芯片，使用多NUMA架构，在对NUMA适配时，使用上述NUMA适配的方法。In a third aspect, an embodiment of the present application provides a chip that uses a multi-NUMA architecture and uses the above-mentioned NUMA adaptation method when adapting to NUMA.

第四方面，本申请实施例提供了一种电子设备，包括存储器、处理器及存储在存储器上的计算机程序，所述处理器在执行所述计算机程序时实现上述任一项所述的方法。In a fourth aspect, an embodiment of the present application provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory, wherein the processor implements any of the above methods when executing the computer program.

第五方面，本申请实施例提供了一种计算机可读存储介质，所述计算机可读存储介质内存储有计算机程序，所述计算机程序被处理器执行时实现上述任一项所述的方法。In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, any of the methods described above is implemented.

第六方面，本申请实施例提供了一种计算机程序产品，包括计算机指令，所示计算机指令被处理器执行时实现上述任一项所述的方法。In a sixth aspect, an embodiment of the present application provides a computer program product, including computer instructions, which implement any of the methods described above when executed by a processor.

与相关技术相比，本申请具有如下优点：Compared with the related art, this application has the following advantages:

依据本申请实施例，获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；根据该静态配置和该动态配置确定可用的目标配置；根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装该目标应用并在该指定资源下启动封装后的应用，其中，该目标模式为NUMA适配模式中的一种或者多种模式。即，本申请提出了一种通用的NUMA适配方案，不同应用可以避免独立实现NUMA优化。具体地，可以根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装目标应用并在指定资源下启动封装后的应用，无侵入的适配OS层的应用，与该应用实现的语言c++/java/脚本语言无关，进而解决了相关技术中，传统的方案每个产品每个工具每个脚本都做NUMA适配费时费力并不合理的技术问题，达到了提高适配效率和性能的技术效果。According to the embodiment of the present application, the static configuration of the server and the dynamic configuration of the operating system in the non-uniform memory access NUMA architecture are obtained; the available target configuration is determined according to the static configuration and the dynamic configuration; according to the target mode selected by the user, the specified resources are determined for the target application in the target configuration, the target application is encapsulated and the encapsulated application is started under the specified resources, wherein the target mode is one or more modes in the NUMA adaptation mode. That is, the present application proposes a universal NUMA adaptation scheme, and different applications can avoid independent implementation of NUMA optimization. Specifically, according to the target mode selected by the user, the specified resources can be determined for the target application in the target configuration, the target application can be encapsulated and the encapsulated application can be started under the specified resources, and the application of the OS layer can be adapted non-invasively, which is independent of the language c++/java/scripting language implemented by the application, thereby solving the technical problem in the related art that the traditional solution is time-consuming, labor-intensive and unreasonable for each product, each tool and each script to perform NUMA adaptation, and achieves the technical effect of improving adaptation efficiency and performance.

上述说明仅是本申请技术方案的概述，为了能够更清楚了解本申请的技术手段，可依照说明书的内容予以实施，并且为了让本申请的上述和其他目的、特征和优点能够更明显易懂，以下特举本申请的具体实施方式。The above description is only an overview of the technical solution of the present application. In order to more clearly understand the technical means of the present application, it can be implemented in accordance with the contents of the specification. In order to make the above and other purposes, features and advantages of the present application more obvious and easy to understand, the specific implementation methods of the present application are listed below.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

在附图中，除非另外规定，否则贯穿多个附图相同的附图标记表示相同或相似的部件或元素。这些附图不一定是按照比例绘制的。应该理解，这些附图仅描绘了根据本申请的一些实施方式，而不应将其视为是对本申请范围的限制。In the accompanying drawings, unless otherwise specified, the same reference numerals throughout the multiple drawings represent the same or similar parts or elements. These drawings are not necessarily drawn to scale. It should be understood that these drawings only depict some embodiments according to the present application and should not be regarded as limiting the scope of the present application.

图1示出了本申请实施例中提供的NUMA适配方法应用场景示意图；FIG1 is a schematic diagram showing an application scenario of a NUMA adaptation method provided in an embodiment of the present application;

图2示出了本申请实施例中提供的NUMA适配方法的流程图；FIG2 shows a flow chart of a NUMA adaptation method provided in an embodiment of the present application;

图3示出了本申请实施例中提供的NUMA优化器结构框图；FIG3 shows a block diagram of a NUMA optimizer structure provided in an embodiment of the present application;

图4示出了本申请实施例中提供的NUMA优化器进行NUMA适配的流程示意图；FIG4 is a schematic diagram showing a flow chart of NUMA adaptation performed by a NUMA optimizer provided in an embodiment of the present application;

图5示出了本申请实施例中提供的芯片使用NUMA适配方法进行适配的示意图；FIG5 is a schematic diagram showing a chip provided in an embodiment of the present application being adapted using a NUMA adaptation method;

图6示出了本申请实施例中的电子设备的框图。FIG6 shows a block diagram of an electronic device in an embodiment of the present application.

具体实施方式DETAILED DESCRIPTION

在下文中，仅简单地描述了某些示例性实施例。正如本领域技术人员可认识到的那样，在不脱离本申请的构思或范围的情况下，可通过各种不同方式修改所描述的实施例。因此，附图和描述被认为本质上是示例性的，而非限制性的。In the following, only some exemplary embodiments are briefly described. As those skilled in the art will appreciate, the described embodiments may be modified in various ways without departing from the concept or scope of the present application. Therefore, the drawings and descriptions are considered to be exemplary in nature and not restrictive.

为便于理解本申请实施例的技术方案，以下对本申请实施例的相关技术进行说明。以下相关技术作为可选方案与本申请实施例的技术方案可以进行任意结合，其均属于本申请实施例的保护范围。To facilitate understanding of the technical solutions of the embodiments of the present application, the following describes the related technologies of the embodiments of the present application. The following related technologies can be combined with the technical solutions of the embodiments of the present application as optional solutions, and they all belong to the protection scope of the embodiments of the present application.

NUMA目前主要的资源优化方案有几类：物理机层面、虚拟机级别、docker级别和kubenetes。其中，上述资源优化方案不同层面和技术栈的实现细节如下：There are several main resource optimization solutions for NUMA at present: physical machine level, virtual machine level, docker level and kubenetes level. The implementation details of the above resource optimization solutions at different levels and technology stacks are as follows:

物理机层面操作系统（Operating System，OS）提供了numactl和taskset命令，可以帮助用户把指定的进程控制在指定的CPU核和内存NUMA节点上，这个是Linux版本的核心能力但是专业性较强，不同硬件、OS版本上有区别，导致实现难度较大。The operating system (OS) at the physical machine level provides the numactl and taskset commands, which can help users control specified processes on specified CPU cores and memory NUMA nodes. This is a core capability of the Linux version, but it is highly professional and varies on different hardware and OS versions, making it difficult to implement.

虚拟机层面包括Hyper-V， Vmware都用类似cpupin的方式，从虚拟机和物理机映射关系建立的时候专门实现了NUMA优化。At the virtual machine level, including Hyper-V and VMware, both use a method similar to cpupin to specifically implement NUMA optimization when establishing the mapping relationship between the virtual machine and the physical machine.

Kubernetes中的NUMA优化，从1.18版本开始引入了拓扑管理器，它可以帮助在NUMA节点上有效地放置工作负载，以优化性能。（1）开启拓扑管理器：确保在集群节点上启用了拓扑管理器，并设置了适当的策略。策略有多种选择，例如无策略、尽力而为、受限制和单个NUMA节点。对于NUMA优化，通常推荐单个NUMA节点策略。（2）设置CPU 管理器策略：CPU管理器可以为容器分配专用的CPU核心。通过设置CPU 管理器策略（如静态），可以在NUMA架构上提高性能。（3）使用CPU和内存的请求与限制：在容器规格中设置CPU和内存的请求和限制，以确保Kubernetes调度器可以考量各个NUMA节点的资源使用。（4）使用容器拓扑扩散约束：通过配置容器拓扑扩散约束，可以指定如何在不同的NUMA节点之间分布容器副本。NUMA optimization in Kubernetes, starting from version 1.18, the topology manager was introduced, which can help effectively place workloads on NUMA nodes to optimize performance. (1) Enable the topology manager: Make sure the topology manager is enabled on the cluster nodes and set the appropriate policy. There are multiple policy options, such as no policy, best effort, restricted, and single NUMA node. For NUMA optimization, a single NUMA node policy is usually recommended. (2) Set the CPU manager policy: The CPU manager can allocate dedicated CPU cores to containers. By setting CPU manager policies (such as static), performance can be improved on NUMA architectures. (3) Use CPU and memory requests and limits: Set CPU and memory requests and limits in the container specification to ensure that the Kubernetes scheduler can consider the resource usage of each NUMA node. (4) Use container topology diffusion constraints: By configuring container topology diffusion constraints, you can specify how to distribute container replicas between different NUMA nodes.

docker层面（基于numactl或者cgoup实现）：Docker等容器运行时，可以使用numactl或类似的工具来直接管理NUMA亲和性。例如，可以在容器启动之前指定NUMA节点亲和性。可以在容器内部直接使用numactl，如果有明确必要的话，通过numactl执行容器内的进程，以便更精细地管理亲和性设置。Docker level (based on numactl or cgoup): When running a container such as Docker, you can use numactl or similar tools to directly manage NUMA affinity. For example, you can specify NUMA node affinity before starting the container. You can use numactl directly inside the container, and if it is clearly necessary, execute the process inside the container through numactl to manage affinity settings more finely.

但是传统的OS层要做好NUMA适配，需要使用numactl，产品研发需要明确填写若干参数，包括cpunodebind、membind、localalloc等参数，具体描述的是进程需要绑定在哪个CPU、内存上。主要缺点是：开发适配的时候需要考虑的因素很多，比如NUMA是否打开，几个插槽（socket）/NUMA节点，硬件部署拓扑以及内核参数调整等，导致不同产品的适配成本很高。适配的质量验证也是各有不同，存在的适配问题相似但是不同，导致适配质量和成本过高。新硬件引入的时候需要新一轮的适配工作。However, the traditional OS layer needs to use numactl to do NUMA adaptation well, and product development needs to clearly fill in several parameters, including cpunodebind, membind, localalloc and other parameters, which specifically describe which CPU and memory the process needs to be bound to. The main disadvantage is that there are many factors to consider when developing adaptation, such as whether NUMA is turned on, the number of sockets/NUMA nodes, hardware deployment topology, and kernel parameter adjustment, which leads to high adaptation costs for different products. The quality verification of adaptation is also different. The existing adaptation problems are similar but different, resulting in high adaptation quality and cost. A new round of adaptation work is required when new hardware is introduced.

k8s提供了相对比较合理的以策略为基础的NUMA适配方案，目前支持无策略/尽力而为/受限制/单个NUMA节点四个策略，这里有几个缺点：仅仅针对CPU作为唯一考量，但是事实上并不是之后CPU影响应用的性能，例如网卡与应用跨NUMA节点对于一部分应用无法接受，但是目前没有更丰富的适配策略。docker层面、虚拟机层面类似。K8s provides a relatively reasonable policy-based NUMA adaptation solution, currently supporting four policies: no policy/best effort/restricted/single NUMA node. There are several disadvantages: only the CPU is considered as the only consideration, but in fact it is not the CPU that affects the performance of the application. For example, the network card and application across NUMA nodes are unacceptable to some applications, but there are currently no richer adaptation policies. The docker level and virtual machine level are similar.

有鉴于此，本申请实施例提供了一种NUMA适配方法，以全部或部分解决上述技术问题。该方法主要应用场景包括：对专有云底座组件实现不敏感的应用程序，其中，该底座组件包括但并不限于：计算组件、存储组件、网络组件、安全组件、管理与自动化组件、备份与灾难恢复组件、服务门户和应用程序编程接口（Application Programming Interface，API），整体架构如图1所示。In view of this, the embodiment of the present application provides a NUMA adaptation method to solve the above technical problems in whole or in part. The main application scenarios of this method include: applications that are insensitive to the implementation of proprietary cloud base components, where the base components include but are not limited to: computing components, storage components, network components, security components, management and automation components, backup and disaster recovery components, service portals and application programming interfaces (APIs). The overall architecture is shown in Figure 1.

在上述应用场景下，本申请实施例提供了如图2所示NUMA适配方法，其中，包括：In the above application scenario, the embodiment of the present application provides a NUMA adaptation method as shown in FIG2 , which includes:

S202，获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置。S202, obtaining a static configuration of a server and a dynamic configuration of an operating system in a non-uniform memory access NUMA architecture.

需要说明的是，静态配置包括预定义的NUMA配置和/或硬件配置，动态配置包括操作系统运行态资源使用情况。上述NUMA配置可以包括服务器上的NUMA配置和内核参数中NUMA优化相关的配置，其中，服务器上的NUMA配置包括NUMA是否打开，插槽和NUMA节点的个数，跨插槽和NUMA节点的延迟差异，当前系统跨节点内存访问频率等信息。上述内核参数中NUMA优化相关的配置包括NUMA平衡（NUMA balance）、调度（sched）调度相关参数、内存回收机制等。上述硬件配置包括硬件驱动信息、网卡所在的节点信息在内的位置与配置、内存速率、内存和NUMA节点的映射关系、磁盘和外设的位置等信息。It should be noted that static configuration includes predefined NUMA configuration and/or hardware configuration, and dynamic configuration includes resource usage in the operating system running state. The above NUMA configuration may include the NUMA configuration on the server and the configuration related to NUMA optimization in the kernel parameters, wherein the NUMA configuration on the server includes information such as whether NUMA is turned on, the number of slots and NUMA nodes, the latency difference across slots and NUMA nodes, and the frequency of memory access across nodes in the current system. The configuration related to NUMA optimization in the above kernel parameters includes NUMA balance, scheduling (sched) scheduling related parameters, memory recovery mechanism, etc. The above hardware configuration includes information such as hardware driver information, location and configuration of the node where the network card is located, memory rate, mapping relationship between memory and NUMA nodes, location of disks and peripherals, etc.

另外，需要说明的是，上述动态配置可以包括不同NUMA节点上CPU资源繁忙程度，内存分配和跨路访问情况等。In addition, it should be noted that the above dynamic configuration may include the busyness of CPU resources on different NUMA nodes, memory allocation and cross-path access conditions, etc.

在一种可能的实现方式中，上述获取NUMA架构中服务器的静态配置的方式可以包括：通过lscpu 命令提供关于CPU架构的详细信息，包括其与NUMA的相关性，这个命令会展示NUMA节点的数量以及每个节点关联的 CPU核心。umactl 是管理NUMA策略的一套工具，其中 numactl--hardware 命令显示了NUMA节点的内存和CPU分布。dmidecode 是一个获取硬件信息的工具，可以通过解析系统 BIOS 信息来显示硬件详情，包括内存模块、CPU信息等。Linux 的 sys 文件系统包含了大量系统和硬件信息，其中包括 NUMA结构信息，通过查看proc 文件系统中的信息，也可以获取 NUMA相关信息。In a possible implementation, the above-mentioned method of obtaining the static configuration of the server in the NUMA architecture may include: providing detailed information about the CPU architecture, including its relevance to NUMA, through the lscpu command, which displays the number of NUMA nodes and the CPU cores associated with each node. umactl is a set of tools for managing NUMA policies, in which the numactl--hardware command displays the memory and CPU distribution of NUMA nodes. dmidecode is a tool for obtaining hardware information, which can display hardware details, including memory modules, CPU information, etc., by parsing system BIOS information. The Linux sys file system contains a lot of system and hardware information, including NUMA structure information. NUMA-related information can also be obtained by viewing the information in the proc file system.

S204，根据静态配置和动态配置确定可用的目标配置。S204: Determine an available target configuration according to the static configuration and the dynamic configuration.

S206，根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装该目标应用并在该指定资源下启动封装后的应用，其中，该目标模式为NUMA适配模式中的一种或者多种模式。S206, according to the target mode selected by the user, determining designated resources for the target application in the target configuration, encapsulating the target application and starting the encapsulated application under the designated resources, wherein the target mode is one or more modes of the NUMA adaptation mode.

需要说明的是，上述指定资源包括以下至少之一：CPU、内存。上述目标应用包括但并不限于：在现代云计算、微服务以及容器化技术出现之前的传统应用程序（例如，企业资源规划（ERP）系统、客户关系管理（CRM）软件、数据库管理系统（DBMS）、电子邮件系统、财务会计软件、桌面办公软件、文件存储系统等）。It should be noted that the above-mentioned designated resources include at least one of the following: CPU, memory. The above-mentioned target applications include but are not limited to: traditional applications before the emergence of modern cloud computing, microservices and containerization technologies (for example, enterprise resource planning (ERP) systems, customer relationship management (CRM) software, database management systems (DBMS), email systems, financial accounting software, desktop office software, file storage systems, etc.).

另外，需要说明的是，NUMA适配模式可以是基于通用/网络保证/节点保障/插槽保障等不同的策略提出的适配模式。In addition, it should be noted that the NUMA adaptation mode can be an adaptation mode proposed based on different strategies such as general/network guarantee/node guarantee/slot guarantee.

通过上述步骤S202~S206，获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；根据该静态配置和该动态配置确定可用的目标配置；根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装该目标应用并在该指定资源下启动封装后的应用，其中，该目标模式为NUMA适配模式中的一种或者多种模式。即，本申请提出了一种通用的NUMA适配方案，不同应用可以避免独立实现NUMA优化。具体地，可以根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装目标应用并在指定资源下启动封装后的应用，无侵入的适配OS层的应用，与该应用实现的语言c++/java/脚本语言无关，进而解决了相关技术中，传统的方案每个产品每个工具每个脚本都做NUMA适配费时费力并不合理的技术问题，达到了提高适配效率和性能的技术效果。Through the above steps S202~S206, the static configuration of the server and the dynamic configuration of the operating system in the non-uniform memory access NUMA architecture are obtained; the available target configuration is determined according to the static configuration and the dynamic configuration; according to the target mode selected by the user, the specified resources are determined for the target application in the target configuration, the target application is encapsulated and the encapsulated application is started under the specified resources, wherein the target mode is one or more modes in the NUMA adaptation mode. That is, the present application proposes a universal NUMA adaptation solution, and different applications can avoid independent implementation of NUMA optimization. Specifically, according to the target mode selected by the user, the specified resources can be determined for the target application in the target configuration, the target application can be encapsulated and the encapsulated application can be started under the specified resources, and the application of the OS layer can be adapted non-invasively, which is independent of the language c++/java/scripting language implemented by the application, thereby solving the technical problem in the related technology that the traditional solution is time-consuming, labor-intensive and unreasonable for each product, each tool and each script to perform NUMA adaptation, and achieves the technical effect of improving adaptation efficiency and performance.

上面提到了用户选择目标模式，基于此本申请实施例还提出了：S11，向该用户提供该NUMA适配模式，该NUMA适配模式用于用户选择目标模式。可选地，NUMA适配模式包括第一适配模式和/或第二适配模式，其中，第一适配模式为在插槽socket范围启动应用的模式，第二适配模式为在NUMA节点范围启动应用的模式，其中，第一适配模式包括但并不限于以下至少之一：The user selection of the target mode is mentioned above. Based on this, the embodiment of the present application further proposes: S11, providing the NUMA adaptation mode to the user, and the NUMA adaptation mode is used for the user to select the target mode. Optionally, the NUMA adaptation mode includes a first adaptation mode and/or a second adaptation mode, wherein the first adaptation mode is a mode for starting the application in the socket range, and the second adaptation mode is a mode for starting the application in the NUMA node range, wherein the first adaptation mode includes but is not limited to at least one of the following:

模式一：在不区分NUMA节点的情况下，尝试应用启动在某一个插槽（socket）上，若该插槽资源不足则跨插槽启动应用。该模式可以为默认模式。该模式可以使用 numactl 的membind 和cpunodebind 选项来绑定内存和 CPU，这样可以保证最大的资源分布。Mode 1: Without distinguishing NUMA nodes, try to start the application on a certain socket. If the socket has insufficient resources, start the application across sockets. This mode can be the default mode. In this mode, you can use the membind and cpunodebind options of numactl to bind memory and CPU, which can ensure the maximum resource distribution.

模式二、在不区分NUMA节点的情况下，保证应用启动在某一个插槽上，若该插槽资源不足则返回报错。该模式可以适用对性能要求极高且希望避免跨插槽通信开销的场景。Mode 2: Without distinguishing NUMA nodes, ensure that the application is started on a certain socket, and return an error if the socket has insufficient resources. This mode is suitable for scenarios with extremely high performance requirements and the desire to avoid cross-socket communication overhead.

模式三、从NUMA配置中为应用选择一个最优的NUMA节点，若无法实现则退回在插槽范围启动应用。在该模式中可以以“尽最大努力”的方式为应用选择一个最优的NUMA节点，在选择时考虑如何最小化内存访问延迟和最大化数据吞吐率。Mode 3: Select an optimal NUMA node for the application from the NUMA configuration. If this is not possible, fall back to launching the application in the socket range. In this mode, an optimal NUMA node can be selected for the application in a "best effort" manner, considering how to minimize memory access latency and maximize data throughput.

上述第二适配模式包括但并不限于以下至少之一：The second adaptation mode includes but is not limited to at least one of the following:

模式四、保证将应用启动在一个NUMA节点，若该NUMA节点资源不足则返回报错。可以使用Linux中的几个工具和命令进行控制和管理，这些工具允许指定应用程序及其内存访问必须限制在特定的NUMA节点。该模式适用于需要高性能和低延迟的应用，提升整体的系统效率和用户体验。Mode 4: Ensure that the application is started on a NUMA node. If the NUMA node has insufficient resources, an error is returned. Several tools and commands in Linux can be used for control and management. These tools allow the specified application and its memory access to be restricted to a specific NUMA node. This mode is suitable for applications that require high performance and low latency, improving overall system efficiency and user experience.

模式五、将应用自动启动在网卡距离最近的NUMA节点。在Linux系统中，可以使用lstopo命令来可视化和查找每个设备（包括网卡）绑定的NUMA节点，确定网卡的NUMA节点后，使用numactl将应用绑定到该节点启动。该模式可以显著提升延迟敏感应用的性能。Mode 5: Automatically start the application on the NUMA node closest to the network card. In the Linux system, you can use the lstopo command to visualize and find the NUMA node bound to each device (including the network card). After determining the NUMA node of the network card, use numactl to bind the application to the node and start it. This mode can significantly improve the performance of latency-sensitive applications.

模式六、用户自定义在某个NUMA节点启动，若该NUMA节点不足则返回报错。即，在用户对硬件和OS了解较深的情况下，用户可以自定义适配模式。Mode 6: User-defined Start at a certain NUMA node, and return an error if the NUMA node is insufficient. That is, if the user has a deep understanding of the hardware and OS, the user can customize the adaptation mode.

在一种可能的实现方式中，上述封装该目标应用并在该指定资源下启动封装后的应用可以包括：S21，利用NUMA原生命令封装该目标应用的启动逻辑；S22，在该指定资源下启动封装后的应用。需要说明的是，NUMA原生命令一般可以包括：numactl命令、numastat命令、lstopo命令、taskset命令、numa_maps命令、numad命令、numactl–hardware命令、numactl–interleave命令。numactl命令是管理NUMA策略的一个主要工具，可以控制进程的NUMA策略，包括内存分配和CPU的绑定等，numastat 命令可以用来显示关于NUMA节点的内存消耗统计，适合用于监控系统的NUMA行为，lstopo命令（来自hwloc工具包）提供了一个系统的硬件拓扑图，包括NUMA节点、CPU、内存等的视图，taskset命令不是专门的NUMA命令，但它经常与numactl一起使用，用于设置CPU亲和性（即进程可以运行的CPU核心）。这在NUMA优化中很重要，以确保进程近距离访问所需资源，numa_maps命令用于显示特定进程的NUMA内存映射信息，帮助高级用户分析内存使用情况，numad命令是一个守护进程，用于自动管理NUMA策略以提高系统的性能。它通过动态监控负载和资源，以决定在NUMA架构上的最佳部署，numactl—hardware命令可以显示所有NUMA节点的详细配置，包括每个节点有哪些CPU和内存量，numactl–interleave命令可以在所有NUMA节点间平均分配内存分配，有助于某些应用场景，如当需要高吞吐量的内存访问。通过上述S21~S22，不同应用特别是OS和硬件无感应用可以避免独立实现NUMA优化，适配成本从数月降低到数天，对应的测试成本也从所有应用单独测试变为一个应用完整测试和其他应用单独测试自己的启动逻辑，数量级级别的降低了适配复杂度。In a possible implementation, encapsulating the target application and starting the encapsulated application under the specified resource may include: S21, encapsulating the startup logic of the target application using a NUMA native command; S22, starting the encapsulated application under the specified resource. It should be noted that NUMA native commands generally include: numactl command, numastat command, lstopo command, taskset command, numa_maps command, numad command, numactl-hardware command, numactl-interleave command. The numactl command is a major tool for managing NUMA policies, which can control the NUMA policies of processes, including memory allocation and CPU binding, etc. The numastat command can be used to display memory consumption statistics about NUMA nodes, which is suitable for monitoring the NUMA behavior of the system. The lstopo command (from the hwloc toolkit) provides a hardware topology map of a system, including views of NUMA nodes, CPUs, memory, etc. The taskset command is not a dedicated NUMA command, but it is often used with numactl to set CPU affinity (i.e., the CPU core on which the process can run). This is important in NUMA optimization to ensure that processes have close access to required resources. The numa_maps command is used to display NUMA memory mapping information for a specific process to help advanced users analyze memory usage. The numad command is a daemon that automatically manages NUMA policies to improve system performance. It dynamically monitors load and resources to determine the best deployment on the NUMA architecture. The numactl—hardware command can display the detailed configuration of all NUMA nodes, including which CPUs and memory each node has. The numactl–interleave command can evenly distribute memory allocation among all NUMA nodes, which helps certain application scenarios, such as when high-throughput memory access is required. Through the above S21~S22, different applications, especially OS and hardware-insensitive applications, can avoid independent implementation of NUMA optimization, and the adaptation cost is reduced from months to days. The corresponding testing cost is also changed from testing all applications separately to testing one application completely and testing the startup logic of other applications separately, which reduces the adaptation complexity by orders of magnitude.

上述获取NUMA架构中操作系统的动态配置可以包括：S31，分析该操作系统当前运行的实时信息；S32，根据实时信息，获取动态配置。可选地，S32可以包括S321，使用该操作系统中的指定命令，获取第一信息，其中，该第一信息包括以下至少之一：该操作系统中正在运行的应用进程的中央处理器CPU使用量、该操作系统中正在运行的应用进程的内存使用量；S322从该应用进程的控制组运行态统计中，获取第二信息，其中，该第二信息包括该操作系统中正在运行的应用进程的CPU、内存、输入/输出I/O以及网络的约束；S323，获取第三信息，其中，该第三信息包括该操作系统中应用进程使用的CPU和内存的具体位置；S324，根据该第一信息、该第二信息以及该第三信息，生成单机进程使用资源分布热力图；S325，根据该单机进程使用资源分布热力图反应的资源使用情况，确定该动态配置。可选地，上述单机进程使用资源分布热力图的生成方式可以通过可视化库来生成。通过上述步骤，获取操作系统当前运行的实时信息，例如，CPU使用量、内存使用量、CPU、内存、输入/输出I/O以及网络的约束、CPU具体位置、内存具体位置，结合静态配置可以生成单机进程使用资源分布热力图，该单机进程使用资源分布热力图提供了直观的资源使用状况，帮助识别性能瓶颈和资源热点，动态配置则基于这些反馈信息，实时调整系统资源分配策略，提升资源利用率和系统性能，为之后新进程拉起的资源选择做好了准备。The above-mentioned acquisition of the dynamic configuration of the operating system in the NUMA architecture may include: S31, analyzing the real-time information of the current operation of the operating system; S32, acquiring the dynamic configuration according to the real-time information. Optionally, S32 may include S321, using a specified command in the operating system to acquire first information, wherein the first information includes at least one of the following: the CPU usage of the application process running in the operating system, the memory usage of the application process running in the operating system; S322, acquiring second information from the running state statistics of the control group of the application process, wherein the second information includes the CPU, memory, input/output I/O and network constraints of the application process running in the operating system; S323, acquiring third information, wherein the third information includes the specific location of the CPU and memory used by the application process in the operating system; S324, generating a heat map of resource distribution used by a single machine process according to the first information, the second information and the third information; S325, determining the dynamic configuration according to the resource usage reflected by the heat map of resource distribution used by the single machine process. Optionally, the generation method of the above-mentioned single-machine process resource distribution heat map can be generated through a visualization library. Through the above steps, the real-time information of the current operation of the operating system is obtained, such as CPU usage, memory usage, CPU, memory, input/output I/O and network constraints, CPU specific location, memory specific location, and combined with static configuration, a single-machine process resource distribution heat map can be generated. The single-machine process resource distribution heat map provides an intuitive resource usage status, helps identify performance bottlenecks and resource hotspots, and the dynamic configuration is based on these feedback information to adjust the system resource allocation strategy in real time, improve resource utilization and system performance, and prepare for the resource selection of new processes to be pulled up later.

针对大页内存（是一种内存管理技术，通过使用比常规页更大的内存页面，来提高内存访问效率和减少内存管理开销。在默认的情况下，操作系统通常使用4KB大小的内存页，而大页内存页的大小则可以是2MB或1GB等）的场景，本申请实施例还提出了：在根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源之后，还包括：S41，获取该操作系统内核参数，其中，该内核参数包括以下至少之一：网卡队列长度、中断绑核策略、内存切换策略；S42，根据该操作系统内核参数调整该指定资源。可选地，S42可以包括S421，通过该操作系统内核参数，确定预留资源；S422，根据该预留资源，调整该指定资源。上述步骤还考虑到操作系统为大页内存预留的资源，在确定上述指定资源后，还可以在该指定资源中去掉该预留资源，以保证分配给目标应用的资源更合理，进而确保应用可以正常启动。For the scenario of large page memory (a memory management technology that uses memory pages larger than regular pages to improve memory access efficiency and reduce memory management overhead. By default, the operating system usually uses a 4KB memory page, while the size of the large page memory page can be 2MB or 1GB, etc.), the embodiment of the present application also proposes: after determining the designated resources for the target application in the target configuration according to the target mode selected by the user, it also includes: S41, obtaining the operating system kernel parameters, wherein the kernel parameters include at least one of the following: network card queue length, interrupt binding core strategy, memory switching strategy; S42, adjusting the designated resources according to the operating system kernel parameters. Optionally, S42 may include S421, determining the reserved resources through the operating system kernel parameters; S422, adjusting the designated resources according to the reserved resources. The above steps also take into account the resources reserved by the operating system for large page memory. After determining the above designated resources, the reserved resources can also be removed from the designated resources to ensure that the resources allocated to the target application are more reasonable, thereby ensuring that the application can be started normally.

如果有若干可用资源的时候，可以根据内置的策略选择最优最短路径，例如一个应用并不是网络敏感，将优先选择距离网卡较远的地方，给潜在的其它网络应用预留空间。在若确定未得到可用的资源配置，本申请实施例还提出了：S51，将存量的进程资源热迁移至目标位置，得到目标资源；S52，重新确定该可用的资源配置。可选地，在本申请实施例中，步骤S52的具体实现方式可以是重新执行上述指定资源的确定过程。即，在没有可用的资源但是可以腾挪出来的时候，选择将存量的其它进程资源热迁移到其它的地方，节省出资源后再重试资源选择的过程。If there are several available resources, the optimal shortest path can be selected according to the built-in strategy. For example, if an application is not network sensitive, it will give priority to a place far away from the network card to reserve space for other potential network applications. If it is determined that no available resource configuration is obtained, the embodiment of the present application also proposes: S51, hot migration of existing process resources to the target location to obtain the target resources; S52, redetermining the available resource configuration. Optionally, in the embodiment of the present application, the specific implementation method of step S52 can be to re-execute the above-mentioned determination process of the specified resources. That is, when there are no available resources but they can be vacated, choose to hot migrate other existing process resources to other places, save resources, and then retry the resource selection process.

综上，本申请提供的NUMA适配方案兼容上层应用，降低应用接入NUMA的适配成本，包括开发和测试的成本，最终提高整体软件的性能和可靠性。通过前置判断可用目标资源，再利用NUMA原生命令封装应用启动逻辑，无侵入的适配OS层的应用，而与其实现的语言c++/java/脚本语言无关。使用本申请提供的适配方案后，所有应用统一接入一次，通过修改启动应用的参数适配一次永久可用，包括不同的x86/arm/sw等芯片架构，包括存量/新增硬件，类似java JVM做到了适配一次，到处运行的效果，同时可以根据芯片的升级和适配方案版本更新，一直得到最佳的性能体验。In summary, the NUMA adaptation solution provided by this application is compatible with upper-level applications, reduces the adaptation cost of applications accessing NUMA, including the cost of development and testing, and ultimately improves the performance and reliability of the overall software. By pre-judging the available target resources, and then using NUMA native commands to encapsulate the application startup logic, the OS layer application is adapted non-invasively, regardless of the c++/java/scripting language used to implement it. After using the adaptation solution provided by this application, all applications are uniformly connected once, and are permanently available by modifying the parameters of the startup application, including different x86/arm/sw and other chip architectures, including existing/newly added hardware, similar to the java JVM, which achieves the effect of adapting once and running everywhere. At the same time, it can be updated according to the chip upgrade and the adaptation solution version, and the best performance experience can be obtained all the time.

需要说明的是，本申请所涉及的用户信息（包括但不限于用户设备信息、用户个人信息等）和数据（包括但不限于用于分析的数据、存储的数据、展示的数据等），均为经用户授权或者经过各方充分授权的信息和数据，并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准，并提供有相应的操作入口，供用户选择授权或者拒绝。It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of relevant countries and regions, and provide corresponding operation entrances for users to choose to authorize or refuse.

下面以具体的实施例对本申请的技术方案以及本申请的技术方案如何解决前述技术问题进行详细说明。所列举的若干具体的实施例可以相互结合，对于相同或相似的概念或过程可能在某些实施例中不再赘述。以下将结合附图，对本申请的实施例进行详细描述。The technical solution of the present application and how the technical solution of the present application solves the above-mentioned technical problems are described in detail below with specific embodiments. The several specific embodiments listed can be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

与本申请实施例提供的方法的应用场景以及方法相对应地，本申请实施例还提供一种NUMA优化器。如图3所示为本申请一实施例的NUMA优化器结构框图，该NUMA优化器可以包括：Corresponding to the application scenario and method of the method provided in the embodiment of the present application, the embodiment of the present application also provides a NUMA optimizer. As shown in FIG3 , a structural block diagram of a NUMA optimizer of an embodiment of the present application is shown, and the NUMA optimizer may include:

第一获取模块32，用于获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；A first acquisition module 32, used to acquire the static configuration of the server and the dynamic configuration of the operating system in the non-uniform memory access NUMA architecture;

第一确定模块34，用于根据该静态配置和该动态配置确定可用的目标配置；A first determination module 34, configured to determine an available target configuration according to the static configuration and the dynamic configuration;

处理模块36，用于根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装该目标应用并在该指定资源下启动封装后的应用，其中，该目标模式为NUMA适配模式中的一种或者多种模式。The processing module 36 is used to determine designated resources for the target application in the target configuration according to the target mode selected by the user, encapsulate the target application and start the encapsulated application under the designated resources, wherein the target mode is one or more modes of the NUMA adaptation mode.

通过图3所示装置，获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；根据该静态配置和该动态配置确定可用的目标配置；根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装该目标应用并在该指定资源下启动封装后的应用，其中，该目标模式为NUMA适配模式中的一种或者多种模式。即，本申请提出了一种通用的NUMA适配方案，不同应用可以避免独立实现NUMA优化。具体地，可以根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装目标应用并在指定资源下启动封装后的应用，无侵入的适配OS层的应用，与该应用实现的语言c++/java/脚本语言无关，进而解决了相关技术中，传统的方案每个产品每个工具每个脚本都做NUMA适配费时费力并不合理的技术问题，达到了提高适配效率和性能的技术效果。Through the device shown in FIG3 , the static configuration of the server in the non-uniform memory access NUMA architecture and the dynamic configuration of the operating system are obtained; the available target configuration is determined according to the static configuration and the dynamic configuration; according to the target mode selected by the user, the specified resources are determined for the target application in the target configuration, the target application is encapsulated and the encapsulated application is started under the specified resources, wherein the target mode is one or more modes in the NUMA adaptation mode. That is, the present application proposes a universal NUMA adaptation scheme, and different applications can avoid independent implementation of NUMA optimization. Specifically, according to the target mode selected by the user, the specified resources can be determined for the target application in the target configuration, the target application can be encapsulated and the encapsulated application can be started under the specified resources, and the application of the OS layer can be adapted non-invasively, which is independent of the language c++/java/scripting language implemented by the application, thereby solving the technical problem in the related art that the traditional scheme is time-consuming, labor-intensive and unreasonable for each product, each tool and each script to perform NUMA adaptation, and achieves the technical effect of improving adaptation efficiency and performance.

在一个可能的实现方式中，该NUMA适配模式包括第一适配模式和/或第二适配模式，其中，该第一适配模式为在插槽socket范围启动应用的模式，该第二适配模式为在NUMA节点范围启动应用的模式。其中，所示第一适配模式包括以下至少之一：在不区分NUMA节点的情况下，尝试应用启动在某一个插槽socket上，若该插槽资源不足则跨插槽启动应用；在不区分NUMA节点的情况下，保证应用启动在某一个插槽上，若该插槽资源不足则返回报错；从NUMA配置中为应用选择一个最优的NUMA节点，若无法实现则退回在插槽范围启动应用；该第二适配模式包括以下至少之一：保证将应用启动在一个NUMA 节点，若该NUMA节点资源不足则返回报错；将应用自动启动在网卡距离最近的NUMA节点；用户自定义在某个NUMA节点启动，若该NUMA节点资源不足则返回报错。In a possible implementation, the NUMA adaptation mode includes a first adaptation mode and/or a second adaptation mode, wherein the first adaptation mode is a mode for starting an application in a slot range, and the second adaptation mode is a mode for starting an application in a NUMA node range. The first adaptation mode shown includes at least one of the following: without distinguishing NUMA nodes, try to start the application on a certain slot, and if the slot resources are insufficient, start the application across slots; without distinguishing NUMA nodes, ensure that the application is started on a certain slot, and return an error if the slot resources are insufficient; select an optimal NUMA node for the application from the NUMA configuration, and if it cannot be achieved, return to start the application in the slot range; the second adaptation mode includes at least one of the following: ensure that the application is started on a NUMA node, and return an error if the NUMA node resources are insufficient; automatically start the application on the NUMA node closest to the network card; user-defined start on a certain NUMA node, and return an error if the NUMA node resources are insufficient.

上述处理模块36包括：封装单元，用于利用NUMA原生命令封装该目标应用的启动逻辑；启动单元，用于在该指定资源下启动封装后的应用。The processing module 36 includes: an encapsulation unit, which is used to encapsulate the startup logic of the target application using a NUMA native command; and a startup unit, which is used to start the encapsulated application under the specified resources.

上述第一获取模块32包括：分析单元，用于分析该操作系统当前运行的实时信息；获取单元，用于根据该实时信息，获取该动态配置。可选的，获取单元包括：第一获取子单元，用于使用该操作系统中的指定命令，获取第一信息，其中，该第一信息包括以下至少之一：该操作系统中正在运行的应用进程的中央处理器CPU使用量、该操作系统中正在运行的应用进程的内存使用量；第二获取子单元，用于从该应用进程的控制组运行态统计中，获取第二信息，其中，该第二信息包括该操作系统中正在运行的应用进程的CPU、内存、输入/输出I/O以及网络的约束；第三获取子单元，用于获取第三信息，其中，该第三信息包括该操作系统中应用进程使用的CPU和内存的具体位置；生成子单元，用于根据该第一信息、该第二信息以及该第三信息，生成单机进程使用资源分布热力图；确定子单元，用于根据该单机进程使用资源分布热力图反应的资源使用情况，确定该动态配置。The first acquisition module 32 includes: an analysis unit for analyzing the real-time information of the current operation of the operating system; an acquisition unit for acquiring the dynamic configuration according to the real-time information. Optionally, the acquisition unit includes: a first acquisition subunit for acquiring first information using a specified command in the operating system, wherein the first information includes at least one of the following: the CPU usage of the application process running in the operating system, the memory usage of the application process running in the operating system; a second acquisition subunit for acquiring second information from the control group running state statistics of the application process, wherein the second information includes the CPU, memory, input/output I/O and network constraints of the application process running in the operating system; a third acquisition subunit for acquiring third information, wherein the third information includes the specific location of the CPU and memory used by the application process in the operating system; a generation subunit for generating a heat map of resource distribution used by a single machine process according to the first information, the second information and the third information; a determination subunit for determining the dynamic configuration according to the resource usage reflected by the heat map of resource distribution used by the single machine process.

可选地，上述装置还包括：第二获取模块，用于在根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源之后，获取该操作系统内核参数，其中，该内核参数包括以下至少之一：网卡队列长度、中断绑核策略、内存切换策略；调整模块，用于根据该操作系统内核参数调整该指定资源。该调整模块还包括确定单元，用于通过该操作系统内核参数，确定预留资源；调整单元，用于根据该预留资源，调整该指定资源。Optionally, the above device further includes: a second acquisition module, which is used to acquire the operating system kernel parameters after determining the designated resources for the target application in the target configuration according to the target mode selected by the user, wherein the kernel parameters include at least one of the following: network card queue length, interrupt binding core policy, memory switching policy; an adjustment module, which is used to adjust the designated resources according to the operating system kernel parameters. The adjustment module also includes a determination unit, which is used to determine the reserved resources through the operating system kernel parameters; and an adjustment unit, which is used to adjust the designated resources according to the reserved resources.

若确定未得到可用的资源配置，上述装置还包括：迁移模块，用于将存量的进程资源热迁移至目标位置，得到目标资源；第二确定模块，用于重新确定该可用的资源配置。If it is determined that no available resource configuration is obtained, the above-mentioned device also includes: a migration module, which is used to hot migrate the existing process resources to the target location to obtain the target resources; and a second determination module, which is used to redetermine the available resource configuration.

可选地，在本申请实施例中，上述指定资源包括以下至少之一：CPU、内存。静态配置包括预定义的NUMA配置和/或硬件配置，动态配置包括操作系统运行态资源使用情况。Optionally, in the embodiment of the present application, the above-mentioned designated resources include at least one of the following: CPU, memory. Static configuration includes predefined NUMA configuration and/or hardware configuration, and dynamic configuration includes resource usage in the operating system running state.

下面结合具体示例，对本申请实施例进行举例说明。The embodiments of the present application are described below with reference to specific examples.

如图4所示，该示例包括如下步骤：S401，NUMA优化器获取服务器上的NUMA配置，包括NUMA是否打开，插槽（socket）和NUMA节点的个数，跨插槽和节点的延迟差异，当前系统跨节点内存访问频率等信息。S402，NUMA优化器获取内核参数中NUMA优化有关的配置，例如numa balance、sched调度相关参数、内存回收机制等。S403，NUMA 优化器分析硬件配置，包括硬件驱动信息、网卡所在的节点信息在内的位置与配置、内存速率、内存和NUMA节点的映射关系、磁盘和外设的位置等信息。S404，分析操作系统当前运行的实时情况，包括不同节点上CPU资源繁忙程度，内存分配和跨路访问情况等等，建立当前资源实际使用的情况的拓扑图，具体可以是：利用os系统命令，得到当前系统中正在运行的进程实际的CPU和内存使用量，从进程的控制组（cgroup）等运行态统计中，得到系统中运行的进程的CPU、内存、I/O、网络的各种约束；利用开源和内部工具，收集系统中进程使用的CPU和内存的具体位置；根据前面三步收集的信息，生成单机进程使用资源分布热力图，为之后的新进程拉起的资源选择做好了准备。S405，根据输入的的参数，帮助应用找到最佳配置，封装应用原生的命名启动应用。具体地，根据用户选择的具体模式，先根据前面步骤拿到的当前热力图，选择相对空闲可用的资源作为备选，如果因为没有可用的资源但是可以腾挪出来的时候，选择将存量的其它进程资源热迁移到其它地方，节省出资源后再重试资源选择的过程，如果有若干可用资源的时候，将根据内置的策略选择最优最短路径，例如一个应用并不是网络敏感，将优先选择距离网卡较远的地方，给潜在的其他网络应用预留空间。分析系统内核参数，例如网卡队列长度、中断绑核的策略、内存切换策略等细节，然后选择复用或者调整策略到更合理的配置，在确定了进程使用的资源和参数等细节后，最终将应用进程拉起来。As shown in FIG4 , the example includes the following steps: S401, the NUMA optimizer obtains the NUMA configuration on the server, including whether NUMA is turned on, the number of sockets and NUMA nodes, the latency difference across sockets and nodes, the memory access frequency across nodes in the current system, and other information. S402, the NUMA optimizer obtains the configuration related to NUMA optimization in the kernel parameters, such as numa balance, sched scheduling related parameters, memory recycling mechanism, and the like. S403, the NUMA optimizer analyzes the hardware configuration, including the location and configuration of the hardware driver information, the node information where the network card is located, the memory rate, the mapping relationship between the memory and the NUMA node, the location of the disk and peripherals, and other information. S404, analyze the real-time status of the current operation of the operating system, including the busyness of CPU resources on different nodes, memory allocation and cross-path access, etc., and establish a topological diagram of the actual use of current resources. Specifically, it can be: using the OS system command to obtain the actual CPU and memory usage of the processes running in the current system, and from the running state statistics such as the process control group (cgroup), obtain the various constraints of the CPU, memory, I/O, and network of the processes running in the system; using open source and internal tools to collect the specific locations of the CPU and memory used by the processes in the system; based on the information collected in the previous three steps, generate a heat map of the resource distribution of the single-machine process, and prepare for the resource selection of the new process to be pulled up later. S405, based on the input parameters, help the application find the best configuration, and encapsulate the application's native naming to start the application. Specifically, according to the specific mode selected by the user, first select relatively idle and available resources as alternatives based on the current heat map obtained in the previous steps. If there are no available resources but they can be vacated, choose to hot migrate other existing process resources to other places, save resources and retry the resource selection process. If there are several available resources, the optimal shortest path will be selected according to the built-in strategy. For example, if an application is not network sensitive, it will give priority to places far away from the network card to reserve space for potential other network applications. Analyze system kernel parameters, such as network card queue length, interrupt binding core strategy, memory switching strategy and other details, and then choose to reuse or adjust the strategy to a more reasonable configuration. After determining the details such as resources and parameters used by the process, the application process will be finally pulled up.

与本申请实施例提供的方法的应用场景以及方法相对应地，本申请实施例还提供了一种芯片，如图5所示，该芯片使用多NUMA架构，在对NUMA适配时，使用上述NUMA适配的方法。在该芯片在使用上述NUMA适配的方法中，获取非一致性内存访问NUMA架构中服务器的静态配置和操作系统的动态配置；根据该静态配置和该动态配置确定可用的目标配置；根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装该目标应用并在该指定资源下启动封装后的应用，其中，该目标模式为NUMA适配模式中的一种或者多种模式。即，本申请提出了一种通用的NUMA适配方案，不同应用可以避免独立实现NUMA优化。具体地，可以根据用户选择的目标模式，在该目标配置中为目标应用确定出指定资源，封装目标应用并在指定资源下启动封装后的应用，无侵入的适配OS层的应用，与该应用实现的语言c++/java/脚本语言无关，进而解决了相关技术中，传统的方案每个产品每个工具每个脚本都做NUMA适配费时费力并不合理的技术问题，达到了提高适配效率和性能的技术效果。本申请实施例各装置中的各模块的功能可以参见上述方法中的对应描述，并具备相应的有益效果，在此不再赘述。Corresponding to the application scenario and method of the method provided in the embodiment of the present application, the embodiment of the present application also provides a chip, as shown in FIG5, the chip uses a multi-NUMA architecture, and uses the above-mentioned NUMA adaptation method when adapting to NUMA. In the method of using the above-mentioned NUMA adaptation, the chip obtains the static configuration of the server and the dynamic configuration of the operating system in the non-consistent memory access NUMA architecture; determines the available target configuration according to the static configuration and the dynamic configuration; determines the specified resources for the target application in the target configuration according to the target mode selected by the user, encapsulates the target application and starts the encapsulated application under the specified resources, wherein the target mode is one or more modes in the NUMA adaptation mode. That is, the present application proposes a universal NUMA adaptation scheme, and different applications can avoid independent implementation of NUMA optimization. Specifically, according to the target mode selected by the user, the specified resources can be determined for the target application in the target configuration, the target application can be encapsulated and the encapsulated application can be started under the specified resources, and the application of the OS layer can be adapted non-invasively, which is independent of the language c++/java/scripting language implemented by the application, thereby solving the technical problem in the related technology that the traditional solution of each product, each tool, and each script performs NUMA adaptation, which is time-consuming, labor-intensive, and unreasonable, and achieves the technical effect of improving adaptation efficiency and performance. The functions of each module in each device of the embodiment of the present application can refer to the corresponding description in the above method, and have corresponding beneficial effects, which will not be repeated here.

图6为用来实现本申请实施例的电子设备的框图。如图6所示，该电子设备包括：存储器601和处理器602，存储器601内存储有可在处理器602上运行的计算机程序。处理器602执行该计算机程序时实现上述实施例中的方法。存储器601和处理器602的数量可以为一个或多个。FIG6 is a block diagram of an electronic device for implementing an embodiment of the present application. As shown in FIG6 , the electronic device includes: a memory 601 and a processor 602, wherein the memory 601 stores a computer program that can be run on the processor 602. When the processor 602 executes the computer program, the method in the above embodiment is implemented. The number of the memory 601 and the processor 602 can be one or more.

该电子设备还包括：The electronic device also includes:

通信接口603，用于与外界设备进行通信，进行数据交互传输。The communication interface 603 is used to communicate with external devices and perform data exchange transmission.

如果存储器601、处理器602和通信接口603独立实现，则存储器601、处理器602和通信接口603可以通过总线相互连接并完成相互间的通信。该总线可以是工业标准体系结构（Industry Standard Architecture，ISA）总线、外部设备互连（Peripheral ComponentInterconnect，PCI）总线或扩展工业标准体系结构（Extended Industry StandardArchitecture，EISA）总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示，图6中仅用一条粗线表示，但并不表示仅有一根总线或一种类型的总线。If the memory 601, the processor 602 and the communication interface 603 are implemented independently, the memory 601, the processor 602 and the communication interface 603 can be connected to each other through a bus and communicate with each other. The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in FIG. 6, but it does not mean that there is only one bus or one type of bus.

可选的，在具体实现上，如果存储器601、处理器602及通信接口603集成在一块芯片上，则存储器601、处理器602及通信接口603可以通过内部接口完成相互间的通信。Optionally, in a specific implementation, if the memory 601, the processor 602 and the communication interface 603 are integrated on a chip, the memory 601, the processor 602 and the communication interface 603 can communicate with each other through an internal interface.

本申请实施例提供了一种计算机可读存储介质，其存储有计算机程序，该程序被处理器执行时实现本申请实施例中提供的方法。An embodiment of the present application provides a computer-readable storage medium storing a computer program, which implements the method provided in the embodiment of the present application when the program is executed by a processor.

本申请实施例还提供了一种芯片，该芯片包括处理器，用于从存储器中调用并运行存储器中存储的指令，使得安装有芯片的通信设备执行本申请实施例提供的方法。An embodiment of the present application also provides a chip, which includes a processor for calling and executing instructions stored in the memory from the memory, so that a communication device equipped with the chip executes the method provided in the embodiment of the present application.

本申请实施例还提供了一种芯片，包括：输入接口、输出接口、处理器和存储器，输入接口、输出接口、处理器以及存储器之间通过内部连接通路相连，处理器用于执行存储器中的代码，当代码被执行时，处理器用于执行申请实施例提供的方法。An embodiment of the present application also provides a chip, including: an input interface, an output interface, a processor and a memory, wherein the input interface, the output interface, the processor and the memory are connected via an internal connection path, and the processor is used to execute the code in the memory. When the code is executed, the processor is used to execute the method provided in the embodiment of the application.

应理解的是，上述处理器可以是中央处理器（Central Processing Unit，CPU），还可以是其他通用处理器、数字信号处理器（Digital Signal Processor，DSP）、专用集成电路（Application Specific Integrated Circuit，ASIC）、现场可编程门阵列（FieldProgrammable Gate Array，FPGA）或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。值得说明的是，处理器可以是支持进阶精简指令集机器（Advanced RISC Machines，ARM）架构的处理器。It should be understood that the processor may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or any conventional processor, etc. It is worth noting that the processor may be a processor that supports the Advanced RISC Machines (ARM) architecture.

进一步地，可选的，上述存储器可以包括只读存储器和随机访问存储器。该存储器可以是易失性存储器或非易失性存储器，或可包括易失性和非易失性存储器两者。其中，非易失性存储器可以包括只读存储器（Read-Only Memory，ROM）、可编程只读存储器（Programmable ROM，PROM）、可擦除可编程只读存储器（Erasable PROM，EPROM）、电可擦除可编程只读存储器（Electrically EPROM，EEPROM）或闪存。易失性存储器可以包括随机访问存储器（Random Access Memory，RAM），其用作外部高速缓存。通过示例性但不是限制性说明，许多形式的RAM均可用。例如，静态随机访问存储器（Static RAM，SRAM）、动态随机访问存储器（Dynamic Random Access Memory，DRAM）、同步动态随机访问存储器（Synchronous DRAM，SDRAM）、双倍数据速率同步动态随机访问存储器（Double Data RateSDRAM，DDR SDRAM）、增强型同步动态随机访问存储器（Enhanced SDRAM，ESDRAM）、同步链接动态随机访问存储器（Sync link DRAM，SLDRAM）和直接内存总线随机访问存储器（DirectRambus RAM，DR RAM）。Further, optionally, the above-mentioned memory may include a read-only memory and a random access memory. The memory may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory. Among them, the non-volatile memory may include a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory may include a random access memory (RAM), which is used as an external cache. By way of exemplary but not limiting description, many forms of RAM are available. For example, static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous link dynamic random access memory (SLDRAM) and direct memory bus random access memory (DR RAM).

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时，全部或部分地产生依照本申请的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输。In the above embodiments, it can be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented using software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the process or function according to the present application is generated in whole or in part. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包括于本申请的至少一个实施例或示例中。而且，描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外，在不相互矛盾的情况下，本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" means that the specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present application. Moreover, the specific features, structures, materials or characteristics described may be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art may combine and combine different embodiments or examples described in this specification and the features of different embodiments or examples, unless they are contradictory.

此外，术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或隐含地包括至少一个该特征。在本申请的描述中，“多个”的含义是两个或两个以上，除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only and should not be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Therefore, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the features. In the description of this application, the meaning of "plurality" is two or more, unless otherwise clearly and specifically defined.

流程图中描述的或在此以其他方式描述的任何过程或方法可以被理解为，表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分。并且本申请的优选实施方式的范围包括另外的实现，其中可以不按所示出或讨论的顺序，包括根据所涉及的功能按基本同时的方式或按相反的顺序，来执行功能。Any process or method described in the flow chart or otherwise described herein can be understood as a module, fragment or portion of a code representing one or more executable instructions for implementing the steps of a specific logical function or process. And the scope of the preferred embodiment of the present application includes other implementations, in which the functions may not be performed in the order shown or discussed, including in a substantially simultaneous manner or in a reverse order according to the functions involved.

在流程图中描述的或在此以其他方式描述的逻辑和/或步骤，例如，可以被认为是用于实现逻辑功能的可执行指令的定序列表，可以具体实现在任何计算机可读介质中，以供指令执行系统、装置或设备（如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统）使用，或结合这些指令执行系统、装置或设备而使用。The logic and/or steps described in the flowchart or otherwise described herein, for example, can be considered as an ordered list of executable instructions for implementing logical functions, which can be specifically implemented in any computer-readable medium for use by an instruction execution system, device or apparatus (such as a computer-based system, a system including a processor or other system that can fetch instructions from an instruction execution system, device or apparatus and execute instructions), or used in combination with these instruction execution systems, devices or apparatuses.

应理解的是，本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中，多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。上述实施例方法的全部或部分步骤是可以通过程序来指令相关的硬件完成，该程序可以存储于一种计算机可读存储介质中，该程序在执行时，包括方法实施例的步骤之一或其组合。It should be understood that the various parts of the present application can be implemented with hardware, software, firmware or a combination thereof. In the above embodiments, multiple steps or methods can be implemented with software or firmware stored in a memory and executed by a suitable instruction execution system. All or part of the steps of the above embodiment method can be completed by instructing the relevant hardware through a program, which can be stored in a computer-readable storage medium, and when the program is executed, it includes one of the steps of the method embodiment or a combination thereof.

此外，在本申请各个实施例中的各功能单元可以集成在一个处理模块中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现，也可以采用软件功能模块的形式实现。上述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时，也可以存储在一个计算机可读存储介质中。该存储介质可以是只读存储器，磁盘或光盘等。In addition, each functional unit in each embodiment of the present application can be integrated into a processing module, or each unit can exist physically separately, or two or more units can be integrated into one module. The above-mentioned integrated module can be implemented in the form of hardware or in the form of a software functional module. If the above-mentioned integrated module is implemented in the form of a software functional module and sold or used as an independent product, it can also be stored in a computer-readable storage medium. The storage medium can be a read-only memory, a disk or an optical disk, etc.

以上所述，仅为本申请的示例性实施方式，但本申请的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本申请记载的技术范围内，可轻易想到其各种变化或替换，这些都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应以权利要求的保护范围为准。The above is only an exemplary embodiment of the present application, but the protection scope of the present application is not limited thereto. Any technician familiar with the technical field can easily think of various changes or substitutions within the technical scope recorded in the present application, which should be included in the protection scope of the present application. Therefore, the protection scope of the present application shall be based on the protection scope of the claims.