CN114385436A

Movatterモバイル変換

Info

Publication number: CN114385436A
Application number: CN202111481222.6A
Authority: CN
Inventors: 刘卓龙
Original assignee: Wangsu Science and Technology Co Ltd
Current assignee: Wangsu Science and Technology Co Ltd
Priority date: 2021-12-06
Filing date: 2021-12-06
Publication date: 2022-04-22

Abstract

The application relates to the technical field of internet communication, and discloses a server grouping method, a server grouping device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring at least one item of asset information on each server; for each server, determining a first feature vector of the server according to the acquired at least one item of asset information; and according to a preset clustering algorithm, taking a plurality of first eigenvectors with similarity greater than a preset threshold value as a class of eigenvectors, obtaining a clustering result of each first eigenvector, and determining the grouping of each server according to the clustering result. The method comprises the steps of generating first feature vectors representing service features of servers by collecting asset information of the servers, determining grouping of the servers according to clustering results obtained by the first feature vectors according to vector similarity, and performing alternative clustering grouping on the servers accurately and efficiently by using the first feature vectors corresponding to the servers and a clustering algorithm, so that grouping efficiency is improved, and meanwhile, grouping difficulty and cost are reduced.

Description

Translated fromChinese

服务器分组方法、装置、电子设备和存储介质Server grouping method, apparatus, electronic device and storage medium

技术领域technical field

本申请实施例涉及互联网通信技术领域，特别涉及一种服务器分组方法、装置、电子设备和存储介质。The embodiments of the present application relate to the field of Internet communication technologies, and in particular, to a server grouping method, apparatus, electronic device, and storage medium.

背景技术Background technique

随着通信技术和互联网的不断发展，为了更好的满足用户的使用需求，服务器逐渐成为现代极其重要的基础设施之一。无论是从事传统行业或高新技术行业的企业，都需要使用大量服务器执行数据存储、计算、提供网站服务等任务，因此，根据服务器上部署业务的不同对服务器进行分组，有利于企业对服务器进行管理，同时，在安全行业，服务器分组也有助于利用分组信息进行精细化防护。With the continuous development of communication technology and the Internet, in order to better meet the needs of users, servers have gradually become one of the most important modern infrastructures. Enterprises engaged in traditional industries or high-tech industries need to use a large number of servers to perform tasks such as data storage, computing, and providing website services. Therefore, grouping servers according to different services deployed on the servers is beneficial for enterprises to manage servers. , At the same time, in the security industry, server grouping also helps to use grouping information for refined protection.

传统的服务器分组主要依靠人工实现，通常是运维人员在服务器上架时对其标注分组信息，这一方法在服务器数量较少时没有突出的问题，但在服务器数量、服务器上业务复杂度不断增加、服务业务变更的今天，人工分组就面临诸多弊端，例如人力成本高、难以实时维护等。The traditional server grouping mainly relies on manual implementation. Usually, the operation and maintenance personnel mark the grouping information on the server when it is put on the shelf. This method has no prominent problems when the number of servers is small, but the number of servers and the business complexity on the server are increasing. . Today, with service business changes, manual grouping faces many drawbacks, such as high labor costs and difficulty in real-time maintenance.

因此，如何简单高效的完成服务器分组，进而实现对服务器的精确管理是一个迫切需要得到解决的技术问题。Therefore, how to complete the server grouping simply and efficiently, and then realize the precise management of the server, is a technical problem that needs to be solved urgently.

发明内容SUMMARY OF THE INVENTION

本申请实施例的主要目的在于提出一种服务器分组方法、装置、电子设备和存储介质，旨在简单高效的完成服务器分组，降低服务器分组难度和成本，实现对服务器的精确管理。The main purpose of the embodiments of the present application is to propose a server grouping method, apparatus, electronic device and storage medium, which aim to complete server grouping simply and efficiently, reduce the difficulty and cost of server grouping, and realize precise management of servers.

为实现上述目的，本申请实施例提供了一种服务器分组方法，包括：获取各服务器上的至少一项资产信息；对于每一个服务器，根据获取到的至少一项所述资产信息，确定所述服务器的第一特征向量；根据预设聚类算法将相似度大于预设阈值的若干所述第一特征向量作为一类特征向量，获取各所述第一特征向量的聚类结果，并根据所述聚类结果确定各所述服务器的分组。To achieve the above purpose, an embodiment of the present application provides a server grouping method, including: acquiring at least one item of asset information on each server; for each server, determining the The first feature vector of the server; according to a preset clustering algorithm, several of the first feature vectors with a similarity greater than a preset threshold are regarded as a type of feature vector, and the clustering results of each of the first feature vectors are obtained. The clustering result determines the grouping of each of the servers.

为实现上述目的，本申请实施例还提供了一种服务器分组装置，包括：获取模块，用于获取各服务器上的至少一项资产信息；确定模块，用于对于每一个服务器，根据获取到的至少一项所述资产信息，确定所述服务器的第一特征向量；分组模块，用于根据预设聚类算法将相似度大于预设阈值的若干所述第一特征向量作为一类特征向量，获取各所述第一特征向量的聚类结果，并根据所述聚类结果确定各所述服务器的分组。In order to achieve the above purpose, an embodiment of the present application further provides a server grouping device, including: an acquisition module for acquiring at least one item of asset information on each server; a determination module for each server, according to the acquired information. At least one item of the asset information is used to determine the first feature vector of the server; the grouping module is configured to use a number of the first feature vectors with a similarity greater than a preset threshold as a class of feature vectors according to a preset clustering algorithm, A clustering result of each of the first feature vectors is acquired, and a grouping of each of the servers is determined according to the clustering result.

为实现上述目的，本申请实施例还提供了一种电子设备，设备包括：至少一个处理器；以及，与至少一个处理器通信连接的存储器；其中，存储器存储有可被至少一个处理器执行的指令，指令被至少一个处理器执行，以使至少一个处理器能够执行如上所述的服务器分组方法。In order to achieve the above purpose, an embodiment of the present application further provides an electronic device, the device includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores an electronic device that can be executed by the at least one processor. The instructions are executed by the at least one processor to enable the at least one processor to perform the server grouping method as described above.

为实现上述目的，本申请实施例还提出了计算机可读存储介质，存储有计算机程序，计算机程序被处理器执行时实现如上所述的服务器分组方法。To achieve the above objective, the embodiments of the present application further provide a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the above-mentioned server grouping method is implemented.

本申请实施例提供的服务器分组方法，通过获取各服务器上的至少一项资产信息，并根据获取到的至少一项资产信息确定每个服务器的第一特征向量，然后通过聚类算法将相似度大于预设阈值的若干第一特征向量作为一类特征向量，进而获取到各第一特征向量的聚类结果，并根据各第一特征向量的聚类结果确定服务器的分组。通过采集服务器的资产信息，并根据资产信息为服务器生成对应的第一特征向量，准确的通过第一特征向量对服务器的业务特征进行表征；根据各第一特征向量间的相似度和预设阈值的关系，得到多个第一特征向量的聚类结果，进而得到服务器的分组结果，利用服务器对应的第一特征向量和聚类算法准确高效的对服务器进行替代性聚类分组，提高分组效率的同时，降低分组难度和成本。The server grouping method provided by the embodiment of the present application obtains at least one item of asset information on each server, determines the first feature vector of each server according to the obtained at least one item of asset information, and then uses a clustering algorithm to classify the similarity Several first eigenvectors greater than the preset threshold are regarded as a type of eigenvectors, and then the clustering results of the first eigenvectors are obtained, and the server grouping is determined according to the clustering results of the first eigenvectors. By collecting the asset information of the server, and generating a corresponding first feature vector for the server according to the asset information, the service features of the server are accurately characterized by the first feature vector; according to the similarity between the first feature vectors and the preset threshold to obtain the clustering results of multiple first eigenvectors, and then obtain the grouping results of the servers, and use the first eigenvectors corresponding to the servers and the clustering algorithm to accurately and efficiently perform alternative clustering and grouping of the servers, so as to improve the efficiency of grouping. At the same time, the difficulty and cost of grouping are reduced.

附图说明Description of drawings

一个或多个实施例通过与之对应的附图中的图片进行示例性说明，这些示例性说明并不构成对实施例的限定。One or more embodiments are exemplified by the pictures in the corresponding drawings, and these exemplified descriptions do not constitute limitations on the embodiments.

图1是本申请实施例中的服务器分组方法的流程图；1 is a flowchart of a server grouping method in an embodiment of the present application;

图2是本申请另一实施例中的服务器分组装置的结构示意图；2 is a schematic structural diagram of a server grouping apparatus in another embodiment of the present application;

图3是本申请另一实施例中的电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device in another embodiment of the present application.

具体实施方式Detailed ways

由背景技术可知，当前的服务器分组方法在服务器数量、服务器上业务复杂度不断增加、服务业务变更的情况下面临诸多弊端，例如人力成本高、服务器分组难以实时维护。因此，如何简单高效的实现服务器分组，进而实现对服务器的精确管理是一个迫切需要得到解决的技术问题。As can be seen from the background art, the current server grouping method faces many drawbacks when the number of servers, the business complexity on the server are increasing, and the service business is changed, such as high labor cost and difficulty in real-time maintenance of server grouping. Therefore, how to realize server grouping simply and efficiently, and then realize precise management of servers, is a technical problem that needs to be solved urgently.

为了解决上述问题，本申请部分实施例提供了一种服务器分组方法，包括：获取各服务器上的至少一项资产信息；对于每一个服务器，根据获取到的至少一项资产信息，确定服务器的第一特征向量；根据预设聚类算法将相似度大于预设阈值的若干第一特征向量作为一类特征向量，获取各第一特征向量的聚类结果，并根据聚类结果确定各服务器的分组。In order to solve the above problem, some embodiments of the present application provide a server grouping method, including: acquiring at least one item of asset information on each server; for each server, determining the first item of asset information of the server according to the acquired at least one item of asset information a eigenvector; according to a preset clustering algorithm, several first eigenvectors with a similarity greater than a preset threshold are regarded as a class of eigenvectors, the clustering result of each first eigenvector is obtained, and the grouping of each server is determined according to the clustering result .

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合附图对本申请的各实施例进行详细的阐述。然而，本领域的普通技术人员可以理解，在本申请各实施例中，为了使读者更好地理解本申请而提出了许多技术细节。但是，即使没有这些技术细节和基于以下各实施例的种种变化和修改，也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便，不应对本申请的具体实现方式构成任何限定，各个实施例在不矛盾的前提下可以相互结合相互引用。In order to make the objectives, technical solutions and advantages of the embodiments of the present application more clear, each embodiment of the present application will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art can understand that, in each embodiment of the present application, many technical details are provided for the reader to better understand the present application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in the present application can be realized. The following divisions of the various embodiments are for the convenience of description, and should not constitute any limitation on the specific implementation of the present application, and the various embodiments may be combined with each other and referred to each other on the premise of not contradicting each other.

下面将结合具体的实施例的对本申请记载的服务器分组方法的实现细节进行具体的说明，以下内容仅为方便理解提供的实现细节，并非实施本方案的必须。The implementation details of the server grouping method described in this application will be specifically described below with reference to specific embodiments. The following content is only provided for the convenience of understanding, and is not necessary for implementing this solution.

本申请实施例的第一方面提供了一种服务器分组方法，服务器分组方法的流程参考图1，在一些实施例中，服务器分组方法应用于具有通信和数据分析能力的终端上，如电脑、管理服务器等电子设备，本实施例以应用在电脑中为例进行说明，服务器分组包括以下步骤：A first aspect of the embodiments of the present application provides a server grouping method. Refer to FIG. 1 for a flowchart of the server grouping method. In some embodiments, the server grouping method is applied to terminals with communication and data analysis capabilities, such as computers, management Electronic devices such as servers are described in this embodiment by taking the application in a computer as an example, and the server grouping includes the following steps:

步骤101，获取各服务器上的至少一项资产信息。Step 101: Acquire at least one item of asset information on each server.

具体地说，电脑在对已经处于上线状态的或者准备上线的多个服务器进行分组的时候，通过资产采集工具，例如，主机型入侵检测系统(Host-based Intrusion DetectionSystem，HIDS)，采集待分组的多个服务器上的至少一项资产信息，并对采集到的资产信息进行有标识的独立存储。Specifically, when the computer groups multiple servers that are already online or ready to go online, it uses asset collection tools, such as a Host-based Intrusion Detection System (HIDS), to collect the data to be grouped. At least one item of asset information on multiple servers, and the collected asset information is independently identified and stored.

在一个例子中，电脑通过资产采集工具获取的服务器资产信息，包括以下之一或其任意组合：进程、端口绑定信息、系统用户、组用户、定时任务、开机启动项、环境变量。电脑在进行资产信息采集的时候，可以根据实际需要和具体的分类要求，采集服务器中的一种或多种特定的资产信息，例如，获取进程、端口绑定信息、系统用户和定时任务四种资产信息。然后利用获取到的资产信息对服务器进行准确的分组，本实施例对具体选择的资产信息的种类和种类数量不做限制。In one example, the server asset information obtained by the computer through the asset collection tool includes one or any combination of the following: process, port binding information, system user, group user, scheduled task, startup item, and environment variable. When the computer collects asset information, it can collect one or more specific asset information in the server according to actual needs and specific classification requirements, such as acquisition process, port binding information, system user and timed tasks. Asset Information. Then, the servers are accurately grouped by using the acquired asset information. This embodiment does not limit the types and types of asset information that are specifically selected.

需要说明的是，电脑进行资产信息采集是应用的资产采集工具可以是主机型入侵检测系统，也可以是其他的资产采集工具，例如，Goby，具体的应用中可以根据电脑性能、采集需求等因素进行选择，本实施例对电脑进行服务器资产信息采集过程中使用的具体资产采集工具不做限制。It should be noted that the asset collection tool used by the computer to collect asset information can be a host-based intrusion detection system or other asset collection tools, such as Goby. The specific application can be based on factors such as computer performance and collection requirements. selection, this embodiment does not limit the specific asset collection tool used in the process of collecting server asset information by the computer.

步骤102，对于每一个服务器，根据获取到的至少一项资产信息，确定服务器的第一特征向量。Step 102: For each server, determine the first feature vector of the server according to the acquired at least one item of asset information.

具体地说，在对待分组的多个服务器进行资产信息采集后，为了便于后续分组，对于每一个服务器，根据采集到的至少一项资产信息对服务器进行业务特征的提取，从而根据获取到的至少一项资产信息，确定服务器的第一特征向量。由于第一特征向量由对至少一项资产信息进行特征提取的结果综合确定，因此，第一特征向量能够准确的表征各服务器的业务特征。Specifically, after asset information is collected from multiple servers to be grouped, in order to facilitate subsequent grouping, for each server, the service features of the server are extracted according to the collected at least one item of asset information, so as to A piece of asset information that determines the first feature vector of the server. Since the first feature vector is comprehensively determined from the result of feature extraction on at least one item of asset information, the first feature vector can accurately represent the service features of each server.

在一个例子中，根据获取到的至少一项资产信息，确定服务器的第一特征向量，包括：在获取到的资产信息为N项的情况下，其中，N为大于1的整数；对各服务器的所有资产信息进行分类，生成N类资产信息集，并确定每一类资产信息集对应的关键词集合；获取服务器每一项资产信息分别对应的目标关键词集合；其中，目标关键词集合为每一项资产信息所属的资产信息集对应的关键词集合；根据对应的目标关键词集合中各关键词在每一项资产信息中的出现次数，生成每一项资产信息各自对应的第二特征向量；根据服务器的N个第二特征向量，确定第一特征向量。具体而言，在根据采集到的资产信息为每个服务器生成第一特征向量的时候，对从每个服务器上分别采集到的N项资产信息按照资产信息类型进行分类，生成N类资产信息集，然后各对资产信息集进行关键词提取和汇总，生成每一个资产信息集对应的关键词集合。为一个服务器生成第一特征向量时，根据每一项资产信息所属的资产信息集，分别获取每一项资产信息对应的目标关键词集合，然后对每一项资产信息进行关键词检测，并统计目标关键词集合中各关键词在每一项资产信息中出现的顺序和次数，根据各关键词的出现次数和顺序，为每一项资产信息生成对应的第二特征向量，从而获取采集到的多项资产信息分别对应的第二特征向量。然后根据获取到的多个第二特征向量，确定出服务器所对应的第一特征向量。通过对每个服务器上采集到的每一项资产信息进行关键词检测，并根据目标关键词集合中各关键词在每一项资产信息中出现的顺序和次数生成每一项资产信息对应的第二特征向量，然后结合每个服务器上采集的多项资产信息生成每个服务器对应的第一特征向量，准确的利用第一特征向量对服务器部署的业务的特征进行表征。In one example, determining the first feature vector of the server according to the acquired at least one item of asset information includes: when the acquired asset information is N items, where N is an integer greater than 1; for each server Classify all asset information of the server, generate N types of asset information sets, and determine the keyword set corresponding to each type of asset information set; obtain the target keyword set corresponding to each asset information of the server; wherein, the target keyword set is A keyword set corresponding to the asset information set to which each item of asset information belongs; according to the number of occurrences of each keyword in the corresponding target keyword set in each item of asset information, a second feature corresponding to each item of asset information is generated vector; determine the first feature vector according to the N second feature vectors of the server. Specifically, when the first feature vector is generated for each server according to the collected asset information, the N items of asset information collected from each server are classified according to the asset information type, and N types of asset information sets are generated. , and then perform keyword extraction and aggregation for each asset information set to generate a keyword set corresponding to each asset information set. When generating the first feature vector for a server, according to the asset information set to which each asset information belongs, obtain the target keyword set corresponding to each asset information, and then perform keyword detection on each asset information, and count The sequence and number of occurrences of each keyword in the target keyword set in each item of asset information, and the corresponding second feature vector is generated for each item of asset information according to the number of occurrences and sequence of each keyword, so as to obtain the collected data. The second eigenvectors corresponding to the multiple pieces of asset information respectively. Then, according to the obtained multiple second feature vectors, the first feature vector corresponding to the server is determined. By performing keyword detection on each item of asset information collected on each server, and according to the order and number of occurrences of each keyword in the target keyword set in each item of asset information, the first item corresponding to each item of asset information is generated. Two feature vectors, and then combined with multiple pieces of asset information collected on each server to generate a first feature vector corresponding to each server, and accurately use the first feature vector to characterize the features of the services deployed by the server.

例如，调用预先获取的词袋模型(bag-of-word)，先根据各服务器上采集到的多项资产信息生成多项资产信息分别对应的具有时序信息的文本，然后对各时序文本进行关键词检测，统计每一个时序文本中各关键词出现的次数等信息，生成各资产信息对应第二特征向量。例如，采集的资产信息包括进程信息和用户信息，待分组的服务器包括服务器A和服务器B，服务器A的进程信息为：[‘mysql’,‘python’，‘python’],用户信息为:[‘root’,‘test’]；服务器B的进程信息为：[‘redis’,‘java’,‘tomcat’],用户信息为：[‘root’,‘watch’]。则利用词袋模型对进程信息进行关键词检测后，检测出进程信息对应的关键词集合中的关键词有5个，且关键词排序结果为：mysql’、‘python’、redis’、‘java’和‘tomcat’，根据进程信息对应的关键词集合及服务器A的进程信息，得到服务器A的进程信息对应的第二特征向量为[1,2,0,0,0]，根据进程信息对应的关键词集合及服务器A的进程信息，得到服务器B的进程信息对应的第二特征向量为[0,0,1,1,1]，其中，进程信息对应的第二特征向量中的各个数值分别表示，进程信息的关键词集合中5个关键词‘mysql’，‘python’，‘redis’，‘java’，‘tomcat’在服务器的进程信息中出现的次数。利用词袋模型对用户信息进行关键词检测后，检测出用户信息对应的关键词集合中的关键词有3个，且关键词排序结果为：‘root’、‘test’和‘watch’，根据用户信息对应的关键词集合及服务器A的用户信息，得到服务器A的用户信息对应的第二特征向量为[1,1,0]，根据用户信息对应的关键词集合及服务器B的用户信息，得到服务器B的用户信息对应的第二特征向量为[1,0,1]，其中，用户信息对应的第二特征向量中各个数值分别表示，用户信息的关键词集合中3个关键词‘root’、‘test’和‘watch’在服务器的用户信息中出现的次数。以此类推，对其余待分组服务器的资源信息进行类似的处理，得到各服务器资产信息分别对应的第二特征向量。然后对任意一个待分组的服务器，根据其各项资产信息对应的第二特征向量，确定其对应的第一特征向量。例如，将各第二特征向量作为新的向量元素构成第一特征向量。利用词袋模型对服务器的资产信息进行特征提取和向量转化，并根据各项资产信息对应的第二特征向量生成服务器对应的第一特征向量，从而使得生成的第一特征向量能够准确的对服务器的业务特征进行表征，提升后续根据第一特征向量进行服务器分组时分组结果的准确性。For example, call the pre-acquired bag-of-word model, first generate texts with time-series information corresponding to multiple pieces of asset information according to multiple pieces of asset information collected on each server, and then perform key analysis on each time-series text. Word detection, counts the number of occurrences of each keyword in each time series text and other information, and generates a second feature vector corresponding to each asset information. For example, the collected asset information includes process information and user information, the servers to be grouped include server A and server B, the process information of server A is: ['mysql','python','python'], and the user information is:[ 'root','test']; the process information of server B is: ['redis','java','tomcat'], and the user information is: ['root','watch']. After using the bag-of-words model to perform keyword detection on the process information, it is detected that there are 5 keywords in the keyword set corresponding to the process information, and the keyword sorting results are: mysql', 'python', redis', 'java' ' and 'tomcat', according to the keyword set corresponding to the process information and the process information of server A, the second feature vector corresponding to the process information of server A is obtained as [1, 2, 0, 0, 0], which corresponds to the process information and the process information of server A, the second feature vector corresponding to the process information of server B is [0, 0, 1, 1, 1], wherein each value in the second feature vector corresponding to the process information Respectively represent the number of times that the five keywords 'mysql', 'python', 'redis', 'java', and 'tomcat' in the keyword set of the process information appear in the process information of the server. After using the bag-of-words model to perform keyword detection on the user information, it is detected that there are 3 keywords in the keyword set corresponding to the user information, and the keyword sorting results are: 'root', 'test' and 'watch'. The keyword set corresponding to the user information and the user information of server A, the second feature vector corresponding to the user information of server A is obtained as [1, 1, 0]. According to the keyword set corresponding to the user information and the user information of server B, The second feature vector corresponding to the user information of server B is obtained as [1, 0, 1], wherein each value in the second feature vector corresponding to the user information represents respectively, and the three keywords 'root' in the keyword set of the user information ', 'test', and 'watch' appear in the server's user information. By analogy, similar processing is performed on the resource information of the remaining servers to be grouped to obtain the second feature vector corresponding to the asset information of each server. Then, for any server to be grouped, the corresponding first feature vector is determined according to the second feature vector corresponding to each item of asset information. For example, each second eigenvector is used as a new vector element to constitute the first eigenvector. Use bag-of-words model to perform feature extraction and vector transformation on the asset information of the server, and generate the first feature vector corresponding to the server according to the second feature vector corresponding to each asset information, so that the generated first feature vector can accurately identify the server. to characterize the business characteristics of the server, so as to improve the accuracy of the grouping result when the servers are grouped according to the first feature vector.

值得一提的是，为资产信息生成第二特征向量时，关键词的排序可以是根据出现时间进行排序，也可以是根据预设优先级或其他条件进行排序，本实施例对此不做限制。It is worth mentioning that, when generating the second feature vector for asset information, the keywords may be sorted according to their appearance time, or according to preset priorities or other conditions, which are not limited in this embodiment. .

进一步地，根据服务器的N个第二特征向量，确定第一特征向量，包括：对每一个第二特征向量，获取第二特征向量中各元素的权重；根据第二特征向量中各元素的权重，生成第二特征向量对应的第三特征向量；针对每一个服务器，按照相同的顺序拼接生成的N个第三特征向量，得到第一特征向量。具体而言，在根据服务器的N个第二特征向量生成对应的第一特征向量时，为了特征向量能够更加直观的体现出服务器的业务特征，可以对统计不同关键词出现的次数的第二特征向量进行转换，将第二特征向量转换为体现不同关键词重要程度的第三特征向量。根据第二特征向量中各元素对应的关键词在对应的资产信息中出现的次数、第二特征向量对应资产信息所属资产信息集中的关键词总数、各元素在对应资产信息中出现的总次数等信息，计算出每一个第二特征向量中各元素的权重，然后将第二特征向量中各元素替换为计算出的权重，生成第二特征向量对应的第三特征向量。在获取到各服务器对应的N个第二特征向量对应的第三特征向量后，针对每一个服务器，按照相同的顺序将各服务器的N个第三特征向量分别拼接在一起，得到各服务器对应的第一特征向量。其中，拼接的顺序可以是根据资产信息的重要程度、获取顺序、特征向量生成时间等因素确定，本实施例对比不做限制。通过对服务器对应的各第二特征向量中的每个元素进行权重分析，并在根据各元素的权重将第二特征向量转换为第三特征向量后，按照相同顺序拼接不同服务器的多个第三特征向量构成各服务器对应的第一特征向量，使得得到的第一特征向量能够准确的从不同业务的重要程度这一维度对服务器的业务特征进行表征，进而使得后续分组是根据各服务器中不同业务的重要程度进行的，从而进一步提高分组结果的准确性。Further, determining the first feature vector according to the N second feature vectors of the server includes: for each second feature vector, obtaining the weight of each element in the second feature vector; according to the weight of each element in the second feature vector , generate a third feature vector corresponding to the second feature vector; for each server, splicing the generated N third feature vectors in the same order to obtain the first feature vector. Specifically, when the corresponding first feature vector is generated according to the N second feature vectors of the server, in order that the feature vector can more intuitively reflect the service features of the server, the second feature that counts the number of occurrences of different keywords can be calculated. The vector is converted, and the second feature vector is converted into a third feature vector reflecting the importance of different keywords. According to the number of times that the keyword corresponding to each element in the second feature vector appears in the corresponding asset information, the total number of keywords in the asset information set to which the asset information corresponding to the second feature vector belongs, the total number of times each element appears in the corresponding asset information, etc. information, calculate the weight of each element in each second eigenvector, and then replace each element in the second eigenvector with the calculated weight to generate a third eigenvector corresponding to the second eigenvector. After obtaining the third eigenvectors corresponding to the N second eigenvectors corresponding to each server, for each server, the N third eigenvectors of each server are spliced together in the same order to obtain the corresponding The first eigenvector. The order of splicing may be determined according to factors such as the importance of the asset information, the acquisition order, and the time for generating the feature vector, which is not limited in this embodiment. By performing weight analysis on each element in each second feature vector corresponding to the server, and after converting the second feature vector into a third feature vector according to the weight of each element, splicing multiple third feature vectors of different servers in the same order The feature vector constitutes the first feature vector corresponding to each server, so that the obtained first feature vector can accurately characterize the service characteristics of the server from the dimension of the importance of different services, so that the subsequent grouping is based on different services in each server. The importance level is carried out, thereby further improving the accuracy of the grouping results.

更进一步地，获取第二特征向量中各元素的权重，包括：对于第二特征向量中的每一个元素，根据对应的关键词在第二特征向量对应的资产信息中出现的次数，以及第二特征向量对应的资产信息所属的资产信息集对应的关键词集合中关键词的总数，确定对应的关键词的词频；根据包含对应的关键词的资产信息的数量，以及各服务器的资产信息总数，确定对应的关键词的逆文档频率；根据对应的关键词的词频和逆文档频率，确定元素的权重。具体而言，电脑在获取第二特征向量中各元素的权重时，获取第二特征向量对应的资产信息，及第二特征向量对应的资产信息所属的资产信息集；对于第二特征向量中的每一个元素，根据元素对应的关键词在第二特征向量对应的资产信息中出现的次数，以及第二特征向量对应的资产信息所属资产信息集对应的关键词集合中关键词的总数，确定元素对应的关键词的词频；然后对在各服务器上采集到的资产信息的数据进行统计，并对每一个资产信息进行关键词检测，根据包含元素对应的关键词的资产信息的数量，以及各服务器的资产信息总数，确定元素对应的关键词的逆文档频率；最后根据元素对应的关键词的词频和逆文档频率，确定元素在一项资产信息中的权重，即重要程度。通过根据元素对应的关键词的词频和逆文本频率，准确获取第二特征向量中每个元素对应的权重，便于后续根据表征不同业务重要程度的第一特征向量对服务器进行准确分组。Further, obtaining the weight of each element in the second feature vector includes: for each element in the second feature vector, according to the number of times the corresponding keyword appears in the asset information corresponding to the second feature vector, and the second The total number of keywords in the keyword set corresponding to the asset information set to which the asset information corresponding to the feature vector belongs, and the word frequency of the corresponding keyword is determined; Determine the inverse document frequency of the corresponding keyword; determine the weight of the element according to the word frequency and inverse document frequency of the corresponding keyword. Specifically, when the computer obtains the weight of each element in the second feature vector, it obtains the asset information corresponding to the second feature vector, and the asset information set to which the asset information corresponding to the second feature vector belongs; For each element, the element is determined according to the number of times the keyword corresponding to the element appears in the asset information corresponding to the second feature vector, and the total number of keywords in the keyword set corresponding to the asset information set to which the asset information corresponding to the second feature vector belongs. The word frequency of the corresponding keyword; then the data of the asset information collected on each server is counted, and the keyword detection is performed on each asset information, according to the number of asset information containing the keyword corresponding to the element, and each server. Determine the inverse document frequency of the keyword corresponding to the element; finally, according to the word frequency and inverse document frequency of the keyword corresponding to the element, determine the weight of the element in an asset information, that is, the degree of importance. By accurately obtaining the weight corresponding to each element in the second feature vector according to the word frequency and inverse text frequency of the keyword corresponding to the element, it is convenient for the subsequent accurate grouping of servers according to the first feature vector representing the importance of different services.

再进一步地，在获取第二特征向量中各元素的权重后，还包括：根据如下公式对各元素的权重进行标准化处理：Still further, after obtaining the weight of each element in the second feature vector, the method further includes: standardizing the weight of each element according to the following formula:

其中，ω_norm,i为第二特征向量中第i个元素的权重标准化值，ω_i为第i个元素的权重，ω_j为第二特征向量中第j个元素的权重，m为第二特征向量中元素的总数；电脑根据第二特征向量中各元素的权重，生成第二特征向量对应的第三特征向量，包括：根据第二特征向量各元素经过标准化处理后的权重，生成第二特征向量对应的第三特征向量。Among them, ω_norm,i is the weight normalization value of the i-th element in the second feature vector, ω_i is the weight of the i-th element, ω_j is the weight of the j-th element in the second feature vector, m is the second The total number of elements in the eigenvector; the computer generates a third eigenvector corresponding to the second eigenvector according to the weight of each element in the second eigenvector, including: The third eigenvector corresponding to the eigenvector.

具体而言，在获取到第二特征向量中各元素的权重后，为了进一步提升各元素权重对服务器上业务特征重要程度表征的准确性，会根据获取到的各元素的初始权重进行权重的标准化处理。例如，延续前文的例子，服务器A的进程信息对应的第二特征向量为[1,2,0,0,0]，服务器B的进程信息对应的第二特征向量为[0,0,1,1,1]；服务器A的用户信息对应的第二特征向量为[1,1,0]，服务器B的用户信息对应的第二特征向量为[1,0,1]。经过利用词频-逆文档频率算法对各第二特征向量进行处理后，服务器A的进程信息对应的第二特征向量的权重矩阵为[0.3920，0.7840，0,0,0]；服务器B的进程信息对应的第二特征向量的权重矩阵为[0，0，0.3920，0.3920，0.3920]。按照上述公式进行标准化处理后，服务器A的进程信息对应的第二特征向量的权重矩阵为[0.4472，0.8944，0,0,0]；服务器B的进程信息对应的第二特征向量的权重矩阵为[0，0，0.5774，0.5774，0.5774]。按照类似的方式再分别对服务器A和服务器B的用户信息对应的第二特征向量进行标准化处理。根据第二特征向量各元素经过标准化处理后的权重，生成的服务器A的进程信息对应的第三特征向量为[0.4472，0.8944，0,0,0]，用户信息对应的第三特征向量为[0.5797,0.8148,0]；生成的服务器B的进程信息对应的第三特征向量为[0,0,0.5774，0.5774，0.5774]，用户信息对应的第三特征向量为[0.5797,0,0.8148]；其中，第三特征向量中各元素表示对应的关键词对服务器的重要程度。在对第二特征向量中各元素的权重进行权重标准化处理，并获取到各服务器对应的N个第三特征向量后，将各服务器的各第三特征向量按照相同的预设顺序分别进行拼接，获取各服务器的第一特征向量，例如，服务器的第一特征向量由进程信息对应得第三特征向量后拼接用户信息对应的第三特征向量构成，则服务器A的第一特征向量为[0.4472，0.8944，0,0,0,0.5797,0.8148,0]，服务器B的第一特征向量为[0,0,0.5774，0.5774，0.5774，0.5797,0,0.8148]。Specifically, after the weight of each element in the second feature vector is obtained, in order to further improve the accuracy of the representation of the importance of each element's weight to the service feature on the server, the weight will be standardized according to the obtained initial weight of each element deal with. For example, continuing the previous example, the second feature vector corresponding to the process information of server A is [1,2,0,0,0], and the second feature vector corresponding to the process information of server B is [0,0,1, 1,1]; the second feature vector corresponding to the user information of server A is [1,1,0], and the second feature vector corresponding to the user information of server B is [1,0,1]. After processing each second feature vector by using the word frequency-inverse document frequency algorithm, the weight matrix of the second feature vector corresponding to the process information of server A is [0.3920, 0.7840, 0, 0, 0]; the process information of server B The weight matrix of the corresponding second eigenvector is [0, 0, 0.3920, 0.3920, 0.3920]. After the normalization process is performed according to the above formula, the weight matrix of the second eigenvector corresponding to the process information of server A is [0.4472, 0.8944, 0, 0, 0]; the weight matrix of the second eigenvector corresponding to the process information of server B is [0, 0, 0.5774, 0.5774, 0.5774]. In a similar manner, normalize the second feature vectors corresponding to the user information of server A and server B, respectively. According to the normalized weight of each element of the second feature vector, the generated third feature vector corresponding to the process information of server A is [0.4472, 0.8944, 0, 0, 0], and the third feature vector corresponding to the user information is [ 0.5797,0.8148,0]; the third feature vector corresponding to the generated process information of server B is [0,0,0.5774,0.5774,0.5774], and the third feature vector corresponding to the user information is [0.5797,0,0.8148]; Wherein, each element in the third feature vector represents the importance of the corresponding keyword to the server. After performing weight normalization processing on the weight of each element in the second feature vector, and obtaining N third feature vectors corresponding to each server, the third feature vectors of each server are spliced respectively in the same preset order, Obtain the first eigenvectors of each server. For example, the first eigenvectors of the servers are formed by splicing the third eigenvectors corresponding to the user information after obtaining the third eigenvectors corresponding to the process information, then the first eigenvectors of the server A are [0.4472, 0.8944, 0, 0, 0, 0.5797, 0.8148, 0], the first feature vector of server B is [0, 0, 0.5774, 0.5774, 0.5774, 0.5797, 0, 0.8148].

值得一提的是，服务器的第一特征向量包含的资产信息、第三特征向量的拼接顺序和获取方式可以根据实际需要进行选择和改变，本实施例对此不做限制。It is worth mentioning that the asset information contained in the first feature vector of the server, the splicing order and acquisition method of the third feature vector can be selected and changed according to actual needs, which is not limited in this embodiment.

步骤103，获取各第一特征向量的聚类结果，并根据聚类结果确定各服务器的分组。Step 103: Obtain the clustering results of each first feature vector, and determine the grouping of each server according to the clustering results.

具体地说，在为待分组的服务器生成对应的第一特征向量后，根据预设聚类算法将相似度大于预设阈值的若干第一特征向量作为一类特征向量，以此获取各服务器对应的第一特征向量的聚类结果，然后根据聚类结果中每一类特征向量中各特征向量对应的服务器，确定待分组服务器的分组。通过根据聚类算法对表征不同服务器业务特征的第一特征向量进行聚类，并根据聚类结果确定各服务器的分组，尽可能保证分组准确性的同时，提升分组效率、降低服务器分组难度。Specifically, after the corresponding first feature vectors are generated for the servers to be grouped, according to a preset clustering algorithm, several first feature vectors with a similarity greater than a preset threshold are used as a type of feature vector, so as to obtain the corresponding first feature vectors of each server. Then, according to the server corresponding to each feature vector in each type of feature vector in the clustering result, the grouping of the servers to be grouped is determined. By clustering the first feature vectors representing the business characteristics of different servers according to the clustering algorithm, and determining the grouping of each server according to the clustering results, while ensuring the accuracy of the grouping as much as possible, the grouping efficiency is improved and the difficulty of server grouping is reduced.

在一个例子中，根据预设聚类算法将相似度大于预设阈值的若干第一特征向量作为一类特征向量，获取各第一特征向量的聚类结果，包括：获取预设密度聚类算法的目标超参数；其中，目标超参数包括一类特征向量间的间隔阈限，和一类特征向量的向量最小个数；基于目标超参数，对各第一特征向量进行聚类迭代，获取聚类结果。具体而言，在获取到待分组的各服务器分别对应的第一特征向量后，根据预设的密度聚类算法获取相应的目标超参数，目标超参数可以是预先设置的，也可以是管理员根据电脑的提示实时输入的。目标超参数中包括聚类过程中，一类特征向量间的间隔阈限和一类特征向量的向量最小个数。向量间的间隔用于表征第一特征向量间的相似度，间隔阈限即为相似度的预设阈值，在两个第一特征向量间的间隔小于间隔阈限的情况下，判定这两个特征向量属于同一类；向量最小个数表征一类特征向量至少需要包含几个特征向量，具有若干特征向量的分组中，特征向量的数量不满足向量最小个数时，判定该分组中各特征向量不是同一类特征向量。电脑将调用的预设密度聚类算法的超参数设置为目标超参数后，利用密度聚类算法对各服务器分别对应的第一特征向量进行聚类迭代，获取第一特征向量的聚类结果。In an example, according to a preset clustering algorithm, a plurality of first feature vectors whose similarity is greater than a preset threshold are regarded as a class of feature vectors, and the clustering results of each first feature vector are obtained, including: obtaining a preset density clustering algorithm The target hyperparameters of class result. Specifically, after obtaining the first feature vectors corresponding to each server to be grouped, the corresponding target hyperparameters are obtained according to a preset density clustering algorithm, and the target hyperparameters may be preset or the administrator Real-time input according to the prompt of the computer. The target hyperparameters include the threshold of the interval between a class of eigenvectors and the minimum number of vectors of a class of eigenvectors in the clustering process. The interval between the vectors is used to characterize the similarity between the first feature vectors, and the interval threshold is the preset threshold of the similarity. When the interval between the two first feature vectors is smaller than the interval threshold, the two The eigenvectors belong to the same class; the minimum number of vectors indicates that a class of eigenvectors must contain at least several eigenvectors. In a group with several eigenvectors, when the number of eigenvectors does not meet the minimum number of vectors, it is determined that each eigenvector in the group is are not of the same class of eigenvectors. After the computer sets the hyperparameter of the called preset density clustering algorithm as the target hyperparameter, it uses the density clustering algorithm to perform clustering iteration on the first eigenvectors corresponding to each server to obtain the clustering result of the first eigenvectors.

例如，利用DBSCAN聚类算法对待分组的10个服务器分别对应的第一特征向量进行聚类，获取到的目标超参数为特征向量间的最小距离eps＝5以及一类特征向量包含的最小向量数min-samples＝5，经过聚类迭代后，类别A中包含5个第一特征向量，类别B中包含4个第一特征向量，并且有一个第一特征向量I与其余各第一特征向量间的距离都大于5，则DBSCAN聚类算法输出的聚类结果为类别A中5个第一特征向量为同一类特征向量，其余5个特征向量都为离散特征向量。然后电脑根据第一特征向量的聚类结果，将类别A中5个第一特征向量对应的服务器作为同一组服务器，将剩余5个服务器都单独作为离散服务器，得到待分组的十个服务器的分组结果。For example, using the DBSCAN clustering algorithm to cluster the first eigenvectors corresponding to the 10 servers to be grouped, the obtained target hyperparameters are the minimum distance between eigenvectors eps=5 and the minimum number of vectors contained in a class of eigenvectors min-samples=5, after clustering iteration, category A contains 5 first eigenvectors, category B contains 4 first eigenvectors, and there is one first eigenvector I and the rest of the first eigenvectors The distances are greater than 5, then the clustering result output by the DBSCAN clustering algorithm is that the five first eigenvectors in category A are the same type of eigenvectors, and the remaining 5 eigenvectors are discrete eigenvectors. Then, according to the clustering result of the first eigenvector, the computer regards the servers corresponding to the 5 first eigenvectors in category A as the same group of servers, and uses the remaining 5 servers as discrete servers individually to obtain a grouping of ten servers to be grouped. result.

值得一提的是，待分组的服务器可以是仅有准确上线的多个服务器、仅有多个已上线服务器或者包括多个已上线服务器和多个准确上线的服务器，通过将已上线和待上线服务器放在一起进行重新分组，不仅能够准确的对待分组服务器进行分组，还能够灵活高效的对所有服务器的分组进行动态维护，降低分组维护难度和成本。It is worth mentioning that the servers to be grouped can be only multiple servers that are online accurately, only multiple servers that are online, or include multiple servers that are online and multiple servers that are online accurately. The servers are put together for regrouping, which can not only accurately group the servers to be grouped, but also dynamically maintain the groups of all servers flexibly and efficiently, reducing the difficulty and cost of group maintenance.

在另一个例子中，在根据聚类结果确定各服务器的分组后，还包括：对于每一个分组，获取分组中各服务器的关键业务；根据各服务器的公共关键业务，对分组进行服务器业务标记。具体而言，电脑在获取到待分组的多个服务器的分组结果后，对每一个分组内的各服务器进行关键业务检测，确定同一分组中各服务器的若干个关键业务。然后检测同一分组中各服务器都具有的一个或多个公共关键业务，并根据获取到的公共关键业务，对当前分组服务器的核心业务进行标记，便于直观的获取当前分组内服务器的核心业务，从而对服务器进行准确的管理和维护。In another example, after the grouping of each server is determined according to the clustering result, the method further includes: for each grouping, acquiring the key services of each server in the grouping; and marking the grouping with server services according to the public key services of each server. Specifically, after obtaining the grouping results of the multiple servers to be grouped, the computer performs key service detection on each server in each group, and determines several key services of each server in the same group. Then, detect one or more public key services possessed by each server in the same group, and mark the core services of the current grouped server according to the acquired public key services, so as to intuitively obtain the core services of the servers in the current group, thereby Accurate management and maintenance of servers.

另外，在得到待分组服务器的分组结果后，还可以提醒管理人员对分组结果进行核验，并根据管理人员输入的调整指令对分组结果进行进一步的调整，从而使得分组结果更加符合实际需求。In addition, after obtaining the grouping result of the server to be grouped, the administrator can also be reminded to verify the grouping result, and further adjust the grouping result according to the adjustment instruction input by the administrator, so that the grouping result is more in line with the actual needs.

此外，应当理解的是，上面各种方法的步骤划分，只是为了描述清楚，实现时可以合并为一个步骤或者对某些步骤进行拆分，分解为多个步骤，只要包括相同的逻辑关系，都在本专利的保护范围内；对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计，但不改变其算法和流程的核心设计都在该专利的保护范围内。In addition, it should be understood that the division of steps of the various methods above is only for the purpose of describing clearly, and can be combined into one step or split into some steps during implementation, and decomposed into multiple steps, as long as the same logical relationship is included, all Within the protection scope of this patent; adding insignificant modifications to the algorithm or process or introducing insignificant designs, but not changing the core design of the algorithm and process are all within the protection scope of this patent.

本申请实施例的另一方面还提供了一种服务器分组装置，参考图2，包括：Another aspect of the embodiments of the present application further provides a server grouping apparatus, with reference to FIG. 2 , including:

获取模块201，用于获取各服务器上的至少一项资产信息。The acquiringmodule 201 is configured to acquire at least one item of asset information on each server.

查询模块202，用于对于每一个服务器，根据获取到的至少一项资产信息，确定服务器的第一特征向量。Thequery module 202 is configured to, for each server, determine the first feature vector of the server according to the acquired at least one item of asset information.

发送模块203，用于根据预设聚类算法将相似度大于预设阈值的若干第一特征向量作为一类特征向量，获取各第一特征向量的聚类结果，并根据聚类结果确定各服务器的分组。The sendingmodule 203 is configured to use a plurality of first feature vectors with a similarity greater than a preset threshold as a class of feature vectors according to a preset clustering algorithm, obtain a clustering result of each first feature vector, and determine each server according to the clustering result grouping.

不难发现，本实施例为与方法实施例相对应的装置实施例，本实施例可与方法实施例互相配合实施。方法实施例中提到的相关技术细节在本实施例中依然有效，为了减少重复，这里不再赘述。相应地，本实施例中提到的相关技术细节也可应用在方法实施例中。It is not difficult to find that this embodiment is an apparatus embodiment corresponding to the method embodiment, and this embodiment can be implemented in cooperation with the method embodiment. The related technical details mentioned in the method embodiment are still valid in this embodiment, and are not repeated here in order to reduce repetition. Correspondingly, the relevant technical details mentioned in this embodiment can also be applied to the method embodiments.

值得一提的是，本实施例中所涉及到的各模块均为逻辑模块，在实际应用中，一个逻辑单元可以是一个物理单元，也可以是一个物理单元的一部分，还可以以多个物理单元的组合实现。此外，为了突出本发明的创新部分，本实施例中并没有将与解决本发明所提出的技术问题关系不太密切的单元引入，但这并不表明本实施例中不存在其它的单元。It is worth mentioning that all the modules involved in this embodiment are logical modules. In practical applications, a logical unit may be a physical unit, a part of a physical unit, or multiple physical units. A composite implementation of the unit. In addition, in order to highlight the innovative part of the present invention, the unit that is not closely related to solving the technical problem proposed by the present invention is not introduced in this embodiment, but this does not mean that there are no other units in this embodiment.

本申请实施例的另一方面还提供了一种电子设备，参考图3，包括：包括至少一个处理器301；以及，与至少一个处理器301通信连接的存储器302；其中，存储器302存储有可被至少一个处理器301执行的指令，指令被至少一个处理器301执行，以使至少一个处理器301能够执行上述任一方法实施例所描述的服务器分组方法。Another aspect of the embodiments of the present application further provides an electronic device, referring to FIG. 3 , comprising: at least oneprocessor 301 ; and amemory 302 communicatively connected to the at least oneprocessor 301 ; wherein thememory 302 stores a Instructions executed by the at least oneprocessor 301, the instructions are executed by the at least oneprocessor 301, so that the at least oneprocessor 301 can execute the server grouping method described in any of the above method embodiments.

其中，存储器302和处理器301采用总线方式连接，总线可以包括任意数量的互联的总线和桥，总线将一个或多个处理器301和存储器302的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起，这些都是本领域所公知的，因此，本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件，也可以是多个元件，比如多个接收器和发送器，提供用于在传输介质上与各种其他装置通信的单元。经处理器301处理的数据通过天线在无线介质上进行传输，进一步，天线还接收数据并将数据传输给处理器301。Thememory 302 and theprocessor 301 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one ormore processors 301 and various circuits of thememory 302 together. The bus may also connect together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides the interface between the bus and the transceiver. A transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other devices over a transmission medium. The data processed by theprocessor 301 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to theprocessor 301 .

处理器301负责管理总线和通常的处理，还可以提供各种功能，包括定时，外围接口，电压调节、电源管理以及其他控制功能。而存储器302可以被用于存储处理器301在执行操作时所使用的数据。Processor 301 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management, and other control functions. Thememory 302 may be used to store data used by theprocessor 301 when performing operations.

本申请实施例的另一方面还提供了一种计算机可读存储介质，存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。Another aspect of the embodiments of the present application further provides a computer-readable storage medium storing a computer program. The above method embodiments are implemented when the computer program is executed by the processor.

即，本领域技术人员可以理解，实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成，该程序存储在一个存储介质中，包括若干指令用以使得一个设备(可以是单片机，芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-OnlyMemory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。That is, those skilled in the art can understand that all or part of the steps in the method for implementing the above embodiments can be completed by instructing the relevant hardware through a program, and the program is stored in a storage medium and includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present application. The aforementioned storage medium includes: U disk, removable hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes.

本领域的普通技术人员可以理解，上述各实施例是实现本申请的具体实施例，而在实际应用中，可以在形式上和细节上对其作各种改变，而不偏离本申请的精神和范围。Those of ordinary skill in the art can understand that the above-mentioned embodiments are specific embodiments for realizing the present application, and in practical applications, various changes in form and details can be made without departing from the spirit and the spirit of the present application. scope.

Claims

Translated fromChinese

1.一种服务器分组方法，其特征在于，包括：1. a server grouping method, is characterized in that, comprises:

获取各服务器上的至少一项资产信息；Obtain at least one asset information on each server;

对于每一个服务器，根据获取到的至少一项所述资产信息，确定所述服务器的第一特征向量；For each server, determine the first feature vector of the server according to the acquired at least one item of the asset information;

根据预设聚类算法将相似度大于预设阈值的若干所述第一特征向量作为一类特征向量，获取各所述第一特征向量的聚类结果，并根据所述聚类结果确定各所述服务器的分组。According to a preset clustering algorithm, a plurality of the first feature vectors whose similarity is greater than a preset threshold are regarded as a class of feature vectors, the clustering results of each of the first feature vectors are obtained, and each of the first feature vectors is determined according to the clustering results. The grouping of the described server.

2.根据权利要求1所述的服务器分组方法，其特征在于，所述根据获取到的至少一项所述资产信息，确定所述服务器的第一特征向量，包括：2 . The server grouping method according to claim 1 , wherein the determining of the first feature vector of the server according to the acquired at least one item of the asset information comprises: 2 .

在获取到的所述资产信息为N项的情况下，其中，N为大于1的整数；In the case where the acquired asset information is N items, N is an integer greater than 1;

对各所述服务器的所有所述资产信息进行分类，生成N类资产信息集，并确定每一类所述资产信息集对应的关键词集合；Classify all the asset information of each of the servers, generate N types of asset information sets, and determine a keyword set corresponding to each type of the asset information set;

获取所述服务器每一项所述资产信息分别对应的目标关键词集合；其中，所述目标关键词集合为每一项所述资产信息所属的所述资产信息集对应的所述关键词集合；Acquiring a target keyword set corresponding to each item of the asset information of the server; wherein, the target keyword set is the keyword set corresponding to the asset information set to which each item of the asset information belongs;

根据对应的所述目标关键词集合中各关键词在每一项所述资产信息中的出现次数，生成每一项所述资产信息各自对应的第二特征向量；According to the number of occurrences of each keyword in the corresponding target keyword set in each item of the asset information, a second feature vector corresponding to each item of the asset information is generated;

根据所述服务器的N个所述第二特征向量，确定所述第一特征向量。The first feature vector is determined according to the N second feature vectors of the server.

3.根据权利要求2所述的服务器分组方法，其特征在于，所述根据所述服务器的N个所述第二特征向量，确定所述第一特征向量，包括：3. The server grouping method according to claim 2, wherein the determining the first feature vector according to the N second feature vectors of the server comprises:

对每一个所述第二特征向量，获取所述第二特征向量中各元素的权重；For each of the second feature vectors, obtain the weight of each element in the second feature vector;

根据所述第二特征向量中各元素的权重，生成所述第二特征向量对应的第三特征向量；According to the weight of each element in the second eigenvector, a third eigenvector corresponding to the second eigenvector is generated;

针对每一个所述服务器，按照相同的顺序拼接所述生成的N个所述第三特征向量，得到所述第一特征向量。For each of the servers, the generated N third feature vectors are spliced in the same order to obtain the first feature vector.

4.根据权利要求3所述的服务器分组方法，其特征在于，所述获取所述第二特征向量中各元素的权重，包括：4. The server grouping method according to claim 3, wherein the obtaining the weight of each element in the second feature vector comprises:

对于所述第二特征向量中的每一个元素，根据对应的关键词在所述第二特征向量对应的所述资产信息中出现的次数，以及所述第二特征向量对应的所述资产信息所属的所述资产信息集对应的所述关键词集合中关键词的总数，确定所述对应的关键词的词频；For each element in the second feature vector, according to the number of times the corresponding keyword appears in the asset information corresponding to the second feature vector, and the asset information corresponding to the second feature vector belongs to The total number of keywords in the keyword set corresponding to the asset information set is determined, and the word frequency of the corresponding keyword is determined;

根据包含所述对应的关键词的所述资产信息的数量，以及各所述服务器的所述资产信息总数，确定所述对应的关键词的逆文档频率；Determine the inverse document frequency of the corresponding keyword according to the quantity of the asset information including the corresponding keyword and the total number of the asset information of each of the servers;

根据所述对应的关键词的所述词频和所述逆文档频率，确定元素的权重。The weight of an element is determined according to the word frequency and the inverse document frequency of the corresponding keyword.

5.根据权利要求3所述的服务器分组方法，其特征在于，在所述获取所述第二特征向量中各元素的权重后，还包括：根据如下公式对各元素的权重进行标准化处理：5. The server grouping method according to claim 3, characterized in that, after obtaining the weight of each element in the second feature vector, the method further comprises: standardizing the weight of each element according to the following formula:

其中，ω_norm,i为所述第二特征向量中第i个元素的权重标准化值，ω_i为所述第i个元素的权重，ω_j为所述第二特征向量中第j个元素的权重，m为所述第二特征向量中元素的总数；Where, ω_norm,i is the weight normalization value of the ith element in the second feature vector, ω_i is the weight of the ith element, and ω_j is the weight of the jth element in the second feature vector. weight, m is the total number of elements in the second feature vector;

所述根据所述第二特征向量中各元素的权重，生成所述第二特征向量对应的第三特征向量，包括：The generating a third feature vector corresponding to the second feature vector according to the weight of each element in the second feature vector, including:

根据所述第二特征向量各元素经过标准化处理后的权重，生成所述第二特征向量对应的第三特征向量。A third feature vector corresponding to the second feature vector is generated according to the normalized weight of each element of the second feature vector.

6.根据权利要求1所述的服务器分组方法，其特征在于，所述根据预设聚类算法将相似度大于预设阈值的若干所述第一特征向量作为一类特征向量，获取各所述第一特征向量的聚类结果，包括：6. The server grouping method according to claim 1, wherein, according to a preset clustering algorithm, a plurality of the first feature vectors with a similarity greater than a preset threshold are used as a class of feature vectors, and each of the first feature vectors is obtained. The clustering results of the first feature vector, including:

获取预设密度聚类算法的目标超参数；其中，所述目标超参数包括一类特征向量间的间隔阈限，和一类特征向量的向量最小个数；Obtain the target hyperparameters of the preset density clustering algorithm; wherein, the target hyperparameters include the interval threshold between a class of feature vectors and the minimum number of vectors of a class of feature vectors;

基于所述目标超参数，对各所述第一特征向量进行聚类迭代，获取所述聚类结果。Based on the target hyperparameters, clustering iteration is performed on each of the first feature vectors to obtain the clustering result.

7.根据权利要求1至6中任一项所述的服务器分组方法，其特征在于，所述资产信息，包括以下之一或其任意组合：进程、端口绑定信息、系统用户、组用户、定时任务、开机启动项、环境变量。7. The server grouping method according to any one of claims 1 to 6, wherein the asset information comprises one of the following or any combination thereof: process, port binding information, system user, group user, Scheduled tasks, startup items, environment variables.

8.根据权利要求1至6中任一项所述的服务器分组方法，其特征在于，在所述根据所述聚类结果确定各所述服务器的分组后，还包括：8. The server grouping method according to any one of claims 1 to 6, wherein after the determining the grouping of each of the servers according to the clustering result, the method further comprises:

对于每一个分组，获取所述分组中各所述服务器的关键业务；For each group, obtain the key services of each of the servers in the group;

根据各所述服务器的公共关键业务，对所述分组进行服务器业务标记。Server service marking is performed on the packet according to the public key service of each of the servers.

9.一种服务器分组装置，其特征在于，包括：9. A server grouping device, comprising:

获取模块，用于获取各服务器上的至少一项资产信息；an acquisition module for acquiring at least one asset information on each server;

确定模块，用于对于每一个服务器，根据获取到的至少一项所述资产信息，确定所述服务器的第一特征向量；a determining module, configured to, for each server, determine the first feature vector of the server according to the acquired at least one item of the asset information;

分组模块，用于根据预设聚类算法将相似度大于预设阈值的若干所述第一特征向量作为一类特征向量，获取各所述第一特征向量的聚类结果，并根据所述聚类结果确定各所述服务器的分组。The grouping module is configured to use a number of the first feature vectors with a similarity greater than a preset threshold as a class of feature vectors according to a preset clustering algorithm, obtain the clustering results of each of the first feature vectors, and according to the clustering results. The class results determine the grouping of each of the servers.

10.一种电子设备，其特征在于，包括：10. An electronic device, comprising:

至少一个处理器；以及，at least one processor; and,

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行如权利要求1至8中任意一项所述的服务器分组方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any one of claims 1 to 8 The server grouping method described above.

11.一种计算机可读存储介质，存储有计算机程序，其特征在于，所述计算机程序被处理器执行时实现权利要求1至8中任一项所述的服务器分组方法。11 . A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the server grouping method according to any one of claims 1 to 8 is implemented. 12 .