Disclosure of Invention
It is an object of embodiments of the present specification to provide a multi-model federal learning method, system, and storage medium.
In order to solve the technical problems, the embodiment of the application is realized by the following steps:
in a first aspect, the present application provides a multi-model federal learning method, the method comprising:
the method comprises the steps that a server acquires a model set to be trained, wherein the model set to be trained comprises a plurality of models to be trained;
The server generates an optimal distribution matrix by adopting a multi-model optimal distribution method so as to distribute the model to be trained to different clients according to the optimal distribution matrix, so that each client downloads the corresponding model to be trained according to the indication of the optimal distribution matrix generated by the server, completes the model training of the round, and uploads the trained model parameters to the server;
the server receives the model parameters uploaded by the client in a preset time and aggregates the model parameters;
and the server determines the precision of each model to be trained and the total number of wheels of model training according to the aggregated model parameters, and finishes training on the models to be trained which meet the precision requirement or have the number of training wheels exceeding the threshold value of the number of wheels, and the other models to be trained enter the next round of training.
In one embodiment, a multi-model optimal allocation method is adopted to generate an optimal allocation matrix, which comprises the following steps:
randomly distributing the model to be trained to each client to obtain an initial distribution matrix;
calculating a corresponding initial objective function value according to the initial allocation matrix and the model precision function table;
Constructing indexes of all possible model allocation attempts to be trained based on the initial objective function values to obtain an index set;
And determining an optimal allocation matrix according to the index set.
In one embodiment, the structure of the model precision function table comprises model types, initial model precision, the number of clients used for model training and end model precision, and the creation of the model precision function table comprises the following steps:
Obtaining a mathematical model;
calculating the end model precision obtained by one round of training when different types of models have different given initial model precision and different numbers of clients are used based on the mathematical model;
And all the corresponding relations of the model types, the initial model precision, the number of used clients and the ending model precision form an initial model precision function table.
In one embodiment, the updating of the model precision function table includes:
after each time of training a round of models, the model training type during the training of the round, the initial training model precision at the beginning of the round, the training number of the clients used by the round and the ending training model precision at the ending of the round are obtained;
And updating the corresponding model precision function table according to the training type, the initial training model precision, the training quantity of the client and the final training model precision.
In one embodiment, updating the corresponding model precision function table according to the training category, the initial training model precision, the training number of the client, and the final training model precision includes:
and searching the data record closest to the training model precision of the starting training and the training quantity of the client from the records of the corresponding training categories in the model precision function table, and updating the model precision of the ending model in the data record to the model precision of the ending training.
In one embodiment, calculating the corresponding initial objective function value according to the initial allocation matrix and the model precision function table includes:
Acquiring initial training precision of any model at the beginning of the training of the present wheel;
under the condition of initial matrix allocation, determining expected inference precision according to a model precision function table and initial training precision;
And determining an initial objective function value according to the initial training precision and the expected inference precision.
In one embodiment, determining an optimal allocation matrix from the set of indices includes:
Judging whether the index set is empty or not;
if the index set is empty, outputting the current allocation matrix as an optimized allocation matrix;
If the index set is not empty, each possible allocation attempt recorded in the index set is tried to allocate the client to each model according to the allocation attempt under the current allocation matrix, and meanwhile, one or more allocated models on the client are removed so that the client can complete model training within a preset time, the allocation matrix of the trial model is recorded, and the trial objective function value corresponding to the allocation matrix of the trial model is calculated;
Selecting an allocation attempt corresponding to the largest try objective function value as an optimal allocation attempt according to all possible allocation attempts, corresponding try model allocation matrixes and try objective function values recorded in the index set, wherein the try objective function value corresponding to the optimal allocation attempt is the largest try objective function value;
if the maximum trial objective function value is smaller than or equal to the current objective function value, outputting the current allocation matrix as an optimized allocation matrix;
If the maximum trial objective function value is greater than the current objective function value, updating the current allocation matrix to be the optimal allocation matrix corresponding to the optimal allocation trial, updating the variable corresponding to the current objective function value to be the variable corresponding to the maximum trial objective function value, deleting the optimal allocation trial from the index set, and returning to judge whether the index set is empty to continue execution.
In one embodiment, the set of models to be trained includes the newly injected model to be trained and/or the model remaining to be trained after the previous round of training.
In a second aspect, the present application provides a multimodal federal learning system, the system comprising:
the system comprises a server, a server and a client, wherein the server is used for acquiring a model set to be trained, the model set to be trained comprises a plurality of models to be trained, and an optimal allocation matrix is generated by adopting a multi-model optimal allocation method and is allocated to different clients according to the optimal allocation matrix;
The system comprises a plurality of clients, a server, a plurality of data processing units and a plurality of data processing units, wherein the clients are used for downloading respective corresponding models to be trained according to the indication of an optimal allocation matrix, completing the training of the models of the present round, and uploading the trained model parameters to the server;
And determining the precision of each model to be trained and the total number of the training wheels of the model according to the aggregated model parameters, ending the training on the model to be trained which meets the precision requirement or has the number of training wheels exceeding the threshold value of the number of the wheels, and entering the next training on the other models to be trained.
In a third aspect, the present application provides a readable storage medium having stored thereon a computer program which when executed by a processor implements the multimodal federal learning method of the first aspect.
The technical scheme provided by the embodiment of the specification can be seen that the distribution of training tasks from multiple models to multiple clients is optimized based on the difference of the resources of the clients, the maximization of the overall training efficiency of the multiple models is expected to be realized, the resources of the clients can be fully utilized, and the remarkable training efficiency is improved during the training of the multiple models.
Detailed Description
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be apparent to those skilled in the art that various modifications and variations can be made in the specific embodiments of the application described herein without departing from the scope or spirit of the application. Other embodiments will be apparent to those skilled in the art from consideration of the specification of the present application. The specification and examples of the present application are exemplary only.
As used herein, the terms "comprising," "including," "having," "containing," and the like are intended to be inclusive and mean an inclusion, but not limited to.
The "parts" in the present application are all parts by mass unless otherwise specified.
In the related art, the existing federal learning system is focused on single model training, that is, the system is only responsible for training one model at the same time. In such a system, the server is responsible for managing the federal training of the individual model, the training process is performed in an iterative manner, in each iteration, a certain number of clients are selected by the server to participate in the training of the individual model, each client trains the individual model once, and the server needs to wait for the training process to wait for the slowest client in each iteration to complete a round of training. Thus, existing single-model federal learning systems waste a large amount of powerful client resources, the capabilities of which are underutilized because only one model can be trained per round.
Based on the method, the application provides a multi-model federal learning method, which is based on an efficient multi-model federal learning system to train a plurality of models in parallel, the system optimizes the training task allocation from the multi-model to the multi-client by considering the difference of the resources of the clients, the maximization of the overall training efficiency of the multi-model is realized, the resources of the clients can be fully utilized, and the remarkable performance is improved during multi-model training.
The invention is described in further detail below with reference to the drawings and examples.
The method provided by the embodiment of the application is applied to a multi-model federal learning system, and the multi-model federal learning system refers to a system for realizing the multi-model federal learning method. FIG. 1 illustrates a schematic diagram of a multi-model federal learning system provided by an embodiment of the present application. As shown in fig. 1, the system comprises a server 11 and several clients 12 that can participate in model training. Each client 12 owns respective user data (i.e., private data) as a participant in federal learning. The server 11 communicates with clients 12 via the internet (including backbones and various types of access networks) to implement federal learning.
The client 12 may be a PC client, a mobile client, a smart vehicle client, or the like.
The server 11 manages the training of a plurality of AI models, and is responsible for managing the parallel federal learning of multiple models in a cyclic iteration manner until each model reaches its preset inference accuracy or the total number of iterative training of the model reaches a preset maximum threshold. In each iteration, the server executes a multi-model optimization allocation method, allocates a plurality of models to a plurality of randomly selected clients for parallel training, and aggregates model parameters after the client training so as to improve model inference precision and realize high-performance multi-model federal learning.
In order to realize the optimal allocation of multiple models to multiple clients, the server communicates with each client in a certain period to obtain dynamic differential information of the client, including bandwidth capability of downloading model data, models that private data can support training, calculation capability of model training, and the like, so as to dynamically estimate total time (denoted as deltaij) required by one client i to train one model j, including time required by the client to download and upload model parameters and time required by the private data to train one model, and further enable the server to perform optimal allocation calculation of multiple models to multiple clients according to the information.
Referring to fig. 2, a flow chart of a multi-model federal learning method suitable for use in embodiments of the present application is shown.
As shown in fig. 2, the multi-model federal learning method may include:
s210, acquiring a model set to be trained, wherein the model set to be trained comprises a plurality of models to be trained. The model set to be trained can comprise a newly injected model to be trained and/or a model which remains to be trained after the previous round of training.
S220, the server adopts a multi-model optimization distribution method to generate an optimization distribution matrix so as to distribute the model to be trained to different clients according to the optimization distribution matrix, so that each client downloads the corresponding model to be trained according to the indication of the optimization distribution matrix generated by the server, the model training of the round is completed, and the trained model parameters are uploaded to the server;
s230, the server receives model parameters uploaded by the client in a preset time, and aggregates the model parameters;
S240, the server determines the precision of each model to be trained and the total number of wheels of model training according to the aggregated model parameters, and the training is finished for the models to be trained which reach the precision requirement or the number of training wheels exceeds the threshold value of the number of wheels, and other models to be trained enter the next round of training.
It can be understood that model training in the multi-model federal learning method provided by the embodiment of the present application is implemented by adopting round-by-round training, and each round of training process is shown in fig. 3, and mainly includes steps of model to be trained and client preparation, model precision function table preparation, multi-model optimization allocation calculation, model parallel training, model parameter collection and aggregation, and the specific process is as follows:
Firstly, the server needs to prepare the models to be trained of the round in advance, and all models to be trained are marked as M. The server also needs to randomly select a certain number of clients, record the client set as N, and start the training round. Wherein, M comprises the model which is left after the previous training and needs to be further trained and the newly injected model to be trained.
Then, the server prepares the model precision function table for the round of multi-model allocation. The preparation process involves two operations, one is to initialize the model precision function table (i.e., creation of the model precision function table) according to the mathematical model, and the other is to update the model precision function table (i.e., update of the model precision function table) according to the actual training result of the previous round. The specific steps of creating and updating the model precision function table are described in the following embodiments.
Then, the server adopts a multi-model optimization allocation method to generate an optimization allocation matrix pi so as to allocate the |M| models to be trained (i.e. the models to be trained) of the round to the clients in the set N. According to the optimal allocation matrix pi, each client should be able to complete its undertaken model training task within a preset time T (which may be set according to actual requirements). In this allocation calculation process, the server will perform calculation of the allocation of multiple models to multiple clients according to the estimated time δij required for any client i e N to complete training of any model j e M once, and the model precision function table, and generate an optimized allocation matrix pi= { piij |i e N, j e M } for the N row x M column to allocate the m|models to be trained by the current round to clients in the set N, where element piij =1/0 indicates whether model j is/is allocated to client i.
And then, each client downloads the model to be trained which is responsible for each client according to the indication of the optimization distribution matrix pi generated by the server, and uploads the trained model parameters to the server.
And then, the server receives the model parameters uploaded by the client in the time T required by the training round (the time required by the training round is the preset time) and aggregates the model parameters.
And finally, the server determines the precision of each model to be trained and the total number of wheels for model training according to the aggregated model parameters. And (3) finishing the training process of the model or the model with the number of training rounds exceeding a certain number of times (namely, the round number threshold value can be set according to actual requirements), and keeping other models to enter the next round of training.
The structure of the model precision function table includes model type (or model class), initial model precision, the number of clients used for model training, and end model precision.
In one embodiment, the creation of the model precision function table may include:
Obtaining a mathematical model;
Calculating the model accuracy of the ending model obtained by one round of training when the model of different types is given with different initial model accuracy and different numbers of clients based on the mathematical model;
and all the corresponding relations of model types, initial model precision, the number of used clients and end model precision form an initial model precision function table.
In one embodiment, the updating of the model precision function table includes:
After each time of training a round of models, obtaining model training types during the training of the round, initial training model precision at the beginning of the round, training quantity of clients used by the round and ending training model precision at the ending of the round;
and updating the corresponding model precision function table according to the training category, the initial training model precision, the training quantity of the client and the final training model precision.
Specifically, the multi-model allocation method of the application assumes that in one federal learning round, the accuracy of inference after training of one model jGenerally, the accuracy of the model j in the initial run is inferredAnd the function of the number of clients nj to be used in this round of training, called model precision function, can be expressed as
For convenience in use during multi-model allocation calculation, the model precision function is stored in the memory of the server in the form of a function table (it will be understood that the model precision function may also be stored in other storage media or other servers, for example, and only needs to be obtained when the server is used), so that the expected model precision can be obtained according to a given parameter lookup table. In the following expression, the look-up procedure is still expressed using the form of equation (1). The structure, creation, and update of the model precision function table are described below.
The model precision function table structurally comprises basic fields of model type, current precision of model (corresponding to) The number of clients (corresponding to nj) that will be used for model training, the model accuracy after training (corresponding to)。
When using a model precision function table, the program will depend on the given model type, the current precision of the modelThe number nj of clients participating in training, the model precision function table is searched in a fuzzy manner, and the corresponding model type and the requirement are found outAnd nj is closest to the record, the trained model accuracy of which is read (corresponds to) As the desired accuracy. As above, the present application still uses symbolsTo represent the fuzzy search of the model precision function table.
The creation and updating of the model precision function table will be based on the mathematical model and the actual model training results, respectively. The table is created based on the following mathematical model derived from actual observations.
Where j represents the model, αj、βj is a model parameter, which can be measured in advance by experimental data, nj is the number of clients used for model training,Is the model inference accuracy obtained by training the most primitive model j using these clients. Given the mathematical model (2), one can calculate the model j of different types at different given initial model accuraciesAnd model accuracy expected to be achieved through one round of training under the condition of different numbers of clients njNamely:
Furthermore, model j can be calculated based on the formula (3) to obtain model accuracy through one round of training under the conditions of different given initial model accuracy and different client numbers, and the data can be stored in the memory table to form an initial model accuracy function table. The values of the current model precision fields in the table may correspond to a plurality of quantization intervals between 0 and 1, for example, 0.1, 0.2, and 0.9, and the values of the client number fields may correspond to values from 1 to |n|, with a certain integer as an interval, for example, 1, 2, and/or.
The model precision function table is continuously updated along with the actual execution result of the multi-model training so as to more accurately reflect the actual association relation among the parameters and improve the accuracy of the multi-model allocation method. Each time a round of models is actually trained, the server records the category of the model j trained in the round and the model precision at the beginning of the roundThe number of clients nj used in the round and the model accuracy at the end of the roundAnd updates the corresponding recorded values in the model precision function table with these data. Specifically, the closest is found from the records in the table corresponding to model j categoryAnd nj, updating the recorded trained model accuracy value toThereby completing the record updating of the model precision function table.
In one embodiment, a multi-model optimal allocation method is adopted to generate an optimal allocation matrix, which comprises the following steps:
randomly distributing the model to be trained to each client to obtain an initial distribution matrix;
calculating a corresponding initial objective function value according to the initial allocation matrix and the model precision function table;
Constructing indexes of all possible model allocation attempts to be trained based on the initial objective function values to obtain an index set;
And determining an optimal allocation matrix according to the index set.
Wherein, according to the initial allocation matrix and the model precision function table, calculating the corresponding initial objective function value may include:
Acquiring initial training precision of any model at the beginning of the training of the present wheel;
under the condition of initial matrix allocation, determining expected inference precision according to a model precision function table and initial training precision;
And determining an initial objective function value according to the initial training precision and the expected inference precision.
Wherein determining an optimal allocation matrix from the index set comprises:
Judging whether the index set is empty or not;
if the index set is empty, outputting the current allocation matrix as an optimized allocation matrix;
If the index set is not empty, each possible allocation attempt recorded in the index set is tried to allocate the client to each model according to the allocation attempt under the current allocation matrix, and meanwhile, one or more allocated models on the client are removed so that the client can complete model training within a preset time, the allocation matrix of the trial model is recorded, and the trial objective function value corresponding to the allocation matrix of the trial model is calculated;
Selecting an allocation attempt corresponding to the largest try objective function value as an optimal allocation attempt according to all possible allocation attempts, corresponding try model allocation matrixes and try objective function values recorded in the index set, wherein the try objective function value corresponding to the optimal allocation attempt is the largest try objective function value;
if the maximum trial objective function value is smaller than or equal to the current objective function value, outputting the current allocation matrix as an optimized allocation matrix;
If the maximum trial objective function value is greater than the current objective function value, updating the current allocation matrix to be the optimal allocation matrix corresponding to the optimal allocation trial, updating the variable corresponding to the current objective function value to be the variable corresponding to the maximum trial objective function value, deleting the optimal allocation trial from the index set, and returning to judge whether the index set is empty to continue execution.
Specifically, a round of training efficiency defining a model (the model to be trained is hereinafter referred to as a model to be trained) is defined as improvement of the inference accuracy thereof, namelyFurthermore, the application provides a multi-model optimization allocation method to solve an optimization allocation matrix pi from a multi-model to a multi-client so as to maximize the overall weighted training efficiency of all models in each round of training, namely a formula (4), thereby optimizing the overall training efficiency of the multi-model federal learning system.
Where wj is the importance factor of model j, which can be set manually, in particular if wj ≡1 of all models indicates that the priorities of all models are the same,Is the accuracy of model j at the beginning of the run (i.e. initial training accuracy),The expected inferred accuracy given the allocation matrix pi can be obtained from a query model accuracy function table, i.eWherein Nj=∑i∈Nπij, M and N are the set of models and the set of clients, respectively, to which the training of the round corresponds.
In general, the multi-model optimal allocation method provided by the application finds an optimal allocation matrix from multiple models to multiple clients by executing the following calculation process. First, a model to be trained is randomly allocated to each client, an initial allocation matrix is formed, and a corresponding objective function value F (pi) is calculated. Note that this process must ensure that each client is able to do its own model training within a specified time T. The optimal model assignments are then iteratively found in a greedy manner, each iteration being based on pi at that time, selecting an attempt from all possible model assignment attempts to maximize the objective function value in the present round of training until no attempt is made to further increase the objective function value. Wherein a model assignment attempt refers to assigning a model to a client and simultaneously (if necessary) culling one or more already assigned models on the client to enable the client to complete its model training in time T, wherein culling models is also an iterative process of culling one model at a time that minimizes the objective function loss until the remaining models can be trained in T.
FIG. 4 shows a specific flow of the multi-model optimal allocation method. The input parameters needed by the method comprise the time upper limit T (namely preset duration) of each round, a set N of clients participating in the round of training, a set M of models to be trained, the total time deltaij needed by any client i epsilon N to train any model j epsilon M, the weight parameter wj of each model and an initial distribution matrix pi with elements of all 0. The method generates a final optimal allocation matrix by performing the following steps.
And 1, initializing model allocation, and recording an allocation result to an initial allocation matrix pi. The models are randomly assigned to the individual clients such that each client has no extra time to train more models, i.e., for any client i e N, the relationship is satisfied
And 2, calculating the value of F (pi) according to an initial allocation matrix pi and storing the value into a variable u according to a formula (4), constructing indexes (i, j) of all possible model allocation attempts, and storing the indexes (i, j) into a set Γ so as to facilitate subsequent iterative operation, wherein each element (i, j) of Γ indicates a model j which can be allocated but is not allocated to a client i currently.
And 3, judging whether the index set Γ is empty or not. If the allocation matrix is empty, outputting the allocation matrix pi as a final allocation result, namely, optimizing the allocation matrix, ending the method, and if the allocation matrix pi is empty, executing the next step, and entering a loop iteration process.
Step 4, attempting to assign model j to client i under the current assignment matrix pi for each possible assignment attempt (i, j) recorded in index set Γ, while (if necessary) rejecting one or more assigned models on client j to enable client j to complete training of all models it undertakes within time T, wherein rejecting is also an iterative process of rejecting one assigned model from the client at a time and minimizing loss of objective function values until the remaining models can be trained within T, recording the model assignment matrix resulting from the above operationsCalculating objective function valueAnd stores the variable u(i,j).
Step 5-all possible allocation attempts (i, j) to be recorded in the index set Γ, and the corresponding allocation matrix resulting from step 4And objective function value (u(i,j)), selecting the allocation attempt with the largest objective function value as the best allocation attempt (i*,j*), i.e., (i*,j*)=argmax(i,j)∈Γ{ü(i,j)).
Step 6, judging the objective function value corresponding to the selected optimal allocation attempt (i*,j*)Whether or not the target function value u is larger than the target function value u corresponding to the current allocation matrix pi, ifThe cyclic process is terminated, the current allocation matrix pi is output as a final result, the method is ended, and otherwise, the next step is executed.
Step 7, updating pi to the allocation matrix corresponding to the optimal allocation attempt (i*,j*)Simultaneously updating the objective function value variable u to beAnd delete this selected (i*,j*) from Γ. Then, the process returns to step 3 to continue execution.
The multi-model federation learning method provided by the embodiment of the application can train a plurality of models in parallel, optimize the allocation of training tasks from the plurality of models to the plurality of clients based on the difference of the resources of the clients, expectedly realize the maximization of the overall training efficiency of the multi-model, fully utilize the resources of the clients and remarkably improve the training efficiency during the multi-model training.
With continued reference to FIG. 1, a schematic diagram of a multi-model federal learning system is shown, as described in accordance with one embodiment of the present application.
As shown in fig. 1, the multimodal federal learning system may include:
The server 11 is used for acquiring a model set to be trained, wherein the model set to be trained comprises a plurality of models to be trained, generating an optimal allocation matrix by adopting a multi-model optimal allocation method, and allocating the models to be trained to different clients according to the optimal allocation matrix;
The clients 12 are used for receiving the optimal allocation matrix sent by the server, downloading respective corresponding models to be trained according to the indication of the optimal allocation matrix, completing the model training of the round, and uploading the trained model parameters to the server, wherein each client is required to complete respective model training tasks within preset time according to the optimal allocation matrix;
And the server 11 is further configured to receive the model parameters uploaded by the client and aggregate the model parameters within a preset time, determine the accuracy of each model to be trained and the total number of rounds of model training according to the aggregated model parameters, end training for the model to be trained that meets the accuracy requirement or has the number of training rounds exceeding the threshold of the number of rounds, and enter the next round of training for the other models to be trained.
Optionally, the server 11 is further configured to:
randomly distributing the model to be trained to each client to obtain an initial distribution matrix;
calculating a corresponding initial objective function value according to the initial allocation matrix and the model precision function table;
Constructing indexes of all possible model allocation attempts to be trained based on the initial objective function values to obtain an index set;
And determining an optimal allocation matrix according to the index set.
Optionally, the model precision function table comprises a model type, an initial model precision, the number of clients used for model training and an end model precision, and the server 11 is further configured to create the model precision function table, including:
Obtaining a mathematical model;
calculating the end model precision obtained by one round of training when different types of models have different given initial model precision and different numbers of clients are used based on the mathematical model;
And all the corresponding relations of the model types, the initial model precision, the number of used clients and the ending model precision form an initial model precision function table.
Optionally, the server 11 is further configured to update a model precision function table, including:
after each time of training a round of models, the model training type during the training of the round, the initial training model precision at the beginning of the round, the training number of the clients used by the round and the ending training model precision at the ending of the round are obtained;
And updating the corresponding model precision function table according to the training type, the initial training model precision, the training quantity of the client and the final training model precision.
Optionally, the server 11 is further configured to:
and searching the data record closest to the training model precision of the starting training and the training quantity of the client from the records of the corresponding training categories in the model precision function table, and updating the model precision of the ending model in the data record to the model precision of the ending training.
Optionally, the server 11 is further configured to:
Acquiring initial training precision of any model at the beginning of the training of the present wheel;
under the condition of initial matrix allocation, determining expected inference precision according to a model precision function table and initial training precision;
And determining an initial objective function value according to the initial training precision and the expected inference precision.
Optionally, the server 11 is further configured to:
Judging whether the index set is empty or not;
if the index set is empty, outputting the current allocation matrix as an optimized allocation matrix;
If the index set is not empty, each possible allocation attempt recorded in the index set is tried to allocate the client to each model according to the allocation attempt under the current allocation matrix, and meanwhile, one or more allocated models on the client are removed so that the client can complete model training within a preset time, the allocation matrix of the trial model is recorded, and the trial objective function value corresponding to the allocation matrix of the trial model is calculated;
Selecting an allocation attempt corresponding to the largest try objective function value as an optimal allocation attempt according to all possible allocation attempts, corresponding try model allocation matrixes and try objective function values recorded in the index set, wherein the try objective function value corresponding to the optimal allocation attempt is the largest try objective function value;
if the maximum trial objective function value is smaller than or equal to the current objective function value, outputting the current allocation matrix as an optimized allocation matrix;
If the maximum trial objective function value is greater than the current objective function value, updating the current allocation matrix to be the optimal allocation matrix corresponding to the optimal allocation trial, updating the variable corresponding to the current objective function value to be the variable corresponding to the maximum trial objective function value, deleting the optimal allocation trial from the index set, and returning to judge whether the index set is empty to continue execution.
Optionally, the model set to be trained includes a newly injected model to be trained and/or a model to be trained remaining after the previous training round.
The embodiment of the method can be implemented by the multi-model federal learning system provided in the present embodiment, and its implementation principle and technical effects are similar, and are not described herein again.
In another aspect, the present application also provides a storage medium, which may be a storage medium included in the foregoing apparatus in the foregoing embodiment, or may be a storage medium that exists alone and is not assembled into a device. The storage medium stores one or more programs for use by one or more processors in performing the multimodal federal learning method described in the present application.
Storage media, including both permanent and non-permanent, removable and non-removable media, may be implemented in any method or technology for storage of information. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.