技术领域Technical field
本公开的实施例涉及计算机技术领域,具体涉及人工智能、深度学习和图像处理技术领域,尤其涉及用于搜索模型结构的方法和装置。Embodiments of the present disclosure relate to the field of computer technology, specifically to the fields of artificial intelligence, deep learning and image processing technology, and in particular to methods and devices for searching model structures.
背景技术Background technique
深度学习技术在很多方向上都取得了巨大的成功。在深度学习技术中,模型结构(即,神经网络的结构)的好坏对最终模型的效果有非常重要的影响。然而,人工设计模型结构需要设计者具有非常丰富的经验并且需要搜索各种组合方式,由于众多网络参数会产生爆炸式的组合方式,常规的随机搜索几乎不可行。因此,最近几年神经网络架构搜索技术(Neural Architecture Search,简称NAS)成为研究热点,其利用算法代替繁琐的人工操作来自动搜索最佳的模型结构。Deep learning technology has achieved great success in many directions. In deep learning technology, the quality of the model structure (that is, the structure of the neural network) has a very important impact on the effect of the final model. However, manually designing the model structure requires designers to have very rich experience and search for various combinations. Since many network parameters will produce explosive combinations, conventional random search is almost infeasible. Therefore, neural network architecture search technology (Neural Architecture Search, referred to as NAS) has become a research hotspot in recent years, which uses algorithms to replace cumbersome manual operations to automatically search for the best model structure.
现有的基于NAS的模型结构自动搜索方法搜索出的模型结构存在准确度低的问题。The existing model structure automatic search method based on NAS has the problem of low accuracy in searching the model structure.
发明内容Contents of the invention
提供了一种用于搜索模型结构的方法、装置、电子设备以及计算机可读存储介质。A method, device, electronic device, and computer-readable storage medium for searching a model structure are provided.
根据第一方面,提供了一种用于搜索模型结构的方法,该方法包括:According to a first aspect, a method for searching a model structure is provided, the method comprising:
获取待替换模型结构在至少一个预设召回率下的分类阈值,其中分类阈值包括:待替换模型结构将待分类数据的特征映射至对应的类别所采用的门限值;Obtain the classification threshold of the model structure to be replaced under at least one preset recall rate, where the classification threshold includes: the threshold value used by the model structure to be replaced to map the characteristics of the data to be classified to the corresponding category;
确定模型结构的搜索空间,初始化模型结构生成器,并通过多轮迭代操作搜索出目标模型结构;迭代操作包括:利用模型结构生成器在搜索空间中搜索出候选模型结构,对该候选模型结构进行训练并获取训练后的候选模型结构在各预设召回率下的分类阈值;根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成反馈信息,模型结构生成器的在执行下一次迭代操作前基于反馈信息更新;响应于确定模型结构生成器达到预设的收敛条件,将当前迭代操作中的候选模型结构确定为目标模型结构。Determine the search space of the model structure, initialize the model structure generator, and search for the target model structure through multiple rounds of iterative operations; the iterative operations include: using the model structure generator to search for candidate model structures in the search space, and performing operations on the candidate model structures. Train and obtain the classification thresholds of the trained candidate model structure at each preset recall rate; generate feedback information based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced at the same preset recall rate, The model structure generator is updated based on the feedback information before executing the next iterative operation; in response to determining that the model structure generator reaches a preset convergence condition, the candidate model structure in the current iterative operation is determined as the target model structure.
根据第二方面,提供了一种用于搜索模型结构的装置,该装置包括:According to a second aspect, a device for searching a model structure is provided, the device comprising:
获取单元,被配置为获取待替换模型结构在至少一个预设召回率下的分类阈值,其中分类阈值包括:待替换模型结构将待分类数据的特征映射至对应的类别所采用的门限值;搜索单元,被配置为确定模型结构的搜索空间,初始化模型结构生成器,并通过多轮迭代操作搜索出目标模型结构;搜索单元包括:计算单元,被配置为执行迭代操作中的如下步骤:利用模型结构生成器在搜索空间中搜索出候选模型结构,对该候选模型结构进行训练并获取训练后的候选模型结构在各预设召回率下的分类阈值;生成单元,被配置为执行迭代操作中的如下步骤:根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成反馈信息,模型结构生成器在执行下一次迭代操作前基于反馈信息更新;确定单元,被配置为执行迭代操作中的如下步骤:响应于确定模型结构生成器达到预设的收敛条件,将当前迭代操作中的候选模型结构确定为目标模型结构。The acquisition unit is configured to obtain the classification threshold of the model structure to be replaced under at least one preset recall rate, where the classification threshold includes: a threshold value used by the model structure to be replaced to map the characteristics of the data to be classified to the corresponding category; The search unit is configured to determine the search space of the model structure, initialize the model structure generator, and search for the target model structure through multiple rounds of iterative operations; the search unit includes: a computing unit, configured to perform the following steps in the iterative operation: using The model structure generator searches for a candidate model structure in the search space, trains the candidate model structure and obtains the classification threshold of the trained candidate model structure under each preset recall rate; the generation unit is configured to perform an iterative operation The following steps: generate feedback information based on the difference between the classification threshold of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the model structure generator is updated based on the feedback information before performing the next iteration operation; The determining unit is configured to perform the following steps in the iterative operation: in response to determining that the model structure generator reaches a preset convergence condition, determine the candidate model structure in the current iterative operation as the target model structure.
根据第三方面,本公开的实施例提供了一种电子设备,包括:一个或多个处理器:存储装置,用于存储一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面提供的用于搜索模型结构的方法。According to a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors: a storage device for storing one or more programs. When the one or more programs are processed by one or more The processor executes, causing one or more processors to implement the method for searching the model structure as provided in the first aspect.
根据第四方面,本公开的实施例提供了一种计算机可读存储介质,其上存储有计算机程序,其中,程序被处理器执行时实现第一方面提供的用于搜索模型结构的方法。According to a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium on which a computer program is stored, wherein when the program is executed by a processor, the method for searching a model structure provided in the first aspect is implemented.
本公开提供的用于搜索模型结构的方法、装置通过根据候选模型结构与待替换模型结构在同一召回率下的分类阈值的差异生成反馈信息,并根据该反馈信息迭代更新模型结构生成器的参数,使最终搜索出的目标模型结构的性能符合预期,提高了搜索模型结构的准确性。The method and device for searching model structures provided by the present disclosure generate feedback information based on the difference in classification thresholds of the candidate model structure and the model structure to be replaced at the same recall rate, and iteratively update the parameters of the model structure generator based on the feedback information. , so that the performance of the finally searched target model structure meets expectations, and the accuracy of the search model structure is improved.
根据本申请的技术解决了模型结构自动搜索方法搜索出的模型结构的准确度低的问题。The technology according to the present application solves the problem of low accuracy of the model structure searched by the automatic model structure search method.
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.
附图说明Description of the drawings
附图用于更好地理解本方案,不构成对本申请的限定。其中:The accompanying drawings are used to better understand the present solution and do not constitute a limitation of the present application. in:
图1是本申请的实施例可以应用于其中的示例性系统架构图;Figure 1 is an exemplary system architecture diagram in which embodiments of the present application can be applied;
图2是根据本申请的用于搜索模型结构的方法的一个实施例的流程图;Figure 2 is a flow chart of one embodiment of a method for searching model structures according to the present application;
图3是根据本申请的用于搜索模型结构的方法的另一个实施例的流程图;Figure 3 is a flow chart of another embodiment of a method for searching model structures according to the present application;
图4是根据本申请的用于搜索模型结构的装置的一个实施例的结构示意图;Figure 4 is a schematic structural diagram of an embodiment of a device for searching model structures according to the present application;
图5是用来实现本申请实施例的用于搜索模型结构的方法的电子设备的框图。FIG. 5 is a block diagram of an electronic device used to implement a method for searching a model structure according to an embodiment of the present application.
具体实施方式Detailed ways
以下结合附图对本申请的示范性实施例做出说明,其中包括本申请实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本申请的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be considered to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.
图1示出了可以应用本申请的用于搜索模型结构的方法或用于搜索模型结构的装置的实施例的示例性系统架构100。FIG. 1 shows an exemplary system architecture 100 to which embodiments of the method for searching for model structures or the apparatus for searching for model structures of the present application may be applied.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in Figure 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 is a medium used to provide communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种客户端应用,例如图像分类应用、信息分类应用、搜索类应用、购物类应用、金融类应用等。Users can use terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages, etc. Various client applications may be installed on the terminal devices 101, 102, and 103, such as image classification applications, information classification applications, search applications, shopping applications, financial applications, etc.
终端设备101、102、103可以是具有显示屏并且支持接收服务器消息的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving PictureExperts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(MovingPicture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, and 103 may be various electronic devices that have a display screen and support receiving server messages, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Picture Experts Group Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Group Audio Layer IV) players, laptops and desktop computers, etc.
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是各种电子设备,。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices. When the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, multiple software modules used to provide distributed services), or as a single software or software module. There are no specific limitations here.
服务器105可以是为终端设备101、102、103上运行的应用提供后台服务的服务器,或者可以是为终端设备101、102、103上运行的神经网络模型提供支持的服务器。服务器105可以从终端设备101、102、103获取待处理的数据,利用神经网络模型对待处理的数据进行处理,并将处理结果返回终端设备101、102、103。服务器105还可以利用从终端设备101、102、103或数据库获取的图像数据、语音数据、文本数据等媒体数据训练执行各种深度学习任务(如图像处理、语音识别、文本翻译等)的神经网络模型,并将训练完成的神经网络模型发送至终端设备101、102、103。或者,服务器105可以基于要执行的深度学习任务、自动搜索出性能良好的神经网络模型结构,并基于媒体数据训练搜索出的神经网络模型结构。The server 105 may be a server that provides background services for applications running on the terminal devices 101, 102, and 103, or may be a server that provides support for neural network models running on the terminal devices 101, 102, and 103. The server 105 can obtain the data to be processed from the terminal devices 101, 102, and 103, process the data to be processed using a neural network model, and return the processing results to the terminal devices 101, 102, and 103. The server 105 can also use media data such as image data, voice data, and text data obtained from the terminal devices 101, 102, 103 or databases to train neural networks that perform various deep learning tasks (such as image processing, speech recognition, text translation, etc.) model, and sends the trained neural network model to the terminal devices 101, 102, and 103. Alternatively, the server 105 can automatically search for a neural network model structure with good performance based on the deep learning task to be performed, and train the searched neural network model structure based on the media data.
需要说明的是,本公开的实施例所提供用于搜索模型结构的方法一般由服务器105执行,相应地,用于搜索模型结构的装置一般设置于服务器105中。It should be noted that the method for searching the model structure provided by the embodiments of the present disclosure is generally executed by the server 105. Correspondingly, the device for searching the model structure is generally provided in the server 105.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the number of terminal devices, networks and servers in Figure 1 is only illustrative. Depending on implementation needs, there can be any number of end devices, networks, and servers.
继续参考图2,示出了根据本公开的用于搜索模型结构的方法的一个实施例的流程200。用于搜索模型结构的方法,包括以下步骤:Continuing with reference to FIG. 2 , a process 200 is shown for one embodiment of a method for searching for model structures in accordance with the present disclosure. Method for searching model structure, including the following steps:
步骤201,获取待替换模型结构在至少一个预设召回率下的分类阈值。Step 201: Obtain the classification threshold of the model structure to be replaced under at least one preset recall rate.
其中,分类阈值包括:待替换模型结构将待分类数据的特征映射至对应的类别所采用的门限值。Among them, the classification threshold includes: the threshold value used by the model structure to be replaced to map the characteristics of the data to be classified to the corresponding category.
在本实施例中,用于搜索模型结构的方法的执行主体(例如图1所示的服务器)可以首先获取待替换模型结构,待替换模型结构可以是用于实现目标分类、目标识别、目标验证等功能的模型结构。在实践中,待替换模型结构可以是线上运行的神经网络模型,例如为各种终端应用中支持特定功能的神经网络模型,如图像分类模型,语音识别模型,等等。In this embodiment, the execution subject of the method for searching the model structure (such as the server shown in Figure 1) can first obtain the model structure to be replaced. The model structure to be replaced can be used to implement target classification, target identification, and target verification. Functional model structure. In practice, the model structure to be replaced can be a neural network model that runs online, such as a neural network model that supports specific functions in various terminal applications, such as image classification models, speech recognition models, and so on.
上述执行主体还可以获取待替换模型结构在至少一个预设召回率下的分类阈值,其中,召回率(Recall Ratio)是指从数据库内检出的相关的信息量与全部的相关的信息量的比率,是衡量某一检索系统从数据集合中检出相关数据成功度的一项指标,在包含正类别与负类别的分类系统中,召回率是所有正类别样本中,样本被正确识别为正类别的比例;分类阈值是指待替换模型结构将待分类数据的特征映射至对应的类别以进行数据分类时所采用的分类门限值。上述分类阈值可以是预先设定的,用于基于从待分类的对象中提取出的特征确定其归属的类别。待替换模型结构运行过程中,对待分类的图像、语音、文本等对象进行分类。该分类阈值具体可以是根据提取出的特征将待分类的目标被判定为某一类别的概率阈值,且该分类阈值与上述召回率具有关联关系。The above execution subject can also obtain the classification threshold of the model structure to be replaced under at least one preset recall rate, where the recall rate refers to the amount of relevant information detected from the database and the amount of all relevant information. Ratio is an indicator that measures the success of a retrieval system in detecting relevant data from a data set. In a classification system that includes positive and negative categories, the recall rate is the number of samples that are correctly identified as positive among all positive category samples. The proportion of categories; the classification threshold refers to the classification threshold used when the model structure to be replaced maps the characteristics of the data to be classified to the corresponding categories for data classification. The above classification threshold may be preset and used to determine the category to which the object to be classified belongs based on features extracted from the object. During the operation of the model structure to be replaced, objects such as images, voices, and texts to be classified are classified. Specifically, the classification threshold may be a probability threshold that determines the target to be classified as a certain category based on the extracted features, and the classification threshold has a correlation with the above-mentioned recall rate.
在本实施例中,服务器可以获取待替换模型结构在多个不同召回率下的分类阈值,以根据待替换模型结构在不同召回率下的多个分类阈值对模型结构生成器(即,用于搜索模型结构的神经网络)的网络参数进行优化,进而提高该神经网络的搜索准确度。In this embodiment, the server can obtain the classification thresholds of the model structure to be replaced under multiple different recall rates, so as to compare the model structure generator (i.e., for The network parameters of the neural network (neural network with search model structure) are optimized to thereby improve the search accuracy of the neural network.
步骤202,确定模型结构的搜索空间,初始化模型结构生成器,并通过多轮迭代操作搜索出目标模型结构。Step 202: Determine the search space of the model structure, initialize the model structure generator, and search for the target model structure through multiple rounds of iterative operations.
在本实施例中,确定能够用于执行分类任务的模型结构的搜索空间,模型结构的搜索空间包括多个候选模型结构,其可以由模型结构的基本构建单元组成。模型结构生成器可以从模型结构的搜索空间中采样基本构建单元,通过将基本构建单元堆叠、连接形成完整的候选模型结构。模型结构生成器用于基于搜索空间和外部反馈信息生成模型结构,其可以实现为循环神经网络、卷积神经网络、强化学习算法、进化算法、模拟退火算法,等等。可以初始化模型结构生成器的网络参数或结构采样策略,其中网络参数是指神经网络中不同层级神经元之间的连接权重,结构采样策略是指从上述搜索空间中采样模型结构或基本构建单元的方法。然后,开始进行多轮迭代操作,直到迭代操作结束,搜索出目标模型结构。In this embodiment, a search space of a model structure that can be used to perform a classification task is determined. The search space of the model structure includes a plurality of candidate model structures, which may be composed of basic building units of the model structure. The model structure generator can sample basic building units from the search space of the model structure, and form a complete candidate model structure by stacking and connecting the basic building units. The model structure generator is used to generate model structures based on the search space and external feedback information, which can be implemented as recurrent neural networks, convolutional neural networks, reinforcement learning algorithms, evolutionary algorithms, simulated annealing algorithms, etc. You can initialize the network parameters or structure sampling strategy of the model structure generator, where the network parameters refer to the connection weights between neurons at different levels in the neural network, and the structure sampling strategy refers to sampling the model structure or basic building units from the above search space. method. Then, multiple rounds of iterative operations are started until the iterative operation ends and the target model structure is searched.
迭代操作包括步骤2021、步骤2022:The iterative operation includes steps 2021 and 2022:
步骤2021,利用模型结构生成器在搜索空间中搜索出候选模型结构,对该候选模型结构进行训练并获取训练后的候选模型结构在各预设召回率下的分类阈值。Step 2021: Use the model structure generator to search for a candidate model structure in the search space, train the candidate model structure, and obtain the classification threshold of the trained candidate model structure under each preset recall rate.
在每一次迭代操作中,可以利用当前的模型结构生成器在搜索空间中搜索候选模型结构。具体地,可以预先定义模型结构编码规则,模型结构编码器可以生成表征候选模型结构的序列。例如,当模型结构生成器实现为循环神经网络时,可以将搜索空间编码作为循环神经网络的输入数据,通过对循环神经网络输出的序列按照上述模型结构编码规则进行解码可得到候选模型结构。或者,当模型结构生成器实现为强化算法时,其可以生成状态序列,对该状态序列按照上述模型结构规则进行解码可得到候选模型结构。In each iteration operation, the current model structure generator can be used to search for candidate model structures in the search space. Specifically, the model structure encoding rules can be predefined, and the model structure encoder can generate a sequence characterizing the candidate model structure. For example, when the model structure generator is implemented as a recurrent neural network, the search space encoding can be used as the input data of the recurrent neural network, and the candidate model structure can be obtained by decoding the sequence output by the recurrent neural network according to the above model structure encoding rules. Alternatively, when the model structure generator is implemented as a reinforcement algorithm, it can generate a state sequence, which can be decoded according to the above model structure rules to obtain a candidate model structure.
之后可以使用训练数据对搜索出的候选模型结构进行训练直到候选模型结构收敛。而后可以获取收敛后的候选模型结构在预设召回率下的分类阈值,该召回率与计算待替换模型结构在预设召回率下的分类阈值时所使用的召回率为同一数值。The training data can then be used to train the searched candidate model structures until the candidate model structures converge. Then, the classification threshold of the converged candidate model structure under the preset recall rate can be obtained. The recall rate is the same value as the recall rate used when calculating the classification threshold of the model structure to be replaced under the preset recall rate.
训练数据可以是服务器通过终端设备获取的样本数据集,可以是服务器读取本地存储或知识库获取的训练数据集,或者通过互联网等途径获取的训练数据集。在本实施例中,训练数据可以是分类任务的训练数据,例如目标识别或身份认证任务中的图像数据。判断候选模型结构是否收敛的依据可以是:判断预设的性能收敛指标是否达到预设收敛阈值,例如,预设分类准确率为收敛指标、90%为收敛阈值,则当候选模型结构的分类准确率达到90%时,判定候选模型结构收敛。The training data can be a sample data set obtained by the server through a terminal device, a training data set obtained by the server by reading local storage or a knowledge base, or a training data set obtained through the Internet or other means. In this embodiment, the training data may be training data for classification tasks, such as image data in target recognition or identity authentication tasks. The basis for judging whether the candidate model structure has converged can be: judging whether the preset performance convergence index reaches the preset convergence threshold. For example, if the preset classification accuracy is the convergence index and 90% is the convergence threshold, then when the classification of the candidate model structure is accurate When the rate reaches 90%, the candidate model structure is judged to have converged.
在本实施例中,还可以根据训练时间判断对候选模型的训练是否结束,即,利用模型结构生成器在搜索空间中搜索候选模型结构,并且使用训练数据对候选模型结构进行训练直到训练所用时长达到预设训练时长,获取训练后的候选模型结构在预设召回率下的分类阈值。In this embodiment, you can also determine whether the training of the candidate model has ended based on the training time, that is, use the model structure generator to search the candidate model structure in the search space, and use the training data to train the candidate model structure until the training time is used. When the preset training time is reached, the classification threshold of the trained candidate model structure under the preset recall rate is obtained.
步骤2022,根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成反馈信息,模型结构生成器在执行下一次迭代操作前基于反馈信息更新。Step 2022: Generate feedback information based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the model structure generator is updated based on the feedback information before performing the next iteration operation.
在本实施例中,根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成反馈信息,基于该反馈信息在执行下一次迭代操作中更新模型结构生成器的参数。其中,训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异可以是二者分类阈值之间的数学计算结果,例如,二者分类阈值之间的差值,二者分类阈值之间的商;也可以预设差异门限值,当二者分类阈值之间的数学计算结果小于差异门限值时,用二进制符号表征二者分类阈值是否存在差异,以此表征二者分类阈值之间的差异情况;还可以是其他可以表征差异的可机读表达式。上述模型结构生成器可以基于该反馈信息更新。具体地,模型结构生成器可以基于反馈信息更新参数或者更新结构采样策略。例如,当模型结构生成器是循环神经网络时,可以基于该反馈信息进行反向传播,采用梯度下降法更新模型结构生成器的参数;当模型结构生成器是强化学习算法时,该反馈信息被作为反馈值(reward),模型结构生成器基于反馈值重新生成状态序列。In this embodiment, feedback information is generated based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the model structure is updated in the next iteration based on the feedback information. Generator parameters. The difference between the classification thresholds of the candidate model structure after training and the model structure to be replaced under the same preset recall rate can be the mathematical calculation result between the two classification thresholds, for example, the difference between the two classification thresholds value, the quotient between the two classification thresholds; the difference threshold value can also be preset. When the mathematical calculation result between the two classification thresholds is less than the difference threshold value, a binary symbol is used to indicate whether there is a difference between the two classification thresholds. This represents the difference between the two classification thresholds; it can also be other machine-readable expressions that can represent the difference. The model structure generator described above can be updated based on this feedback information. Specifically, the model structure generator can update parameters or update the structure sampling strategy based on feedback information. For example, when the model structure generator is a recurrent neural network, backpropagation can be performed based on the feedback information, and the parameters of the model structure generator can be updated using the gradient descent method; when the model structure generator is a reinforcement learning algorithm, the feedback information is As a feedback value (reward), the model structure generator regenerates the state sequence based on the feedback value.
步骤2023,响应于确定模型结构生成器达到预设的收敛条件,将当前迭代操作中的候选模型结构确定为目标模型结构。Step 2023, in response to determining that the model structure generator reaches the preset convergence condition, determine the candidate model structure in the current iterative operation as the target model structure.
在本实施例中,当模型结构生成器达到预设的收敛条件时,结束迭代操作,并将最后一轮迭代操作中搜索的候选模型结构确定为目标模型结构。模型结构生成器达到预设的收敛条件可以是模型结构生成器的迭代次数达到预设迭代次数,可以是模型结构生成器的迭代时间达到预设迭代时间,也可以是模型结构生成器搜出的候选模型结构的性能达到预期性能。In this embodiment, when the model structure generator reaches the preset convergence condition, the iterative operation ends, and the candidate model structure searched in the last round of iterative operation is determined as the target model structure. When the model structure generator reaches the preset convergence condition, it can be that the iteration number of the model structure generator reaches the preset iteration number, it can be that the iteration time of the model structure generator reaches the preset iteration time, or it can be that the model structure generator searches for The performance of the candidate model structure meets the expected performance.
本实施例提供的用于搜索模型结构的方法根据候选模型结构与待替换模型结构在同一召回率下的分类阈值的差异生成反馈信息,并根据该反馈信息迭代更新模型结构生成器的网络参数,能够将待替换的模型结构与候选模型结构对分类对象提取出的特征之间的差异反馈至模型结构生成器,使得模型结构生成器生成与待替换模型结构提取出的特征一致的候选模型结构,使最终搜索出的目标模型结构在预设召回率下的阈值与待替换模型结构在同一召回率下的阈值接近。由此在利用搜索出的目标模型结构对待替换的模型结构进行替换时,无需对分类阈值重新设计或改动,降低了替换线上模型结构的复杂度。并且,通过候选模型结构与待替换模型结构在同一召回率下的分类阈值的差异生成反馈信息,并根据该反馈信息迭代更新模型结构生成器的网络参数,无需对待替换模型结构重新训练,可以减少训练所消耗的硬件运算资源、内存资源、并节省硬件设备资源。The method for searching model structures provided in this embodiment generates feedback information based on the difference in classification thresholds between the candidate model structure and the model structure to be replaced at the same recall rate, and iteratively updates the network parameters of the model structure generator based on the feedback information. The difference between the features extracted by the model structure to be replaced and the candidate model structure on the classified object can be fed back to the model structure generator, so that the model structure generator generates a candidate model structure that is consistent with the features extracted by the model structure to be replaced, Make the threshold of the finally searched target model structure under the preset recall rate close to the threshold of the model structure to be replaced under the same recall rate. Therefore, when using the searched target model structure to replace the model structure to be replaced, there is no need to redesign or change the classification threshold, which reduces the complexity of replacing the online model structure. Moreover, feedback information is generated through the difference in classification thresholds of the candidate model structure and the model structure to be replaced at the same recall rate, and the network parameters of the model structure generator are iteratively updated based on the feedback information. There is no need to retrain the model structure to be replaced, which can reduce Hardware computing resources, memory resources consumed by training, and hardware device resources are saved.
通过本实施例的方法可以确定出用于构建执行图像处理任务的神经网络模型的目标模型结构,由于图像数据在神经网络模型的处理中通常被转化为矩阵数据,涉及大量的矩阵运算。对用于构建执行图像处理任务的神经网络模型的目标模型结构的训练需要耗费大量时间成本以及硬件损耗成本,本实施例利用候选模型结构与待替换模型结构在同一召回率下的分类阈值的差异生成反馈信息,并根据该反馈信息迭代更新模型结构生成器的网络参数,可以节省重新训练目标模型结构的时间成本以及硬件资源的消耗,从而提高了构建执行图像处理任务的神经网络模型的效率。The target model structure for constructing a neural network model that performs image processing tasks can be determined through the method of this embodiment. Since image data is usually converted into matrix data during the processing of the neural network model, a large number of matrix operations are involved. Training the target model structure for building a neural network model that performs image processing tasks requires a lot of time and hardware loss costs. This embodiment uses the difference in classification thresholds of the candidate model structure and the model structure to be replaced at the same recall rate. Generating feedback information and iteratively updating the network parameters of the model structure generator based on the feedback information can save the time cost of retraining the target model structure and the consumption of hardware resources, thereby improving the efficiency of building neural network models that perform image processing tasks.
进一步参考图3,其示出了用于搜索模型结构的方法的又一个实施例的流程300。该用于搜索模型结构的方法的流程300,包括以下步骤:Referring further to FIG. 3 , a flow 300 of yet another embodiment of a method for searching for model structures is shown. The process 300 of the method for searching model structure includes the following steps:
步骤301,获取待替换模型结构在至少一个预设召回率下的分类阈值,其中分类阈值包括:待替换模型结构将待分类数据的特征映射至对应的类别所采用的门限值。Step 301: Obtain the classification threshold of the model structure to be replaced under at least one preset recall rate, where the classification threshold includes: the threshold value used by the model structure to be replaced to map the characteristics of the data to be classified to the corresponding category.
步骤302,确定模型结构的搜索空间,初始化模型结构生成器,并通过多轮迭代操作搜索出目标模型结构。Step 302: Determine the search space of the model structure, initialize the model structure generator, and search for the target model structure through multiple rounds of iterative operations.
迭代操作包括步骤3021、步骤3022:The iterative operation includes steps 3021 and 3022:
步骤3021,利用模型结构生成器在搜索空间中搜索出候选模型结构,对该候选模型结构进行训练并获取训练后的候选模型结构在各预设召回率下的分类阈值。Step 3021: Use the model structure generator to search for a candidate model structure in the search space, train the candidate model structure, and obtain the classification threshold of the trained candidate model structure under each preset recall rate.
步骤3022,确定训练后的候选模型结构的性能信息,根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异、以及训练后的候选模型结构的性能信息生成反馈信息,其中,模型结构生成器在执行下一次迭代操作前基于反馈信息更新。Step 3022: Determine the performance information of the trained candidate model structure, based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the performance of the trained candidate model structure. The information generates feedback information, wherein the model structure generator is updated based on the feedback information before performing the next iteration operation.
在本实施例中,首先,获取训练后的候选模型结构的性能信息,以及训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异,其中,训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异可以是二者分类阈值之间的差值。然后,根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异、以及训练后的候选模型结构的性能信息生成反馈信息,并且根据该反馈信息在下一次迭代操作中更新模型结构生成器的网络参数。其中,反馈信息是神经网络中用于调整网络参数的性能指标,通过迭代的调整神经网络的网络参数可以使迭代结束后的神经网络的性能符合预期。In this embodiment, first, the performance information of the trained candidate model structure is obtained, as well as the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, where, after training The difference between the classification thresholds of the candidate model structure and the model structure to be replaced under the same preset recall rate may be the difference between the two classification thresholds. Then, feedback information is generated based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, as well as the performance information of the trained candidate model structure, and based on the feedback information, the next time Update the network parameters of the model structure generator in an iterative operation. Among them, feedback information is a performance indicator used in the neural network to adjust network parameters. By iteratively adjusting the network parameters of the neural network, the performance of the neural network after the iteration can be in line with expectations.
本实施例根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异、以及训练后的候选模型结构的性能信息这两方面因素生成反馈信息以用于调整神经网络的参数,可以使搜索出的目标模型结构不需要修改分类策略即可替代待替换模型结构执行分类任务,同时,可以使搜索出的目标模型结构的性能符合预期,从而提高了搜索的准确性。This embodiment generates feedback information for use based on two factors: the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the performance information of the trained candidate model structure. By adjusting the parameters of the neural network, the searched target model structure can replace the model structure to be replaced to perform the classification task without modifying the classification strategy. At the same time, the performance of the searched target model structure can be made to meet expectations, thereby improving the efficiency of the search. accuracy.
在本实施例的一些可选的实现方式中,还可以通过如下方式生成反馈信息:根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成第一反馈值;根据训练后的候选模型结构的性能信息生成第二反馈值;将第一反馈值和第二反馈值的加权和作为反馈信息。In some optional implementations of this embodiment, feedback information can also be generated in the following manner: generating the third threshold based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate. A feedback value; generating a second feedback value based on the performance information of the trained candidate model structure; using the weighted sum of the first feedback value and the second feedback value as feedback information.
在本实施例中,首先根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成第一反馈值,并且根据训练后的候选模型结构的性能信息生成第二反馈值,然后,将第一反馈值和第二反馈值的加权和作为反馈信息。例如,第一反馈值为4,对应的反馈权重为0.9,第二反馈值为12,对应的反馈权重为0.1,那么将第一反馈值和第二反馈值的加权和4.8作为反馈信息。本实施例通过引入权重以调整第一反馈值、第二反馈值对反馈信息的影响,可以使最终的目标模型结构更侧重于性能的好坏或者与待替换模型结构的策略相似度,使搜索结果更符合用户需求,进而提高搜索的准确性。In this embodiment, the first feedback value is first generated based on the difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and based on the performance information of the trained candidate model structure A second feedback value is generated, and then the weighted sum of the first feedback value and the second feedback value is used as feedback information. For example, if the first feedback value is 4, the corresponding feedback weight is 0.9, the second feedback value is 12, and the corresponding feedback weight is 0.1, then the weighted sum of 4.8 of the first feedback value and the second feedback value is used as the feedback information. In this embodiment, by introducing weights to adjust the impact of the first feedback value and the second feedback value on the feedback information, the final target model structure can focus more on the performance or the strategic similarity with the model structure to be replaced, making the search The results are more in line with user needs, thereby improving search accuracy.
可选地,在本实施例中,第一反馈值的权重小于第二反馈值的权重。当第一反馈值的权重小于第二反馈值的权重时,模型结构生成器的搜索策略是更侧重于搜索出性能好的模型结构,可以提高最终搜索出的目标模型结构的性能。Optionally, in this embodiment, the weight of the first feedback value is smaller than the weight of the second feedback value. When the weight of the first feedback value is smaller than the weight of the second feedback value, the search strategy of the model structure generator is more focused on searching for a model structure with good performance, which can improve the performance of the ultimately searched target model structure.
步骤3023,响应于确定模型结构生成器达到预设的收敛条件,将当前迭代操作中的候选模型结构确定为目标模型结构。Step 3023, in response to determining that the model structure generator reaches the preset convergence condition, determine the candidate model structure in the current iterative operation as the target model structure.
本实施例的步骤301、步骤302、步骤3021、步骤3023分别与前述实施例的步骤201、步骤2021、步骤203、步骤2023一致,步骤301、步骤302、步骤303、步骤305的具体实现方式可以参考前述实施例中对应步骤的描述,此处不再赘述。Step 301, step 302, step 3021, and step 3023 of this embodiment are respectively consistent with step 201, step 2021, step 203, and step 2023 of the previous embodiment. The specific implementation methods of step 301, step 302, step 303, and step 305 can be Refer to the description of the corresponding steps in the foregoing embodiments, which will not be described again here.
在上述结合图2和图3描述的实施例的一些可选的实现方式中,用于搜索模型结构的方法还包括:将待替换模型结构替换为目标模型结构;利用目标模型结构对待分类的数据进行分类。In some optional implementations of the embodiments described above in conjunction with Figures 2 and 3, the method for searching the model structure also includes: replacing the model structure to be replaced with the target model structure; using the target model structure to classify the data sort.
本实施例提供的用于搜索模型结构的方法根据候选模型结构与待替换模型结构在同一召回下的分类阈值的差异生成反馈信息,并根据该反馈信息迭代更新模型结构生成器的参数,使最终搜索出的目标模型结构的性能符合预期,提高了搜索模型结构的准确性。The method for searching the model structure provided by this embodiment generates feedback information based on the difference in classification thresholds of the candidate model structure and the model structure to be replaced under the same recall, and iteratively updates the parameters of the model structure generator based on the feedback information, so that the final The performance of the searched target model structure is in line with expectations, which improves the accuracy of the search model structure.
在上述各实施例的一些可选的实现方式中,上述迭代操作还可以包括:响应于确定所述模型结构生成器未达到预设的收敛条件,基于更新后的模型结构生成器执行下一次迭代操作。这样,通过多次执行迭代操作,逐步优化模型结构生成器,使得模型结构生成器生成与待替换模型结构之间的特征损失更小的目标模型结构,或者搜索出与待替换模型结构之间的特征损失更小的目标模型结构且性能更好的目标模型结构,实现了模型结构的自动优化。In some optional implementations of the above embodiments, the above iterative operation may further include: in response to determining that the model structure generator has not reached a preset convergence condition, executing the next iteration based on the updated model structure generator. operate. In this way, by performing multiple iterative operations, the model structure generator is gradually optimized, so that the model structure generator generates a target model structure with smaller feature loss between the model structure to be replaced, or to search for a target model structure that has a smaller feature loss than the model structure to be replaced. The target model structure with smaller feature loss and better performance realizes the automatic optimization of the model structure.
进一步参考图4,作为对上述各图所示方法的实现,本公开提供了一种用于搜索模型结构的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to Figure 4, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a device for searching model structures. The device embodiment corresponds to the method embodiment shown in Figure 2, The device can be applied in various electronic devices.
如图4所示,本实施例的用于搜索模型结构的装置400包括:获取单元401、搜索单元402、计算单元4021、生成单元4022、确定单元4023。其中获取单元401,被配置为获取待替换模型结构在至少一个预设召回率下的分类阈值,分类阈值包括:待替换模型结构将待分类数据的特征映射至对应的类别所采用的门限值;搜索单元402,被配置为确定模型结构的搜索空间,初始化模型结构生成器,并通过多轮迭代操作搜索出目标模型结构;搜索单元402包括:计算单元4021,被配置为执行迭代操作中的如下步骤:利用模型结构生成器在搜索空间中搜索出候选模型结构,对候选模型结构进行训练并获取训练后的候选模型结构在各预设召回率下的分类阈值;搜索单元402包括:生成单元4022,被配置为执行迭代操作中的如下步骤:根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成反馈信息,模型结构生成器在执行下一次迭代操作前基于反馈信息更新;确定单元4023,被配置为执行迭代操作中的如下步骤:响应于确定模型结构生成器达到预设的收敛条件,将当前迭代操作中的候选模型结构确定为目标模型结构。As shown in FIG. 4 , the device 400 for searching a model structure in this embodiment includes: an acquisition unit 401 , a search unit 402 , a calculation unit 4021 , a generation unit 4022 , and a determination unit 4023 . The acquisition unit 401 is configured to obtain the classification threshold of the model structure to be replaced under at least one preset recall rate. The classification threshold includes: the threshold value used by the model structure to be replaced to map the characteristics of the data to be classified to the corresponding category. ; The search unit 402 is configured to determine the search space of the model structure, initialize the model structure generator, and search for the target model structure through multiple rounds of iterative operations; the search unit 402 includes: a computing unit 4021, configured to perform the iterative operation The following steps: use the model structure generator to search for candidate model structures in the search space, train the candidate model structures and obtain the classification thresholds of the trained candidate model structures under each preset recall rate; the search unit 402 includes: a generation unit 4022, configured to perform the following steps in the iterative operation: generate feedback information based on the difference between the classification threshold of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the model structure generator is executed Updated based on feedback information before an iterative operation; the determination unit 4023 is configured to perform the following steps in the iterative operation: in response to determining that the model structure generator reaches the preset convergence condition, determine the candidate model structure in the current iterative operation as the target model structure.
本实施例提供的用于搜索模型结构的装置通过根据候选模型结构与待替换模型结构在同一召回下的分类阈值的差异生成反馈信息,并根据该反馈信息迭代更新模型结构生成器的参数,使最终搜索出的目标模型结构的性能符合预期,提高了模型结构的搜索准确性。The device for searching the model structure provided by this embodiment generates feedback information based on the difference in classification thresholds of the candidate model structure and the model structure to be replaced under the same recall, and iteratively updates the parameters of the model structure generator based on the feedback information, so that The performance of the finally searched target model structure is in line with expectations, which improves the search accuracy of the model structure.
在一些实施例中,搜索单元402还包括:测试单元,被配置为执行迭代操作中的如下步骤:确定训练后的候选模型结构的性能信息;生成单元4022包括:信息生成模块,被配置为根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异、以及训练后的候选模型结构的性能信息生成反馈信息。In some embodiments, the search unit 402 further includes: a testing unit configured to perform the following steps in the iterative operation: determining performance information of the trained candidate model structure; the generating unit 4022 includes: an information generating module configured to perform the following steps according to The difference between the classification thresholds of the trained candidate model structure and the model structure to be replaced under the same preset recall rate, and the performance information of the trained candidate model structure generate feedback information.
在一些实施例中,信息生成模块包括:第一生成模块,被配置为根据训练后的候选模型结构与待替换模型结构在同一预设召回率下的分类阈值之间的差异生成第一反馈值;第二生成模块,被配置为根据训练后的候选模型结构的性能信息生成第二反馈值;融合模块,被配置为将第一反馈值和第二反馈值的加权和作为反馈信息。In some embodiments, the information generation module includes: a first generation module configured to generate a first feedback value based on the difference between the classification threshold of the trained candidate model structure and the model structure to be replaced under the same preset recall rate. ; The second generation module is configured to generate a second feedback value based on the performance information of the trained candidate model structure; the fusion module is configured to use the weighted sum of the first feedback value and the second feedback value as feedback information.
在一些实施例中,第一反馈值的权重小于第二反馈值的权重。In some embodiments, the weight of the first feedback value is less than the weight of the second feedback value.
在一些实施例中,用于搜索模型结构的装置还包括:替换单元,被配置为将待替换模型结构替换为目标模型结构;处理单元,被配置为利用目标模型结构对待分类的数据进行分类。In some embodiments, the apparatus for searching the model structure further includes: a replacement unit configured to replace the model structure to be replaced with a target model structure; and a processing unit configured to classify the data to be classified using the target model structure.
上述装置400中的各单元与参考图2和图4描述的方法中的步骤相对应。由此上文针对用于搜索模型结构的方法描述的操作、特征及所能达到的技术效果同样适用于装置400及其中包含的单元,在此不再赘述。Each unit in the above-mentioned device 400 corresponds to the steps in the method described with reference to FIGS. 2 and 4 . Therefore, the operations, features and achievable technical effects described above for the method for searching the model structure are also applicable to the device 400 and the units included therein, and will not be described again here.
根据本申请的实施例,本申请还提供了一种电子设备和一种可读存储介质。According to embodiments of the present application, the present application also provides an electronic device and a readable storage medium.
如图5所示,是根据本申请实施例的用于搜索模型结构的方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in FIG. 5 , it is a block diagram of an electronic device used for a method of searching a model structure according to an embodiment of the present application. Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit the implementation of the present application as described and/or claimed herein.
如图5所示,该电子设备包括:一个或多个处理器501、存储器502,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图5中以一个处理器501为例。As shown in Figure 5, the electronic device includes: one or more processors 501, memory 502, and interfaces for connecting various components, including high-speed interfaces and low-speed interfaces. The various components are connected to each other using different buses and can be mounted on a common motherboard or otherwise mounted as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices can be connected, each device providing part of the necessary operation (eg, as a server array, a set of blade servers, or a multi-processor system). In Figure 5, a processor 501 is taken as an example.
存储器502即为本申请所提供的非瞬时计算机可读存储介质。其中,该存储器存储有可由至少一个处理器执行的指令,以使该至少一个处理器执行本申请所提供的用于搜索模型结构的方法。本申请的非瞬时计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行本申请所提供的用于搜索模型结构的方法。The memory 502 is the non-transitory computer-readable storage medium provided by this application. Wherein, the memory stores instructions that can be executed by at least one processor, so that the at least one processor executes the method for searching the model structure provided in this application. The non-transitory computer-readable storage medium of the present application stores computer instructions, which are used to cause the computer to execute the method for searching model structures provided by the present application.
存储器502作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本申请实施例中的用于搜索模型结构的方法对应的程序指令/模块(例如,附图4所示的获取单元401、搜索单元402、计算单元4021、生成单元4022、确定单元4023)。处理器501通过运行存储在存储器502中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的用于搜索模型结构的方法。As a non-transient computer-readable storage medium, the memory 502 can be used to store non-transient software programs, non-transient computer executable programs and modules, such as program instructions/modules corresponding to the method for searching model structures in the embodiments of the present application. (For example, the acquisition unit 401, the search unit 402, the calculation unit 4021, the generation unit 4022, and the determination unit 4023 shown in Figure 4). The processor 501 executes non-transient software programs, instructions and modules stored in the memory 502 to execute various functional applications and data processing of the server, that is, to implement the method for searching the model structure in the above method embodiment.
存储器502可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据用于搜索模型结构的电子设备的使用所创建的数据等。此外,存储器502可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器502可选包括相对于处理器501远程设置的存储器,这些远程存储器可以通过网络连接至用于搜索模型结构的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 502 may include a stored program area and a stored data area, wherein the stored program area may store an operating system and an application program required for at least one function; the stored data area may store data created according to the use of an electronic device for searching model structures. Data etc. In addition, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, the memory 502 optionally includes memory located remotely relative to the processor 501, and these remote memories may be connected via a network to electronic devices for searching model structures. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
用于搜索模型结构的方法的电子设备还可以包括:输入装置503、输出装置504以及总线505。处理器501、存储器502、输入装置503和输出装置504可以通过总线505或者其他方式连接,图5中以通过总线505连接为例。The electronic device used for the method of searching the model structure may further include: an input device 503 , an output device 504 and a bus 505 . The processor 501, the memory 502, the input device 503 and the output device 504 can be connected through a bus 505 or other means. In FIG. 5, the connection through the bus 505 is taken as an example.
输入装置503可接收输入的数字或字符信息,以及产生与用于搜索模型结构的电子设备的用户设置以及功能控制有关的键信号输入,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置504可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中,显示设备可以是触摸屏。The input device 503 may receive input numeric or character information and generate key signal input related to user settings and functional control of electronic devices for searching model structures, such as touch screens, keypads, mice, trackpads, touch pads, pointers An input device such as a stick, one or more mouse buttons, a trackball, or a joystick. Output devices 504 may include display devices, auxiliary lighting devices (eg, LEDs), tactile feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein may be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor The processor, which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.
这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computing programs (also referred to as programs, software, software applications, or code) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine language Calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or means for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLD)), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that various forms of the process shown above may be used, with steps reordered, added or deleted. For example, each step described in this application can be executed in parallel, sequentially, or in a different order. As long as the desired results of the technical solution disclosed in this application can be achieved, there is no limitation here.
上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the scope of protection of the present application. It will be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions are possible depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of this application shall be included in the protection scope of this application.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010503202.3ACN111667056B (en) | 2020-06-05 | 2020-06-05 | Method and apparatus for searching model structure |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010503202.3ACN111667056B (en) | 2020-06-05 | 2020-06-05 | Method and apparatus for searching model structure |
| Publication Number | Publication Date |
|---|---|
| CN111667056A CN111667056A (en) | 2020-09-15 |
| CN111667056Btrue CN111667056B (en) | 2023-09-26 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010503202.3AActiveCN111667056B (en) | 2020-06-05 | 2020-06-05 | Method and apparatus for searching model structure |
| Country | Link |
|---|---|
| CN (1) | CN111667056B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112100468A (en)* | 2020-09-25 | 2020-12-18 | 北京百度网讯科技有限公司 | Method and device for generating search space, electronic device and storage medium |
| CN112836801A (en)* | 2021-02-03 | 2021-05-25 | 上海商汤智能科技有限公司 | Deep learning network determination method and device, electronic equipment and storage medium |
| CN112989361B (en)* | 2021-04-14 | 2023-10-20 | 华南理工大学 | Model security detection method based on generation countermeasure network |
| CN113076903A (en)* | 2021-04-14 | 2021-07-06 | 上海云从企业发展有限公司 | Target behavior detection method and system, computer equipment and machine readable medium |
| CN113657468A (en)* | 2021-07-29 | 2021-11-16 | 北京百度网讯科技有限公司 | Pre-training model generation method and device, electronic equipment and storage medium |
| CN113806519A (en)* | 2021-09-24 | 2021-12-17 | 金蝶软件(中国)有限公司 | A search and recall method, device and medium |
| CN116341634B (en)* | 2022-11-18 | 2024-07-09 | 上海玄戒技术有限公司 | Training method and device for neural structure search model and electronic equipment |
| CN116188834B (en)* | 2022-12-08 | 2023-10-20 | 赛维森(广州)医疗科技服务有限公司 | Full-slice image classification method and device based on self-adaptive training model |
| CN117725979B (en)* | 2023-09-27 | 2024-09-20 | 行吟信息科技(上海)有限公司 | Model training method and device, electronic equipment and computer readable storage medium |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107463704A (en)* | 2017-08-16 | 2017-12-12 | 北京百度网讯科技有限公司 | Searching method and device based on artificial intelligence |
| CN108959552A (en)* | 2018-06-29 | 2018-12-07 | 北京百度网讯科技有限公司 | Recognition methods, device, equipment and the storage medium of question and answer class query statement |
| CN109816116A (en)* | 2019-01-17 | 2019-05-28 | 腾讯科技(深圳)有限公司 | The optimization method and device of hyper parameter in machine learning model |
| CN110543944A (en)* | 2019-09-11 | 2019-12-06 | 北京百度网讯科技有限公司 | Neural network structure search method, device, electronic device and medium |
| CN110674326A (en)* | 2019-08-06 | 2020-01-10 | 厦门大学 | Neural network structure retrieval method based on polynomial distribution learning |
| CN110766142A (en)* | 2019-10-30 | 2020-02-07 | 北京百度网讯科技有限公司 | Model generation method and device |
| CN110782015A (en)* | 2019-10-25 | 2020-02-11 | 腾讯科技(深圳)有限公司 | Training method, device and storage medium for network structure optimizer of neural network |
| CN110852321A (en)* | 2019-11-11 | 2020-02-28 | 北京百度网讯科技有限公司 | Candidate frame filtering method, device and electronic device |
| CN110909877A (en)* | 2019-11-29 | 2020-03-24 | 百度在线网络技术(北京)有限公司 | Neural network model structure searching method and device, electronic equipment and storage medium |
| CN110956260A (en)* | 2018-09-27 | 2020-04-03 | 瑞士电信公司 | System and method for neural architecture search |
| CN111160448A (en)* | 2019-12-26 | 2020-05-15 | 北京达佳互联信息技术有限公司 | An image classification model training method and device |
| CN111178546A (en)* | 2019-12-31 | 2020-05-19 | 华为技术有限公司 | Search method for machine learning model and related devices and equipment |
| WO2022063247A1 (en)* | 2020-09-28 | 2022-03-31 | 华为技术有限公司 | Neural architecture search method and apparatus |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190286984A1 (en)* | 2018-03-13 | 2019-09-19 | Google Llc | Neural architecture search by proxy |
| CA3061717A1 (en)* | 2018-11-16 | 2020-05-16 | Royal Bank Of Canada | System and method for a convolutional neural network for multi-label classification with partial annotations |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107463704A (en)* | 2017-08-16 | 2017-12-12 | 北京百度网讯科技有限公司 | Searching method and device based on artificial intelligence |
| CN108959552A (en)* | 2018-06-29 | 2018-12-07 | 北京百度网讯科技有限公司 | Recognition methods, device, equipment and the storage medium of question and answer class query statement |
| CN110956260A (en)* | 2018-09-27 | 2020-04-03 | 瑞士电信公司 | System and method for neural architecture search |
| CN109816116A (en)* | 2019-01-17 | 2019-05-28 | 腾讯科技(深圳)有限公司 | The optimization method and device of hyper parameter in machine learning model |
| CN110674326A (en)* | 2019-08-06 | 2020-01-10 | 厦门大学 | Neural network structure retrieval method based on polynomial distribution learning |
| CN110543944A (en)* | 2019-09-11 | 2019-12-06 | 北京百度网讯科技有限公司 | Neural network structure search method, device, electronic device and medium |
| CN110782015A (en)* | 2019-10-25 | 2020-02-11 | 腾讯科技(深圳)有限公司 | Training method, device and storage medium for network structure optimizer of neural network |
| CN110766142A (en)* | 2019-10-30 | 2020-02-07 | 北京百度网讯科技有限公司 | Model generation method and device |
| CN110852321A (en)* | 2019-11-11 | 2020-02-28 | 北京百度网讯科技有限公司 | Candidate frame filtering method, device and electronic device |
| CN110909877A (en)* | 2019-11-29 | 2020-03-24 | 百度在线网络技术(北京)有限公司 | Neural network model structure searching method and device, electronic equipment and storage medium |
| CN111160448A (en)* | 2019-12-26 | 2020-05-15 | 北京达佳互联信息技术有限公司 | An image classification model training method and device |
| CN111178546A (en)* | 2019-12-31 | 2020-05-19 | 华为技术有限公司 | Search method for machine learning model and related devices and equipment |
| WO2022063247A1 (en)* | 2020-09-28 | 2022-03-31 | 华为技术有限公司 | Neural architecture search method and apparatus |
| Publication number | Publication date |
|---|---|
| CN111667056A (en) | 2020-09-15 |
| Publication | Publication Date | Title |
|---|---|---|
| CN111667056B (en) | Method and apparatus for searching model structure | |
| CN112560912B (en) | Classification model training methods, devices, electronic equipment and storage media | |
| CN111639710B (en) | Image recognition model training method, device, equipment and storage medium | |
| CN111667057B (en) | Method and apparatus for searching model structures | |
| CN111708876B (en) | Method and apparatus for generating information | |
| CN112036509A (en) | Method and apparatus for training image recognition models | |
| CN111639753B (en) | Methods, devices, equipment and storage media for training image processing supernetworks | |
| CN111582479B (en) | Distillation method and device for neural network model | |
| CN111753914A (en) | Model optimization method and device, electronic device and storage medium | |
| CN111831813B (en) | Dialog generation method, dialog generation device, electronic equipment and medium | |
| CN111209977A (en) | Methods, apparatus, equipment and media for training and using classification models | |
| CN111563593B (en) | Training method and device for neural network model | |
| CN111737954B (en) | Text similarity determination method, device, equipment and medium | |
| CN111680517A (en) | Method, apparatus, device and storage medium for training a model | |
| CN112000763B (en) | Method, device, equipment and medium for determining competition relationship of interest points | |
| CN111859953B (en) | Training data mining method and device, electronic equipment and storage medium | |
| CN111522944B (en) | Method, apparatus, device and storage medium for outputting information | |
| CN113053367A (en) | Speech recognition method, model training method and device for speech recognition | |
| CN111539209B (en) | Method and apparatus for entity classification | |
| CN110675954A (en) | Information processing method and device, electronic equipment and storage medium | |
| CN115035890B (en) | Speech recognition model training methods, devices, electronic equipment and storage media | |
| CN111241225B (en) | Method, device, equipment and storage medium for judging change of resident area | |
| CN114386503A (en) | Method and apparatus for training a model | |
| CN111679829A (en) | Determination method and device for user interface design | |
| CN112232089B (en) | Pre-training method, device and storage medium for semantic representation model |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |