CN116523051B

Movatterモバイル変換

Info

Publication number: CN116523051B
Application number: CN202310524663.2A
Authority: CN
Inventors: 田宏泽; 程伟; 孙清阁
Original assignee: Beijing Suiyuan Intelligent Technology Co ltd
Current assignee: Beijing Suiyuan Intelligent Technology Co ltd
Priority date: 2023-05-10
Filing date: 2023-05-10
Publication date: 2025-10-03
Anticipated expiration: 2043-05-10
Also published as: CN116523051A

Abstract

Translated fromChinese

本发明公开了一种模型混精推理方法、装置、设备及存储介质，包括：将输入样本输入至芯片内的深度学习模型中，通过芯片内的计算节点对输入样本进行计算，得到float32类型的目标结果；获取模型的分段列表，根据模型针对各分段在预设精度选择参数下的混精结果及目标结果，对各分段的精度选择参数进行调整；将每个分段中各计算节点的目标精度选择参数，作为控制信号输入至控制节点中，通过芯片内的控制节点选择匹配的精度计算分支，并通过计算节点根据精度计算分支完成混精推理。本发明实施例的技术方案可以有效获取满足模型精度要求的混精推理方案，提高模型的混精推理效率。

The present invention discloses a model mixed precision reasoning method, device, equipment and storage medium, including: inputting input samples into a deep learning model in a chip, calculating the input samples through the computing nodes in the chip, and obtaining a target result of type float32; obtaining a segmentation list of the model, and adjusting the precision selection parameters of each segment according to the mixed precision results and target results of the model under preset precision selection parameters for each segment; inputting the target precision selection parameters of each computing node in each segment as a control signal into the control node, selecting a matching precision calculation branch through the control node in the chip, and completing mixed precision reasoning through the computing node according to the precision calculation branch. The technical solution of the embodiment of the present invention can effectively obtain a mixed precision reasoning scheme that meets the model precision requirements and improve the mixed precision reasoning efficiency of the model.