CN119066223B

Movatterモバイル変換

Info

Publication number: CN119066223B
Application number: CN202411570790.7A
Authority: CN
Inventors: 徐聪; 周永哲; 吴忠人; 黄鹏
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2024-11-05
Filing date: 2024-11-05
Publication date: 2025-01-17
Anticipated expiration: 2044-11-05
Also published as: CN119066223A

Abstract

Translated fromChinese

本申请涉及人工智能技术领域，尤其涉及一种视频问答方法、装置、电子设备及存储介质，用于提高视频问答的答案准确度。该方法包括：获取输入对象输入的问题，并抽取问题中的目标实体；基于目标实体，查询目标视频所属场景对应的场景知识图谱，获得第一搜索数据，并基于目标实体，查询目标视频对应的时序知识图谱，获得第二搜索数据，其中，时序知识图谱包含：各时间段内，目标视频中的各候选实体间的交互关系；基于第一搜索数据和第二搜索数据，通过大语言模型LLM，获得问题的答案。这样，使得LLM在回答视频问题时会考虑场景知识和时序知识，提高了回答的准确度。

The present application relates to the field of artificial intelligence technology, and in particular to a video question-and-answer method, device, electronic device, and storage medium for improving the accuracy of answers to video questions and answers. The method includes: obtaining a question input by an input object, and extracting a target entity in the question; based on the target entity, querying the scene knowledge graph corresponding to the scene to which the target video belongs, obtaining first search data, and based on the target entity, querying the time series knowledge graph corresponding to the target video, obtaining second search data, wherein the time series knowledge graph includes: the interaction relationship between each candidate entity in the target video in each time period; based on the first search data and the second search data, obtaining the answer to the question through a large language model LLM. In this way, LLM will consider scene knowledge and time series knowledge when answering video questions, thereby improving the accuracy of the answer.