Movatterモバイル変換


[0]ホーム

URL:


CN118965279A - A financial content risk control method and system based on a large model - Google Patents

A financial content risk control method and system based on a large model
Download PDF

Info

Publication number
CN118965279A
CN118965279ACN202411435306.XACN202411435306ACN118965279ACN 118965279 ACN118965279 ACN 118965279ACN 202411435306 ACN202411435306 ACN 202411435306ACN 118965279 ACN118965279 ACN 118965279A
Authority
CN
China
Prior art keywords
semantic
text
image
features
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202411435306.XA
Other languages
Chinese (zh)
Other versions
CN118965279B (en
Inventor
莫倩
蔡锦森
贾承斌
孟维勋
朱若曦
张晓玲
艾青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wiseweb Technology Group Co ltd
Beijing Wiseweb Big Data Technology Co ltd
Original Assignee
Wiseweb Technology Group Co ltd
Beijing Wiseweb Big Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wiseweb Technology Group Co ltd, Beijing Wiseweb Big Data Technology Co ltdfiledCriticalWiseweb Technology Group Co ltd
Priority to CN202411435306.XApriorityCriticalpatent/CN118965279B/en
Publication of CN118965279ApublicationCriticalpatent/CN118965279A/en
Application grantedgrantedCritical
Publication of CN118965279BpublicationCriticalpatent/CN118965279B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请提供了一种基于大模型的金融内容风控方法及系统,首先对目标金融文档进行多模态语义特征提取,得到文本语义空间和图像语义空间;根据文本语义空间中各个文本语义特征在上下文的关系语义和文本语义特征的句法结构关系确定文本模态的语义关联图;再通过图像的整体语义特征和图像内视觉对象的句法特征将图像语义空间的语义结构关系转换为图像模态的语义关联图;对文本模态的语义关联图和图像模态的语义关联图进行模态间细粒度结构融合,得到多模态融合图;基于多模态融合图,对提取的多模态特征进行异常分类,判断目标金融文档是否存在风险。采用本申请的方案,可实现金融文档的细粒度多模态特征融合,从而提高风控检测的准确性。

The present application provides a financial content risk control method and system based on a large model. First, multimodal semantic features are extracted from the target financial document to obtain a text semantic space and an image semantic space; the semantic association graph of the text modality is determined according to the relational semantics of each text semantic feature in the text semantic space in the context and the syntactic structure relationship of the text semantic features; the semantic structure relationship of the image semantic space is then converted into a semantic association graph of the image modality through the overall semantic features of the image and the syntactic features of the visual objects in the image; the semantic association graph of the text modality and the semantic association graph of the image modality are subjected to inter-modal fine-grained structural fusion to obtain a multimodal fusion graph; based on the multimodal fusion graph, the extracted multimodal features are subjected to abnormal classification to determine whether the target financial document has risks. The scheme of the present application can realize fine-grained multimodal feature fusion of financial documents, thereby improving the accuracy of risk control detection.

Description

Financial content wind control method and system based on large model
Technical Field
The application relates to the technical field of financial content wind control, in particular to a financial content wind control method and system based on a large model.
Background
Financial content wind control refers to monitoring and analyzing content (such as financial reports, news, social media information and the like) released and spread in the financial field so as to identify potential false information, misleading statement or illegal content therein, thereby preventing management measures of financial risks, and the core aim is to ensure the authenticity, accuracy and compliance of the financial information through technical means and data analysis, so that accurate false information detection on financial documents becomes a key task.
In the prior art, the conventional wind control detection (for example, false information detection) technology has various limitations when processing financial documents, the conventional method mostly relies on a single mode of text or image to perform information analysis, and cannot fully utilize collaborative information of multiple modes (such as text and image), the processing mode of the single mode is difficult to realize fine detection when facing complex financial scenes, for example, text description and image content in financial reports or news often have close semantic association, important cross-mode information can be omitted when analyzing text or image alone, so that false information is not accurately identified, and the conventional natural language processing technology can identify semantic anomalies to a certain extent, but has limited understanding ability on complex grammar structures, implicit semantic relations and the like, particularly in financial documents, a large number of terms, industry specific languages and implicit semantic relations make the conventional method difficult to be adequate, in addition, the processing of image information in financial documents is usually rough, only basic semantic detection or object identification is often carried out, and the granularity information in the image and the correlation with text are ignored, thus the accuracy of the text is further improved, and the accuracy of detecting the text content is further improved, and the quality of the text content is further integrated, and the text content is further detected, and the text content is further accurately detected, and the text content is detected.
Disclosure of Invention
The application provides a financial content wind control method and a financial content wind control system based on a large model, which can realize fine-grained multi-mode feature fusion of financial documents, thereby improving the accuracy of wind control detection.
In a first aspect, the present application provides a financial content wind control method based on a large model, including the steps of:
collecting a target financial document to be detected by wind control from a financial content platform;
Performing multi-mode semantic feature extraction on the target financial document based on the pre-trained semantic big model to obtain a text semantic space of a text mode and an image semantic space of an image mode;
constructing a semantic association tree of text elements according to the relation semantics of each text semantic feature in the text semantic space, and further determining a semantic association diagram of a text mode according to the semantic association tree and the syntactic structure relation of the text semantic features in the text semantic space;
Converting semantic structure relations among the semantic features of all images in the image semantic space into semantic association diagrams of image modes through the integral semantic features of the images in the target financial document and the syntactic features of the visual objects in the images;
Carrying out inter-mode fine-grained structure fusion on the semantic association graph of the text mode and the semantic association graph of the image mode to obtain a multi-mode fusion graph;
And extracting multi-modal features from the multi-modal fusion graph, carrying out abnormal classification on the extracted multi-modal features, and if the classification result is abnormal, determining that the target financial document has risks.
Preferably, the multi-mode semantic feature extraction is performed on the target financial document based on the pre-trained semantic big model, and the text semantic space of the text mode and the image semantic space of the image mode are obtained specifically including:
Extracting all text semantic features and image semantic features in the target financial document through a pre-trained semantic big model;
Taking a set formed by all text semantic features as a text semantic space of a text mode;
And taking the set formed by all the image semantic features as an image semantic space of an image mode.
Preferably, the construction of the semantic association tree of the text elements according to the relation semantics of the context of each text semantic feature in the text semantic space specifically includes:
Determining the relation semantics of each text semantic feature in the text semantic space in the context through a context window sliding analysis technology;
Hierarchical clustering is carried out on the text semantic features based on all the relation semantics to obtain a plurality of semantic association clusters;
and organizing each text semantic feature in the text semantic space into a tree structure according to semantic relations according to all the semantic association clusters to obtain a semantic association tree of the text elements.
Preferably, determining the semantic association graph of the text modality according to the semantic association tree and the syntactic structure relation of the text semantic features in the text semantic space specifically includes:
Extracting the syntactic structures of different language segments in the text in the target financial document;
determining the syntactic structure relation of the text semantic features in the text semantic space through all the syntactic structures, and generating a syntactic tree based on the syntactic structure relation;
And carrying out structural semantic association on the semantic association tree and the syntax tree to generate a semantic association graph of the text mode.
Preferably, converting the semantic structure relation between the semantic features of each image in the image semantic space into the semantic association graph of the image mode through the integral semantic features of the image in the target financial document and the syntactic features of the visual object in the image specifically comprises:
extracting the whole semantic features of the images in the target financial document;
extracting syntactic characteristics of a visual object in an image in a target financial document;
Constructing a semantic structure relation model based on the integral semantic features and the syntactic features;
and converting the semantic structure relation among the semantic features of each image in the image semantic space into a semantic association graph of an image mode by using the semantic structure relation model.
Preferably, extracting the multi-modal feature from the multi-modal fusion map, and performing anomaly classification by using the extracted multi-modal feature, and determining whether the target financial document is false information specifically includes:
Extracting node characteristics and edge characteristics from the multi-mode fusion graph;
And carrying out abnormal classification on the node characteristics and the edge characteristics, and if the classification result is abnormal, determining that the target financial document has risks.
Preferably, the target financial document includes text data and image data.
In a second aspect, the present application provides a financial content wind control system based on a large model, comprising:
The acquisition module is used for acquiring target financial documents to be detected in a wind control manner from the financial content platform;
the processing module is used for extracting multi-mode semantic features of the target financial document based on the pre-trained semantic big model to obtain a text semantic space of a text mode and an image semantic space of an image mode;
The processing module is further used for constructing a semantic association tree of text elements according to the relation semantics of the text semantic features in the text semantic space, and further determining a semantic association graph of a text mode according to the semantic association tree and the syntactic structure relation of the text semantic features in the text semantic space;
The processing module is also used for converting semantic structure relations among the semantic features of each image in the image semantic space into a semantic association graph of an image mode through the integral semantic features of the image in the target financial document and the syntactic features of the visual object in the image;
The processing module is further used for carrying out inter-mode fine-grained structure fusion on the semantic association graph of the text mode and the semantic association graph of the image mode to obtain a multi-mode fusion graph;
and the execution module is used for extracting the multi-modal characteristics from the multi-modal fusion graph, carrying out abnormal classification on the extracted multi-modal characteristics, and determining that the target financial document has risks if the classification result is abnormal.
In a third aspect, the present application provides a computer device comprising a memory storing code and a processor configured to obtain the code and to perform the above-described large model-based financial content wind control method.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the above-described large model-based financial content wind control method.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
In the embodiment of the application, a target financial document to be detected by wind control is collected from a financial content platform; performing multi-mode semantic feature extraction on the target financial document based on the pre-trained semantic big model to obtain a text semantic space of a text mode and an image semantic space of an image mode; constructing a semantic association tree of text elements according to the relation semantics of each text semantic feature in the text semantic space, and further determining a semantic association diagram of a text mode according to the semantic association tree and the syntactic structure relation of the text semantic features in the text semantic space; converting semantic structure relations among the semantic features of all images in the image semantic space into semantic association diagrams of image modes through the integral semantic features of the images in the target financial document and the syntactic features of the visual objects in the images; carrying out inter-mode fine-grained structure fusion on the semantic association graph of the text mode and the semantic association graph of the image mode to obtain a multi-mode fusion graph; and extracting multi-modal features from the multi-modal fusion graph, carrying out abnormal classification on the extracted multi-modal features, and if the classification result is abnormal, determining that the target financial document has risks.
Therefore, the application carries out inter-mode fine-grained structure fusion on the semantic association graph of the text mode and the semantic association graph of the image mode to obtain a multi-mode fusion graph, further extracts multi-mode features from the multi-mode fusion graph, carries out abnormal classification on the extracted multi-mode features, and judges whether a target financial document has risks or not; firstly, determining a semantic association diagram of a text mode by a semantic association tree of text elements and a syntactic structure relation of text semantic features, and helping to deeply understand the structure and logic relation of text contents by constructing the semantic association tree of the text elements, wherein the process can identify key elements and interactions thereof in the text, so that potential false information is more accurately captured in a complex financial context, and further, generating a semantic association diagram of a richer text mode by combining the semantic association tree with the syntactic structure relation, wherein the representation of the semantic association diagram enables the relation between each semantic feature in the text and other features to be clearer, and improves the understanding capability of a system on the text information, thereby helping to detect false information and possible propagation paths thereof in the text; secondly, according to the whole semantic features of the image and the syntactic features of the visual object in the image, semantic structural relations among all the image semantic features in the image semantic space are converted into semantic association diagrams of image modes, the semantic structural relations of the image modes are converted through the whole semantic features and the syntactic features of the visual object, deep semantic information in the image can be fully mined, association between visual content and text content in the image is clearly presented through the process, coarse analysis of the image content by a traditional image processing method is avoided, and the semantic association diagrams of the image modes are also helpful for more comprehensively understanding the synergies of the image and the text in the financial document, so that detection accuracy of false information is further improved; then, through fusing the fine granularity structures of the text mode and the image mode, a multi-mode fusion graph is generated, so that the depth information integration among different modes can be realized, the overall understanding of the system on the financial document is enhanced, the fused graph not only contains the information of the text and the image, but also can capture the complex relationship between the text and the image, and the identification capability of false information is greatly improved; finally, based on the generated multi-modal fusion graph, the extracted multi-modal features are subjected to abnormal classification, whether the target financial document has risks or not is judged, potential abnormal information in the financial document can be effectively identified, and accordingly wind control detection of false information of the target financial document is completed; in conclusion, the fine granularity multi-mode feature fusion of the financial document can be realized by the scheme of the application, so that the accuracy of wind control detection is improved.
Drawings
FIG. 1 is an exemplary flow chart of a large model-based financial content wind control method according to some embodiments of the application;
FIG. 2 is a schematic flow diagram of determining a semantic association tree according to some embodiments of the present application;
FIG. 3 is a schematic diagram of a semantic association tree according to some embodiments of the present application;
FIG. 4 is a schematic diagram of exemplary software modules of a large model-based financial content wind control system shown in accordance with some embodiments of the application;
FIG. 5 is a schematic diagram of a computer device implementing a large model-based financial content wind control method according to some embodiments of the application.
Detailed Description
The core of the application is that multi-mode semantic feature extraction is carried out on a target financial document to obtain a text semantic space and an image semantic space; determining a semantic association diagram of a text mode according to the relation semantics of the context of each text semantic feature in the text semantic space and the syntactic structure relation of the text semantic feature; then, the semantic structure relation of the image semantic space is converted into a semantic association graph of an image mode through the integral semantic features of the image and the syntactic features of the visual objects in the image; carrying out inter-mode fine granularity structure fusion on the semantic association graph of the text mode and the semantic association graph of the image mode to obtain a multi-mode fusion graph; based on the multi-modal fusion graph, the extracted multi-modal features are subjected to abnormal classification, and whether the target financial document has risks is judged. By adopting the scheme of the application, the fine granularity multi-mode feature fusion of the financial document can be realized, so that the accuracy of wind control detection is improved.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments. Referring to FIG. 1, which is an exemplary flowchart of a large model-based financial content wind control method 100 according to some embodiments of the present application, the large model-based financial content wind control method 100 mainly includes the steps of:
In step 101, a target financial document to be wind detected is collected from a financial content platform.
It should be noted that, the financial content platform in the present application refers to an online platform that provides financial information, data and related services, and these platforms generally include news, research reports, market data, analysis tools, etc., and its main function is to provide timely and accurate financial information for users to support investment decision and investment management.
In specific implementation, the collection of the target financial document to be detected by wind control from the financial content platform can be realized by the following modes: the target financial document to be detected in a wind control mode can be obtained from a database of the financial content platform, wherein the target financial document is a multi-mode data document, and the target financial document comprises text data and image data.
In step 102, multi-modal semantic feature extraction is performed on the target financial document based on the pre-trained semantic big model, so as to obtain a text semantic space of a text modality and an image semantic space of an image modality.
In some embodiments, the multi-modal semantic feature extraction is performed on the target financial document based on the pre-trained semantic big model, and the text semantic space of the text modality and the image semantic space of the image modality are obtained by the following steps:
Extracting all text semantic features and image semantic features in the target financial document through a pre-trained semantic big model;
Taking a set formed by all text semantic features as a text semantic space of a text mode;
And taking the set formed by all the image semantic features as an image semantic space of an image mode.
It should be noted that, the semantic big model in the application is a machine learning model obtained by pretraining a large amount of financial document data, the algorithm structure of the semantic big model in the application adopts a transducer, and in other embodiments, other algorithm structures can also be adopted, which is not limited herein; it should be further noted that, the principle of extracting all text semantic features in the target financial document through the pre-trained semantic big model is as follows: converting each Word of the text in the financial document into a high-dimensional vector by using a Word embedding technology (such as Word2 Vec), and further capturing semantic relations among the words in the high-dimensional vector so as to obtain text semantic features; the principle of extracting all image semantic features in the target financial document through the pre-trained semantic big model is as follows: and carrying out convolution recognition on the images in the financial document through a convolution neural network, and taking the obtained recognition features as image semantic features.
When the method is specifically implemented, firstly, a pre-trained semantic big model (such as GENERATIVE PRE-trained Transformer, GPT) can be used for processing texts in a target financial document, word segmentation is carried out on the texts, each word is expressed by a vector through the semantic big model, and all obtained vectors are used as text semantic features; secondly, extracting image features of images in the financial document by using a convolutional neural network (Convolution Neural Network, CNN), wherein the features are expressed in a vector form, generating a plurality of semantic vectors by capturing semantic information of the images, and taking all the generated semantic vectors as image semantic features; then, integrating all the extracted text semantic features to form a set, wherein the set can be represented by a high-dimensional vector space, and each vector corresponds to one text semantic feature, so that a text semantic space of a text mode is formed; and finally, integrating all the extracted image semantic features to form an image feature set, and representing the image feature set as a high-dimensional vector space in the same mode as a text mode to obtain an image semantic space of the image mode.
It should be noted that, text semantic features in the present application refer to features capable of expressing text content and semantic information, and these features are generally used for understanding and analyzing meaning, emotion and theme of text; image semantic features refer to features capable of expressing image content and semantic information, which are commonly used to understand and analyze the content, scene, and objects of an image.
In step 103, a semantic association tree of text elements is constructed according to the relation semantics of the text semantic features in the text semantic space, and then a semantic association diagram of a text mode is determined according to the semantic association tree and the syntactic structure relation of the text semantic features in the text semantic space.
In some embodiments, referring to fig. 2, the flow chart of determining a semantic association tree in some embodiments of the present application is shown, where in this embodiment, the semantic association tree for constructing text elements according to the relational semantics of the context of each text semantic feature in the text semantic space may be implemented by the following steps:
in step 1031, determining, by a contextual window sliding analysis technique, a relational semantic of each text semantic feature in the text semantic space;
in step 1032, hierarchical clustering is performed on the text semantic features based on all the relationship semantics to obtain a plurality of semantic association clusters;
In step 1033, according to all the semantic association clusters, each text semantic feature in the text semantic space is organized into a tree structure according to semantic relations, and a semantic association tree of the text elements is obtained.
It should be noted that, the semantic association tree in the application is a way of semantically structuring and representing text elements, and is used for displaying the relation and hierarchical structure among various semantic features in the text, and clearly displaying how the semantic features (such as words, phrases or sentences) are associated with each other and the positions thereof in the semantic space in a tree-shaped graph form; it should also be noted that, in the present application, the meaning of a relation refers to a semantic relation between words, phrases or sentences in a text, and this relation not only relates to a direct relation between words, but also includes meaning changes and logical relations of the words in a specific context.
When the method is specifically implemented, firstly, a context window (such as 5 words) with a fixed size is set in a text by using a sliding window technology, the window is moved step by step, semantic analysis is carried out on the words in each window, the similarity between the words in the window is calculated, cosine similarity can be used here, a relation matrix between the words in the context window is obtained, and the relation semantics of the semantic features of each text in the context can be reflected through the relation matrix; then, using all the extracted relation matrixes as basic data, clustering all the relation matrixes by using a hierarchical clustering algorithm (such as aggregation hierarchical clustering), classifying the features with high similarity into one type, and forming a plurality of semantic association clusters; finally, according to the formed semantic association clusters, the text semantic features are organized into a tree structure according to semantic relationships, the root node can be a theme or a central concept, the child nodes are related semantic features, as shown in fig. 3, which is a schematic diagram of the semantic association tree according to some embodiments of the present application, and the child nodes below the semantic association tree are set as investors and confidence by taking finance and market as the root nodes, so as to form a semantic association tree.
By means of context window sliding analysis, hierarchical clustering and semantic association tree construction, semantic features in texts can be extracted and organized to form a structured semantic graph, the semantic association tree not only helps to understand deep hierarchical semantics of the texts, but also can effectively find hidden semantic contradictions or unreasonable text organization in false information detection, and therefore detection accuracy of a separate control detection system can be improved.
In some embodiments, determining the semantic association graph of the text modality from the semantic association tree and the syntactic structural relationship of the text semantic features in the text semantic space may be implemented by:
Extracting the syntactic structures of different language segments in the text in the target financial document;
determining the syntactic structure relation of the text semantic features in the text semantic space through all the syntactic structures, and generating a syntactic tree based on the syntactic structure relation;
And carrying out structural semantic association on the semantic association tree and the syntax tree to generate a semantic association graph of the text mode.
It should be noted that, the syntax tree in the application is a way for structurally representing the sentence structure and the grammar relationship, and the syntax tree can help understand the grammar structure and the semantic meaning of the sentence by showing the hierarchical relationship and the dependency relationship among each component in the sentence through the tree structure; it should be further noted that, the semantic association graph of the text mode in the application is a graphical data structure for representing each semantic feature and its interrelationship in the text, and the semantic association between the words and phrases in the text is shown in the form of nodes and edges, so that the text content and logic structure can be understood more deeply.
In specific implementation, firstly, a syntactic analysis tool (such as Stanford Parser) can be used for carrying out syntactic analysis on a target financial document, the syntactic structure of each object segment is extracted, the syntactic structure comprises phrase structures (words) and dependency relationships thereof, namely, components such as subjects, predicates and objects in the object segments are identified, and the dependency relationships among the subjects, the predicates and the objects are marked, for example, in the sentence of increasing confidence in a financial market of an investor, the syntactic analysis can identify the investor as the subject, the increasing as the predicate, the confidence as the object, and the dependency relationships among the subjects, the predicates and the objects are marked; then, constructing a syntax tree according to the extracted syntax structure, wherein each node represents a word or phrase, the edges represent the syntax relationships among the words, and the dependency relationships in the syntax structure are used for determining the syntax structure relationships of all semantic features in the text semantic space, when the syntax tree is constructed, an investor can be used as a root node, connected to the predicate of 'adding', and then connected to the confidence as an object; finally, the similarity between the semantic association tree and the syntax tree is identified by comparing the semantic association tree and the syntax tree, whether the semantic features are matched with the syntax structure is checked, a semantic association graph of a text mode is generated on the basis, nodes in the semantic association tree are matched with corresponding nodes in the syntax tree, the nodes are connected and edges are generated to represent the semantic and the syntax relationship, for example, an investor is connected with the corresponding nodes in the syntax tree from the semantic association tree, and a semantic association graph of a new text mode is generated.
It should be noted that, the semantic association graph of the text mode in the application can clearly show the relation between each semantic feature and the syntax structure in the text, and can provide rich semantic context features for subsequent wind control detection.
In step 104, the semantic structure relationship between the semantic features of each image in the image semantic space is converted into a semantic association graph of the image modality through the overall semantic features of the image in the target financial document and the syntactic features of the visual object in the image.
In some embodiments, the conversion of the semantic structural relationship between the semantic features of each image in the image semantic space into the semantic association graph of the image modality by the overall semantic features of the image in the target financial document and the syntactic features of the visual object within the image may be implemented by the following steps:
extracting the whole semantic features of the images in the target financial document;
extracting syntactic characteristics of a visual object in an image in a target financial document;
Constructing a semantic structure relation model based on the integral semantic features and the syntactic features;
and converting the semantic structure relation among the semantic features of each image in the image semantic space into a semantic association graph of an image mode by using the semantic structure relation model.
It should be noted that, the semantic association graph of the image mode in the application is a graphical data structure for representing semantic relationships among different visual features in the image, and organizes key visual objects, integral semantic features and association relationships among the key visual objects and the integral semantic features into a graph structure of nodes and edges for describing semantic association among various elements in the image.
In specific implementation, firstly, a pre-trained deep learning model (such as ResNet) can be used for processing images in a target financial document, the final layer (usually a fully connected layer) of the deep learning model is used for outputting the overall semantic features of the images, the overall semantic features are expressed in the form of vectors, the overall semantic features represent semantic information of the images, for example, a picture containing a financial data chart is processed, and the output feature vectors can reflect the trend, color, form and other semantic information of the chart; secondly, a target detection model (such as YOLOV) can be used for identifying specific visual objects in the image, such as text frames, chart elements and graphs, then the position information (boundary frame coordinates), category labels and the mutual relations (such as 'containing', 'abutting') of each visual object are extracted, and feature vectors formed by the position information, the category labels and the mutual relations of the visual objects are used as syntactic features of the visual objects; then, defining nodes and edges, namely, regarding the visual objects extracted from the whole semantic features and the syntactic features as nodes of the graph, constructing an adjacency matrix according to the syntactic relation between the visual objects, wherein each element in the matrix represents the relation strength between the nodes, then creating a directed graph by using a graph structure (such as NetworkX), and integrating the visual object nodes and the relation edges between the visual object nodes into the graph, thereby completing the construction of a semantic structure relation model; and finally, recognizing that semantic relationships exist among the semantic features of each image in the image semantic space by using the constructed semantic structure relationship model, taking the semantic features of each image as feature nodes of the graph, taking the semantic relationships among the semantic features of each image as relationship edges of the graph, and integrating all the feature nodes with the corresponding relationship edges to form a semantic association graph of a complete image mode.
It should be noted that, in the application, by extracting the whole semantic features of the image and the syntactic features of the visual object and combining the semantic structure relation model, the semantic feature relation in the image can be represented as a graph structure, and the semantic association graph of the image mode not only provides clear semantic structure support for understanding the image, but also lays a foundation for subsequent multi-mode fusion and wind control detection.
In step 105, inter-mode fine granularity structure fusion is performed on the semantic association graph of the text mode and the semantic association graph of the image mode, so as to obtain a multi-mode fusion graph.
In some embodiments, the inter-mode fine granularity structure fusion is performed on the semantic association graph of the text mode and the semantic association graph of the image mode, so that the multi-mode fusion graph can be obtained by adopting the following steps:
Extracting the structure granularity of the text from the semantic association graph of the text mode;
extracting the structure granularity of the image from the semantic association graph of the image mode;
And carrying out modal fusion on the semantic association graph of the text mode and the semantic association graph of the image mode based on the structure granularity of the text and the structure granularity of the image to obtain a multi-modal fusion graph.
It should be noted that, the fine granularity of the text in the application is the fine degree of analyzing semantic structure information in the text, and is specifically expressed as analyzing and organizing language units such as vocabulary, phrase, sentence and the like in the text according to the relation of semantics, syntax, context dependence and the like, and the hierarchical structure and the internal semantic association of the text are revealed by deeply analyzing the relation between each semantic unit in the text; the structural granularity of the image in the application is the fineness of analyzing semantic structural information in the image.
When the method is specifically implemented, firstly, a pre-trained convolutional neural network model can be used for carrying out convolutional processing on a semantic association graph of a text mode, fine granularity of a semantic structure in the semantic association graph is identified through convolutional processing, and then the fine granularity of the semantic structure is used as the fine granularity of the structure of the text; then, extracting the fine granularity of the semantic structure in the semantic association graph of the image mode through a pre-trained FASTER RCNN model, so that the extracted result is used as the fine granularity of the structure of the image; finally, the semantic association graph of the text and the semantic association graph of the image can be aligned through a cross-modal attention mechanism (such as multi-head self-attention), then the semantic association graph of the text mode and the semantic association graph of the image mode are mapped to the same feature space through a 2-layer full-connection layer so as to bridge the difference of the representing features of different modes, further the association information between the two modes is captured, each node in the semantic association graph of the text mode is connected to each node on the semantic association graph of the image mode, the mode of establishing connecting edges between the nodes of different modes is different from the mode of connection through addition or splicing, the model can be helped to learn the relevance between objects contained in different modes on a finer granularity level, then the graph neural network (Graph Neural Network, GNN) is used for fusing the fine granularity features of the image and the text, so that the semantic feature difference between different modes is reduced, and the graph neural network output result is used as a multi-mode fusion graph.
In step 106, the multi-modal feature is extracted from the multi-modal fusion map, the extracted multi-modal feature is classified abnormally, and if the classification result is abnormal, the risk of the target financial document is determined.
In some embodiments, extracting the multi-modal feature from the multi-modal fusion map, performing abnormal classification on the extracted multi-modal feature, and if the classification result is abnormal, determining that the risk exists in the target financial document may be implemented by the following steps:
Extracting node characteristics and edge characteristics from the multi-mode fusion graph;
And carrying out abnormal classification on the node characteristics and the edge characteristics, and if the classification result is abnormal, determining that the target financial document has risks.
In the specific implementation, firstly, a feature encoder in the prior art can be adopted to read node features and edge features in and among modes from a multi-mode fusion graph; then, the node features and the edge features can be mapped to a correlation feature space, then the correlation feature space is converted into sentence-word level correlation features and semantic-level correlation features among modes through a convolution layer, further the sentence-word level correlation features and the semantic-level correlation features are used as inputs of a classifier trained in advance, the extracted multi-mode features are classified through the classifier, and whether a target financial document is at risk or not is judged according to a classification result, namely, when the output result of the classifier is abnormal, the target financial document is judged to be at risk.
It should be noted that, the classifier in the present application may use a support vector machine, and in other embodiments, a classifier of another type may also be used, which is not limited herein.
In another aspect, in some embodiments, the present application provides a large model-based financial content wind control system, referring to fig. 4, which is a schematic diagram of exemplary software modules of a large model-based financial content wind control system according to some embodiments of the present application, the large model-based financial content wind control system 400 comprising: the acquisition module 401, the processing module 402, and the execution module 403 are respectively described as follows:
the acquisition module 401 is mainly used for acquiring target financial documents to be detected in a wind control manner from a financial content platform;
The processing module 402 is used for extracting multi-mode semantic features of the target financial document based on the pre-trained semantic big model to obtain a text semantic space of a text mode and an image semantic space of an image mode;
The processing module 402 is further configured to construct a semantic association tree of text elements according to the relationship semantics of the context of each text semantic feature in the text semantic space, and further determine a semantic association graph of a text modality according to the semantic association tree and the syntactic structure relationship of the text semantic feature in the text semantic space;
The processing module 402 is further configured to convert semantic structural relationships between semantic features of each image in the image semantic space into a semantic association graph of an image modality through integral semantic features of the image in the target financial document and syntactic features of the visual object in the image;
The processing module 402 is further configured to perform inter-mode fine granularity structure fusion on the semantic association graph of the text mode and the semantic association graph of the image mode to obtain a multi-mode fusion graph;
the execution module 403 is mainly configured to extract multi-modal features from the multi-modal fusion graph, perform abnormal classification on the extracted multi-modal features, and determine that the target financial document has risk if the classification result is abnormal.
In addition, the application also provides a computer device, which comprises a memory and a processor, wherein the memory stores codes, and the processor is configured to acquire the codes and execute the financial content wind control method based on the large model.
In some embodiments, reference is made to FIG. 5, which is a schematic diagram of a computer device implementing a large model-based financial content wind control method, according to some embodiments of the application. The large model-based financial content wind control method in the above embodiment may be implemented by a computer device shown in fig. 5, where the computer device 500 includes at least one processor 501, a communication bus 502, a memory 503, and at least one communication interface 504.
The processor 501 may be a general purpose central processing unit (central processing unit, CPU) or an application-specific integrated circuit (ASIC).
Communication bus 502 may be used to transfer information between the above-described components.
The Memory 503 may be, but is not limited to, a read-only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (random access Memory, RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-only Memory, EEPROM), a compact disc (compact disc read-only Memory) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 503 may be separate and coupled to the processor 501 via a communication bus 502. Memory 503 may also be integrated with processor 501.
Wherein the memory 503 is for storing program codes for executing the inventive arrangements and is controlled for execution by the processor 501. The processor 501 is configured to execute program code stored in the memory 503. One or more software modules may be included in the program code. The financial content wind control method based on the large model in the above embodiment may be implemented by one or more software modules in the processor 501 and the program code in the memory 503.
Communication interface 504, using any transceiver-like device, is used to communicate with other devices or communication networks, such as ethernet, radio access network (radio access network, RAN), wireless local area network (wireless local area networks, WLAN), etc.
In a specific implementation, as an embodiment, a computer device may include a plurality of processors, where each of the processors may be a single-core (single-CPU) processor or may be a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
The computer device may be a general purpose computer device or a special purpose computer device. In a specific implementation, the computer device may be a desktop, a laptop, a web server, a personal computer (PDA), a mobile handset, a tablet, a wireless terminal device, a communication device, or an embedded device. Embodiments of the application are not limited to the type of computer device.
In addition, the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the financial content wind control method based on the large model when being executed by a processor.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

CN202411435306.XA2024-10-152024-10-15 A financial content risk control method and system based on a large modelActiveCN118965279B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411435306.XACN118965279B (en)2024-10-152024-10-15 A financial content risk control method and system based on a large model

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411435306.XACN118965279B (en)2024-10-152024-10-15 A financial content risk control method and system based on a large model

Publications (2)

Publication NumberPublication Date
CN118965279Atrue CN118965279A (en)2024-11-15
CN118965279B CN118965279B (en)2024-12-13

Family

ID=93407753

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411435306.XAActiveCN118965279B (en)2024-10-152024-10-15 A financial content risk control method and system based on a large model

Country Status (1)

CountryLink
CN (1)CN118965279B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120145402A (en)*2025-05-142025-06-13北京科技大学 A large model defense method based on multi-view image-text conversion

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115100664A (en)*2022-06-202022-09-23济南大学Multi-mode false news identification method and system based on correlation information expansion
CN116737979A (en)*2023-06-192023-09-12山东财经大学Context-guided multi-modal-associated image text retrieval method and system
US20230386238A1 (en)*2021-09-222023-11-30Tencent Technology (Shenzhen) Company LimitedData processing method and apparatus, computer device, and storage medium
CN118133839A (en)*2024-03-012024-06-04齐鲁工业大学(山东省科学院) Image-text retrieval method and system based on semantic information reasoning and cross-modal interaction
CN118312922A (en)*2024-06-052024-07-09陕西淘丁实业集团有限公司Multi-mode network content security intelligent auditing system and method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20230386238A1 (en)*2021-09-222023-11-30Tencent Technology (Shenzhen) Company LimitedData processing method and apparatus, computer device, and storage medium
CN115100664A (en)*2022-06-202022-09-23济南大学Multi-mode false news identification method and system based on correlation information expansion
CN116737979A (en)*2023-06-192023-09-12山东财经大学Context-guided multi-modal-associated image text retrieval method and system
CN118133839A (en)*2024-03-012024-06-04齐鲁工业大学(山东省科学院) Image-text retrieval method and system based on semantic information reasoning and cross-modal interaction
CN118312922A (en)*2024-06-052024-07-09陕西淘丁实业集团有限公司Multi-mode network content security intelligent auditing system and method thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120145402A (en)*2025-05-142025-06-13北京科技大学 A large model defense method based on multi-view image-text conversion

Also Published As

Publication numberPublication date
CN118965279B (en)2024-12-13

Similar Documents

PublicationPublication DateTitle
US11886815B2 (en)Self-supervised document representation learning
CN112000818B (en) A text- and image-oriented cross-media retrieval method and electronic device
RU2619193C1 (en)Multi stage recognition of the represent essentials in texts on the natural language on the basis of morphological and semantic signs
CN113469214B (en) Fake news detection method, device, electronic device and storage medium
RU2640297C2 (en)Definition of confidence degrees related to attribute values of information objects
Al-Tameemi et al.Interpretable multimodal sentiment classification using deep multi-view attentive network of image and text data
CN117235605B (en)Sensitive information classification method and device based on multi-mode attention fusion
CN116975340A (en)Information retrieval method, apparatus, device, program product, and storage medium
CN117391051A (en) A multi-modal fake news detection method based on joint attention network integrating emotion
CN119202934B (en)Multi-mode labeling method based on deep learning
CN114004231B (en) A Chinese word extraction method, system, electronic device and storage medium
CN118965279B (en) A financial content risk control method and system based on a large model
CN118762377B (en)Multi-mode false news detection method, device, equipment and medium
CN115130435A (en)Document processing method and device, electronic equipment and storage medium
CN116738978A (en)Financial public opinion identification method and device based on multi-mode data fusion
CN116246287B (en) Target object recognition method, training method, device and storage medium
CN119047484A (en)Intelligent document auditing method and device, computer equipment and storage medium
CN118093689A (en)Multi-mode document analysis and structuring processing system based on RPA
CN113918710A (en) Text data processing method, apparatus, electronic device and readable storage medium
CN115953788A (en) Method and system for intelligent identification of green financial attributes based on OCR and NLP technology
CN119377632B (en)Rumor detection method and system based on multi-mode information in low-resource environment
CN119377415A (en) A method and system for detecting bad Chinese speech
CN116756310B (en)False news detection method, false news detection device, false news detection equipment and storage medium
CN118585677A (en) A multimodal data detection method for database security
CN116955639A (en) Futures industry chain knowledge graph construction method, device and computer equipment

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp