Movatterモバイル変換


[0]ホーム

URL:


CN119967457A - Signaling analysis method, system, electronic device and storage medium - Google Patents

Signaling analysis method, system, electronic device and storage medium
Download PDF

Info

Publication number
CN119967457A
CN119967457ACN202411935906.2ACN202411935906ACN119967457ACN 119967457 ACN119967457 ACN 119967457ACN 202411935906 ACN202411935906 ACN 202411935906ACN 119967457 ACN119967457 ACN 119967457A
Authority
CN
China
Prior art keywords
information
signaling
analysis
text
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411935906.2A
Other languages
Chinese (zh)
Inventor
陈鑫远
赵元哲
左绘
兰卓睿
陈洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi IoT Technology Co Ltd
Original Assignee
Tianyi IoT Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi IoT Technology Co LtdfiledCriticalTianyi IoT Technology Co Ltd
Priority to CN202411935906.2ApriorityCriticalpatent/CN119967457A/en
Publication of CN119967457ApublicationCriticalpatent/CN119967457A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请公开了一种信令分析方法、系统、电子设备及存储介质,方法包括:获取信令文本信息;加载检索增强生成数据库,融合多层结构知识图谱,对所述信令文本信息进行检索分析,得到提示词和检索结果;所述多层结构知识图谱由大语言模型对领域知识进行多层数据集构建得到;将所述提示词和所述检索结果输入大语言模型,得到分析结论。本申请实施例通过检索增强生成和大语言模型,有利于提升信令分析的质量和效率。本申请可以广泛应用于物联网技术领域。

The present application discloses a signaling analysis method, system, electronic device and storage medium, the method comprising: obtaining signaling text information; loading a retrieval enhancement generation database, integrating a multi-layer structure knowledge graph, performing retrieval analysis on the signaling text information, and obtaining prompt words and retrieval results; the multi-layer structure knowledge graph is obtained by constructing a multi-layer data set of domain knowledge by a large language model; the prompt words and the retrieval results are input into the large language model to obtain an analysis conclusion. The embodiments of the present application are conducive to improving the quality and efficiency of signaling analysis through retrieval enhancement generation and a large language model. The present application can be widely used in the field of Internet of Things technology.

Description

Signaling analysis method, system, electronic equipment and storage medium
Technical Field
The present application relates to the field of internet of things, and in particular, to a signaling analysis method, a system, an electronic device, and a storage medium.
Background
The signaling analysis is the basis of the operation and maintenance of the communication network, most operators adopt a Wireshark tool to realize the deep analysis of the code stream to diagnose the network problem, and the method is relatively dependent on the professional level of technicians and has low efficiency. Mainly comprises the following difficulties:
And the mass data screening is difficult, when a large amount of network traffic is captured, the specific signaling flow is difficult to accurately find from huge data, a large amount of time and effort are required to carry out screening and filtering, and missing key signaling is easy to occur.
Complex protocol parsing-the efficient use of Wireshark for signaling analysis requires a deep understanding of the various network protocols. This includes the structure, principle of operation and common signaling procedures of the protocol. Even for professional engineers, performing end-to-end signaling analysis of wireless, transport, core network to subscriber MEC in a 5G private network environment requires correlation analysis across multiple professions.
Disclosure of Invention
The embodiment of the application mainly aims to provide an efficient signaling analysis method, an efficient signaling analysis system, electronic equipment and a storage medium.
In order to achieve the above purpose, one aspect of the embodiment of the application provides a signaling analysis method, which comprises the steps of obtaining signaling text information, loading a search enhancement generation database, fusing a multi-layer structure knowledge graph, carrying out search analysis on the signaling text information to obtain a prompt word and a search result, constructing a multi-layer data set of domain knowledge by a large language model through the multi-layer structure knowledge graph, and inputting the prompt word and the search result into the large language model to obtain an analysis conclusion. The embodiment of the application is beneficial to improving the quality and efficiency of signaling analysis by retrieving the enhanced generation and large language model.
In some embodiments, the method provided by the embodiment of the application, the multi-layer structure knowledge graph is built by the following steps:
Acquiring first text information of a signaling domain knowledge base;
Dividing and extracting elements from the first text information to obtain vector information;
According to the hierarchical relation constructed by the knowledge base, linking entities corresponding to each hierarchy in the vector information to obtain map information;
And merging the map information to obtain the multi-layer structure knowledge map.
In some embodiments, the method provided by the embodiment of the present application links entities corresponding to each level in the vector information according to the hierarchical relationship constructed by the knowledge base, to obtain map information, including:
taking the direct grabbing information in the vector information as a first entity of a bottom layer;
determining the relevance of the vector information through the large language model, taking complex paraphrasing information in the vector information as a second entity of a second layer, and linking the second entity with the first entity;
And taking the clear definition information in the vector information as a third entity of a third layer, and linking the third entity with the second entity to obtain the map information.
In some embodiments, the method provided by the embodiment of the present application includes that the segmentation and element extraction are performed on the first text information to obtain vector information, including:
performing text segmentation on the first text information to obtain text blocks;
Extracting elements from the text block through a BERT model to obtain a text vector;
and carrying out mean pooling and text similarity processing on all the text vectors to obtain vector information.
In some embodiments, the method provided by the embodiment of the present application includes loading a search enhancement generation database, fusing a multi-layer structure knowledge graph, and performing search analysis on the signaling text information to obtain a prompt word and a search result, where the method includes:
based on the multi-layer structure knowledge graph, carrying out association analysis on the signaling text information, and determining keywords;
performing hierarchical search through a matching process in the multi-layer structure knowledge graph based on the keywords, and performing entity activation and knowledge collection based on search results to obtain primary information;
Combining the primary information with the signaling text information, and searching through a hybrid searcher to obtain secondary information;
Screening the secondary information through the clear definition information in the knowledge base to obtain tertiary information;
And fusing the primary information, the secondary information and the tertiary information to obtain a prompt word and a search result.
In some embodiments, the method provided by the embodiments of the present application, the hybrid retriever determines the secondary information by:
determining a score function through the ranking of the documents to be searched;
determining a relevance function according to the score function and the relevance score of the document to be searched;
Determining a retrieval score according to the correlation function of the primary information and the correlation function of the signaling text information;
and determining secondary information according to the retrieval score.
In some embodiments, the method provided by the embodiment of the present application further includes:
An architecture for signaling analysis is constructed, the architecture comprising:
the computing network resource layer is used for providing resources constructed by the knowledge base;
A technical layer for establishing the large language model;
The intelligent agent layer is used for realizing signaling analysis based on expert programming and task scheduling;
And the small model layer is used for signaling analysis in a specific field.
To achieve the above object, another aspect of an embodiment of the present application provides a signaling analysis system, including:
the first module is used for acquiring the signaling text information;
The second module is used for loading a search enhancement generation database, fusing a multi-layer structure knowledge graph, and carrying out search analysis on the signaling text information to obtain a prompt word and a search result;
And the third module is used for inputting the prompt words and the search results into a large language model to obtain analysis conclusion.
To achieve the above object, another aspect of the embodiments of the present application provides an electronic device, which includes a memory storing a computer program and a processor implementing the above method when executing the computer program.
To achieve the above object, another aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-mentioned method.
The method at least comprises the following beneficial effects that signaling text information is obtained, a search enhancement generation database is loaded, a multi-layer structure knowledge graph is fused, the signaling text information is subjected to search analysis to obtain prompt words and search results, the multi-layer structure knowledge graph is obtained by constructing a multi-layer data set of domain knowledge through a large language model, and the prompt words and the search results are input into the large language model to obtain analysis conclusion. The embodiment of the application is beneficial to improving the quality and efficiency of signaling analysis by retrieving the enhanced generation and large language model.
Drawings
FIG. 1 is an interface diagram of a wireshark-based analysis process in the related art;
FIG. 2 is an application scenario diagram of a DPI-based analysis process in the related art;
FIG. 3 is a flow chart of one embodiment of a signaling analysis method provided by the present application;
FIG. 4 is a flow chart of another embodiment of a signaling analysis method provided by the present application;
FIG. 5 is a flow chart of one embodiment of a retrieval enhancement based retrieval process provided by the present application;
FIG. 6 is a flow chart of one embodiment of a signaling analysis architecture provided by the present application;
FIG. 7 is an interface diagram of one embodiment of a signaling analysis process provided by the present application;
FIG. 8 is an interface diagram of another embodiment of a signaling analysis process provided by the present application;
fig. 9 is a schematic structural diagram of a signaling analysis system according to an embodiment of the present application;
fig. 10 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with embodiments of the application, but are merely examples of apparatuses and methods consistent with aspects of embodiments of the application as detailed in the accompanying claims.
It is to be understood that the terms "first," "second," and the like, as used herein, may be used to describe various concepts, but are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present application. The words "if", as used herein, may be interpreted as "when" or "in response to a determination", depending on the context.
The terms "at least one", "a plurality", "each", "any" and the like as used herein, at least one includes one, two or more, a plurality includes two or more, each means each of the corresponding plurality, and any one means any of the plurality.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
Before describing embodiments of the present application in detail, some of the terms and expressions that are referred to in the embodiments of the present application will be described first, and the terms and expressions that are referred to in the embodiments of the present application are applicable to the following explanation.
The signaling analysis is a technical process for capturing, analyzing, monitoring and analyzing control signaling messages in a communication system. Signaling generally refers to control information communicated between network nodes (e.g., base stations, switches, servers, etc.) that is responsible for managing and coordinating call setup, maintenance, termination, and other management and configuration tasks of the network. Analysis of the signaling can help to understand the operation of the communication system, discover potential problems, evaluate quality of service, support failure removal, and optimize network configuration.
Deep Packet Inspection (DPI) is an application-layer-based traffic detection and control technique that not only examines the header information of packets, but also recognizes different application-layer protocols, user behavior, traffic types, etc., and uses it to monitor network traffic, track user behavior, recognize and block network attacks, etc.
An evolved packet core network (Evolved Packet Core, EPC), i.e. a 4G core network, comprises a Mobility Management Entity (MME), a serving gateway (S-GW), a packet data network gateway (P-GW) and a Home Subscriber Server (HSS).
A 5G core network (5G Core Network,5GC) comprising network elements (AMF, SMF, PCF) for control plane functions, user Plane (UPF) and network elements (UDM) for data storage and management functions, etc.
An IP multimedia subsystem (IP Multimedia Subsystem, IMS) is an IP-based network architecture for providing multimedia communication services.
The large language model (Large Language Model, LLM) is an artificial intelligence algorithm trained using deep learning techniques and massive data, and is mainly used for processing and understanding human language.
Knowledge Graph/Vault (KG), also known as scientific Knowledge Graph, describes Knowledge resources and their carriers by using various different visual technologies such as graphics, and discovers, analyzes, builds, draws and displays Knowledge and the interrelationship between them.
Resource description FrameWork (Resource Description FrameWork, RDF), which is a data model represented using XML syntax. The RDF functions to describe the properties of the resources and the relationships between the resources in the form of triples, one that stores the triples data line by line in text. And storing the knowledge graph by using a relational database. The graph data is represented by triples, each triplet being recorded as a row in a table.
The retrieval enhancement generation (RETRIEVAL-Augmented Generation, RAG) technique is an artificial intelligence technique that combines information retrieval with language generation. The method mainly aims at improving the accuracy, reliability and information richness of the output result of the generated artificial intelligence model by utilizing information retrieved from an external source.
The signaling analysis is the basis of the operation and maintenance of the communication network, most operators adopt a Wireshark tool to realize the deep analysis of the code stream to diagnose the network problem, and the method is relatively dependent on the professional level of technicians and has low efficiency. Mainly comprises the following difficulties:
And the mass data screening is difficult, when a large amount of network traffic is captured, the specific signaling flow is difficult to accurately find from huge data, a large amount of time and effort are required to carry out screening and filtering, and missing key signaling is easy to occur.
Complex protocol parsing-the efficient use of Wireshark for signaling analysis requires a deep understanding of the various network protocols. This includes the structure, principle of operation and common signaling procedures of the protocol. Even for professional engineers, performing end-to-end signaling analysis of wireless, transport, core network to subscriber MEC in a 5G private network environment requires correlation analysis across multiple professions.
Resolving reporting ambiguity-while a large model can promote the knowledge content in the analysis conclusion, it is prone to illusion, providing erroneous decisions and inferences.
The lack of visualization and ease of use-the Wireshark interface is relatively complex, contains functional options and parameter settings, is difficult to use, and while the Wireshark can display the details of the signaling data, it is also lacking in visualization. For complex signaling flows, there is a lack of visual graphical presentation, making understanding and analysis more difficult.
As shown in fig. 1, a typical wireshark analysis process requires analyzing the interactive flow and content of signaling line by line and frame by frame, and analyzing and confirming problems in the interactive process.
The signaling analysis software mainly exists in the form of DPI sensing probes in the application of the client sinking 5G private network, and in the private network operation and maintenance management, the language and interface presentation which can be read and understood by the end user are also lacked, so that the specialized analysis result of the DPI sensing probes is explained for the client, and the purposes of deep self-operation and maintenance and self-diagnosis of the private network by the client are supported. As shown in fig. 2, the DPI mainly completes acquisition around the edge UPF, and the acquisition interface includes interfaces such as N3/N4/N9, and generates a code stream file of the XDR.
In view of this, the embodiment of the application provides a signaling analysis method, which aims to improve the quality and efficiency of signaling analysis. The application can be used in the fields of software technology and network communication.
The embodiment of the application provides a signaling analysis method, and relates to the technical field of Internet of things. The signaling analysis method provided by the embodiment of the application can be applied to the terminal, the server and software running in the terminal or the server. In some embodiments, the terminal may be, but not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle terminal, etc., the server may be configured as an independent physical server, may be configured as a server cluster or a distributed system formed by a plurality of physical servers, may be configured as a cloud server for providing a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a cloud server for basic cloud computing services such as big data and an artificial intelligence platform, and the server may be a node server in a blockchain network, and the software may be an application for implementing a signaling analysis method, etc., but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. Such as a personal computer, a server computer, a hand-held or portable device, a tablet device, a multiprocessor system, a microprocessor-based system, a set top box, a programmable consumer electronics, a network PC, a minicomputer, a mainframe computer, a distributed computing environment that includes any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
It should be noted that, in each specific embodiment of the present application, when related processing is required according to user information, user behavior data, user history data, user location information, and other data related to user identity or characteristics, permission or consent of the user is obtained first, and the collection, use, processing, and the like of the data comply with related laws and regulations and standards. In addition, when the embodiment of the application needs to acquire the sensitive personal information of the user, the independent permission or independent consent of the user is acquired through popup or jump to a confirmation page and the like, and after the independent permission or independent consent of the user is definitely acquired, the necessary relevant data of the user for enabling the embodiment of the application to normally operate is acquired.
Fig. 3 is an alternative flowchart of a signaling analysis method according to an embodiment of the present application, and the method in fig. 3 may include, but is not limited to, steps S100 to S300.
Step S100, obtaining signaling text information;
Step S200, loading a search enhancement generation database, fusing a multi-layer structure knowledge graph, and carrying out search analysis on signaling text information to obtain a prompt word and a search result;
and step S300, inputting the prompt word and the search result into a large language model to obtain an analysis conclusion.
The embodiment of the application provides a construction method of an intelligent signaling analysis agent based on an AI large model, which combines a signaling analysis workflow to intelligently upgrade the traditional signaling analysis measure, integrates the wireless/core network signaling data analysis processing capability and the large model language understanding and content generating capability, realizes automatic understanding, analysis and reasoning of network signaling and signaling knowledge retrieval question and answer, and greatly improves the quality and efficiency of complaint analysis and network optimization analysis work. Referring to fig. 4, the application realizes high-order automation and intelligent signaling analysis based on a large model, and enhances the reliability of signaling analysis through RAG+knowledge graph fusion.
In some embodiments, the method provided by the embodiment of the application, the multi-layer structure knowledge graph is built by the following steps:
Acquiring first text information of a signaling domain knowledge base;
Dividing and extracting elements of the first text information to obtain vector information;
according to the hierarchical relationship constructed by the knowledge base, linking entities corresponding to each hierarchy in the vector information to obtain map information;
and combining the map information to obtain the multi-layer structure knowledge map.
In some possible implementations, three-layer data set construction of domain knowledge is realized by extracting a signaling domain entity relationship through LLM, segmentation and blocking are performed by using LLM, tag information is extracted, and automatic map construction is realized. In some embodiments, domain knowledge hierarchical data sets are constructed by extracting signaling domain entity relationships through LLM, and atlas construction is realized by extracting labels in a segmented and segmented manner.
In some embodiments, the method provided by the embodiment of the present application links entities corresponding to each level in the vector information according to the hierarchical relationship constructed by the knowledge base, to obtain the map information, including:
Taking the direct grabbing information in the vector information as a first entity of the bottom layer;
Determining the relevance of vector information through a large language model, taking complex paraphrasing information in the vector information as a second entity of a second layer, and linking the second entity with the first entity;
And taking the clear definition information in the vector information as a third entity of a third layer, and linking the third entity with the second entity to obtain the map information.
In some possible implementations, the present application provides hierarchical linking-communication signaling is a field of expertise that uses precise term systems and builds on many well-defined definitions and relationships. The application also provides a map construction, which is to identify specific entity objects and the relations thereof from the text of the signaling domain knowledge base by utilizing LLM, convert the specific entity objects and the relations thereof into a structured knowledge representation and present the structured knowledge representation through the structure of the map. In some embodiments, the application constructs a three-layer linked knowledge database for communication signaling to form a comprehensive signaling analysis knowledge graph. The bottom layer is a basic information entity, the second layer is a simple knowledge graph layer constructed after the LLM links with the bottom layer entity, and the third layer is a signaling display analysis layer after the LLM links with the second layer entity.
In some embodiments, the method provided by the embodiment of the present application performs segmentation and element extraction on the first text information to obtain vector information, including:
performing text segmentation on the first text information to obtain text blocks;
Extracting elements from the text block through the BERT model to obtain a text vector;
and carrying out mean pooling and text similarity processing on all the text vectors to obtain vector information.
In some possible implementations, the element extraction method is numerous, and those skilled in the art may perform element extraction by other methods, which are not particularly limited by the present application. It can be understood that the application uses the sliding window to process a fixed number of paragraphs each time and continuously adjusts the window through the semantic segmentation and blocking before and after the signaling, thereby keeping the attention to the consistency of the signaling information. LLM is employed to complete the identification and extraction of graph node examples from source text blocks, providing example guidance for LLM training through text vectorization methods using BERT models.
In some embodiments, referring to fig. 5, the method provided by the embodiment of the present application loads a search enhancement generation database, fuses a multi-layer structure knowledge graph, performs search analysis on signaling text information, and obtains a prompt word and a search result, including:
Step 210, carrying out association analysis on the signaling text information based on the multi-layer structure knowledge graph to determine keywords;
Step 220, carrying out hierarchical search on the basis of keywords through a matching process in the multi-layer structure knowledge graph, and carrying out entity activation and knowledge collection on the basis of search results to obtain primary information;
step 230, merging the primary information with the signaling text information, and searching through a hybrid searcher to obtain secondary information;
step 240, screening the secondary information through the clear definition information in the knowledge base to obtain tertiary information;
and 250, fusing the primary information, the secondary information and the tertiary information to obtain the prompt word and the search result.
In some possible implementations, input signaling is automatically analyzed using a search enhanced generation (RAG) based technique. It can be understood that the application adopts a strategy retrieval method based on once STM-R high efficiency as the processing procedure of RAG retrieval enhancement generation technology, outputs entity content and related content in a range, and provides the input of secondary LLM enhancement retrieval.
In some embodiments, the method provided by the embodiments of the present application, the hybrid retriever determines the secondary information by:
determining a score function through the ranking of the documents to be searched;
determining a relevance function according to the score function and the relevance score of the document to be searched;
Determining a retrieval score according to the correlation function of the primary information and the correlation function of the signaling text information;
And determining secondary information according to the retrieval score.
In some possible implementations, the model's understanding of a particular problem is enhanced by secondary LLM enhanced retrieval, in combination with LLM by prompting word association data set two-layer sentence meaning. The application combines the original inquiry and the additional prompt word generated after the additional context information is combined and processed, adopts a mixed retrieval mode to fuse the document retrieval results, and enhances the understanding of the model to the specific problem.
In some embodiments, the method provided by the embodiment of the present application further includes:
constructing a signaling analysis architecture, the architecture comprising:
the computing network resource layer is used for providing resources constructed by the knowledge base;
A technical layer for building a large language model;
The intelligent agent layer is used for realizing signaling analysis based on expert programming and task scheduling;
And the small model layer is used for signaling analysis in a specific field.
The embodiment of the application provides a technical framework, in particular to a technical framework of signaling analysis intelligent agent based on an AI large model. Referring to fig. 6, the technical architecture specifically includes:
And (3) calculating network resources, namely providing operation base resources of the AI large model, and meeting the purposes of calculating resource allocation and network configuration optimization.
Technical layer is a professional large model base and a core algorithm.
And the AI Agent is an AI intelligent Agent, and intelligent calculation is realized through expert arrangement, task scheduling and an AI intelligent actuator.
And providing a small model, namely a field model, by enhancing intelligent calculation, with predictive early warning, perception evaluation, failure code positioning, network element health evaluation, fault positioning, influence analysis, cross-domain association analysis and the like in the signaling field.
And the instruction is used for realizing the decision of the bottom core control instruction according to the evaluation result of the agent.
And a public component, namely providing a basic domain knowledge vector knowledge base, constructing a prompter, an evaluation tool, an AI embedding technology and the like.
The application scene is to provide core functions such as signaling analysis report service, signaling tracing, signaling reading, knowledge question answering and the like for first-line complaints, network optimization personnel and the like.
Input learning, namely, inputting alarms, knowledge questions and answers and data query, and training AI agents according to the input.
The following describes and describes the scheme of the embodiment of the present application in detail with reference to specific application examples, and the signaling analysis process provided by the present application includes the following steps:
And 0, constructing a knowledge base to form an initial signaling knowledge database based on signaling protocol standards (including signaling message/cell parameter interpretation and the like), classical cases, technical documents and expert experiences of multi-domain service scenes such as EPC/5GC/IMS and the like. Here, the data information is divided into three layers:
at the bottom, a developer signals to grasp the direct file of the package, the context is associated, and the data connection is reported in error.
And the second layer, the credible communication professional field signaling related specifications, technical documents, textbooks and signaling analysis papers, have complex definitions and associated definitions.
And a third layer, defining protocol source files with various signaling meanings. Directly explaining.
Step 1, extracting a signaling domain entity relation through LLM, realizing three-layer data set construction of domain knowledge, (of course, a person skilled in the art can adjust the structural layer number of the data set and the knowledge graph according to actual requirements), carrying out segmentation and blocking by using LLM, extracting label information, and realizing automatic graph construction (namely, the construction process of the multi-layer structure knowledge graph in the application).
Sub-step 1.1, atlas construction, namely, identifying specific entity objects and relations thereof from texts in a signaling domain knowledge base by using LLM, converting the specific entity objects and relations thereof into a structured knowledge representation, and presenting the structured knowledge representation through a structure of a graph.
Sub-step 1.1.1 original text segmentation, namely firstly, data text segmentation is carried out, wherein the segmentation uses a mixed method of signaling words and sentences and segmentation based on structural body units, specifically, a line-feed character is used for segmenting each paragraph in a signaling document, and then, a signaling front-back separator is applied for semantic segmentation, and the method comprises proposition transfer and sub-block derivation. Using sliding window technique, five paragraphs are processed each time, and the last signaling block is removed and the next signaling block is added by continuously adjusting window, so as to keep the attention to the consistency of signaling information.
Sub-step 1.1.2 element extraction-identifying and extracting graph node examples from each source text block is accomplished by using one LLM that is intended to identify all relevant entities in the text.
Specifically, LLM uses a pre-trained language model (Bidirectional Encoder Representations from Transform, BERT) that is prompted to output a name, type, and description for each entity during entity discovery using a neural network of transducer mechanisms. The name may be the exact text in the document or paraphrasing sentences in the signaling technical document, carefully selected to reflect accurate paraphrasing information suitable for subsequent processing. The type is selected by the LLM from a predefined table, and the description is an entity interpretation generated by the LLM, combined with the context in the document. To ensure the validity of the model, examples are prepared here to guide LLM generation of the required output.
Here, in order to better search for valid entities in the database, the BERT model used outputs the entities by vectorizing, generating triplets, mapping the vectors to a lower-dimensional domain using a three-time repeating network with adaptive boundaries, etc.
When the method is specifically used, an advanced text vectorization method in the field is used, the BERT model is used as a text encoder, and after the BERT model is read into a text of any text sequence xi containing N words, the objective function reaches the maximum value through optimizing and training an embedding matrix M and the weight of a neural network, wherein the maximum value is represented by the following formula:
x is all words in the lexicon. In order to vector all text, the derivation is performed in a mean-pooling manner, as follows:
where Vi,j represents the J-th word vector in the i-th text and Ji represents the total number of words in the entity text. After calculating the value of the text vector, calculating the similarity of the entity text by adopting a cosine similarity method, wherein the similarity is as follows:
where D represents the vector dimension.
The output of the final hidden layer is taken as an initial representation hi of xi:
[h1,h2,…,hn]=BERTφ([x1,x2,…,xn]),
Wherein the method comprises the steps ofParameters representing BERT. Then for each predefined entity class cj, its initial prototype vector hcj is constructed by averaging the embedded representations of words labeled cj.
Sub-step 1.1.3 hierarchical linking (i.e. the determination of profile information in the present application) communication signaling is a field of expertise that uses precise terminology systems and is built on a number of well-defined definitions and relationships. Such as the meaning of a certain signaling or the meaning of a reported error value. In this field, LLM cannot warp, modify or add creative or random elements, so here a special hierarchical link model is prepared for communication signaling to form a comprehensive communication signaling analysis knowledge graph. Previously, a knowledge database of a three-layer structure has been prepared. Here, we preset the underlying information according to the above steps to form entities, and the second layer information first uses LLM to construct a map with simple knowledge-graph construction method, and combines LLM to construct a map, and the second layer entity links with the first layer entity according to the correlation detected by LLMs. The signaling definition of the third layer is then linked to the entity of the second layer. For each entity, the text of its name is embedded in comparison with the vocabulary in the communication protocol.
Sub-step 1.1.4. Tag generation and map merging, namely, after the metagraph is formed, scanning each data block to develop a global graph connecting all metagraphs together. The merged metagraph nodes are linked to each other using the hierarchical linking rules described above.
After the steps are completed, a knowledge graph with a three-layer structure is formed, wherein nodes represent entities, and edges represent the relationship between the entities. And the extracted entity relationship is presented in a visual mode by constructing a knowledge graph.
Step 2, adopting a retrieval enhancement generation technology (RAG) based, and automatically analyzing input signaling (namely, step S210 to step S250 in the application). In the process, information will be retrieved efficiently by LLM in response using a strategy called STM (Signaling Transfer Matching) -Retrieve in this patent. The output sequence text Hn has:
Hn=STM-Rφ([x1,x2,…,xn,Y])
wherein xi is an input signaling text, Y is a database, phi is a preset parameter of a retrieval strategy, and the preset parameter comprises priority, initial quantity of a hierarchical retrieval iteration range and the like.
And then, screening out key information as a search keyword according to an analysis scene, and combining the signaling entity relation, the RAG search information and the prompt word to realize logical reasoning of the signaling flow, thereby improving the accuracy and reliability of the signaling analysis.
Substep 2.1. Input of the input code stream file (i.e. signalling text information in the present application).
And 2.2, analyzing, namely analyzing the original code stream according to the protocol.
And 2.3, loading a database, namely loading an RAG knowledge base, wherein the database comprises analysis cases, field knowledge, field interpretation, failure code rules and the like. The original query is converted into a vector representation using text embedding, vector searching, and other techniques.
Sub-step 2.4 once STM (Signaling Transfer Matching) -retrieve strategy retrieval:
And 2.4.1, extracting keywords, namely firstly, carrying out association analysis in a vector form facing the original code stream by utilizing the formed three-layer structure knowledge graph to generate abstract tag description, extracting the characteristic information of the signaling plaintext once, and generating the abstract tag description in the keyword form.
Sub-step 2.4.2 hierarchical search, then, using the extracted keyword information, identifying the most relevant metagraph units through a top-down matching process in the three-layer knowledge graph. This process starts with a larger graph, uses key delineation, and then gradually indexes to the small graph it contains, repeating.
And 2.4.3, activating the entity, namely finally searching to the metalayer and locking part of the entity after hierarchical searching and gradual narrowing, collecting the searched entity and completing the activation of the part of the entity.
Sub-step 2.4.4. Collecting knowledge, finally, collecting entity content and all relevant content of relevant entities within the scope thereof according to the entities, including relevant protocol content knowledge, technical document knowledge, relevance and relation with other entities, content of any linked entities, etc. And (5) completing the search.
Sub-step 2.5 secondary LLM enhanced retrieval, namely, the retrieved information is input as additional context, and is combined with the original query to generate additional prompt words. Here, the document is searched by using an advanced hybrid searcher, and the method fuses the search results of the sparse search model and the dense search model by means of convex linear combination hybrid search (Convex linear combination fusionretriever, convex-fusion), as shown in the formula. Wherein Sconvex (q, d) represents the final relevance score of the document, q represents the query, d represents the document, the value range of alpha is (0, 1), S is the standardized formula of the relevance score, Smax is the highest score of the relevance scores of the current candidate documents, Smin is the lowest score of the relevance scores of the current candidate documents, rank represents the document ranking, and k is an adjustable parameter.
Sconver(q,d)=(1-α)S′sparse(q,d)+αS′dense(q,d)
Based on the above search method and the combination of LLM, the understanding of the model to the specific problem is enhanced by prompting word association data set two-layer sentence meaning.
And 2.6, screening the knowledge generated in the secondary enhanced retrieval according to the definition information of the bottom serious signaling in the three-layer data structure, providing more strict information for the generated knowledge, deleting obvious difference error information, reducing the illusion of a large model and reducing guessing and uncertainty.
And 2.7, knowledge fusion, namely converting all the obtained knowledge vectors and the entities in the atlas into uniform vector representation by using a common embedding model, and calculating and analyzing in the same vector space. The information of the knowledge vector is conveniently integrated into a final conclusion, and the structural information of the atlas can be used for optimizing the representation and calculation of the knowledge vector.
And step 3, merging the signaling knowledge vector knowledge base and the graph data in the previous step, combining the prompt words and the search results, inputting the large model again for logic reasoning, and generating an analysis conclusion and a report.
And 4, updating a domain knowledge base comprising cases and expert experiences, and repeating the steps 0 to 3.
In a specific example, the application effect is described below:
Based on the analysis of the content of the code stream, the analysis scheme is automatically recommended, and the user can select the specific data type of the signaling analysis according to the need, as shown in fig. 7. Meanwhile, the application can realize multi-domain code stream analysis, output analysis reports and assist analysts to quickly locate problems, and is shown in reference to FIG. 8.
The application provides a concept of signaling analysis agent based on an AI large model and a composition of a technical framework. The method comprises the steps of signaling analysis intelligent agent core business flow and specific execution step based on large model, especially the processing step of enhancing the reliability of signaling analysis through RAG+knowledge graph fusion, including three-layer knowledge structure model and mixed search strategy.
The application creatively introduces a new text vectorization method based on BERT in the field of signaling analysis, which comprises a self-adaptive boundary rule and a multiple mapping mechanism, and can better obtain reliable entities in a database by carrying out three-layer overlapping on data with the self-adaptive boundary and mapping the data to a low dimension.
The application innovatively uses a data base model with a three-layer structure and performs hierarchical linking of preset rules, the analysis database to be input by the large model is classified in advance according to different credibility, and the rules during hierarchical linking are performed according to the credibility rating standards of different source data in the database.
Aiming at the characteristics of the signaling analysis, the application creatively provides a signaling analysis special strategy based on a three-layer knowledge structure model and mixed retrieval by combining the two points, and the problem of illusion which can occur in the general large model analysis operation process is solved by directionally iterating in the retrieval process, activating the entity of the metalayer and secondarily enhancing the retrieval priority rule setting, so that the error is reduced.
Compared with the prior art, the invention has the following beneficial effects:
Firstly, a visualization tool for signaling analysis based on a large model interaction class is realized, signaling data is displayed in a text question-answering and graphical mode, and a complex signaling interaction process can be more intuitively understood.
Secondly, a large model and a knowledge base are built, knowledge brains in the signaling field are built, multi-format corpus analysis and semantic vector index construction are completed, the process uses a structured database and a special search strategy to reduce illusions possibly occurring in the large model deduction process, and the method can complete work order state inquiry, knowledge question answering in the operation and maintenance field, instant question answering of documents, output conclusion, promote knowledge sharing, enable intelligent operation and maintenance and improve the work efficiency of operation and maintenance staff.
Thirdly, the upper layer application is rapidly expanded by combining domain knowledge, the upper layer application of the core network configuration data auditing agent can be realized based on a platform, the natural language processing technology and the large model capability are integrated, the original document data is subjected to cleaning, searching and structuring processing, grammar and standard auditing are realized, and the entity recognition is utilized to extract key instructions and attributes thereof and convert the key instructions into codes, so that automatic auditing is realized.
Possible future application scenarios of the present application include:
The method combines domain knowledge to rapidly expand upper-layer application, can realize upper-layer application of core network configuration data auditing agent based on a platform, integrates natural language processing technology and large model capacity, performs cleaning, searching and structuring processing on original document data to realize grammar and standard auditing;
And carrying out alarm classification based on the performance worksheet data, extracting worksheet key information, calling a corresponding capability API interface by adopting an Agent technology, carrying out the collaborative realization of the delimitation and positioning of faults of the model, and improving the alarm disposal efficiency.
The application is beneficial to improving the efficiency of telecom professionals in the network operation and maintenance process, improving the depth and breadth of service perception probe application in a 5G private network platform, creating professional service capability facing clients and further improving the product competitiveness.
Referring to fig. 9, an embodiment of the present application further provides a signaling analysis system, which may implement the signaling analysis method, where the system includes:
A first module 810, configured to obtain signaling text information;
The second module 820 is used for loading the search enhancement generation database, fusing the multi-layer structure knowledge graph, and carrying out search analysis on the signaling text information to obtain the prompt word and the search result;
and a third module 830, configured to input the prompt word and the search result into a large language model, so as to obtain an analysis conclusion.
It can be understood that the content in the above method embodiment is applicable to the system embodiment, and the functions specifically implemented by the system embodiment are the same as those of the above method embodiment, and the achieved beneficial effects are the same as those of the above method embodiment.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the signaling analysis method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
It can be understood that the content in the above method embodiment is applicable to the embodiment of the present apparatus, and the specific functions implemented by the embodiment of the present apparatus are the same as those of the embodiment of the above method, and the achieved beneficial effects are the same as those of the embodiment of the above method.
Referring to fig. 10, fig. 10 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
The processor 901 may be implemented by a general purpose CPU (central processing unit), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs, so as to implement the technical solution provided by the embodiments of the present application;
The memory 902 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM), among others. The memory 902 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented by software or firmware, relevant program codes are stored in the memory 902, and the processor 901 invokes a signaling analysis method for executing the embodiments of the present disclosure;
an input/output interface 903 for inputting and outputting information;
The communication interface 904 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.);
A bus 905 that transfers information between the various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);
wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively coupled to each other within the device via a bus 905.
The embodiment of the application also provides a computer readable storage medium, which stores a computer program, and the computer program realizes the signaling analysis method when being executed by a processor.
It can be understood that the content of the above method embodiment is applicable to the present storage medium embodiment, and the functions of the present storage medium embodiment are the same as those of the above method embodiment, and the achieved beneficial effects are the same as those of the above method embodiment.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by persons skilled in the art that the embodiments of the application are not limited by the illustrations, and that more or fewer steps than those shown may be included, or certain steps may be combined, or different steps may be included.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" is used to describe an association relationship of an associated object, and indicates that three relationships may exist, for example, "a and/or B" may indicate that only a exists, only B exists, and three cases of a and B exist simultaneously, where a and B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b or c may represent a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. The storage medium includes various media capable of storing programs, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk, or an optical disk.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims (10)

CN202411935906.2A2024-12-262024-12-26 Signaling analysis method, system, electronic device and storage mediumPendingCN119967457A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411935906.2ACN119967457A (en)2024-12-262024-12-26 Signaling analysis method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411935906.2ACN119967457A (en)2024-12-262024-12-26 Signaling analysis method, system, electronic device and storage medium

Publications (1)

Publication NumberPublication Date
CN119967457Atrue CN119967457A (en)2025-05-09

Family

ID=95597724

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411935906.2APendingCN119967457A (en)2024-12-262024-12-26 Signaling analysis method, system, electronic device and storage medium

Country Status (1)

CountryLink
CN (1)CN119967457A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120264329A (en)*2025-06-032025-07-04亚信科技(中国)有限公司 5G private network operation and maintenance method and related device based on large model adaptive capability

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120264329A (en)*2025-06-032025-07-04亚信科技(中国)有限公司 5G private network operation and maintenance method and related device based on large model adaptive capability
CN120264329B (en)*2025-06-032025-09-16亚信科技(中国)有限公司 5G private network operation and maintenance method and related device based on large model adaptive capability

Similar Documents

PublicationPublication DateTitle
CN111026842B (en)Natural language processing method, natural language processing device and intelligent question-answering system
US20240086731A1 (en)Knowledge-graph extrapolating method and system based on multi-layer perception
CN117271767A (en)Operation and maintenance knowledge base establishing method based on multiple intelligent agents
Ghahremanlou et al.Geotagging twitter messages in crisis management
CN113177164B (en)Multi-platform collaborative new media content monitoring and management system based on big data
CN119967457A (en) Signaling analysis method, system, electronic device and storage medium
US20240232912A9 (en)Knowledge graph implementation
CN120277223A (en)Dynamic vector knowledge base construction and retrieval method based on multi-mode large model
CN118886489A (en) A method for building a compliance risk control AI knowledge base using LLM
CN114443904A (en)Video query method, video query device, computer equipment and computer readable storage medium
CN116467459A (en)Internet of things equipment fault reporting method and device, computer equipment and storage medium
KR101864401B1 (en)Digital timeline output system for support of fusion of traditional culture
CN119903159A (en) A knowledge question-answering fast processing system based on artificial intelligence
CN115034317B (en) Training method and device for insurance policy recognition model, insurance policy recognition method and device
CN120216703A (en) A method and system for constructing electric power engineering knowledge base based on large model
CN119474328A (en) An intelligent question-answering method and system in the field of building construction
CN113158082B (en)Artificial intelligence-based media content reality degree analysis method
CN117933237A (en)Conference analysis method, conference analysis device and storage medium
CN117290143A (en)Fault locating method, system, electronic equipment and computer readable storage medium
ChoDesigning smart cities: Security issues
CN119293239B (en)Data classification method and work order classification method
CN119155197B (en) Information processing method, network recommendation method and model training method
CN118964593B (en)Document searching method and device, electronic equipment and storage medium
CN119515584B (en)Cross-platform user identity alignment processing method, device, equipment and medium
CN118484665B (en) Intelligent extraction method and system of text topics based on NLP technology

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp