Movatterモバイル変換


[0]ホーム

URL:


CN115526165B - Flow platform monitoring method and system based on word frequency weight - Google Patents

Flow platform monitoring method and system based on word frequency weight

Info

Publication number
CN115526165B
CN115526165BCN202111557458.3ACN202111557458ACN115526165BCN 115526165 BCN115526165 BCN 115526165BCN 202111557458 ACN202111557458 ACN 202111557458ACN 115526165 BCN115526165 BCN 115526165B
Authority
CN
China
Prior art keywords
word
meanings
sentences
model
feature vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111557458.3A
Other languages
Chinese (zh)
Other versions
CN115526165A (en
Inventor
苏长君
曾祥禄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guorui Digital Intelligence Technology Co ltd
Beijing Zhimei Internet Technology Co ltd
Original Assignee
Beijing Guorui Digital Intelligence Technology Co ltd
Beijing Zhimei Internet Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guorui Digital Intelligence Technology Co ltd, Beijing Zhimei Internet Technology Co ltdfiledCriticalBeijing Guorui Digital Intelligence Technology Co ltd
Priority to CN202111557458.3ApriorityCriticalpatent/CN115526165B/en
Publication of CN115526165ApublicationCriticalpatent/CN115526165A/en
Application grantedgrantedCritical
Publication of CN115526165BpublicationCriticalpatent/CN115526165B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明提供一种基于词频权重的流量平台监测方法和系统,通过搭建云计算平台,获取互联网数据流,使用句法分析和语义分析特征向量,根据词分量出现的频率赋予权重值,计算余弦值得到有关评论的质心向量,对质心向量进行报警判断,可以更加容易判断是否合规,极大地提高了防护效率。

The present invention provides a traffic platform monitoring method and system based on word frequency weights. By building a cloud computing platform, obtaining Internet data streams, using syntactic analysis and semantic analysis feature vectors, assigning weight values according to the frequency of occurrence of word components, calculating cosine values to obtain the centroid vectors of relevant comments, and performing alarm judgment on the centroid vectors, it is easier to determine whether compliance is achieved, greatly improving protection efficiency.

Description

Flow platform monitoring method and system based on word frequency weight
Technical Field
The application relates to the field of network multimedia, in particular to a method and a system for monitoring a flow platform based on word frequency weights.
Background
The problem faced by the existing flow platform is that the vocabulary is fragmented, the key vocabulary is difficult to extract, and although the filtering method based on the centroid vector exists in the prior art, when the occurrence frequency of the vocabulary is disordered, the filtering method based on the centroid vector is difficult to achieve the expected effect.
Therefore, a method and system for targeted word frequency weight-based flow platform monitoring are urgently needed.
Disclosure of Invention
The invention aims to provide a flow platform monitoring method and a system based on word frequency weight, which are characterized in that a cloud computing platform is built to acquire internet data flow, a weight value is given according to the occurrence frequency of word components by using syntactic analysis and semantic analysis feature vectors, a cosine value is calculated to obtain a centroid vector of related comments, and alarm judgment is carried out on the centroid vector, so that whether compliance is judged more easily, and the protection efficiency is greatly improved.
In a first aspect, the present application provides a method for monitoring a flow platform based on word frequency weights, where the method includes:
Building a cloud computing platform on a server, and building a syntax model and a semantic analysis model, wherein the syntax model and the semantic analysis model are respectively positioned on different core entities of the cloud computing platform, and the verification body is an entity server in a central position in the cloud computing platform;
According to the acquisition strategy, acquiring a data stream of an Internet platform, inputting feature vectors in the data stream into a syntactic model for sentence breaking, and removing expression symbols to obtain word components;
counting the occurrence times of the word components in unit time, and correspondingly giving weight values according to the times;
Inputting the word components into a semantic analysis model, outputting word meanings, namely sentences with words of a large class, which are simple, unique in meaning and removed, re-forming the word meanings into new sentences, inserting the weight values into the new sentences, and completing vectorization to obtain a second feature vector;
wherein the second feature vector comprises a plurality of weight values corresponding to different word meanings;
calculating cosine values of included angles among a plurality of second feature vectors, and forming a centroid vector from the second feature vectors with the cosine values higher than a threshold value;
calculating an accumulated value of weight values of the centroid vector, wherein the accumulated value is used for reflecting the measure of the relevance of comments;
Filtering word meanings with centroid vector values lower than a second threshold value, judging whether the word meanings comprise appointed keywords, if yes, continuing to judge whether sentences in which the word meanings are located form appointed meanings, if the sentences form the appointed meanings, confirming that corresponding second feature vectors belong to conditions needing alarming, sending alarm information, and if the sentences do not form the appointed meanings, confirming that the corresponding second feature vectors are compliant.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the method further includes risk assessment, attack association analysis, and situation awareness.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the acquiring the data stream of the internet platform includes encoding and decoding the data stream.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the kernels of the semantic analysis model and the syntax model use a neural network model.
In a second aspect, the present application provides a flow platform monitoring system based on word frequency weights, the system comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
The processor is configured to perform the method according to any one of the four possible aspects of the first aspect according to instructions in the program code.
In a third aspect, the present application provides a computer readable storage medium for storing program code for performing the method of any one of the four possibilities of the first aspect.
The invention provides a flow platform monitoring method and a system based on word frequency weight, which are characterized in that a cloud computing platform is built, an internet data stream is acquired, a weight value is given according to the occurrence frequency of word components by using syntactic analysis and semantic analysis feature vectors, a cosine value is calculated to obtain a centroid vector of related comments, and the centroid vector is subjected to alarm judgment, so that whether compliance is judged more easily, and the protection efficiency is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings so that the advantages and features of the present invention can be more easily understood by those skilled in the art, thereby making clear and defining the scope of the present invention.
Fig. 1 is a flowchart of a flow platform monitoring method based on word frequency weight provided by the application, which comprises the following steps:
Building a cloud computing platform on a server, and building a syntax model and a semantic analysis model, wherein the syntax model and the semantic analysis model are respectively positioned on different core entities of the cloud computing platform, and the verification body is an entity server in a central position in the cloud computing platform;
According to the acquisition strategy, acquiring a data stream of an Internet platform, inputting feature vectors in the data stream into a syntactic model for sentence breaking, and removing expression symbols to obtain word components;
counting the occurrence times of the word components in unit time, and correspondingly giving weight values according to the times;
Inputting the word components into a semantic analysis model, outputting word meanings, namely sentences with words of a large class, which are simple, unique in meaning and removed, re-forming the word meanings into new sentences, inserting the weight values into the new sentences, and completing vectorization to obtain a second feature vector;
wherein the second feature vector comprises a plurality of weight values corresponding to different word meanings;
calculating cosine values of included angles among a plurality of second feature vectors, and forming a centroid vector from the second feature vectors with the cosine values higher than a threshold value;
calculating an accumulated value of weight values of the centroid vector, wherein the accumulated value is used for reflecting the measure of the relevance of comments;
Filtering word meanings with centroid vector values lower than a second threshold value, judging whether the word meanings comprise appointed keywords, if yes, continuing to judge whether sentences in which the word meanings are located form appointed meanings, if the sentences form the appointed meanings, confirming that corresponding second feature vectors belong to conditions needing alarming, sending alarm information, and if the sentences do not form the appointed meanings, confirming that the corresponding second feature vectors are compliant.
The cloud computing platform further comprises an entity server for calling the edge position, the corresponding word components and the cluster structure are traced, the suspected track and the suspected source point are sent to the entity server of the center position, the entity server of the center position calls the computing capacity of the cloud computing platform, the source point of the corresponding data stream is determined, and the entity server of the edge position is informed of shielding the source point.
In some preferred embodiments, the method further comprises risk assessment, attack association analysis, and situational awareness.
In some preferred embodiments, the acquiring the data stream of the internet platform includes encoding and decoding the data stream.
In some preferred embodiments, the kernels of the semantic analysis model and the syntactic model both use neural network models.
The application provides a flow platform monitoring system based on word frequency weight, which comprises a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any of the embodiments of the first aspect according to instructions in the program code.
The present application provides a computer readable storage medium for storing program code for performing the method of any one of the embodiments of the first aspect.
In a specific implementation, the present invention also provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in the various embodiments of the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.
The same or similar parts between the various embodiments of the present description are referred to each other. In particular, for the embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the description of the method embodiments for the matters.
The embodiments of the present invention described above do not limit the scope of the present invention.

Claims (6)

CN202111557458.3A2021-12-192021-12-19Flow platform monitoring method and system based on word frequency weightActiveCN115526165B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111557458.3ACN115526165B (en)2021-12-192021-12-19Flow platform monitoring method and system based on word frequency weight

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111557458.3ACN115526165B (en)2021-12-192021-12-19Flow platform monitoring method and system based on word frequency weight

Publications (2)

Publication NumberPublication Date
CN115526165A CN115526165A (en)2022-12-27
CN115526165Btrue CN115526165B (en)2025-08-15

Family

ID=84694458

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111557458.3AActiveCN115526165B (en)2021-12-192021-12-19Flow platform monitoring method and system based on word frequency weight

Country Status (1)

CountryLink
CN (1)CN115526165B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105787097A (en)*2016-03-162016-07-20中山大学Distributed index establishment method and system based on text clustering

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105787097A (en)*2016-03-162016-07-20中山大学Distributed index establishment method and system based on text clustering

Also Published As

Publication numberPublication date
CN115526165A (en)2022-12-27

Similar Documents

PublicationPublication DateTitle
CN112560496B (en)Training method and device of semantic analysis model, electronic equipment and storage medium
CN112163681B (en)Equipment fault cause determining method, storage medium and electronic equipment
CN110175851B (en)Cheating behavior detection method and device
CN113903361B (en) Voice quality inspection method, device, equipment and storage medium based on artificial intelligence
CN111221960A (en)Text detection method, similarity calculation method, model training method and device
CN117312562A (en)Training method, device, equipment and storage medium of content auditing model
CN115526165B (en)Flow platform monitoring method and system based on word frequency weight
CN118394945B (en)Short message content analysis method and system based on artificial intelligence
CN112287663B (en)Text parsing method, equipment, terminal and storage medium
CN115526178B (en)Improved flow platform monitoring method and system
CN114201955B (en)Internet flow platform monitoring method and system
CN112416754A (en)Model evaluation method, terminal, system and storage medium
CN117574103A (en) Evaluation method, device, electronic equipment and storage medium for a question and answer system
CN117455306A (en)Complaint early warning method and device, storage medium and electronic equipment
CN113836292B (en)Structuring method, system, device and medium for biomedical literature abstract
CN116384370A (en) A big data security analysis method and system for online business session interaction
CN115114627A (en) Malware detection method and device
CN110929501B (en)Text analysis method and device
CN116132103A (en) A network security situation monitoring method, device, electronic equipment and storage medium
CN112632229A (en)Text clustering method and device
CN114168731B (en)Internet media flow safety protection method and system
CN114201956B (en)Security protection method and system for industrial Internet
CN114363061B (en)Abnormal flow detection method, system, storage medium and terminal
RU2832692C1 (en)Method for automated generation of instructions for elimination of information security incidents and generation on their basis of machine scenarios for setting up information protection system
CN119357981B (en) A POS machine intelligent early warning method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
CB02Change of applicant information
CB02Change of applicant information

Address after:607a, 6 / F, No. 31, Fuchengmenwai street, Xicheng District, Beijing 100037

Applicant after:Beijing Guorui Digital Intelligence Technology Co.,Ltd.

Address before:607a, 6 / F, No. 31, Fuchengmenwai street, Xicheng District, Beijing 100037

Applicant before:Beijing Zhimei Internet Technology Co.,Ltd.

GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp