Movatterモバイル変換


[0]ホーム

URL:


CN110609969A - Information processing method and device - Google Patents

Information processing method and device
Download PDF

Info

Publication number
CN110609969A
CN110609969ACN201910731000.1ACN201910731000ACN110609969ACN 110609969 ACN110609969 ACN 110609969ACN 201910731000 ACN201910731000 ACN 201910731000ACN 110609969 ACN110609969 ACN 110609969A
Authority
CN
China
Prior art keywords
risk
publisher
event
text
web
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910731000.1A
Other languages
Chinese (zh)
Inventor
龚黎明
蒋增辉
林川
易灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding LtdfiledCriticalAlibaba Group Holding Ltd
Priority to CN201910731000.1ApriorityCriticalpatent/CN110609969A/en
Publication of CN110609969ApublicationCriticalpatent/CN110609969A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The disclosure provides a network public opinion data analysis method and device. Specifically, the present disclosure provides an information processing method, including: acquiring a network release text; analyzing the web post text to determine a publisher identifier of the text and a risk event type associated with the text; determining an influence metric for the publisher from the publisher identifier; acquiring the number of historical public opinions related to the risk event type; determining a propagation risk value according to the influence metric of the publisher and the number of historical public opinions; determining an event risk value according to the risk event type; and determining a risk level of the web-published text according to the propagation risk value and the event risk value.

Description

Information processing method and device
Technical Field
The application relates to the field of internet in general, and in particular relates to analysis and processing of network public opinion data.
Background
With the rapid development of the internet, users issue opinions and opinions of events on various network information platforms, certain media and websites publish directional spreading opinions, and the contents are fermented through the network to cause network public opinions and have profound influence on enterprises and individuals. With the rapid development of the internet on a global scale, network media has been recognized as "fourth media" following newspapers, broadcasting, and television, and the network becomes one of the main carriers reflecting social public opinions.
The monitoring of the network public opinion content extracts the concerned events from a plurality of information on the Internet, analyzes and predicts the development and change trend of the public opinion represented by the events, and further can take effective measures in advance to avoid the generation of negative social effects.
Disclosure of Invention
In order to solve the above technical problem, the present disclosure provides an information processing method, including:
acquiring a network release text;
analyzing the web post text to determine a publisher identifier of the text and a risk event type associated with the text;
determining an influence metric for the publisher from the publisher identifier;
acquiring the number of historical public opinions related to the risk event type;
determining a propagation risk value according to the influence metric of the publisher and the number of historical public opinions;
determining an event risk value according to the risk event type; and
determining a risk level of the web-posted text based on the propagation risk value and the event risk value.
Optionally, the determining a propagation risk value comprises:
and carrying out weighted summation on the influence metric of the publisher and the historical public opinion number to determine the propagation risk value.
Optionally, the method further comprises,
determining an influence metric for the publisher based on one or more publisher characteristics of the publisher.
Optionally, the impact metric is determined by inputting the one or more publisher characteristics into an impact model, and the impact model is trained using publisher characteristics of a plurality of publishers and risk ratings of historical web-published text of the plurality of publishers.
Optionally, the one or more publisher characteristics include one or more of occupation, territory, age group, and income of the publisher.
Optionally, the determining the risk level of the web-posted text comprises:
weighted summing the propagation risk value and the event risk value to determine the risk level.
Optionally, the method further comprises:
obtaining the number of historical public opinions related to the event type from a counter related to the risk event type; and
incrementing a value of a counter associated with the risk event type.
Optionally, the method further comprises displaying one or more of one or more publisher characteristics of the publisher, the risk event type, the number of historical opinions, the propagation risk value, the event risk value, and information related to the web post text.
Optionally, the event risk value is determined according to a risk rating of a plurality of web-published texts associated with the risk event type.
Optionally, the event risk value is a weighted sum of the risk ratings of the plurality of web-published texts associated with the risk event type.
Optionally, the method further comprises:
updating an event risk value for the risk event type using the risk rating of the web-posting text.
Optionally, the method further comprises:
updating an event risk value for the risk event type using the risk rating of the web-posting text.
Optionally, the risk event type is determined by extracting keywords in the web-published text.
Optionally, the web post text includes microblog blogs, web documents, and web forum utterances.
Another aspect of the present disclosure provides an apparatus for information processing, including:
a module for obtaining a network release text;
means for analyzing the web-published text to determine a publisher identifier for the text, and a risk event type associated with the text;
means for determining an influence metric for the publisher from the publisher identifier;
means for obtaining a number of historical public sentiments related to the type of risk event;
means for determining a propagation risk value as a function of the publisher's measure of influence and the number of historical public opinions;
means for determining an event risk value as a function of the risk event type; and
means for determining a risk level for the web-posted text as a function of the propagation risk value and the event risk value.
Optionally, the means for determining a propagation risk value comprises:
means for weighted summing the publisher's influence metric and the number of historical commonalities to determine the propagation risk value.
Optionally, the apparatus further comprises,
means for determining an influence metric for the publisher based on one or more publisher characteristics of the publisher.
Optionally, the impact metric is determined by inputting the one or more publisher characteristics into an impact model, and the impact model is trained using publisher characteristics of a plurality of publishers and risk ratings of historical web-published text of the plurality of publishers.
Optionally, the one or more publisher characteristics include one or more of occupation, territory, age group, and income of the publisher.
Optionally, the means for determining the risk level of the web-posted text comprises:
means for weighted summing the propagation risk value and the event risk value to determine the risk level.
Optionally, the apparatus further comprises:
means for obtaining the number of historical public sentiments related to the event type from a counter related to the risk event type; and
means for incrementing a value of a counter associated with the risk event type.
Optionally, the apparatus further comprises means for displaying one or more of the publisher characteristic(s) of the publisher, the risk event type, the number of historical consensus opinions, the propagation risk value, the event risk value, and information related to the web post text.
Optionally, the event risk value is determined according to a risk rating of a plurality of web-published texts associated with the risk event type.
Optionally, the event risk value is a weighted sum of the risk ratings of the plurality of web-published texts associated with the risk event type.
Optionally, the apparatus further comprises:
means for updating an event risk value for the risk event type using the risk rating of the web-published text.
Optionally, the apparatus further comprises:
means for updating an event risk value for the risk event type using the risk rating of the web-published text.
Optionally, the risk event type is determined by extracting keywords in the web-published text.
Optionally, the web post text includes microblog blogs, web documents, and web forum utterances.
Yet another aspect of the invention provides an apparatus comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring a network release text;
analyzing the web post text to determine a publisher identifier of the text and a risk event type associated with the text;
determining an influence metric for the publisher from the publisher identifier;
acquiring the number of historical public opinions related to the risk event type;
determining a propagation risk value according to the influence metric of the publisher and the number of historical public opinions;
determining an event risk value according to the risk event type; and
determining a risk level of the web-posted text based on the propagation risk value and the event risk value.
In contrast to the prior art, the present disclosure uses publisher characteristics (e.g., occupation, territory, age, etc.) in determining a risk level for posting text over a network to determine the influence of a publisher. The present disclosure also considers the number of occurrences of the historical public sentiment of the same kind of event (also referred to herein as the number of historical public sentiments), and comprehensively determines the propagation risk value of the event by combining the influence of the publisher and the number of the historical public sentiment of the same kind of event.
In addition, in the technical solution of the present invention, each type of risk event may have an event risk value, and the event risk value may be related to risk levels of a plurality of historical opinions (e.g., web postings) of the type of event. For example, the risk level of the historical web published text of each type of event can be used to train an event risk value model of the type of event, and then the risk level of the historical public opinion information of the type of event can be determined by using the event risk value model. Or, that is, the risk level determined each time with respect to historical public sentiments of a type of event may be used to update the event risk value for that type of event.
The risk level of the web-published text may then be determined by propagating the risk value and the event risk value.
Further, the risk level of the web-distributed text may be used to update the event risk value for the risk event to which the web-published text relates. Through the feedback mechanism, the risk value of each risk event type can be updated in real time, so that the risk probability of the event type can be more accurately reflected, and the accuracy of determining the risk level of the network-issued text is improved.
Drawings
Fig. 1 is a diagram of a system for internet public opinion data analysis according to aspects of the present disclosure.
Fig. 2 is a diagram of an apparatus for risk event classification counting according to aspects of the present disclosure.
Fig. 3 is a block diagram of an apparatus for internet public opinion data analysis according to aspects of the present disclosure.
Fig. 4 is a flowchart of a method for cyber public opinion data analysis according to aspects of the present disclosure.
Fig. 5 is a flowchart of a method for cyber public opinion data analysis according to aspects of the present disclosure.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described herein, and thus the present invention is not limited to the specific embodiments disclosed below.
In this context, the term "risk event type" is a generic term for a class of potentially harmful events, such as lost funds, leaked identity information, lost courier goods, received counterfeit goods for shopping, and the like. The term "risk event" refers to a specifically occurring risk event, e.g., an event that a particular user encounters at a certain time that may cause a loss.
Fig. 1 is a diagram of a system for internet public opinion data analysis according to aspects of the present disclosure.
As shown in fig. 1, a system 100 for cyber public opinion data analysis may include one or more network platforms 1011-N, a server 102, a memory 103, and an optional display 104.
The network platform 101 may include web forums, micro blogs, news websites, chat software, and the like. A user may publish a word on the network platform 101, which may be referred to herein as web posted text, e.g., a website posted article, a forum utterance, a microblog blog, etc.
The web-posting text may include a description of the risk event experienced by the user for himself or herself. A risk event is an event that may have a significant impact on the image, brand, and operation of an enterprise. For example, if a user experiences some loss (e.g., loss of funds, leakage of identity information, loss of courier goods, receipt of counterfeit goods from online shopping, etc.) while using services provided by a business, the event may be referred to as a risk event.
The server 102 may obtain (e.g., obtain with a web crawler) a post author Identifier (ID) (e.g., a publisher account number, a cell phone number), a risk event description, a forwarding amount, a reading amount, a number of comments, and other information from a web post text published on the web platform 101. The publisher is also referred to herein simply as the publisher.
The server 102 may obtain the risk event type (e.g., loss of funds, leakage of identity information, loss of courier goods, receipt of counterfeit goods for shopping, etc.) and other information related to the risk event (e.g., amount of lost funds, etc.) from the risk event description.
The server 102 can further obtain the corresponding one or more publisher characteristics based on the publisher ID. Publisher characteristics may include the publisher's profession, region, age group, number of network supporters (e.g., number of microblog fans), number of forwards of historical published text of the publisher, amount of reading, number of comments, and so forth.
The server 102 can use the one or more publisher characteristics to determine an impact metric for the publisher. For example, an influence model may be trained using the influence (e.g., risk level) of one or more publishers' publisher signatures and their prior published text over the web (also referred to as historical web published text), and then one or more publisher signatures of publishers to be predicted are input into the trained influence model to determine the influence of the publisher.
For example, if the publisher occupation is a newsreader or a star, the text may be spread more widely, or the impact will be greater, and thus may be given a higher impact metric. If the publisher's territory is a first-line city, its published text may be forwarded many times and may also be given a higher measure of influence.
The method and the system not only consider the network characteristics (such as the number of fans, the network attention degree, the reading amount, the forwarding amount and the comment amount of the historical published texts) of the publisher, but also consider the individual characteristics (such as occupation, region and the like) of the publisher, so that the influence on the publisher can be predicted more comprehensively and reasonably.
The server 102 may also obtain the number of historical opinions of the risk event types to which the web post text relates (e.g., describes). The historical public opinion number may be a public opinion number collected over a period (e.g., two weeks, one month, etc.) for the type of risk event.
For example, if the risk event type acquired by the server 102 is identity information leakage, the server 102 may acquire the number of public opinions (e.g., articles published by a network platform, bloggers, customer service information feedback, etc.) collected in a period about the identity information leakage event type.
The server 102 may collect various types of risk events through various channels (platforms), and classify and count the collected risk events. These channels may include various network platforms, customer complaint calls, complaint platforms, television broadcast reports, newspapers, and so forth.
Fig. 2 is a diagram of an apparatus for classified counting of risk events for public opinion according to aspects of the present disclosure.
As shown in fig. 2, public opinion information about various risk events may be collected through a plurality of platform collection modules 201. These platform collection modules 201 may include a web platform collection module, a customer service platform collection module, a news media platform collection module, and the like.
The network platform collecting module can collect public opinion information from network platforms such as network forums, websites, chatting software and the like, and extracts relevant information of risk events from the public opinion information.
The customer service platform collection module may collect customer complaints and feedback regarding the encountered risk events. For example, a user may reflect a risk event (e.g., stolen funds, identity compromised, etc.) encountered through an incoming call or complaint platform. The customer service information collection module may record the received feedback regarding the risk event.
The news media platform collection module can collect public opinion information of risk events from news reports such as television stations, radio, newspapers and the like.
The plurality of platform collection modules may provide the collected risk event related information (or public opinion information) to the public opinion classification module 202.
The public opinion classification module 202 may classify the public opinion according to the related information of the risk event related to the public opinion, for example, a capital loss event type, an identity information disclosure event type, a cargo delivery loss event type, a counterfeit goods receipt event type in online shopping, and the like. For example, the public opinion information data about the characteristics of different event types can be constructed in advance, the characteristics of the sample data are labeled, then machine learning is carried out by using the labeled data, a public opinion classification model is trained, and finally the trained public opinion classification model is called to classify the events of the public opinion information.
The public opinion classification module 202 may assign an event type identifier to each public opinion information according to the classification result. For example, a public opinion regarding a capital loss event type may be assigned an event type identifier 01, a public opinion regarding an identity information leakage event type may be assigned an event type identifier 02, a public opinion regarding a cargo delivery loss event type may be assigned an event type identifier 03, a public opinion regarding a shopping receipt fraud event type may be assigned an event type identifier 04, and the like.
The event classification module 202 may provide an event type identification of the public sentiment to the counter module 203. The counter module 203 may include a plurality of event type counters that respectively correspond to different event types. Each time a type identification of a risk event is received, the corresponding event type counter may be incremented, e.g., by 1, according to the type identification.
The value of each counter may be stored in a memory (e.g., memory 103) for subsequent use by server 102.
Returning to fig. 1, the server 102 can determine a propagation risk value for a published risk event based on the publisher's measure of influence and the number of historical public sentiments associated with the risk event type. For example, the propagation risk value may be a weighted sum of the publisher influence metric and the number of historical public opinions. The propagation risk value may characterize the size of the potential propagation range of the web-published text.
The propagation risk value of the disclosure considers the individual characteristics of the publishers of the network published texts and the historical public opinion number of the related risk event types, and predicts the potential propagation degree of the network published texts more comprehensively and effectively.
Server 102 may also determine an event risk value for the risk event type described by the web post text.
The event risk value for each event type may be determined and updated based on historical empirical values.
In one example, a risk level for historical webpublished text associated with each risk event type (e.g., a risk level for multiple webpublished texts associated with the type of event) may be determined and an event risk value for the type of event determined and/or updated based on the determined risk levels. For example, if the risk level of the historical web post text associated with the first risk event type (e.g., an average of the risk levels of the plurality of historical texts) is high, a greater event risk value may be assigned to the first risk event type. A smaller event risk value may be assigned to a first risk event type if the risk rating of the historical web-published text associated with a second risk event type is lower.
In another example, the degree of propagation of the historical web post text associated with each risk event type may be determined. The degree of dissemination may be expressed using the amount of forwarding, the amount of reading, and/or the number of comments (e.g., a weighted average of the average amount of forwarding, the average amount of reading, and/or the average number of comments) of the historical web-posted text for a plurality of events belonging to the event type, and so on. An event risk value for the type of event may then be determined based on the degree of propagation of the historical web post text. For example, if the number of historical public opinion forwards, reads, and/or reviews of the first risk event type is high, a greater event risk value may be assigned to the first risk event. A lower event risk value may be assigned to the second risk event if the plurality of historical public opinion forwardings, reading, and/or number of reviews of the second risk event type is lower.
In yet another example, the audit results of historical web-published text associated with each risk event may be used to determine an event risk value for that event type. After a risk event occurs, it may be determined through auditing whether the risk event is real. For example, for the user complaint of the fund stealing event, whether the fund stealing event really occurs can be determined by means of rechecking a bill and the like, if so, the audit result is true, and otherwise, the audit result is false. The event risk value for an event type may be proportional to the number of historical risk events (historical web post text) for which the audit result is true.
The historical experience values may also include an impact metric of a publisher of historical web published text related to the risk event type, a propagation risk value for web published text, and the like.
Server 102 may determine a risk level for the web-published text based on the propagation risk value and the event risk value.
The risk level of the web-published text may be obtained by a weighted sum of the propagation risk value and the event risk value.
Server 102 may determine whether to issue a text-triggered alert for the network based on the risk level. If the probability of causing major influence on the enterprise is high, the high-risk event can be judged and needs to be processed as soon as possible; otherwise, if the probability of causing major influence on the enterprise is very low, the low-risk event can be determined, and the processing can be suspended.
For example, the determined risk level may be compared to a predetermined threshold, and if above the predetermined threshold, it is determined that an early warning signal needs to be generated. For example, the wind control department may be notified to further analyze and process the risk events to which the web post text relates.
Optionally, server 102 may also find the source publication text of the web publication text by forwarding a link backtrack (e.g., by searching for a forwarded URL in a web page). In particular, one netpage may be the forwarding of other netpage texts, and the source of the netpage (which may be referred to herein as the source netpage) may be found by forwarding a link back.
Further, the network may present multiple source network distribution texts on one risk event. In this case, the publication times of the plurality of source web distributed texts may be ordered, and the earliest published text may be considered as the initial web published text. The initial web post text can be analyzed (e.g., to obtain the identity information of the publisher), and effective measures can be taken to control the development of public sentiment.
The system 100 may optionally include a display 104. The display 104 may display information related to the web-posted text, such as risk event types, publisher characteristics (occupation, region, age group, fan count, network support, etc.), propagation risk values of the web-posted text, event risk values, risk levels, text reading amount, forwarding amount, comment amount, and the like, related to the web-posted text. For example, where the type of risk event is a fund theft, the amount of the theft may also be displayed. Other related information may also be displayed on the display 104.
In the prior art, only information whether an alarm is required is usually displayed. According to the technical scheme, more detailed information such as the risk event type, the characteristics (occupation, region, age group, fan number and the like) of the publisher, the propagation risk value, the event risk value, the risk level and the like of the network publishing text is displayed on the display 104, so that the method and the system can help relevant organizations (such as a wind control department) to more comprehensively and intuitively know details of the risk event related to the network publishing text, and accordingly the public opinion development can be more efficiently controlled.
The memory 103 may store publisher characteristics, influence values, historical public opinion numbers for each type of risk event (e.g., values of counters), risk values for various events, risk ratings for web-published text, etc., as described above.
Fig. 3 is a block diagram of an apparatus for internet public opinion data analysis according to aspects of the present disclosure.
The apparatus 300 for internet public opinion data analysis may be located in the server 102 of fig. 1.
The public opinion information collecting module 301 may collect public opinion information from a web publishing text. The web-published text may be obtained from various web platforms.
For example, the related public opinion information may be collected by searching keywords of web-published text. The collected public opinion information may include relevant descriptions of risk events (e.g., event type, event occurrence time, location, and other information), media name of published text, publisher ID (e.g., account information), forwarding amount, reading amount, number of comments, and so forth.
In an aspect, a publisher influence metric for text may be determined from public opinion information by the publisher ID extraction module 302, the publisher feature acquisition module 303, and the publisher influence determination module 304.
The publisher ID extraction module 302 may extract a publisher ID, for example, an account number, a cell phone number, etc. of the publisher from public opinion information.
The publisher characteristic acquisition module 303 may acquire one or more publisher characteristics of the publisher based on the publisher ID.
Publisher characteristics may include individual characteristics of the publisher, including publisher occupation (e.g., students, lawyers, teachers, media practitioners, stars, etc.), publisher territory (e.g., first-line city, second-line city), age group, publisher income, and so forth.
The publisher characteristics may also include network characteristics of the publisher, including the publisher's network supporter population (e.g., fans), the publisher's read volume of historical published text, forward volume, number of reviews, and so forth.
The publisher impact determination module 304 can determine an impact metric for a publisher based on one or more publisher characteristics. The impact metric may represent (or predict) the impact of a publisher's speech (e.g., published web text) on the network. For example, the impact strength amount may be related to the number of forwards, reads, and/or comments made to the web-posted text.
The impact metric may be derived by inputting one or more publisher characteristics into the impact model.
For example, publisher features of a plurality of publishers may be used as input samples for training the model of influence, the number of hops of historical web text published by each publisher and the influence reflected by the reading number are empirically determined as output samples of the model, and the model is trained from the input samples and the output samples. One or more publisher features of the publisher to be predicted may then be input into the trained influence model to obtain an influence metric for the publisher.
On the other hand, the number of historical public opinions of the event to which the text relates may be determined by the event type determination module 305 and the historical public opinion number acquisition module 306.
The event type determination module 305 may extract the involved risk event types from the public opinion information. In the process of determining the type of the risk event, the risk event keywords can be extracted from the public sentiment information through a word segmentation extraction model. For example, the model can be trained by using samples of some acquired public opinion information and event types described by the public opinion information. The newly acquired public opinion information may then be input into the trained model to obtain the event type described in the public opinion information.
The historical public opinion number obtaining module 306 can obtain the public opinion number of the event type in a predetermined time period according to the determined risk event type. For example, for the type of money theft event, the number of occurrences of public opinions (e.g., articles on money theft that appear on websites, complaint calls received on money theft, news reports on money theft, etc.) on money theft over a predetermined period of time (e.g., two weeks, one month, etc.) may be obtained.
The number of historical public opinions can be determined by the risk event classification counting method described in fig. 2.
The propagation risk value determination module 307 may determine a propagation risk value of the web-published text according to the publisher influence metric and the number of historical public opinions.
For example, the propagation risk value may be obtained by a weighted sum of the publisher influence metric and the number of historical public opinions.
The event risk value determination module 308 may determine its corresponding event risk value based on the event type determined by the event type determination module 305.
Each risk event type has a corresponding event risk value. The event risk value for each risk event type may be determined and/or updated based on the risk ratings of a plurality of historical risk events associated with the risk event type. For example, an event risk value model may be trained using risk ratings for web-published text that have been determined to be associated with a plurality of historical risk events of a risk event type and a label for the risk event type. The trained model may then be used to determine a risk value for the type of risk event to be predicted. As a simple example, the event risk value for each risk event type may be a weighted sum of the risk ratings of the web-published texts associated with a plurality of historical risk events for that risk event type.
The risk level determination module 309 can determine a risk level for the web-published text based on the propagation risk value and the event risk value. For example, the propagation risk value and the event risk value may be weighted and summed to determine a risk level for the web-published text.
Preferably, the determined risk level may be used to update the event risk value for the risk event in question, as described above.
Fig. 4 is a flowchart of a method for cyber public opinion data analysis according to aspects of the present disclosure.
At step 402, web-posting text can be obtained.
For example, the web post text may include a description of the user's risk event experienced by himself or by others. The network release text can be obtained from a network forum, a microblog, a news website, chatting software and the like.
At step 404, a propagation risk value for the text may be determined.
In an aspect, a publisher's ID (e.g., account number, phone number, etc.) may be extracted from the network publication text, one or more characteristics of the publisher (e.g., occupation, region, age bracket, etc.) may be obtained from the publisher ID, and the one or more characteristics may then be used to determine an influence metric for the publisher.
In another aspect, the type of risk event involved in the network-issued text may be determined, for example, a loss of funds, a leak of identity information, a loss of courier goods, a receipt of counterfeit goods at a purchase, and so forth. The type of risk event to which the web-published text relates can be determined by extracting keywords. The historical public sentiment number for the risk event type may then be obtained, which may be the number of public sentiments collected for the risk event type over a period (e.g., two weeks, one month, etc.).
The propagation risk value of the text can be determined according to the publisher influence metric and the number of historical public opinions.
At step 406, an event risk value for the web-published text may be determined.
In particular, an event risk value for a risk event type to which the web post text relates may be determined.
Each risk event type may have an event risk value, which may be determined and updated based on historical empirical values.
For example, the risk level, the degree of propagation (e.g., the amount of forwarding, the amount of reading, and/or the number of reviews), the audit results, etc. of the historical web-published text associated with each type of risk event may be used to determine and update the corresponding event risk value.
At step 408, a risk level for the web-published text may be determined based on the propagation risk value and the event risk value.
The risk level of the web-published text may be obtained by a weighted sum of the propagation risk value and the event risk value.
In step 410, whether to generate a public opinion warning signal may be determined according to the risk level.
For example, the risk level may be compared with a predetermined threshold, and if the risk level is higher than the predetermined threshold, it is determined that a public opinion warning signal is to be generated. The public sentiment early warning signal can prompt relevant mechanisms to take further control operation on the public sentiment.
Optionally, the public opinion warning signal may include information related to the web-posted text, such as risk event type, publisher characteristics (occupation, region, age group, fan number, network support, etc.), propagation risk value of the web-posted text, event risk value, risk level, text reading amount, forwarding amount, comment amount, and other related information related to the web-posted text. Such information related to the web post text can be transmitted to a terminal (e.g., a display) of the relevant institution, which helps the relevant institution (e.g., a wind control department) to more comprehensively and intuitively know the details of the risk event related to the web post text, so that the development of public opinion can be more efficiently controlled.
Fig. 5 is a flowchart of a method for cyber public opinion data analysis according to aspects of the present disclosure.
At step 502, a web post text may be obtained.
At step 504, the web post text may be analyzed to determine a publisher identifier for the text, and a risk event type associated with the text.
Relevant information can be gathered by searching keywords of web-published text. The collected information may include a relevant description of the risk event (e.g., event type, event occurrence time, location, and other information), a media name of the publication text, a publisher ID (e.g., account information, cell phone number), a forwarding amount, a reading amount, a number of reviews, and so forth.
At step 506, a publisher influence metric may be determined from the publisher ID.
In particular, one or more publisher characteristics of a publisher may be obtained from the publisher ID.
The publisher characteristics for each publisher may be pre-stored with the publisher ID in a memory (e.g., memory 103 of fig. 1). The publisher features may be obtained in advance from various information channels.
Publisher characteristics may include individual characteristics of the publisher, including publisher occupation (e.g., students, lawyers, teachers, media practitioners, stars, etc.), publisher territory (e.g., first line city, second line city), age group, publisher income.
The publisher characteristics may also include network characteristics of the publisher, including the publisher's network supporter population, the publisher's read volume of historical published text, forward volume, number of reviews, and so forth.
An influence metric for the publisher can then be determined from the one or more publisher characteristics.
The impact metric for a publisher may be determined by inputting one or more publisher characteristics of the publisher into an impact model, where the impact model may be trained using the publisher characteristics of a plurality of publishers and risk levels of publishing text for historical networks of the plurality of publishers.
At step 508, the number of historical public sentiments associated with the type of risk event may be obtained.
The number of historical public opinions of each event type may be determined by classifying and counting events of public opinion information, as illustrated in fig. 2. The number of historical opinions for each event type may then be stored in memory for subsequent use.
When new public opinion information is obtained, the corresponding historical public opinion number can be searched in the storage through the related risk event type.
At step 510, a propagation risk value may be determined based on the one or more publisher characteristics and the number of historical public opinions.
In an aspect, an influence metric for a publisher can be determined from the one or more publisher characteristics obtained in step 506. The impact metric may represent (or predict) the degree of dissemination of a publisher's speech (e.g., published web text) over the network. For example, the impact strength amount may be related to the number of forwards, reads, and/or comments made to the web-posted text.
An influence model may be trained using the influence of the publisher features of multiple publishers and their historical web published text, and then one or more publisher features of a publisher to be predicted are input into the trained influence model to determine an influence metric for the publisher.
For example, if the professional characteristic of the publisher is a newsreader or star, the text may be spread more widely, or the impact will be greater, and thus may be given a higher impact metric. If the publisher's geographic characteristics are a first-line city, its published text may be forwarded many times and may also be given a higher measure of influence.
The method and the system have the advantages that not only the network characteristics (such as fan number and network attention degree) of the publisher but also the individual characteristics (such as occupation, region and age range) of the publisher are considered when the propagation risk value is determined, so that the influence of the publisher can be more comprehensively and reasonably predicted.
On the other hand, the number of historical public opinions may be the number of public opinions collected in a predetermined period (e.g., two weeks, one month, etc.) about the type of risk event.
The propagation risk value obtained according to the present disclosure is obtained by combining individual characteristics of the publisher (e.g., occupation, age group, income, etc. of the publisher) and the number of historical public opinions of the risk event, so that the potential degree of propagation of the publisher and the risk event can be fully embodied.
At step 512, an event risk value may be determined based on the risk event type.
The event risk value for each event type may be determined and updated based on historical empirical values.
In an aspect, a risk level of historical webpublished text associated with each risk event (e.g., a risk level of multiple webpublished texts associated with the type of event) may be determined, and an event risk value for the type of event may be determined and/or updated based on the determined risk levels.
In another aspect, the event risk value for each type of event may also be determined based on historical public opinion forward, reading, and/or number of reviews of the historical web-published text associated with that type of risk event.
In yet another aspect, historical audit results of risk events (whether or not the risk event is true) may be used to determine an event risk value for the event type.
The above lists only a few examples of determining an event risk value, but other means of determining an event risk value based on historical empirical values are also contemplated by the present disclosure.
At step 514, a risk level for the web-published text may be determined based on the propagation risk value and the event risk value.
The risk level of the web-published text may be obtained by a weighted sum of the propagation risk value and the event risk value.
Further, whether to issue text for the web to trigger an alert may be determined based on the risk level. For example, the determined risk level may be compared to a predetermined threshold, and if above the predetermined threshold, it is determined that an early warning signal needs to be generated. For example, the wind control is notified, and the wind control is prompted to further analyze and process the risk event to which the web-published text relates.
Optionally, information related to the web text may be sent to the terminal (e.g., display) for subsequent processing. For example, the risk event type, publisher characteristics (occupation, region, age group, fan number, network support degree, etc.) related to the web-published text, the propagation risk value of the web-published text, the event risk value, the risk level, the text reading amount, the forwarding amount, the comment amount, and the like.
Optionally, the source published text of the web published text, as well as the initial web published text, may be found by forwarding a link backtrack. Effective information can be obtained through analysis of the source published text and/or the initial network published text, and then effective measures are taken to control development of public sentiment. For example, a publisher who originally publishes a text on a web can be found, thereby effectively controlling the development of public sentiment.
The illustrations set forth herein in connection with the figures describe example configurations and are not intended to represent all examples that may be implemented or fall within the scope of the claims. The term "exemplary" as used herein means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other examples. The detailed description includes specific details to provide an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described examples.
In the drawings, similar components or features may have the same reference numerals. Further, various components of the same type may be distinguished by following the reference label by a dash and a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.
The various illustrative blocks and modules described in connection with the disclosure herein may be implemented or performed with a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and the following claims. For example, due to the nature of software, the functions described above may be implemented using software executed by a processor, hardware, firmware, hard-wired, or any combination thereof. Features that implement functions may also be physically located at various locations, including being distributed such that portions of functions are implemented at different physical locations. In addition, as used herein, including in the claims, "or" as used in a list of items (e.g., a list of items accompanied by a phrase such as "at least one of" or "one or more of") indicates an inclusive list, such that, for example, a list of at least one of A, B or C means a or B or C or AB or AC or BC or ABC (i.e., a and B and C). Also, as used herein, the phrase "based on" should not be read as referring to a closed condition set. For example, an exemplary step described as "based on condition a" may be based on both condition a and condition B without departing from the scope of the present disclosure. In other words, the phrase "based on," as used herein, should be interpreted in the same manner as the phrase "based, at least in part, on.
Computer-readable media includes both non-transitory computer storage media and communication media, including any medium that facilitates transfer of a computer program from one place to another. Non-transitory storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, non-transitory computer-readable media can comprise RAM, ROM, electrically erasable programmable read-only memory (EEPROM), Compact Disk (CD) ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other non-transitory medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disc), as used herein, includes CD, laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.
The description herein is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (29)

CN201910731000.1A2019-08-082019-08-08Information processing method and devicePendingCN110609969A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910731000.1ACN110609969A (en)2019-08-082019-08-08Information processing method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910731000.1ACN110609969A (en)2019-08-082019-08-08Information processing method and device

Publications (1)

Publication NumberPublication Date
CN110609969Atrue CN110609969A (en)2019-12-24

Family

ID=68890079

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910731000.1APendingCN110609969A (en)2019-08-082019-08-08Information processing method and device

Country Status (1)

CountryLink
CN (1)CN110609969A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112000866A (en)*2020-08-052020-11-27杭州安恒信息技术股份有限公司Internet data analysis method, device, electronic device and medium
CN117196293A (en)*2023-08-162023-12-08平安科技(深圳)有限公司Public opinion risk determination method, device, server and medium based on artificial intelligence
CN117390602A (en)*2023-12-112024-01-12深圳市瑞迅通信息技术有限公司Information security risk evaluation method and system
CN119313170A (en)*2024-12-172025-01-14戎行技术有限公司 A risk prediction method for news network data based on large models

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101763401A (en)*2009-12-302010-06-30暨南大学Network public sentiment hotspot prediction and analysis method
CN104820629A (en)*2015-05-142015-08-05中国电子科技集团公司第五十四研究所Intelligent system and method for emergently processing public sentiment emergency
US9870546B1 (en)*2013-09-232018-01-16Turner Industries Group, L.L.C.System and method for industrial project cost estimation risk analysis
CN108108902A (en)*2017-12-262018-06-01阿里巴巴集团控股有限公司A kind of risk case alarm method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101763401A (en)*2009-12-302010-06-30暨南大学Network public sentiment hotspot prediction and analysis method
US9870546B1 (en)*2013-09-232018-01-16Turner Industries Group, L.L.C.System and method for industrial project cost estimation risk analysis
CN104820629A (en)*2015-05-142015-08-05中国电子科技集团公司第五十四研究所Intelligent system and method for emergently processing public sentiment emergency
CN108108902A (en)*2017-12-262018-06-01阿里巴巴集团控股有限公司A kind of risk case alarm method and device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112000866A (en)*2020-08-052020-11-27杭州安恒信息技术股份有限公司Internet data analysis method, device, electronic device and medium
CN112000866B (en)*2020-08-052024-03-26杭州安恒信息技术股份有限公司 Internet data analysis methods, devices, electronic devices and media
CN117196293A (en)*2023-08-162023-12-08平安科技(深圳)有限公司Public opinion risk determination method, device, server and medium based on artificial intelligence
CN117196293B (en)*2023-08-162024-09-10平安科技(深圳)有限公司Public opinion risk determination method, device, server and medium based on artificial intelligence
CN117390602A (en)*2023-12-112024-01-12深圳市瑞迅通信息技术有限公司Information security risk evaluation method and system
CN117390602B (en)*2023-12-112024-03-29深圳市瑞迅通信息技术有限公司Information security risk evaluation method and system
CN119313170A (en)*2024-12-172025-01-14戎行技术有限公司 A risk prediction method for news network data based on large models

Similar Documents

PublicationPublication DateTitle
CN110609969A (en)Information processing method and device
US8818788B1 (en)System, method and computer program product for identifying words within collection of text applicable to specific sentiment
US9245252B2 (en)Method and system for determining on-line influence in social media
US20130332385A1 (en)Methods and systems for detecting and extracting product reviews
US9787838B1 (en)System and method for analysis of interactions with a customer service center
CN109711955B (en)Poor evaluation early warning method and system based on current order and blacklist base establishment method
Lee et al.Detecting fake reviews with supervised machine learning algorithms
Sørum et al.Dude, where's my data? The GDPR in practice, from a consumer's point of view
CN104054103A (en)Machine-learning based classification of user accounts based on email addresses and other account information
US9875486B2 (en)Extracting product purchase information from electronic messages
US20130166374A1 (en)Managing reputations
US10237226B2 (en)Detection of manipulation of social media content
CN104462509A (en)Review spam detection method and device
US20140379702A1 (en)System for influencer scoring and methods thereof
US10628510B2 (en)Web link quality analysis and prediction in social networks
CN104285233A (en) Market Research/Analysis System
Hosseini et al.A bilingual longitudinal analysis of privacy policies measuring the impacts of the gdpr and the ccpa/cpra
US9208509B1 (en)System, method, and computer program for personalizing content for a user based on a size of a working vocabulary of the user
CN115618120B (en)Public number information pushing method, system, terminal equipment and storage medium
CN113077292A (en)User classification method and device, storage medium and electronic equipment
KR101811751B1 (en)Advertisement providing server using chatbot
US8661327B1 (en)Method and system for automated insertion of relevant hyperlinks into social media-based communications
CN118710319B (en)Method and system for analyzing bid
KR101614843B1 (en)The method and judgement apparatus for detecting concealment of social issue
CN108460049B (en)Method and system for determining information category

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right
TA01Transfer of patent application right

Effective date of registration:20200927

Address after:Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after:Innovative advanced technology Co.,Ltd.

Address before:Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant before:Advanced innovation technology Co.,Ltd.

Effective date of registration:20200927

Address after:Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after:Advanced innovation technology Co.,Ltd.

Address before:A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before:Alibaba Group Holding Ltd.

RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20191224


[8]ページ先頭

©2009-2025 Movatter.jp