Movatterモバイル変換


[0]ホーム

URL:


CN111737407B - Event unique ID construction method based on event disambiguation - Google Patents

Event unique ID construction method based on event disambiguation
Download PDF

Info

Publication number
CN111737407B
CN111737407BCN202010860468.3ACN202010860468ACN111737407BCN 111737407 BCN111737407 BCN 111737407BCN 202010860468 ACN202010860468 ACN 202010860468ACN 111737407 BCN111737407 BCN 111737407B
Authority
CN
China
Prior art keywords
event
text
data
index
disambiguation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010860468.3A
Other languages
Chinese (zh)
Other versions
CN111737407A (en
Inventor
车雨蒙
周凡吟
吴桐
曾途
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Business Big Data Technology Co Ltd
Original Assignee
Chengdu Business Big Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Business Big Data Technology Co LtdfiledCriticalChengdu Business Big Data Technology Co Ltd
Priority to CN202010860468.3ApriorityCriticalpatent/CN111737407B/en
Publication of CN111737407ApublicationCriticalpatent/CN111737407A/en
Application grantedgrantedCritical
Publication of CN111737407BpublicationCriticalpatent/CN111737407B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention relates to an event unique ID construction method based on event disambiguation, which comprises the following steps: collecting a plurality of text data, and carrying out preliminary disambiguation on the collected text data; respectively analyzing text texts of the plurality of text data subjected to preliminary disambiguation, and outputting respective text basic data; and performing deep disambiguation on each output text basic data respectively, and outputting text cache data with a unique event, wherein the text cache data comprises a unique index ID and an event ID corresponding to the index ID. The invention can solve the problem of controlling the data quality, the data standard, the data source and the like in the steps of data acquisition, data fusion, data analysis and the like in the construction process of the enterprise data warehouse by forming the unique ID system of the event through disambiguation of the text data.

Description

Event unique ID construction method based on event disambiguation
Technical Field
The invention relates to the technical field of data processing and fusion, in particular to an event unique ID construction method based on event disambiguation.
Background
With the continuous deepening of the big data era, the value of data as production elements is more and more obvious, and the capabilities of data acquisition, data fusion and data analysis become the key problems of enterprise data self-transformation gradually. In the process of constructing a data warehouse by an enterprise, particularly when massive text data is acquired and analyzed, the data has a standard and uniform ID system, and great convenience can be brought to data processing and data fusion.
Disclosure of Invention
The invention aims to construct a unique ID of an event after disambiguation of the event, and provides a unique ID construction method of the event based on event disambiguation.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
the method for constructing the event unique ID based on the event disambiguation comprises the following steps:
collecting a plurality of text data, and carrying out preliminary disambiguation on the collected text data;
respectively analyzing text texts of the plurality of text data subjected to preliminary disambiguation, and outputting respective text basic data;
and performing deep disambiguation on each output text basic data respectively, and outputting text cache data with a unique event, wherein the text cache data comprises a unique index ID and an event ID corresponding to the index ID.
The step of collecting a plurality of text data and preliminarily disambiguating the collected text data comprises the following steps:
extracting a text title, a text source website name and a text release date in each text data, if the text data with the same text title, the same text data are removed and only one text data is reserved;
the text title, the text source website name and the text release date of the reserved text data are encrypted by using an MD5 encryption algorithm to generate a related information unique ID.
The step of respectively carrying out text body analysis on the plurality of text data after the preliminary disambiguation and outputting text basic data comprises the following steps of:
analyzing the text of each text data by using an NLP (non line segment) natural language processing method, and extracting an event type, an event subject, a start time, an end time and an event object in the text;
linking the extracted standard names of the event subject and the event object in an entity linking mode, so that the event subject and the event object have the standard names, matching the subject ID with the event subject, and matching the object ID with the event object;
and establishing an index ID according to the extracted event type, the event subject, the start time, the end time and the event object, and finishing text basic data output.
The step of establishing an index ID according to the extracted event type, event subject, start time, end time and event object includes:
if only one event type, event subject, start time, end time or event object is extracted, establishing an index ID corresponding to the event type, the event subject, the start time, the end time and the event object;
if the number of the extracted event types, event subjects, start times, end times or event objects is more than one, index IDs corresponding to the event types, the event subjects, the start times, the end times and the event objects are respectively established, and a distinguishing suffix is input after each index ID.
The step of performing deep disambiguation on each output text basic data respectively and outputting text cache data, wherein the text cache data comprises an event ID and an index ID, and the step comprises the following steps of:
if the text basic data only contains one index ID, generating an event ID for the index ID by using an MD5 encryption algorithm, and outputting text cache data, wherein the text cache data comprises a group of event ID-index ID;
if the text basic data contains a plurality of index IDs, whether event types, event subjects, starting time, ending time and event objects corresponding to the index IDs are the same or not is judged, if yes, the same index IDs are removed, the reserved index IDs are generated into event IDs by using an MD5 encryption algorithm, and text cache data is output, wherein the text cache data comprises one or more groups of event IDs-index IDs.
Compared with the prior art, the invention has the beneficial effects that:
the invention can solve the problem of controlling the data quality, the data standard, the data source and the like in the steps of data acquisition, data fusion, data analysis and the like in the construction process of the enterprise data warehouse by forming the unique ID system of the event through disambiguation of the text data.
Under the scenes of a multi-source heterogeneous data fusion scheme, a standard unified data ID system, unstructured data processing and the like, corresponding IDs are matched in the disambiguation process, so that the uniqueness and traceability of data are effectively guaranteed, and a standard basic framework is provided for subsequent data analysis and data mining work.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flowchart of a method for constructing an event unique ID according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for constructing an event unique ID according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Example (b):
the invention is realized by the following technical scheme, as shown in fig. 1 and 2, the invention provides an event unique ID construction method based on event disambiguation, which comprises the following steps:
step S100: collecting a plurality of text data, and performing preliminary disambiguation on the collected text data.
The text data is collected by using a crawler or other methods, and the collected text data can be various data such as news public sentiments, business basic information, judicial litigation, administrative penalties and the like, and the embodiment exemplifies the news public sentiments as the text data. For example, if 10 news opinions are collected, the 10 news opinions are preliminarily disambiguated.
A plurality of fields such as text title, text source website name, text release date, version number, keyword, body, etc. can be extracted for each news opinion. The method comprises the steps of independently extracting 3 fields of a text title, a text source website name and a text release date of each news public opinion, judging whether the 3 fields of the text title, the text source website name and the text release date of the 10 news public opinions are completely the same, if the 3 fields are completely the same, indicating that the 10 news public opinions are the same, removing the same news public opinions, and only keeping one, so as to finish preliminary disambiguation of the 10 collected news public opinions.
For example, the 3 fields of the text title, the name of the text source website and the text release date of 4 news opinions are shown in table 1:
chinese character fieldEnglish fieldNews public opinion 1News public opinion 2News public opinion 3News public opinion 4
Text titlenews_titleA and B cooperateA and B cooperateMarriage of C and DE acquisition F
Text source network namenews_siteData viewData viewSelf-service netData view
Date of text releasepubdate2020-01-012020-01-012020-01-012020-01-02
TABLE 1
As can be seen from table 1, the 3 fields of the text titles, the names of the websites from which the texts are sourced, and the dates of text release of the news opinions 1 and 2 are all identical, so that it is indicated that the news opinions 1 and 2 are identical, and duplicate news opinions are removed, and only one of the news opinions is reserved.
After preliminary disambiguation by comparing the 3 fields, according to the reserved text title, text source website name and text release date of each news opinion, the MD5 encryption method is used for encrypting to generate the unique ID of the relevant information. For example, after the news public opinion 1 generates the relevant information unique ID, the field description shown in table 2 can be formed by combining the aforementioned various extractable fields:
english fieldChinese character field
bbd_xgxx_idUnique ID of related information
bbd_tableTable name
bbd_typeWatch type
uptimeTime stamp
do_timeDate grabbing
versionVersion number
bbd_seedKeyword
bbd_urlGrabbing chain
news_titleText title
pubdateDate of text release
news_siteText source website name
mainText
TABLE 2
For example, after adding the actual value field of news opinion 1 to table 2, a field description as shown in table 3 can be formed:
english fieldValue taking
bbd_xgxx_idf28a1d555831272ad0a2b7b0922ca564
bbd_tableqyxg_yuqing
bbd_typesic_chinacoal
uptime1557924294
do_time2019-05-15
version1
bbd_seedData link nameplate cooperation
bbd_urlhttp://www.cbdio.com/BigData/2018-12/26/content_5966169.htm
news_titleStrategic cooperation of data link with Yiborui leading information service company
pubdate2018-12-26 10:24:23
news_siteData view
main12.25 am, Chengdu Dai Ming dynasty together with Kangming scientific & technical limited company (BBD for short) and Yiborui communicationThe SeitchToolTown (Beijing) company Limited enters into a strategic cooperation agreement in Shanghai. Both parties will be based on eachThe advantages of technical and market resources, and the joint exploration of big data in the aspects of general finance, retail risk management,Credit service innovation, and the like. 29 months in 2018, Chinese ship weightThe group of industry, the government of people in Heilongjiang province and the government of people in Halrison in HarrisonAnd signing a deepening strategic cooperation agreement.
TABLE 3
It should be noted that, the fields that can be extracted from one text data have at least 3 fields of text title, text source website name, and text release date, other fields can be extracted according to the actual situation definition, and the english field corresponding to the chinese field can also be translated according to the actual situation, and tables 1, 2, and 3 are only examples for easy understanding.
After the preliminary disambiguation and the field encryption, the text data of each news public opinion can be output for further analysis.
Step S200: and respectively carrying out text body analysis on the plurality of text data after the preliminary disambiguation, and outputting respective text basic data.
And respectively analyzing the text of each news public opinion by using an NLP natural language processing method, and extracting 5 fields of an event type, an event subject, a starting time, an ending time and an event object in the text, wherein the event type, the event subject and the starting time are fields which must exist, and the ending time and the event object can be empty fields. For example, there may be only one subject (event subject) in a news opinion, and what the subject does (event type) and start time, and there is no opposite object (event object) and end time.
And if the event object exists in the news public opinion, the extracted standard names of the event subject and the event object are linked in an entity linking mode, so that the event subject and the event object have the standard names. For example, if the event subject extracted from the news opinion is "digital link product" or "BBD", it needs to be linked to the standard name "digital link product technology limited", that is, the full name. And then, after the event subject and the standard name of the event object are linked, matching the subject ID for the event subject and matching the object ID for the event object. If the extracted event object is an empty field, standard name linking is not needed to be carried out on the event object; or the extracted event subject and event object are standard names, and standard name linkage is not needed.
Since a news opinion may include a plurality of event subjects, a plurality of event objects, or a plurality of event types, the text data after the preliminary disambiguation is analyzed to output text base data.
Judging the number of event types, event subjects, starting time, ending time or event objects extracted from the news public sentiment, and if only one event type, event subject, starting time, ending time and event object is extracted, directly establishing an index ID corresponding to the event type, event subject, starting time, ending time and event object. And if the event object or the field of the end time does not exist, establishing an index ID corresponding to the existing field. Assume that the preliminary disambiguated text base data as shown in table 4 is formed:
english fieldChinese character field
bbd_xgxx_idUnique ID of related information
search_idIndex ID
event_typeEvent type
event_subjectEvent body
subject_idPrincipal ID
pubdateDate of release
start_timeStarting time
end_timeEnd time
event_objectEvent object
object_idObject ID
TABLE 4
If it is determined that there is more than one event type, event subject, start time, end time or event object extracted from the text data, for example, the news public opinion text subjected to preliminary disambiguation is shown in table 5:
main12.25 am, Chengdu-Ding-Zhi-Lin-Ming-Tech and Yiborui-Information-Tech-SichShanghai signed a strategic cooperation agreement. Both parties will jointly explore based on respective technology and market resource advantagesThe big data is practically applied in the fields of general finance, retail risk management, credit service innovation and the like.29 th 8.8.2018, China Ship re-engineering group Limited, the government of people in Heilongjiang province, HarbinThe municipality signs a deep strategic cooperative agreement at Harbin.
TABLE 5
An event type field which can be extracted from the text of the news public opinion is ' enterprise cooperation ', an event subject comprises ' Chengdu digital associated data technology company limited (hereinafter referred to as digital associated data) ' Chinese ship re-engineering group limited (hereinafter referred to as Chinese ship) ', and an event object comprises ' Yiborui information technology company limited (hereinafter referred to as Yiborui), ' Heilongjiang people government and ' Halrison city people government '.
The event type, event subject, start time, end time, event object extracted in this way can form the text base data shown in table 6:
english fieldValue 1Value 2Value 3
bbd_xgxx_idf28a1d555831272ad0a2b7b0922ca564f28a1d555831272ad0a2b7b0922ca564f28a1d555831272ad0a2b7b0922ca564
search_idf28a1d555831272ad0a2b7b0922ca564_1f28a1d555831272ad0a2b7b0922ca564_2_1f28a1d555831272ad0a2b7b0922ca564_2_2
event_typeEnterprise collaborationEnterprise collaborationEnterprise collaboration
event_subjectNumber linked name plateChina Shipbuilding Heavy Industry Group Co.,Ltd.China Shipbuilding Heavy Industry Group Co.,Ltd.
subject_id17988de145c14f808fd2ffa0dc1399d789988de145c14f808fd2ffa0dc1399d789988de145c14f808fd2ffa0dc1399d7
pubdate2015-06-15 12:15:002018-09-03 00:00:002018-09-03 00:00:00
start_time2018-12-262018-08-292018-08-29
end_time
event_objectYiboruiGovernment of Heilongjiang provinceHarbin city government
object_id88988de145c14f808fd2ffa0dc1399d7nullnull
TABLE 6
As can be seen from table 6, the text data body includes three event types, each event type has its corresponding event subject, event object, start time, and no end time. Index IDs corresponding to the three 'event types, event subjects, start times and event objects' are respectively established, and a distinguishing suffix is input after each index ID.
For example, when two event subjects of 'number associated nameplate' and 'Chinese ship' exist in three event types, a first-level distinguishing suffix _1 is input after an index ID corresponding to the 'number associated nameplate', and a first-level distinguishing suffix _2 is input after the index ID corresponding to the 'Chinese ship'; the two event subjects are the event objects corresponding to the Chinese ship, namely, the 'Huilongjiang people government' and the 'Harbin city people government', then a second-level distinguishing suffix _1 is input after an index ID corresponding to the 'Heilongjiang people government' to form a _2_1, and similarly, a second-level distinguishing suffix _2 is input after the index ID corresponding to the 'Harbin city people' to form a _2_2 to be used as the division of the index ID.
Step S300: and performing deep disambiguation on each output text basic data respectively, and outputting text cache data, wherein the text cache data comprises a unique index ID and an event ID corresponding to the index ID.
If the text basic data only contains one index ID, an event ID is generated by using an MD5 encryption algorithm for the index ID, and text cache data is output, wherein the text cache data comprises a group of unique index IDs and event IDs corresponding to the index IDs, namely a group of event IDs-index IDs. If the field of the event object or the ending time corresponding to the index ID is empty, at least after the event object in the empty field is matched with the object ID in the empty field, the four fields of the event type, the subject ID, the object ID and the starting time are formed to be encrypted by the MD5, and the event ID cannot be formed by directly encrypting without the field of the object ID.
If the text basic data contains a plurality of index IDs, whether the 'event type-event subject-start time-end time-event object' corresponding to the plurality of index IDs are the same or not is judged, if the 'event type-event subject-start time-end time-event object' corresponding to the plurality of index IDs are the same, the same index ID needs to be removed, the reserved index ID is used for generating the event ID by using an MD5 encryption algorithm, deep disambiguation on of the text basic data is completed, and text cache data unique to the event is formed, wherein the text cache data comprises one or more groups of unique index IDs and event IDs corresponding to the index IDs, namely one or more groups of 'index IDs-event IDs'.
For example, the "event type-event subject-start time-end time-event object" corresponding to three groups of index IDs extracted from the text base data of a news opinion is shown in table 7:
english fieldValue 1Value 2Value 3
bbd_xgxx_idf28a1d555831272ad0a2b7b0922ca564f28a1d555831272ad0a2b7b0922ca564f28a1d555831272ad0a2b7b0922ca564
search_idf28a1d555831272ad0a2b7b0922ca564_1_1f28a1d555831272ad0a2b7b0922ca564_1_2f28a1d555831272ad0a2b7b0922ca564_2
event_typeEnterprise collaborationEnterprise collaborationEnterprise collaboration
event_subjectNumber linked name plateNumber linked name plateChina Shipbuilding Heavy Industry Group Co.,Ltd.
subject_id17988de145c14f808fd2ffa0dc1399d789988de145c14f808fd2ffa0dc1399d789988de145c14f808fd2ffa0dc1399d7
pubdate2015-06-15 12:15:002015-06-15 12:15:002018-09-03 00:00:00
start_time2018-12-262018-12-262018-08-29
end_time2018-12-282018-12-28
event_objectYiboruiYiboruiHarbin city government
object_id88988de145c14f808fd2ffa0dc1399d788988de145c14f808fd2ffa0dc1399d7null
TABLE 7
As can be seen from table 7, the "event type-event subject-start time-end time-event object" corresponding to the first group index ID is completely the same as the "event type-event subject-start time-end time-event object" corresponding to the second group index ID, which indicates that the event corresponding to the first group index ID and the event corresponding to the second group index ID are the same event, and therefore the same event needs to be removed, and only one event needs to be kept, that is, deep disambiguation of text base data is completed, and text base data unique to the event is formed as shown in table 8:
english fieldValue 1Value 2
bbd_xgxx_idf28a1d555831272ad0a2b7b0922ca564f28a1d555831272ad0a2b7b0922ca564
search_idf28a1d555831272ad0a2b7b0922ca564_1_1f28a1d555831272ad0a2b7b0922ca564_2
event_typeEnterprise collaborationEnterprise collaboration
event_subjectNumber linked name plateChina Shipbuilding Heavy Industry Group Co.,Ltd.
subject_id17988de145c14f808fd2ffa0dc1399d789988de145c14f808fd2ffa0dc1399d7
pubdate2015-06-15 12:15:002018-09-03 00:00:00
start_time2018-12-262018-08-29
end_time2018-12-28
event_objectYiboruiHarbin city government
object_id88988de145c14f808fd2ffa0dc1399d7null
TABLE 8
An index ID reserved in text basic data subjected to deep disambiguation is generated by using an MD5 encryption algorithm to generate a unique event ID corresponding to the index ID, so as to achieve the purpose of data tracing, that is, text cache data shown in table 9 is stored in a standard library for subsequent services:
english fieldChinese character field
event_idEvent id
search_idIndex id
TABLE 9
The disambiguation processing and the ID matching are carried out on each piece of news public opinion, and the text cache data which is unique to the event corresponding to each text data finally can be output.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

CN202010860468.3A2020-08-252020-08-25Event unique ID construction method based on event disambiguationActiveCN111737407B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010860468.3ACN111737407B (en)2020-08-252020-08-25Event unique ID construction method based on event disambiguation

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010860468.3ACN111737407B (en)2020-08-252020-08-25Event unique ID construction method based on event disambiguation

Publications (2)

Publication NumberPublication Date
CN111737407A CN111737407A (en)2020-10-02
CN111737407Btrue CN111737407B (en)2020-11-10

Family

ID=72658835

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010860468.3AActiveCN111737407B (en)2020-08-252020-08-25Event unique ID construction method based on event disambiguation

Country Status (1)

CountryLink
CN (1)CN111737407B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102521321A (en)*2011-12-022012-06-27华中科技大学Video search method based on search term ambiguity and user preferences
CN104041189A (en)*2012-01-122014-09-10鲁门无线电通信公司Remote commissioning of an array of networked devices
CN111414763A (en)*2020-02-282020-07-14长沙千博信息技术有限公司 A semantic disambiguation method, device, device and storage device for sign language computing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
BRPI0811415A2 (en)*2007-03-302017-05-02Knewco Inc system and method for wikifying content for knowledge browsing and discovery
CN103559269B (en)*2013-11-042018-02-06北京中搜搜悦网络技术有限公司A kind of knowledge recommendation method towards mobile news subscription
US9542477B2 (en)*2013-12-022017-01-10Qbase, LLCMethod of automated discovery of topics relatedness
CN110020438B (en)*2019-04-152020-12-08上海冰鉴信息科技有限公司Sequence identification based enterprise or organization Chinese name entity disambiguation method and device
CN110399613B (en)*2019-07-262023-03-31浪潮软件股份有限公司Method and system for identifying internet news related to place names based on part-of-speech tagging

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102521321A (en)*2011-12-022012-06-27华中科技大学Video search method based on search term ambiguity and user preferences
CN104041189A (en)*2012-01-122014-09-10鲁门无线电通信公司Remote commissioning of an array of networked devices
CN111414763A (en)*2020-02-282020-07-14长沙千博信息技术有限公司 A semantic disambiguation method, device, device and storage device for sign language computing

Also Published As

Publication numberPublication date
CN111737407A (en)2020-10-02

Similar Documents

PublicationPublication DateTitle
TWI664539B (en)System, apparatus and method for monitoring internet media events based on a constructed industry knowledge graph database
US20210200759A1 (en)Systems and Methods for Data Mining of Historic Electronic Communication Exchanges to Identify Relationships, Patterns, and Correlations to Deal Outcomes
Andryani et al.Social media analytics: data utilization of social media for research
CN109753502B (en)Data acquisition method based on NiFi
CN103778200B (en)A kind of message information source abstracting method and its system
Chang et al.Data analysis of digital currency networks: Namecoin case study
Wongthongtham et al.Ontology and trust based data warehouse in new generation of business intelligence: State-of-the-art, challenges, and opportunities
CN107918644A (en)News subject under discussion analysis method and implementation system in reputation Governance framework
CN110968571A (en)Big data analysis and processing platform for financial information service
Khan et al.An analysis of Twitter users of Pakistan
Ennaji et al.Social intelligence framework: Extracting and analyzing opinions for social CRM
EP4002152A1 (en)Data tagging and synchronisation system
CN107679097A (en)A kind of distributed data processing method, system and storage medium
CN111737407B (en)Event unique ID construction method based on event disambiguation
Al Bashaireh et al.Twitter data collection and extraction: a method and a new dataset, the UTD-MI
KR102025813B1 (en)Device and method for chronological big data curation system
CN111723063A (en) A method and device for offline log data processing
Yu et al.Analysis of enterprise social media intelligence acquisition based on data crawler technology
Xianlei et al.Finding domain experts in microblogs
CN108717637A (en)A kind of automatic mining method and system of the safety-related entity of electric business
AslamLOPDF: a framework for extracting and producing open data of scientific documents for smart digital libraries
Alfiyanti et al.Analyzing radicalism-related conversation patterns on twitter: a comparative study of association rules with Apriori and FP-growth algorithms
AljojoExamining Heterogeneity Structured on a Large Data Volume with Minimal Incompleteness
Liu et al.Data Acquisition, Hot Issues and System of Microblog Mining
CN112085464A (en)Associated data processing method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
PE01Entry into force of the registration of the contract for pledge of patent right

Denomination of invention:Event unique ID construction method based on event disambiguation

Effective date of registration:20210305

Granted publication date:20201110

Pledgee:Agricultural Bank of China Limited Chengdu Shudu sub branch

Pledgor:CHENGDU BUSINESS BIG DATA TECHNOLOGY Co.,Ltd.

Registration number:Y2021980001476

PE01Entry into force of the registration of the contract for pledge of patent right
PP01Preservation of patent right

Effective date of registration:20240428

Granted publication date:20201110

PP01Preservation of patent right

[8]ページ先頭

©2009-2025 Movatter.jp