Movatterモバイル変換


[0]ホーム

URL:


CN114328481A - A data analysis method for use and construction of digital resources - Google Patents

A data analysis method for use and construction of digital resources
Download PDF

Info

Publication number
CN114328481A
CN114328481ACN202111496809.4ACN202111496809ACN114328481ACN 114328481 ACN114328481 ACN 114328481ACN 202111496809 ACN202111496809 ACN 202111496809ACN 114328481 ACN114328481 ACN 114328481A
Authority
CN
China
Prior art keywords
data
index table
sensitive
sensitive factors
factors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111496809.4A
Other languages
Chinese (zh)
Other versions
CN114328481B (en
Inventor
刘金梅
曲秋莳
李军
王小娟
张荐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Vocational College Of Transportation
Original Assignee
Beijing Vocational College Of Transportation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Vocational College Of TransportationfiledCriticalBeijing Vocational College Of Transportation
Priority to CN202111496809.4ApriorityCriticalpatent/CN114328481B/en
Publication of CN114328481ApublicationCriticalpatent/CN114328481A/en
Application grantedgrantedCritical
Publication of CN114328481BpublicationCriticalpatent/CN114328481B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

The invention discloses a data analysis method for digital resource use construction, which comprises the following steps: A. classifying the data to be used according to the content, and then cleaning each type of data; B. b, establishing an index table of the data processed in the step A, and integrating the index table with an index table of the resource library by adding external keys; C. and D, performing simulation operation on the integrated resource library, and updating the resource library index table integrated in the step B according to an operation result. The invention can improve the defects of the prior art and improve the data updating speed of the digital resource library.

Description

Translated fromChinese
一种用于数字资源使用建设的数据分析方法A data analysis method for use and construction of digital resources

技术领域technical field

本发明涉及数据库技术领域,尤其是一种用于数字资源使用建设的数据分析方法。The invention relates to the technical field of databases, in particular to a data analysis method for use and construction of digital resources.

背景技术Background technique

数字资源库是一种在各行业被广泛使用的数据库。为了保证数据资源的实时性,需要定期对数字资源库进行更新。由于每次更新过程的数据量较大,导致更新过程速度较慢,影响到了数字资源库的使用便利性。Digital Repository is a database that is widely used in various industries. In order to ensure the real-time nature of data resources, the digital resource database needs to be updated regularly. Due to the large amount of data in each update process, the update process is slow, which affects the convenience of the digital resource library.

发明内容SUMMARY OF THE INVENTION

本发明要解决的技术问题是提供一种用于数字资源使用建设的数据分析方法,能够解决现有技术的不足,提高数字资源库的数据更新速度。The technical problem to be solved by the present invention is to provide a data analysis method for use and construction of digital resources, which can solve the deficiencies of the prior art and improve the data update speed of the digital resource database.

为解决上述技术问题,本发明所采取的技术方案如下。In order to solve the above technical problems, the technical solutions adopted by the present invention are as follows.

一种用于数字资源使用建设的数据分析方法,包括以下步骤:A data analysis method for use and construction of digital resources, comprising the following steps:

A、对待用数据按照内容进行分类,然后对每类数据进行数据清洗;A. Classify the data to be used according to the content, and then perform data cleaning on each type of data;

B、建立经过步骤A处理后的数据的索引表,并通过添加外键将索引表与资源库索引表进行整合;B. Establish an index table of the data processed in step A, and integrate the index table with the resource library index table by adding a foreign key;

C、对整合后的资源库进行模拟运算,根据运算结果对步骤B整合后的资源库索引表进行更新。C. Perform a simulation operation on the integrated resource library, and update the integrated resource library index table in step B according to the operation result.

作为优选,步骤A中,进行数据清洗包括以下步骤,Preferably, in step A, performing data cleaning includes the following steps:

A1、提取每类数据的敏感因子;A1. Extract the sensitive factors of each type of data;

A2、以相似度对敏感因子进行聚类分组,然后根据每组敏感因子的数量对同组的敏感因子赋予一个相同的优先级;A2. Clustering and grouping sensitive factors by similarity, and then assigning the same priority to sensitive factors in the same group according to the number of sensitive factors in each group;

A3、删除不包含敏感因子的数据;A3. Delete data that does not contain sensitive factors;

A4、对于包含敏感因子的数据,根据其包含的最高优先级的敏感因子进行分组;A4. For data containing sensitive factors, group them according to the sensitive factors with the highest priority;

A5、对每组数据中的重复数据进行删除;A5. Delete the duplicate data in each group of data;

A6、对剩余数据进行一次模拟运算,然后对剩余数据中的非最高优先级敏感因子进行互换,再进行一次模拟运算,对比两次模拟运算结果,将敏感因子互换前后模拟运算结果偏差小于设定阈值的数据进行合并;A6. Perform a simulation operation on the remaining data, then exchange the non-highest priority sensitive factors in the remaining data, perform another simulation operation, and compare the results of the two simulation operations. The deviation of the simulation results before and after the exchange of the sensitive factors is less than The data of the set threshold is merged;

A7、重复步骤A6,直至没有符合合并条件的数据,结束。A7. Repeat step A6 until there is no data that meets the merging conditions, and end.

作为优选,步骤A1中,提取每类数据的敏感因子包括以下步骤,Preferably, in step A1, extracting the sensitive factors of each type of data includes the following steps:

A11、对数据内容进行标记,一个数据的标记数量大于等于2个;A11. Mark the data content, the number of markers for one data is greater than or equal to 2;

A12、对数据的标记位置内容进行随机替换,使用测试函数对替换前后的数据进行测试运算,计算两次运算结果的偏差度;A12. Randomly replace the content of the marked position of the data, use the test function to perform test operations on the data before and after the replacement, and calculate the deviation of the results of the two operations;

A13、重复步骤A12,每次执行步骤A12之前对数据的标记位置进行更换,直至偏差度超过预设阈值或重复次数达到预设次数,结束测试运算,选择偏差度最大的标记内容作为敏感因子。A13. Repeat step A12, changing the marked position of the data before each execution of step A12, until the deviation exceeds the preset threshold or the number of repetitions reaches the preset number of times, end the test operation, and select the marked content with the largest deviation as the sensitivity factor.

作为优选,步骤B中,建立经过步骤A处理后的数据的索引表包括以下步骤,Preferably, in step B, establishing an index table of the data processed in step A includes the following steps:

B11、建立每个数据所包含敏感因子的敏感因子集合,建立敏感因子集合与数据之间的关联函数;B11. Establish a sensitive factor set of the sensitive factors contained in each data, and establish an association function between the sensitive factor set and the data;

B12、建立两级索引表,第一级索引表的对象为关联函数,采用分组方式存储,将关联函数根据相似度进行分组,第二级索引表的对象为敏感因子集合,采用队列方式存储;B12. Establish a two-level index table. The object of the first-level index table is an association function, which is stored in a grouping mode, and the association function is grouped according to the similarity. The object of the second-level index table is a set of sensitive factors, which is stored in a queue mode;

B13、检索数据时,首先通过第二级索引表查找与目标数据的敏感因子集合相同和/或相似的敏感因子集合,然后通过第一级索引表查找与第二级索引表中敏感因子相关的关联函数,最后通过查找到的关联函数所在分组中的关联函数查找目标数据。B13. When retrieving data, first look for a set of sensitive factors that is the same and/or similar to the set of sensitive factors of the target data through the second-level index table, and then through the first-level index table to look up the sensitive factors related to the sensitive factors in the second-level index table association function, and finally find the target data through the association function in the group where the found association function is located.

作为优选,步骤C中,对步骤B整合后的资源库索引表进行更新包括以下步骤,Preferably, in step C, updating the resource library index table integrated in step B includes the following steps:

C1、根据模拟运算结果更新敏感因子集合;C1. Update the sensitive factor set according to the simulation operation result;

C2、根据更新后的敏感因子集合对第二级索引表进行更新。C2. Update the second-level index table according to the updated sensitive factor set.

采用上述技术方案所带来的有益效果在于:本发明通过提取敏感因子,使用敏感因子作为数据清洗的限制参数,有效降低了数据清洗过程对数据的检验运算量,同时提高了数据清洗的准确度。与此同时,在建立索引表的过程中,通过建立包含关联函数和敏感因子的两级索引结构,可以提高数据检索效率。另外,在每次对索引表进行更新时只需要对包含敏感因子集合的第二级索引表进行更新即可,更新运算量更低。The beneficial effects brought about by the above technical solutions are: the present invention effectively reduces the amount of data inspection and computation in the data cleaning process by extracting the sensitive factor and using the sensitive factor as the limiting parameter for data cleaning, and simultaneously improves the accuracy of data cleaning . At the same time, in the process of establishing the index table, the data retrieval efficiency can be improved by establishing a two-level index structure including correlation functions and sensitive factors. In addition, each time the index table is updated, only the second-level index table including the sensitive factor set needs to be updated, and the amount of update operation is lower.

附图说明Description of drawings

图1是本发明一个具体实施方式的流程图。FIG. 1 is a flow chart of a specific embodiment of the present invention.

具体实施方式Detailed ways

参照图1,本发明的一个具体实施方式包括以下步骤:1, a specific embodiment of the present invention includes the following steps:

A、对待用数据按照内容进行分类,然后对每类数据进行数据清洗;A. Classify the data to be used according to the content, and then perform data cleaning on each type of data;

B、建立经过步骤A处理后的数据的索引表,并通过添加外键将索引表与资源库索引表进行整合;B. Establish an index table of the data processed in step A, and integrate the index table with the resource library index table by adding a foreign key;

C、对整合后的资源库进行模拟运算,根据运算结果对步骤B整合后的资源库索引表进行更新。C. Perform a simulation operation on the integrated resource library, and update the integrated resource library index table in step B according to the operation result.

步骤A中,进行数据清洗包括以下步骤,In step A, carrying out data cleaning includes the following steps,

A1、提取每类数据的敏感因子;A1. Extract the sensitive factors of each type of data;

A2、以相似度对敏感因子进行聚类分组,然后根据每组敏感因子的数量对同组的敏感因子赋予一个相同的优先级;A2. Clustering and grouping sensitive factors by similarity, and then assigning the same priority to sensitive factors in the same group according to the number of sensitive factors in each group;

A3、删除不包含敏感因子的数据;A3. Delete data that does not contain sensitive factors;

A4、对于包含敏感因子的数据,根据其包含的最高优先级的敏感因子进行分组;A4. For data containing sensitive factors, group them according to the sensitive factors with the highest priority;

A5、对每组数据中的重复数据进行删除;A5. Delete the duplicate data in each group of data;

A6、对剩余数据进行一次模拟运算,然后对剩余数据中的非最高优先级敏感因子进行互换,再进行一次模拟运算,对比两次模拟运算结果,将敏感因子互换前后模拟运算结果偏差小于设定阈值的数据进行合并;A6. Perform a simulation operation on the remaining data, then exchange the non-highest priority sensitive factors in the remaining data, perform another simulation operation, and compare the results of the two simulation operations. The deviation of the simulation results before and after the exchange of the sensitive factors is less than The data of the set threshold is merged;

A7、重复步骤A6,直至没有符合合并条件的数据,结束。A7. Repeat step A6 until there is no data that meets the merging conditions, and end.

步骤A1中,提取每类数据的敏感因子包括以下步骤,In step A1, extracting the sensitive factors of each type of data includes the following steps:

A11、对数据内容进行标记,一个数据的标记数量大于等于2个;A11. Mark the data content, the number of markers for one data is greater than or equal to 2;

A12、对数据的标记位置内容进行随机替换,使用测试函数对替换前后的数据进行测试运算,计算两次运算结果的偏差度;A12. Randomly replace the content of the marked position of the data, use the test function to perform test operations on the data before and after the replacement, and calculate the deviation of the results of the two operations;

A13、重复步骤A12,每次执行步骤A12之前对数据的标记位置进行更换,直至偏差度超过预设阈值或重复次数达到预设次数,结束测试运算,选择偏差度最大的标记内容作为敏感因子。A13. Repeat step A12, changing the marked position of the data before each execution of step A12, until the deviation exceeds the preset threshold or the number of repetitions reaches the preset number of times, end the test operation, and select the marked content with the largest deviation as the sensitivity factor.

步骤B中,建立经过步骤A处理后的数据的索引表包括以下步骤,In step B, establishing the index table of the data processed by step A includes the following steps,

B11、建立每个数据所包含敏感因子的敏感因子集合,建立敏感因子集合与数据之间的关联函数;B11. Establish a sensitive factor set of the sensitive factors contained in each data, and establish an association function between the sensitive factor set and the data;

B12、建立两级索引表,第一级索引表的对象为关联函数,采用分组方式存储,将关联函数根据相似度进行分组,第二级索引表的对象为敏感因子集合,采用队列方式存储;B12. Establish a two-level index table. The object of the first-level index table is an association function, which is stored in a grouping mode, and the association function is grouped according to the similarity. The object of the second-level index table is a set of sensitive factors, which is stored in a queue mode;

B13、检索数据时,首先通过第二级索引表查找与目标数据的敏感因子集合相同和/或相似的敏感因子集合,然后通过第一级索引表查找与第二级索引表中敏感因子相关的关联函数,最后通过查找到的关联函数所在分组中的关联函数查找目标数据。B13. When retrieving data, first look for a set of sensitive factors that is the same and/or similar to the set of sensitive factors of the target data through the second-level index table, and then through the first-level index table to look up the sensitive factors related to the sensitive factors in the second-level index table association function, and finally find the target data through the association function in the group where the found association function is located.

步骤C中,对步骤B整合后的资源库索引表进行更新包括以下步骤,C1、根据模拟运算结果更新敏感因子集合;In step C, updating the resource library index table integrated in step B includes the following steps, C1, updating the set of sensitive factors according to the simulation operation result;

C2、根据更新后的敏感因子集合对第二级索引表进行更新。C2. Update the second-level index table according to the updated sensitive factor set.

本发明通过改进数据清洗的过程,有效提高了数字资源库的数据更新速度。The invention effectively improves the data update speed of the digital resource library by improving the data cleaning process.

在本发明的描述中,需要理解的是,术语“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that the terms "portrait", "horizontal", "upper", "lower", "front", "rear", "left", "right", "vertical", The orientation or positional relationship indicated by "horizontal", "top", "bottom", "inner", "outer", etc. is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention, rather than indicating or It is implied that the device or element referred to must have a particular orientation, be constructed and operate in a particular orientation, and therefore should not be construed as limiting the invention.

以上显示和描述了本发明的基本原理和主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的只是说明本发明的原理,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The basic principles and main features of the present invention and the advantages of the present invention have been shown and described above. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments, and the descriptions in the above-mentioned embodiments and the description are only to illustrate the principle of the present invention. Without departing from the spirit and scope of the present invention, the present invention will have Various changes and modifications fall within the scope of the claimed invention. The claimed scope of the present invention is defined by the appended claims and their equivalents.

Claims (5)

Translated fromChinese
1.一种用于数字资源使用建设的数据分析方法,其特征在于包括以下步骤:1. a data analysis method for the use and construction of digital resources, is characterized in that comprising the following steps:A、对待用数据按照内容进行分类,然后对每类数据进行数据清洗;A. Classify the data to be used according to the content, and then perform data cleaning on each type of data;B、建立经过步骤A处理后的数据的索引表,并通过添加外键将索引表与资源库索引表进行整合;B. Establish an index table of the data processed in step A, and integrate the index table with the resource library index table by adding a foreign key;C、对整合后的资源库进行模拟运算,根据运算结果对步骤B整合后的资源库索引表进行更新。C. Perform a simulation operation on the integrated resource library, and update the integrated resource library index table in step B according to the operation result.2.根据权利要求1所述的用于数字资源使用建设的数据分析方法,其特征在于:步骤A中,进行数据清洗包括以下步骤,2. the data analysis method for digital resource use construction according to claim 1, is characterized in that: in step A, carrying out data cleaning comprises the following steps,A1、提取每类数据的敏感因子;A1. Extract the sensitive factors of each type of data;A2、以相似度对敏感因子进行聚类分组,然后根据每组敏感因子的数量对同组的敏感因子赋予一个相同的优先级;A2. Clustering and grouping sensitive factors by similarity, and then assigning the same priority to sensitive factors in the same group according to the number of sensitive factors in each group;A3、删除不包含敏感因子的数据;A3. Delete data that does not contain sensitive factors;A4、对于包含敏感因子的数据,根据其包含的最高优先级的敏感因子进行分组;A4. For data containing sensitive factors, group them according to the sensitive factors with the highest priority;A5、对每组数据中的重复数据进行删除;A5. Delete the duplicate data in each group of data;A6、对剩余数据进行一次模拟运算,然后对剩余数据中的非最高优先级敏感因子进行互换,再进行一次模拟运算,对比两次模拟运算结果,将敏感因子互换前后模拟运算结果偏差小于设定阈值的数据进行合并;A6. Perform a simulation operation on the remaining data, then exchange the non-highest priority sensitive factors in the remaining data, perform another simulation operation, and compare the results of the two simulation operations. The deviation of the simulation results before and after the exchange of the sensitive factors is less than The data of the set threshold is merged;A7、重复步骤A6,直至没有符合合并条件的数据,结束。A7. Repeat step A6 until there is no data that meets the merging conditions, and end.3.根据权利要求2所述的用于数字资源使用建设的数据分析方法,其特征在于:步骤A1中,提取每类数据的敏感因子包括以下步骤,3. the data analysis method that is used for the construction of digital resource use according to claim 2, is characterized in that: in step A1, extracting the sensitive factor of each type of data comprises the following steps,A11、对数据内容进行标记,一个数据的标记数量大于等于2个;A11. Mark the data content, the number of markers for one data is greater than or equal to 2;A12、对数据的标记位置内容进行随机替换,使用测试函数对替换前后的数据进行测试运算,计算两次运算结果的偏差度;A12. Randomly replace the content of the marked position of the data, use the test function to perform test operations on the data before and after the replacement, and calculate the deviation of the results of the two operations;A13、重复步骤A12,每次执行步骤A12之前对数据的标记位置进行更换,直至偏差度超过预设阈值或重复次数达到预设次数,结束测试运算,选择偏差度最大的标记内容作为敏感因子。A13. Repeat step A12, changing the marked position of the data before each execution of step A12, until the deviation exceeds the preset threshold or the number of repetitions reaches the preset number of times, end the test operation, and select the marked content with the largest deviation as the sensitivity factor.4.根据权利要求2所述的用于数字资源使用建设的数据分析方法,其特征在于:步骤B中,建立经过步骤A处理后的数据的索引表包括以下步骤,4. the data analysis method for digital resource use construction according to claim 2, is characterized in that: in step B, setting up the index table of the data processed through step A comprises the following steps,B11、建立每个数据所包含敏感因子的敏感因子集合,建立敏感因子集合与数据之间的关联函数;B11. Establish a sensitive factor set of the sensitive factors contained in each data, and establish an association function between the sensitive factor set and the data;B12、建立两级索引表,第一级索引表的对象为关联函数,采用分组方式存储,将关联函数根据相似度进行分组,第二级索引表的对象为敏感因子集合,采用队列方式存储;B12. Establish a two-level index table. The object of the first-level index table is an association function, which is stored in a grouping mode, and the association function is grouped according to the similarity. The object of the second-level index table is a set of sensitive factors, which is stored in a queue mode;B13、检索数据时,首先通过第二级索引表查找与目标数据的敏感因子集合相同和/或相似的敏感因子集合,然后通过第一级索引表查找与第二级索引表中敏感因子相关的关联函数,最后通过查找到的关联函数所在分组中的关联函数查找目标数据。B13. When retrieving data, first look for a set of sensitive factors that is the same and/or similar to the set of sensitive factors of the target data through the second-level index table, and then through the first-level index table to look up the sensitive factors related to the sensitive factors in the second-level index table association function, and finally find the target data through the association function in the group where the found association function is located.5.根据权利要求4所述的用于数字资源使用建设的数据分析方法,其特征在于:步骤C中,对步骤B整合后的资源库索引表进行更新包括以下步骤,5. the data analysis method that is used for the construction of digital resource use according to claim 4, is characterized in that: in step C, the resource library index table after the integration of step B is updated and comprises the following steps,C1、根据模拟运算结果更新敏感因子集合;C1. Update the sensitive factor set according to the simulation operation result;C2、根据更新后的敏感因子集合对第二级索引表进行更新。C2. Update the second-level index table according to the updated sensitive factor set.
CN202111496809.4A2021-12-082021-12-08 A data analysis method for digital resource usage constructionActiveCN114328481B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111496809.4ACN114328481B (en)2021-12-082021-12-08 A data analysis method for digital resource usage construction

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111496809.4ACN114328481B (en)2021-12-082021-12-08 A data analysis method for digital resource usage construction

Publications (2)

Publication NumberPublication Date
CN114328481Atrue CN114328481A (en)2022-04-12
CN114328481B CN114328481B (en)2025-09-23

Family

ID=81050099

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111496809.4AActiveCN114328481B (en)2021-12-082021-12-08 A data analysis method for digital resource usage construction

Country Status (1)

CountryLink
CN (1)CN114328481B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104714252A (en)*2014-03-042015-06-17中国石油化工股份有限公司Method for analyzing fluid factor sensibility
CN108287835A (en)*2017-01-092018-07-17腾讯科技(深圳)有限公司A kind of data clearing method and device
CN109063222A (en)*2018-11-042018-12-21吉铁磊A kind of self-adapting data searching method based on big data
CN109241023A (en)*2018-09-212019-01-18郑州云海信息技术有限公司Distributed memory system date storage method, device, system and storage medium
CN110427655A (en)*2019-07-092019-11-08中国地质大学(武汉)A kind of extracting method for the sensitiveness that comes down
CN111048190A (en)*2019-11-292020-04-21挂号网(杭州)科技有限公司DRG grouping method based on artificial intelligence
US20200184028A1 (en)*2018-12-102020-06-11Institute For Information IndustryOptimization method and module thereof based on feature extraction and machine learning
WO2021143016A1 (en)*2020-01-152021-07-22平安科技(深圳)有限公司Approximate data processing method and apparatus, medium and electronic device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104714252A (en)*2014-03-042015-06-17中国石油化工股份有限公司Method for analyzing fluid factor sensibility
CN108287835A (en)*2017-01-092018-07-17腾讯科技(深圳)有限公司A kind of data clearing method and device
CN109241023A (en)*2018-09-212019-01-18郑州云海信息技术有限公司Distributed memory system date storage method, device, system and storage medium
CN109063222A (en)*2018-11-042018-12-21吉铁磊A kind of self-adapting data searching method based on big data
US20200184028A1 (en)*2018-12-102020-06-11Institute For Information IndustryOptimization method and module thereof based on feature extraction and machine learning
CN110427655A (en)*2019-07-092019-11-08中国地质大学(武汉)A kind of extracting method for the sensitiveness that comes down
CN111048190A (en)*2019-11-292020-04-21挂号网(杭州)科技有限公司DRG grouping method based on artificial intelligence
WO2021143016A1 (en)*2020-01-152021-07-22平安科技(深圳)有限公司Approximate data processing method and apparatus, medium and electronic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐松等: "改进的敏感性分析方法在桩端注浆效果评价中的应用", 施工技术, no. 10, 25 May 2015 (2015-05-25)*

Also Published As

Publication numberPublication date
CN114328481B (en)2025-09-23

Similar Documents

PublicationPublication DateTitle
CN106294762B (en)Entity identification method based on learning
CN104199931B (en)A kind of consistent semantic extracting method of trademark image and trade-mark searching method
WO2017210949A1 (en)Cross-media retrieval method
CN104573130B (en)The entity resolution method and device calculated based on colony
CN107832456B (en)Parallel KNN text classification method based on critical value data division
CN108959395B (en)Multi-source heterogeneous big data oriented hierarchical reduction combined cleaning method
CN104281674A (en)Adaptive clustering method and adaptive clustering system on basis of clustering coefficients
CN106484915B (en) A method and system for cleaning massive data
CN110493221A (en)A kind of network anomaly detection method based on the profile that clusters
CN109949185A (en) Judicial case discrimination system and method based on event tree analysis
CN112905380A (en)System anomaly detection method based on automatic monitoring log
CN104679887A (en)Large-scale image data similarity searching method based on EMD (earth mover's distance)
CN115146062A (en) Intelligent event analysis method and system integrating expert recommendation and text clustering
CN114969467A (en)Data analysis and classification method and device, computer equipment and storage medium
CN116451675A (en) A detection and optimization method for similar duplicate records based on the density clustering algorithm DBSCAN algorithm
CN117252183B (en)Semantic-based multi-source table automatic matching method, device and storage medium
CN109492098B (en)Target language material library construction method based on active learning and semantic density
WO2023272855A1 (en)Virus gene classification method and apparatus, electronic device, and computer-readable storage medium
CN105631465A (en)Density peak-based high-efficiency hierarchical clustering method
CN113205124B (en)Clustering method, system and storage medium based on density peak value under high-dimensional real scene
CN110008205A (en) A method for cleaning redundant data of monitoring system
CN105760478A (en)Large-scale distributed data clustering method based on machine learning
CN114328481A (en) A data analysis method for use and construction of digital resources
CN104268571B (en)A kind of Infrared Multi-Target dividing method based on minimum tree cluster
CN106557668A (en)DNA sequence dna similar test method based on LF entropys

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp