Movatterモバイル変換


[0]ホーム

URL:


CN120196616A - A data collection method and system for device service management platform - Google Patents

A data collection method and system for device service management platform
Download PDF

Info

Publication number
CN120196616A
CN120196616ACN202510324745.1ACN202510324745ACN120196616ACN 120196616 ACN120196616 ACN 120196616ACN 202510324745 ACN202510324745 ACN 202510324745ACN 120196616 ACN120196616 ACN 120196616A
Authority
CN
China
Prior art keywords
data
screening
similarity
paragraph
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510324745.1A
Other languages
Chinese (zh)
Inventor
应建群
苏国伟
沈秀利
梁霄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Shuode Software Co ltd
Original Assignee
Hangzhou Shuode Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Shuode Software Co ltdfiledCriticalHangzhou Shuode Software Co ltd
Priority to CN202510324745.1ApriorityCriticalpatent/CN120196616A/en
Publication of CN120196616ApublicationCriticalpatent/CN120196616A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种面向设备服务管理平台的数据采集方法及系统,涉及数据采集技术领域,提出一种智能筛减排序,用于解决数据采集效率低下,计算资源浪费的问题,通过采集数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值,建立数据分析模型,进行逻辑回归计算,得到筛选评估系数,并与预设的筛选阈值进行比对,依据比对结果,得到优先筛选的设备数据段落,再根据各个优先筛选的设备数据段落数据相似度,确定数据筛选段落,根据数据筛选段落,依据数据筛选段落调用频次用户端实时平均上传速度以及数据筛选段落在不同区域数据分布频繁程度制定一组模糊规则进行模糊推理,确定数据筛选段落调用排序结果。

The invention discloses a data collection method and system for a device service management platform, relates to the technical field of data collection, and proposes an intelligent screening and sorting method for solving the problems of low data collection efficiency and waste of computing resources. By collecting data call frequency, data fluctuation value, data validity period and data area coverage value, a data analysis model is established, and a logistic regression calculation is performed to obtain a screening evaluation coefficient, which is compared with a preset screening threshold. According to the comparison result, a device data section for priority screening is obtained, and then according to the data similarity of each device data section for priority screening, a data screening section is determined. According to the data screening section, a group of fuzzy rules are formulated according to the data screening section call frequency, the real-time average upload speed of the user end and the frequency of data distribution of the data screening section in different regions for fuzzy reasoning to determine the data screening section call sorting result.

Description

Data acquisition method and system for equipment service management platform
Technical Field
The invention relates to the technical field of data acquisition, in particular to a data acquisition method and system for a device service management platform.
Background
With the rapid development of the internet of things technology, the demands of various industries for intelligent equipment are increasing year by year. The equipment service management platform is used as a core tool for realizing remote equipment monitoring, operation and maintenance management and intelligent decision making, is widely applied to the fields of industrial manufacturing, energy management, smart cities and the like, and has the core of grasping the running state of the distributed equipment in real time, and the efficient and accurate data acquisition function is not needed.
The prior art has the following defects:
At present, data acquisition refers to acquiring operation parameters and state information from distributed equipment and uploading the operation parameters and state information to a management platform so as to support subsequent analysis, storage and decision, but in practical application, data acquisition of each equipment cannot provide an effective intelligent screening, so that data acquisition efficiency is reduced, and meanwhile, old and old data maintained by some equipment can cost more calculation resources when substituted into comprehensive calculation, and meanwhile, the accuracy of platform processing data is reduced. Therefore, a data acquisition method and a system for the equipment service management platform are provided.
The above information disclosed in the background section is only for enhancement of understanding of the background of the disclosure and therefore it may include information that does not form the prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, embodiments of the present invention provide a data collection method and system for an equipment service management platform, which solve the problems set forth in the above-mentioned background art by applying different product inspection methods.
In order to achieve the above object, the present invention provides a data acquisition method for an equipment service management platform, including:
s1, according to equipment updating log database, collecting data platform characteristic information and data service characteristic information, and performing data processing to obtain data calling frequency, data fluctuation value, data valid period duration and data area coverage value;
S2, acquiring data calling frequency, data fluctuation value, data valid period duration and data area coverage value, establishing a data analysis model, and performing logistic regression calculation to obtain screening evaluation coefficients;
S3, acquiring a screening evaluation coefficient, comparing the screening evaluation coefficient with a preset screening threshold value, obtaining equipment data paragraphs with priority screening according to equipment data paragraph results marked as priority paragraphs in the comparison result, and determining data screening paragraphs according to the data similarity of each piece of equipment data with priority screening;
S4, acquiring a data screening paragraph, and acquiring the real-time average uploading speed of a frequency user side and the data distribution frequency of the data screening paragraph in different areas according to the data screening paragraph;
And S5, determining a data screening paragraph calling sequencing result by using fuzzy logic according to the real-time average uploading speed of the data screening paragraph calling data user side and the data distribution frequency of the data screening paragraphs in different areas.
In a preferred embodiment, the data platform characteristic information comprises data calling frequency and data fluctuation value, and the data service characteristic information comprises data valid period duration and data area coverage value;
Recording events of each data call by determining the analyzed unit time, counting the total number of data call occurrences in the set unit time, and calculating the ratio of the total number of data call occurrences in the unit time to the unit time length to obtain the data call frequencyWherein i is the ith unit time, g is the g data tag;
acquiring operation parameter time sequence data of different data labels from equipment operation data records, and calculating standard deviation of operation parameter values relative to mean values of the operation parameter time sequence data to obtain data fluctuation values;
Acquiring the valid period ending time and the current time of the data in the data information database, and subtracting the current time from the valid period ending time of the data to obtain the valid period duration of the data;
Calculating by providing the longitude and latitude coordinates of the data of each data tag point to obtain the geographic position distribution coordinate area of the data tag, and calculating the ratio of the geographic position distribution coordinate area to the coverage area in the time range to obtain the coverage value of the data area
In a preferred embodiment, the data call frequency, the data fluctuation value, the data validity period duration, and the data area coverage value are substituted into the logistic regression calculation specific formula as follows:
;
In the formula,For the logistic regression calculation result, i.e. screening the evaluation coefficient, e is a natural base, y is a linear combination term of the logistic regression model, and specifically y can be set as follows:
;
In the formula,As a result of the bias term,AndRegression coefficients of the data call frequency, the data fluctuation value, the data validity period duration and the data area coverage value are respectively obtained.
In a preferred embodiment, after the screening evaluation coefficients are obtained, the screening evaluation coefficients are compared with a continuously iterated screening threshold value for analysis;
if the screening evaluation coefficient is greater than or equal to the screening threshold value, marking the current equipment data paragraph as a priority paragraph, and generating a screening signal;
If the screening evaluation coefficient is smaller than the screening threshold, marking the current equipment data paragraph as a screened paragraph, and generating an ending signal.
In a preferred embodiment, the current device data paragraph marked as a priority paragraph is marked as a priority screened device data paragraph;
and carrying out vectorization representation on three dimensions of numerical values, distribution and structures in the preferentially screened equipment data paragraphs, and calculating to obtain numerical value similarity, time sequence similarity and feature vector similarity through a cosine similarity formula.
In a preferred embodiment, substituting the numerical similarity, the time sequence similarity and the feature vector similarity into a weighted formula to calculate the data similarity of each device data segment which is preferentially screened;
Comparing the data similarity of the preferentially screened equipment data paragraphs with a preset similarity threshold, screening out the equipment data paragraphs corresponding to the comparison similarity if the data similarity of the preferentially screened equipment data paragraphs is larger than the similarity threshold, otherwise, reserving the equipment data paragraphs corresponding to the comparison similarity until the comparison is completed;
and collecting the equipment data paragraphs reserved by the comparison result to obtain data screening paragraphs.
In a preferred embodiment, the data filtering section comprises a plurality of data tags, and the data tags comprise a plurality of data;
The method comprises the steps of accumulating the real-time uploading speeds of all calling users by acquiring the number of calling users in a current data screening paragraph, and calculating the ratio of the accumulated real-time uploading speeds to the number of calling users in the current data screening paragraph to obtain the real-time average uploading speed of the calling data user side of the data screening paragraph;
and extracting geographic position data corresponding to the data screening paragraphs from the records of the data source, calculating the ratio of the data quantity corresponding to different positions to the total data quantity of the data screening paragraphs, and counting the data distribution frequency in each region to accumulate so as to obtain the data distribution frequency of the data screening paragraphs in different regions.
In a preferred embodiment, the data filtering section calling data user side real-time average uploading speed and the data distribution frequency of the data filtering section in different areas are respectively divided into different fuzzy sets;
defining the data screening paragraph calling sequencing result as an output variable, and dividing the output variable into fuzzy sets;
Formulating a fuzzy rule, and describing the influence of the data screening paragraph calling data user end on the real-time average uploading speed and the data distribution frequency of the data screening paragraphs in different areas on the data screening paragraph calling sequencing result;
and carrying out fuzzy reasoning according to the fuzzy rule, and determining a data screening paragraph calling sequencing result.
The data acquisition system facing to the equipment service management platform comprises a data acquisition module, a data processing module, a screening analysis module and a paragraph ordering module;
The data acquisition module is used for updating the log database according to the equipment, acquiring the characteristic information of the data platform and the characteristic information of the data service, performing data processing to obtain data calling frequency, a data fluctuation value, a data valid period duration and a data area coverage value, and sending the data calling frequency, the data fluctuation value, the data valid period duration and the data area coverage value to the data processing module;
The data processing module is used for acquiring data calling frequency, data fluctuation value, data valid period duration and data area coverage range value, establishing a data analysis model, performing logistic regression calculation to obtain screening evaluation coefficients, and sending the screening evaluation coefficients to the screening analysis module;
The screening analysis module is used for acquiring a screening evaluation coefficient, comparing the screening evaluation coefficient with a preset screening threshold value, obtaining equipment data paragraphs with priority screening according to equipment data paragraph results marked as priority paragraphs in the comparison result, determining data screening paragraphs according to data similarity of each piece of equipment data with priority screening, and sending the data screening paragraphs to the paragraph sorting module;
the paragraph sorting module is used for acquiring data screening paragraphs, acquiring real-time average uploading speed of a data screening paragraph calling frequency user side and data distribution frequency of the data screening paragraphs in different areas according to the data screening paragraphs, and determining a data screening paragraph calling sorting result by using fuzzy logic.
The invention has the technical effects and advantages that:
1. According to the method, a data analysis model is built through collecting data calling frequency, data fluctuation values, data valid period duration and data area coverage values, logistic regression calculation is conducted to obtain screening evaluation coefficients, the screening evaluation coefficients are compared with preset screening thresholds, equipment data paragraphs which are screened preferentially are obtained according to comparison results, then the data screening paragraphs are determined according to the data similarity of the equipment data paragraphs which are screened preferentially, multidimensional comprehensive analysis is conducted, screening accuracy is improved, data collection efficiency is improved, and waste of calculation resources is avoided.
2. According to the method, a group of fuzzy rules are formulated for fuzzy reasoning according to the data screening paragraphs, the data average uploading speed of the data screening paragraph calling frequency user side and the data distribution frequency of the data screening paragraphs in different areas, the data screening paragraph calling sequencing result is determined, the calculation pressure of subsequent data analysis is reduced, the service efficiency is improved, and the user experience is enhanced.
Drawings
Fig. 1 is a flow chart of a method for collecting data for a device service management platform according to the present invention.
Fig. 2 is a schematic block diagram of a data acquisition system facing to a device service management platform according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
According to the method, a log database is updated according to equipment, equipment data characteristic information and equipment service characteristic information are collected, analysis is carried out on the data quality and service period characteristics, logistic regression calculation is carried out through an analysis data analysis model, screening evaluation coefficients are obtained through output, data screening paragraphs are determined through the screening evaluation coefficients, and data collection results are determined through a fuzzy Bayesian model according to data importance scores and data selection frequencies in the data screening paragraphs;
The equipment service period belongs to a quality guarantee period after a user owns equipment, in the quality guarantee period, when a customer scans a code and needs to upload service required by the customer, data acquisition is carried out, and acquired data needs to be analyzed in advance from a database;
Example 1
Referring to fig. 1, a data collection method for an equipment service management platform includes the following specific operation procedures:
s1, according to equipment updating log database, collecting data platform characteristic information and data service characteristic information, and performing data processing to obtain data calling frequency, data fluctuation value, data valid period duration and data area coverage value;
The data processing comprises data type conversion, missing value processing, data standardization and feature extraction operation;
The data type conversion refers to converting different types of data (such as character strings, dates, values and the like) according to the source format and target application requirements of the data, for example, converting a device log or a data record to floating point type or integer type data by storing the values in the form of the character strings;
The missing value processing is used for solving the problem of null value caused by equipment faults, network transmission problems or incomplete recording and other reasons in equipment logs or data records, and the missing value is processed by a deletion method or a mean filling method preferentially;
Data standardization is carried out on all acquired data according to a unified scale, and data calling frequency, data fluctuation value, data valid period duration and data area coverage range value are subjected to standardization;
specifically, the above data operation methods are all in the prior art, and are not described herein in detail;
The data platform characteristic information comprises data calling frequency and a data fluctuation value, and the data service characteristic information comprises data valid period duration and a data area coverage value;
The data calling frequency refers to the number of times of data calling in a set unit time, and measures the response capability, stability and efficiency of data service, the acquisition logic is used for recording the event of each data calling by determining the analyzed unit time, counting the total number of times of data calling in the set unit time, and calculating the ratio of the total number of times of data calling in the unit time to the length of the unit time to obtain the data calling frequencyWherein i is the ith unit time, g is the g data tag;
It should be noted that, the setting of the unit time may be the lengths of "24 hours", "36 hours" and "72 hours", and the specific length setting is determined by the present experimenter according to the historical data calling frequency and the historical service period length, which is not described herein in detail;
it should be noted that, the data tag refers to that the platform performs classification processing on all data in the database in advance, divides a plurality of groups, and attaches a corresponding data tag to the groups, and the specific format of the data tag is not limited;
The data fluctuation value refers to the change amplitude of the operation parameters (such as temperature, pressure, current and the like) of the equipment in unit time and is used for evaluating the data screenability of the equipment, the acquisition logic acquires the operation parameter time sequence data of different data labels from the equipment operation data record, and the data fluctuation value is obtained by calculating the standard deviation of the operation parameter value relative to the average value;
The device service-oriented management platform preferentially divides the data into a plurality of tags and marks the tags as g, namely g is a g-th data tag, the specific division rule can be based on the classification of the experimenter according to the original data or the similarity among the data, the specific classification method is not limited, but the experimenter is obtained according to the specific implementation mode and is not repeated herein;
specifically, the above operation parameters are not limited, and may be temperature, pressure, current, etc., and in this embodiment, only the operation parameter that can be most represented is selected, for example, for a refrigeration device, a fluctuation value of refrigeration efficiency data of the refrigeration device is obtained, for a heating device, a fluctuation value of a heating indication temperature of the heating device is obtained, etc., and the selection of a specific operation parameter is set by the present experimenter according to a specific device application scenario and a device operation feature, which is not described herein in detail;
wherein, the formula for calculating the standard deviation of the operation parameter value relative to the mean value thereof is expressed as follows:
;
In the formula,Expressing the standard deviation of the g-th data in the ith unit time, namely the data fluctuation value, P is the total number of data tags,As the operating parameter value for the class g data,Is the mean value of the operating parameters;
The data validity period length refers to the time left from the current moment to the end of the data validity period, is an important parameter for measuring the state of the current life cycle of the data and is used for predicting the filterability and the validity period length of the data, and the acquisition logic acquires the validity period end time and the current time of the data from the data information database, and subtracts the current time from the validity period end time of the data to obtain the data validity period length;
Specifically, the data valid period duration selection system defaults to compare the valid period end time of the data with the current time, if the valid period end time of the data is larger than the current time, the current data is indicated to exist in the valid period, otherwise, the valid period end time of the data is indicated, and the data is deleted;
The data area coverage value refers to the data of different data tags, and is generally used for measuring the availability and coverage degree of a data platform at different geographic positions in a corresponding geographic area coverage within a unit time, and the acquisition logic calculates by providing the longitude and latitude coordinates of the data of each data tag point to obtain the geographic position distribution coordinate area of the data tag, and calculates the ratio of the geographic position distribution coordinate area to the coverage area within the time range to obtain the data area coverage value;
The data longitude and latitude coordinates of each data tag point refer to a specific geographic position associated with each data tag, so as to obtain corresponding longitude and latitude points, the area of the data coverage area is calculated by calculating the minimum rectangular boundary of the points, the range of the boundary frame is determined by the maximum longitude and latitude value and the minimum longitude and latitude value, a rectangular area is formed, and the specific area calculation can be calculated by using a spherical triangle formula and the like and is not described herein;
S2, acquiring data calling frequency, data fluctuation value, data valid period duration and data area coverage value, establishing a data analysis model, and performing logistic regression calculation to obtain screening evaluation coefficients;
the data analysis model refers to a logistic regression calculation model, and screening evaluation coefficients are generated through logistic regression calculation;
substituting the data calling frequency, the data fluctuation value, the data valid period duration and the data area coverage range value into a logistic regression calculation specific formula to express as follows:
;
In the formula,For the logistic regression calculation result, i.e. screening the evaluation coefficient, e is a natural base, y is a linear combination term of the logistic regression model, and specifically y can be set as follows:
;
In the formula,As a result of the bias term,AndRegression coefficients of the data calling frequency, the data fluctuation value, the data valid period duration and the data area coverage value are respectively obtained;
the data calling frequency, the data fluctuation value, the data valid period duration and the data area coverage range value are all data embodiments for directly expressing the screening selection of the current equipment data paragraph;
The formula shows that when the data calling frequency, the data fluctuation value and the coverage range value of the data area are higher, the current equipment data section is more valuable in aspects of desirability, dynamic property, applicability and the like, the screening evaluation coefficient is required to be preferentially screened, otherwise, the time length of the data validity period is longer, the timeliness of the current equipment data section is higher, the priority is reduced, and the screening evaluation coefficient is lower;
S3, acquiring a screening evaluation coefficient, comparing the screening evaluation coefficient with a preset screening threshold value, obtaining equipment data paragraphs with priority screening according to equipment data paragraph results marked as priority paragraphs in the comparison result, and determining data screening paragraphs according to the data similarity of each piece of equipment data with priority screening;
The acquisition logic of the screening threshold value is that the data set is divided into a training set and a test set by collecting the priority classification set of the historical equipment data paragraph, an evaluation index and a clustering algorithm are set, in each iteration of cross verification, a model is trained on the training set, the performance of the model is evaluated on the test set, and then the screening threshold value is adjusted according to the performance of the verification set, so that the screening threshold value is continuously and iteratively updated;
In the invention, a clustering algorithm is an unsupervised learning algorithm and is used for classifying the priorities of the equipment data paragraphs in a data set into groups or clusters with labeling property, and a common K-means clustering is used for classifying weighing data in the data set into K clusters so as to minimize the distance between each equipment data paragraph priority and the center point (centroid) of the cluster to which each equipment data paragraph belongs, and finally, the priorities of the equipment data paragraph distribution are measured through Euclidean distance, thereby setting a screening threshold value;
After the screening evaluation coefficient is obtained, the screening evaluation coefficient is compared with a continuously iterated screening threshold value for analysis;
if the screening evaluation coefficient is greater than or equal to the screening threshold value, marking the current equipment data paragraph as a priority paragraph, and generating a screening signal;
if the screening evaluation coefficient is smaller than the screening threshold value, marking the current equipment data paragraph as a screened paragraph, and generating an ending signal;
marking the current device data paragraph marked as the priority paragraph as the device data paragraph of the priority screening;
Vectorizing the three dimensions of the numerical value, distribution and structure in the preferentially screened equipment data paragraph, and calculating to obtain numerical value similarity, time sequence similarity and feature vector similarity through a cosine similarity formula;
Specifically, in this embodiment, the numerical similarity, the time sequence similarity and the feature vector similarity are analyzed by three dimensions, and in fact, the experimenter may set denser data similarity according to practical application to more accurately express the data similarity of each device data segment preferentially screened, perform operations of improving screening precision, and so on, which are not described herein in detail;
it should be noted that, when calculating the similarity formula, the present example calculates the cosine similarity, however, in practical application, the euclidean distance may also be used to determine the numerical similarity, etc., and the time sequence similarity may also be determined according to a dynamic time warping method, etc., where the method for calculating the similarity formula is not limited, but is set according to a calculation model preset by the experimenter, and will not be described herein;
Substituting the numerical similarity, the time sequence similarity and the feature vector similarity into a weighted formula to calculate and obtain the data similarity of each preferentially screened equipment data segment;
Comparing the similarity of the preferentially screened equipment data paragraph data with a preset similarity threshold value, screening out the equipment data paragraph corresponding to the comparison similarity if the similarity of the preferentially screened equipment data paragraph data is larger than the similarity threshold value, otherwise, reserving the equipment data paragraph corresponding to the comparison similarity until the comparison is completed;
Collecting the equipment data paragraphs reserved by the comparison result to obtain data screening paragraphs;
According to the method, a data analysis model is built through collecting data calling frequency, data fluctuation values, data valid period duration and data area coverage values, logistic regression calculation is conducted to obtain screening evaluation coefficients, the screening evaluation coefficients are compared with preset screening thresholds, equipment data paragraphs which are screened preferentially are obtained according to comparison results, then the data screening paragraphs are determined according to the data similarity of the equipment data paragraphs which are screened preferentially, multidimensional comprehensive analysis is conducted, screening accuracy is improved, data collection efficiency is improved, and waste of calculation resources is avoided.
Example 2
In the embodiment 1 of the invention, the data collection frequency, the data fluctuation value, the data valid period duration and the coverage range value of the data area are mainly illustrated, a data analysis model is established, the logistic regression calculation is carried out to obtain screening evaluation coefficients, the screening evaluation coefficients are compared with preset screening thresholds, the equipment data paragraphs which are preferentially screened are obtained according to the comparison results, and then the operation strategy of the data screening paragraphs is determined according to the data similarity of the equipment data paragraphs which are preferentially screened;
S4, acquiring a data screening paragraph, and acquiring the real-time average uploading speed of a frequency user side and the data distribution frequency of the data screening paragraph in different areas according to the data screening paragraph;
Specifically, the data filtering section includes a plurality of data tags, and the data tags include a plurality of data;
The acquisition logic of the real-time average uploading speed of the data screening paragraph call data user end is used for accumulating the real-time uploading speeds of all the called users by acquiring the number of the call users in the current data screening paragraph and calculating the ratio of the real-time uploading speeds to the number of the call users in the current data screening paragraph to obtain the real-time average uploading speed of the data screening paragraph call data user end;
The method comprises the steps that when a user performs code scanning or other interactive operations, a client application monitors uploading speed in real time, specifically, uploading speed is collected periodically in the process of the client realizing an uploading module, for example, uploading amount and time consumption are recorded once per second, the real-time uploading speed is stored in a local cache, and the real-time collection mode is not limited and is not repeated herein;
the data distribution frequency of the data screening section in different areas reflects the activity of the data screening section in the geographic position, the acquisition logic extracts geographic position data corresponding to the data screening section from the record of the data source, calculates the ratio of the data quantity corresponding to the different positions to the total data quantity of the data screening section, and counts the data distribution frequency in each area to accumulate so as to obtain the data distribution frequency of the data screening section in different areas;
It should be noted that, for the selection and the number of the geographic locations, the geographic locations with a large number of users or high activity may be preferentially selected by analyzing the activity of the users at each geographic location, which is not described herein;
S5, determining a data screening paragraph calling sequencing result by using fuzzy logic according to the real-time average uploading speed of the data screening paragraph calling data user side and the data distribution frequency of the data screening paragraphs in different areas;
For example, "Fast", "Slow", "Moderate" call the real-time average uploading speed of the data user side for the data screening paragraph, and "High", "Low", "Medium" frequently distribute data in different areas for the data screening paragraph;
A set of fuzzy rules is formulated to describe the influence of different input variables on the output variables. The definition of rules may be based on expertise or may be obtained through data analysis and experimentation. For example:
Marking the real-time average uploading speed of a data screening paragraph call data user end as X, marking the data distribution frequency degree of the data screening paragraph in different areas as U, and marking the data screening paragraph call sequencing result as C_results;
Then it is possible to define:
Rule 1: IF (X is Fast) AND (U is High) THEN (C_results is High)
Rule 2: IF (U is Slow) AND (U is Low) THEN (C_results is Low)
...
performing fuzzy reasoning according to the fuzzy rule, and determining a data screening paragraph calling sequencing result;
It should be noted that, the division of the fuzzy set may be adjusted according to the actual situation, for example, although the embodiment uses three fuzzy sets as examples, the real-time average uploading speed of the data user end for invoking the data screening paragraph and the data distribution frequency of the data screening paragraph in different areas may be actually divided into more than three sets, so as to facilitate better accurate adjustment according to different data labels;
Further, for the judgment of the real-time average uploading speed of the data user side of the data screening paragraph call and the data distribution frequency degree of the data screening paragraph in different areas, the judgment can be performed according to the actual situation, for example, when the real-time average uploading speed of the data user side of the data screening paragraph call exceeds 75%, the data user side is marked as Fast, and when the data distribution frequency degree of the data screening paragraph in different areas is higher than 70%, the data screening paragraph is marked as High, and the like, which are not described herein;
According to the method, a group of fuzzy rules are formulated for fuzzy reasoning according to the data screening paragraphs, the data average uploading speed of the data screening paragraph calling frequency user side and the data distribution frequency of the data screening paragraphs in different areas, the data screening paragraph calling sequencing result is determined, the calculation pressure of subsequent data analysis is reduced, the service efficiency is improved, and the user experience is enhanced.
Example 3
Referring to fig. 2, a data acquisition system facing to an equipment service management platform includes a data acquisition module, a data processing module, a screening analysis module and a paragraph ordering module;
The data acquisition module is used for updating the log database according to the equipment, acquiring the characteristic information of the data platform and the characteristic information of the data service, performing data processing to obtain data calling frequency, a data fluctuation value, a data valid period duration and a data area coverage value, and sending the data calling frequency, the data fluctuation value, the data valid period duration and the data area coverage value to the data processing module;
The data processing module is used for acquiring data calling frequency, data fluctuation value, data valid period duration and data area coverage range value, establishing a data analysis model, performing logistic regression calculation to obtain screening evaluation coefficients, and sending the screening evaluation coefficients to the screening analysis module;
The screening analysis module is used for acquiring a screening evaluation coefficient, comparing the screening evaluation coefficient with a preset screening threshold value, obtaining equipment data paragraphs with priority screening according to equipment data paragraph results marked as priority paragraphs in the comparison result, determining data screening paragraphs according to data similarity of each piece of equipment data with priority screening, and sending the data screening paragraphs to the paragraph sorting module;
the paragraph sorting module is used for acquiring data screening paragraphs, acquiring real-time average uploading speed of a data screening paragraph calling frequency user side and data distribution frequency of the data screening paragraphs in different areas according to the data screening paragraphs, and determining a data screening paragraph calling sorting result by using fuzzy logic.
The above formulas are all formulas with dimensions removed and numerical values calculated, the formulas are formulas with a large amount of data collected for software simulation to obtain the latest real situation, and preset parameters in the formulas are set by those skilled in the art according to the actual situation.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

Translated fromChinese
1.一种面向设备服务管理平台的数据采集方法,其特征在于:包括:1. A data collection method for a device service management platform, characterized in that it includes:S1:依据设备更新日志数据库,采集数据平台特征信息以及数据服务特征信息,进行数据化处理,得到数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值;S1: According to the device update log database, collect data platform characteristic information and data service characteristic information, perform data processing, and obtain data call frequency, data fluctuation value, data validity period and data area coverage value;S2:获取数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值,建立数据分析模型,进行逻辑回归计算,得到筛选评估系数;S2: Obtain data call frequency, data fluctuation value, data validity period, and data area coverage value, establish a data analysis model, perform logistic regression calculation, and obtain the screening evaluation coefficient;S3:获取筛选评估系数,并与预设的筛选阈值进行比对,依据比对结果中标记为优先段落的设备数据段落结果得到优先筛选的设备数据段落,再根据各个优先筛选的设备数据段落的数据相似度,确定数据筛选段落;S3: Obtaining a screening evaluation coefficient and comparing it with a preset screening threshold, obtaining a device data segment for priority screening according to the device data segment results marked as priority segments in the comparison results, and then determining a data screening segment according to the data similarity of each device data segment for priority screening;S4:获取数据筛选段落,根据数据筛选段落,采集数据筛选段落调用频次用户端实时平均上传速度以及数据筛选段落在不同区域数据分布频繁程度;S4: Obtain the data screening section, and according to the data screening section, collect the real-time average upload speed of the user end of the data screening section call frequency and the frequency of data distribution of the data screening section in different regions;S5:根据数据筛选段落调用数据用户端实时平均上传速度以及数据筛选段落在不同区域数据分布频繁程度,使用模糊逻辑确定数据筛选段落调用排序结果。S5: According to the real-time average upload speed of the data filtering section calling data user end and the frequency of data distribution in different regions of the data filtering section, the fuzzy logic is used to determine the data filtering section calling sorting result.2.根据权利要求1所述的一种面向设备服务管理平台的数据采集方法,其特征在于:数据平台特征信息包括数据调用频率以及数据波动值;数据服务特征信息包括数据有效期限时长以及数据区域覆盖范围值;2. According to claim 1, a data collection method for a device service management platform is characterized in that: the data platform characteristic information includes data call frequency and data fluctuation value; the data service characteristic information includes data validity period and data area coverage value;通过确定分析的单位时间,记录每一次数据调用的事件,在设定的单位时间内,统计数据调用发生的总次数,将单位时间内的数据调用总次数与单位时间长度进行比值计算,得到数据调用频率;其中,i为第i个单位时间,g为第g个数据标签;By determining the unit time for analysis, recording each data call event, and calculating the total number of data calls within the set unit time, the data call frequency is obtained by calculating the ratio of the total number of data calls within the unit time to the unit time length. ; Where i is the i-th unit time, and g is the g-th data label;通过设备运行数据记录中获取不同数据标签的运行参数时间序列数据,通过计算运行参数值相对于其均值的标准差,得到数据波动值Obtain the time series data of operating parameters with different data tags from the equipment operation data records, and obtain the data fluctuation value by calculating the standard deviation of the operating parameter value relative to its mean. ;通过数据信息数据库中获取数据的有效期限结束时间以及当前时间,将数据的有效期限结束时间减去当前时间得到数据有效期限时长Obtain the data validity period end time and current time from the data information database, and subtract the current time from the data validity period end time to get the data validity period length ;通过提供各个数据标签点的数据经纬度坐标进行计算,得到数据标签的地理位置分布坐标面积,并与时间范围内的可覆盖区域面积进行比值计算,得到数据区域覆盖范围值By providing the data latitude and longitude coordinates of each data label point for calculation, the geographical location distribution coordinate area of the data label is obtained, and the ratio calculation is performed with the coverable area within the time range to obtain the data area coverage value. .3.根据权利要求2所述的一种面向设备服务管理平台的数据采集方法,其特征在于:将数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值代入逻辑回归计算具体公式表达如下:3. According to claim 2, a data collection method for a device service management platform is characterized in that: the data call frequency, data fluctuation value, data validity period and data area coverage value are substituted into the logistic regression calculation specific formula to express as follows: ;式中,为逻辑回归计算结果,即筛选评估系数,e为自然底数,y为逻辑回归模型的线性组合项,具体y设置为:In the formula, is the result of logistic regression calculation, that is, the screening evaluation coefficient, e is the natural base, y is the linear combination term of the logistic regression model, and the specific y is set as: ;式中,为偏置项,以及分别为数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值的回归系数。In the formula, is the bias term, , , as well as They are the regression coefficients of data call frequency, data fluctuation value, data validity period and data area coverage value.4.根据权利要求3所述的一种面向设备服务管理平台的数据采集方法,其特征在于:获取筛选评估系数后,将筛选评估系数与不断迭代的筛选阈值进行比对分析;4. A data collection method for a device service management platform according to claim 3, characterized in that: after obtaining the screening evaluation coefficient, the screening evaluation coefficient is compared and analyzed with the continuously iterated screening threshold;若筛选评估系数大于或等于筛选阈值,则将当前设备数据段落标记为优先段落,并生成筛选信号;If the screening evaluation coefficient is greater than or equal to the screening threshold, the current device data segment is marked as a priority segment and a screening signal is generated;若筛选评估系数小于筛选阈值,则将当前设备数据段落标记为筛出段落,并生成结束信号。If the screening evaluation coefficient is less than the screening threshold, the current device data segment is marked as a screened-out segment and an end signal is generated.5.根据权利要求4所述的一种面向设备服务管理平台的数据采集方法,其特征在于:将标记为优先段落的当前设备数据段落记为优先筛选的设备数据段落;5. A data collection method for a device service management platform according to claim 4, characterized in that: the current device data segment marked as a priority segment is recorded as the device data segment for priority screening;将优先筛选的设备数据段落内的数值、分布、结构三个维度进行向量化表示,通过余弦相似度公式计算得到数值相似度、时间序列相似度以及特征向量相似度。The three dimensions of value, distribution and structure in the prioritized device data segments are vectorized, and the value similarity, time series similarity and feature vector similarity are calculated using the cosine similarity formula.6.根据权利要求5所述的一种面向设备服务管理平台的数据采集方法,其特征在于:将数值相似度、时间序列相似度以及特征向量相似度代入加权公式计算得到各个优先筛选的设备数据段落的数据相似度;6. A data collection method for a device service management platform according to claim 5, characterized in that: numerical similarity, time series similarity and feature vector similarity are substituted into a weighted formula to calculate the data similarity of each device data segment that is preferentially screened;将优先筛选的设备数据段落的数据相似度与预设的相似阈值比对,若优先筛选的设备数据段落的数据相似度大于相似阈值,则将对应的比对相似度的设备数据段落筛除,反之,则将对应的比对相似度的设备数据段落保留,直至比对结束;Compare the data similarity of the device data segments that are prioritized for screening with the preset similarity threshold. If the data similarity of the device data segments that are prioritized for screening is greater than the similarity threshold, the device data segments with the corresponding comparison similarity are screened out. Otherwise, the device data segments with the corresponding comparison similarity are retained until the comparison is completed.将比对结果保留的设备数据段落进行收集,得到数据筛选段落。The device data segments retained by the comparison results are collected to obtain data screening segments.7.根据权利要求6所述的一种面向设备服务管理平台的数据采集方法,其特征在于:数据筛选段落内包含多个数据标签,数据标签包含多个数据;7. A data collection method for a device service management platform according to claim 6, characterized in that: a data screening section contains a plurality of data tags, and a data tag contains a plurality of data;通过获取当前数据筛选段落内调用用户的数量,将所有调用的用户的实时上传速度进行累加,并与当前数据筛选段落内调用用户的数量进行比值计算,得到数据筛选段落调用数据用户端实时平均上传速度;By obtaining the number of calling users in the current data screening section, the real-time upload speeds of all calling users are accumulated, and the ratio is calculated with the number of calling users in the current data screening section to obtain the real-time average upload speed of the data user end calling the data screening section;通过数据源的记录中提取数据筛选段落对应的地理位置数据,将不同位置对应的数据数量与不同数据筛选段落数据总量进行比值计算,再统计每个区域内的数据分布频率进行累加得到数据筛选段落在不同区域数据分布频繁程度。The geographic location data corresponding to the data filtering paragraphs are extracted from the records of the data source, and the ratio of the number of data corresponding to different locations to the total amount of data in different data filtering paragraphs is calculated. Then, the data distribution frequency in each area is counted and accumulated to obtain the frequency of data distribution of the data filtering paragraphs in different areas.8.根据权利要求7所述的一种面向设备服务管理平台的数据采集方法,其特征在于:将数据筛选段落调用数据用户端实时平均上传速度以及数据筛选段落在不同区域数据分布频繁程度,分别划分为不同的模糊集合;8. A data collection method for a device service management platform according to claim 7, characterized in that: the real-time average upload speed of the data user end calling the data screening section and the frequency of data distribution in different regions of the data screening section are divided into different fuzzy sets respectively;将数据筛选段落调用排序结果定义为输出变量,划分为模糊集合;Define the sorting results of the data screening paragraph call as output variables and divide them into fuzzy sets;制定模糊规则,描述数据筛选段落调用数据用户端实时平均上传速度以及数据筛选段落在不同区域数据分布频繁程度对数据筛选段落调用排序结果的影响;Formulate fuzzy rules to describe the impact of the real-time average upload speed of the data filtering section calling data user end and the frequency of data distribution in different regions of the data filtering section on the data filtering section calling sorting results;根据模糊规则进行模糊推理,确定数据筛选段落调用排序结果。Perform fuzzy reasoning based on fuzzy rules to determine the data screening paragraph call sorting results.9.一种面向设备服务管理平台的数据采集系统,用于实现权利要求1-8任意一项所述的一种面向设备服务管理平台的数据采集方法,其特征在于:包括数据采集模块、数据处理模块、筛选分析模块以及段落排序模块;9. A data collection system for a device service management platform, used to implement a data collection method for a device service management platform according to any one of claims 1 to 8, characterized in that it comprises a data collection module, a data processing module, a screening and analysis module, and a paragraph sorting module;数据采集模块用于依据设备更新日志数据库,采集数据平台特征信息以及数据服务特征信息,进行数据化处理,得到数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值,并发送至数据处理模块;The data collection module is used to collect data platform characteristic information and data service characteristic information based on the device update log database, perform data processing, obtain data call frequency, data fluctuation value, data validity period and data area coverage value, and send them to the data processing module;数据处理模块用于获取数据调用频率、数据波动值、数据有效期限时长以及数据区域覆盖范围值,建立数据分析模型,进行逻辑回归计算,得到筛选评估系数,并发送至筛选分析模块;The data processing module is used to obtain data call frequency, data fluctuation value, data validity period and data area coverage value, establish a data analysis model, perform logistic regression calculation, obtain the screening evaluation coefficient, and send it to the screening analysis module;筛选分析模块用于获取筛选评估系数,并与预设的筛选阈值进行比对,依据比对结果中标记为优先段落的设备数据段落结果得到优先筛选的设备数据段落,再根据各个优先筛选的设备数据段落的数据相似度,确定数据筛选段落,并发送至段落排序模块;The screening analysis module is used to obtain the screening evaluation coefficient and compare it with the preset screening threshold value, obtain the device data paragraphs with priority screening according to the device data paragraph results marked as priority paragraphs in the comparison results, and then determine the data screening paragraphs according to the data similarity of each device data paragraph with priority screening, and send them to the paragraph sorting module;段落排序模块用于获取数据筛选段落,根据数据筛选段落,采集数据筛选段落调用频次用户端实时平均上传速度以及数据筛选段落在不同区域数据分布频繁程度,使用模糊逻辑确定数据筛选段落调用排序结果。The paragraph sorting module is used to obtain data filtering paragraphs. According to the data filtering paragraphs, the real-time average upload speed of the user end of the data filtering paragraph call frequency and the frequency of data distribution of the data filtering paragraphs in different regions are collected, and fuzzy logic is used to determine the data filtering paragraph call sorting results.
CN202510324745.1A2025-03-192025-03-19 A data collection method and system for device service management platformPendingCN120196616A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202510324745.1ACN120196616A (en)2025-03-192025-03-19 A data collection method and system for device service management platform

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202510324745.1ACN120196616A (en)2025-03-192025-03-19 A data collection method and system for device service management platform

Publications (1)

Publication NumberPublication Date
CN120196616Atrue CN120196616A (en)2025-06-24

Family

ID=96071321

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202510324745.1APendingCN120196616A (en)2025-03-192025-03-19 A data collection method and system for device service management platform

Country Status (1)

CountryLink
CN (1)CN120196616A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2017054554A (en)*2016-12-202017-03-16ヤフー株式会社 Calculation device, calculation method, and calculation program
US20180216946A1 (en)*2016-09-302018-08-02Mamadou Mande GueyeMethod and system for facilitating provisioning of social activity data to a mobile device based on user preferences
CN109542927A (en)*2018-10-242019-03-29南京邮电大学Valid data screening technique, readable storage medium storing program for executing and terminal
CN119088830A (en)*2024-08-212024-12-06南京信息工程大学 A data processing system suitable for accounting and financial management
CN119624511A (en)*2025-02-172025-03-14山东恒迈信息科技有限公司 A customer behavior analysis method and system based on retail data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20180216946A1 (en)*2016-09-302018-08-02Mamadou Mande GueyeMethod and system for facilitating provisioning of social activity data to a mobile device based on user preferences
JP2017054554A (en)*2016-12-202017-03-16ヤフー株式会社 Calculation device, calculation method, and calculation program
CN109542927A (en)*2018-10-242019-03-29南京邮电大学Valid data screening technique, readable storage medium storing program for executing and terminal
CN119088830A (en)*2024-08-212024-12-06南京信息工程大学 A data processing system suitable for accounting and financial management
CN119624511A (en)*2025-02-172025-03-14山东恒迈信息科技有限公司 A customer behavior analysis method and system based on retail data

Similar Documents

PublicationPublication DateTitle
CN119205277B (en) Retrieval method and system for efficiently collecting accurate global potential customer information
CN117670066B (en)Questor management method, system, equipment and storage medium based on intelligent decision
CN119322915A (en)Prediction model construction method and system for operation and maintenance of personalized equipment
CN106485589A (en)A kind of Agriculture enterprise group KXG based on Internet of Things
CN118734417B (en)Road bridge steel mould structural strength optimization method
CN118760670B (en) A data transaction quality assessment system based on big data
CN119358840B (en)Self-adaptive authority management method of intelligent seal management system
CN118550573B (en) IT operation and maintenance management method and IT operation and maintenance management device
CN118671267B (en)Building operation carbon emission metering monitoring management method and system
CN113222229A (en)Non-cooperative unmanned aerial vehicle trajectory prediction method based on machine learning
CN119273235A (en) A low-carbon building operation management method based on data analysis
CN119558806B (en)Accurate matching system of resource data
CN119167221B (en) Remote meter reading data quality assessment method, system and medium based on assessment model
CN119782885A (en) Comprehensive evaluation method for quality traits of plant breeding
CN119228225A (en) A tree obstacle assessment method and system
CN112488236B (en)Integrated unsupervised student behavior clustering method
CN120196616A (en) A data collection method and system for device service management platform
CN119090601B (en) A computer rental operation system based on the Internet
CN118229117B (en)Territorial space planning implementation monitoring model
CN119739847B (en) Patent classification retrieval method and system based on multivariate data fusion
CN120145938B (en) A method for simulating and tracking the trajectory of blowing snow based on a numerical model of blowing snow
CN119782383B (en)Client interaction data retrieval method based on dynamic data index
CN120372255B (en) A method, medium and system for extracting audit features from massive financial and accounting data fusion
CN120102577B (en)Machine vision detection method and system based on deep learning
CN116109211B (en)Equipment operation level analysis method and device based on equipment digitization

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp