技术领域Technical field
本发明涉及充电站技术领域,特别涉及一种充电站的数据处理方法及装置、存储介质及电子设备。The present invention relates to the technical field of charging stations, and in particular to a data processing method and device, storage media and electronic equipment for a charging station.
背景技术Background technique
随着电动汽车保有量的增加,作为配套基础设施的充电站数量也在不断增加,而充电负荷的预测则是充电站规划、调度的基础。目前对充电站的负荷预测时,通常使用依靠大量历史数据的负荷预测方法。As the number of electric vehicles increases, the number of charging stations as supporting infrastructure is also increasing, and charging load prediction is the basis for charging station planning and dispatching. Currently, load forecasting methods that rely on a large amount of historical data are usually used for load forecasting of charging stations.
传统的负荷预测方法以大量的历史数据作为支撑,而历史数据的规模和质量会直接影响预测结果的精度。目前对充电站负荷进行预测时,所使用的历史数据的规模很大,存在许多与充电站的特征差异大、不具有参考意义的数据,导致对充电站负荷预测所需的时间长、且预测结果的精度低。Traditional load forecasting methods are supported by a large amount of historical data, and the scale and quality of historical data will directly affect the accuracy of the forecast results. At present, when forecasting the load of charging stations, the scale of historical data used is very large. There are many data that are very different from the characteristics of charging stations and have no reference significance. As a result, it takes a long time to predict the load of charging stations and the prediction is difficult. The accuracy of the results is low.
发明内容Contents of the invention
有鉴于此,本发明实施例提供一种充电站的数据处理方法及装置、存储介质及电子设备,应用本发明,可以对充电站的历史样本数据进行筛选,从而得到精度高、体量小的数据,使用该数据对充电站的负荷进行预测,可以缩短预测所需的时间,以及提高预测结果的精度。In view of this, embodiments of the present invention provide a data processing method and device, storage medium and electronic equipment for a charging station. By applying the present invention, historical sample data of the charging station can be screened, thereby obtaining high-precision and small-volume data. Using this data to predict the load of charging stations can shorten the time required for prediction and improve the accuracy of prediction results.
为实现上述目的,本发明实施例提供如下技术方案:To achieve the above objectives, embodiments of the present invention provide the following technical solutions:
一种充电站的数据处理方法,包括:A data processing method for a charging station, including:
构建充电站的知识图谱;Build a knowledge graph of charging stations;
基于所述知识图谱获取所述充电站的特征数据;Obtain characteristic data of the charging station based on the knowledge graph;
获取所述充电站的各个历史样本数据;Obtain each historical sample data of the charging station;
应用所述特征数据,对各个所述历史样本数据进行筛选,得到目标样本数据;其中,所述对各个所述历史样本数据进行筛选,是指基于各个所述历史样本数据与所述特征数据之间的相关度信息,选择或剔除至少一个历史样本数据。Apply the characteristic data to filter each of the historical sample data to obtain target sample data; wherein, filtering each of the historical sample data means based on the combination of each of the historical sample data and the characteristic data. Correlation information between them, select or eliminate at least one historical sample data.
上述的方法,可选的,所述构建充电站的知识图谱,包括:The above method, optionally, the construction of the knowledge graph of the charging station includes:
获取所述充电站的半结构化数据以及应用层数据;Obtain semi-structured data and application layer data of the charging station;
从所述半结构化数据中抽取负荷影响因素数据;Extract load influencing factor data from the semi-structured data;
应用所述负荷影响因素数据和所述应用层数据构建知识图谱。The load influencing factor data and the application layer data are used to construct a knowledge graph.
上述的方法,可选的,所述应用所述特征数据,对各个所述历史样本数据进行筛选,得到目标样本数据,包括:In the above method, optionally, the characteristic data is used to filter each of the historical sample data to obtain target sample data, including:
基于所述特征数据,获取每个所述历史样本数据的第一筛选标志位;Based on the characteristic data, obtain the first screening flag of each historical sample data;
将满足预设的第一筛选条件的第一筛选标志位所属的历史样本数据确定为初筛样本数据;Determine the historical sample data belonging to the first screening flag that satisfies the preset first screening condition as the preliminary screening sample data;
基于各个所述初筛样本数据确定所述目标样本数据。The target sample data is determined based on each of the preliminary screening sample data.
上述的方法,可选的,所述基于各个所述初筛样本数据确定所述目标样本数据,包括:In the above method, optionally, determining the target sample data based on each of the preliminary screening sample data includes:
获取与所述特征数据相关的影响因子;Obtain the influencing factors related to the characteristic data;
基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的第二筛选标志位;Based on the characteristic data and the influence factor, obtain the second screening flag of each of the preliminary screening sample data;
将满足预设的第二筛选条件的第二筛选标志位所属的初筛样本数据确定为目标样本数据。The preliminary screening sample data belonging to the second screening flag that satisfies the preset second screening condition is determined as the target sample data.
上述的方法,可选的,所述基于所述特征数据,获取每个所述历史样本数据的第一筛选标志位,包括:In the above method, optionally, obtaining the first screening flag of each historical sample data based on the characteristic data includes:
获取每个所述历史样本数据与所述特征数据之间的筛选系数;Obtain the screening coefficient between each of the historical sample data and the characteristic data;
基于预设的筛选阀值和每个所述历史样本数据的筛选系数,确定每个所述历史样本数据的第一筛选标志位。Based on the preset filtering threshold and the filtering coefficient of each historical sample data, the first filtering flag of each historical sample data is determined.
上述的方法,可选的,所述基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的第二筛选标志位,包括:In the above method, optionally, obtaining the second screening flag of each preliminary screening sample data based on the characteristic data and the impact factor includes:
基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的特征相似度系数;Based on the characteristic data and the influence factor, obtain the characteristic similarity coefficient of each of the preliminary screening sample data;
基于预设的特征相似度系数阀值和每个所述初筛样本数据的特征相似度系数,确定每个所述初筛样本数据的第二筛选标志位。Based on the preset characteristic similarity coefficient threshold and the characteristic similarity coefficient of each preliminary screening sample data, the second screening flag bit of each preliminary screening sample data is determined.
上述的方法,可选的,所述基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的特征相似度系数,包括:In the above method, optionally, based on the characteristic data and the influence factor, obtaining the characteristic similarity coefficient of each preliminary screening sample data includes:
基于所述特征数据和所述影响因子,确定每个所述初筛样本数据在每个特征维度的相似度系数;Based on the characteristic data and the influence factor, determine the similarity coefficient of each of the preliminary screening sample data in each characteristic dimension;
对于每个所述初筛样本数据,将所述初筛样本数据的各个相似度系数进行求和处理,得到所述初筛样本数据的特征相似度系数。For each of the preliminary screening sample data, the similarity coefficients of the preliminary screening sample data are summed to obtain the characteristic similarity coefficient of the preliminary screening sample data.
上述的方法,可选的,所述基于所述特征数据和所述影响因子,确定每个所述初筛样本数据在每个特征维度的相似度系数,包括:In the above method, optionally, based on the characteristic data and the influence factor, determining the similarity coefficient of each preliminary screening sample data in each characteristic dimension includes:
确定每个所述初筛样本数据在每个特征维度的参数集合,所述参数集合包括所述初筛样本数据中与所述特征维度对应的第一特征值、所述特征数据中与所述特征维度对应的第二特征值以及所述影响因子中与所述特征维度对应的因子值;Determine a parameter set in each characteristic dimension of each of the preliminary screening sample data. The parameter set includes the first characteristic value in the preliminary screening sample data corresponding to the characteristic dimension, the characteristic value in the characteristic data and the first characteristic value in the characteristic dimension. The second characteristic value corresponding to the characteristic dimension and the factor value corresponding to the characteristic dimension among the influence factors;
对于每个所述初筛样本数据,将所述初筛样本数据的每个所述参数集合中的第一特征值、第二特征值以及因子值进行运算,得到每个所述参数集合所对应的特征维度的相似度系数。For each of the preliminary screening sample data, the first characteristic value, the second characteristic value and the factor value in each of the parameter sets of the preliminary screening sample data are calculated to obtain the corresponding parameter set of each parameter set. The similarity coefficient of the feature dimension.
上述的方法,可选的,还包括:The above methods, optionally, also include:
应用所述目标样本数据对预设的预测模型进行训练,并将训练完成的预测模型作为充电站负荷预测模型;Use the target sample data to train a preset prediction model, and use the trained prediction model as a charging station load prediction model;
将所述充电站的运行数据输入所述充电站负荷预测模型中,得到所述充电站的负荷预测结果。The operating data of the charging station is input into the charging station load prediction model to obtain the load prediction result of the charging station.
一种充电站的样本数据筛选装置,包括:A sample data screening device for charging stations, including:
构建单元,用于构建充电站的知识图谱;Building unit, used to build the knowledge graph of charging stations;
第一获取单元,用于基于所述知识图谱获取所述充电站的特征数据;A first acquisition unit configured to acquire characteristic data of the charging station based on the knowledge graph;
第二获取单元,用于获取所述充电站的各个历史样本数据;a second acquisition unit, used to acquire each historical sample data of the charging station;
筛选单元,用于应用所述特征数据,对各个所述历史样本数据进行筛选,得到目标样本数据;其中,所述对各个所述历史样本数据进行筛选,是指基于各个所述历史样本数据与所述特征数据之间的相关度信息,选择或剔除至少一个历史样本数据。A screening unit is used to apply the characteristic data to screen each of the historical sample data to obtain target sample data; wherein the screening of each of the historical sample data refers to based on each of the historical sample data and Based on the correlation information between the feature data, at least one historical sample data is selected or eliminated.
上述的装置,可选的,所述构建单元执行构建充电站的知识图谱的过程,包括:For the above device, optionally, the construction unit performs a process of constructing the knowledge graph of the charging station, including:
获取所述充电站的半结构化数据以及应用层数据;Obtain semi-structured data and application layer data of the charging station;
从所述半结构化数据中抽取负荷影响因素数据;Extract load influencing factor data from the semi-structured data;
应用所述负荷影响因素数据和所述应用层数据构建知识图谱。The load influencing factor data and the application layer data are used to construct a knowledge graph.
上述的装置,可选的,所述筛选单元执行应用所述特征数据,对各个所述历史样本数据进行筛选,得到目标样本数据的过程,包括:In the above device, optionally, the screening unit performs a process of applying the characteristic data, screening each of the historical sample data, and obtaining the target sample data, including:
基于所述特征数据,获取每个所述历史样本数据的第一筛选标志位;Based on the characteristic data, obtain the first screening flag of each historical sample data;
将满足预设的第一筛选条件的第一筛选标志位所属的历史样本数据确定为初筛样本数据;Determine the historical sample data belonging to the first screening flag that satisfies the preset first screening condition as the preliminary screening sample data;
基于各个所述初筛样本数据确定所述目标样本数据。The target sample data is determined based on each of the preliminary screening sample data.
上述的装置,可选的,所述筛选单元执行基于各个所述初筛样本数据确定所述目标样本数据的过程,包括:For the above device, optionally, the screening unit performs a process of determining the target sample data based on each of the preliminary screening sample data, including:
获取与所述特征数据相关的影响因子;Obtain the influencing factors related to the characteristic data;
基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的第二筛选标志位;Based on the characteristic data and the influence factor, obtain the second screening flag of each of the preliminary screening sample data;
将满足预设的第二筛选条件的第二筛选标志位所属的初筛样本数据确定为目标样本数据。The preliminary screening sample data belonging to the second screening flag that satisfies the preset second screening condition is determined as the target sample data.
上述的装置,可选的,所述筛选单元执行基于所述特征数据,获取每个所述历史样本数据的第一筛选标志位的过程,包括:In the above device, optionally, the screening unit performs a process of obtaining the first screening flag of each historical sample data based on the characteristic data, including:
获取每个所述历史样本数据与所述特征数据之间的筛选系数;Obtain the screening coefficient between each of the historical sample data and the characteristic data;
基于预设的筛选阀值和每个所述历史样本数据的筛选系数,确定每个所述历史样本数据的第一筛选标志位。Based on the preset filtering threshold and the filtering coefficient of each historical sample data, the first filtering flag of each historical sample data is determined.
上述的装置,可选的,所述筛选单元执行基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的第二筛选标志位的过程,包括:In the above device, optionally, the screening unit performs a process of obtaining the second screening flag of each preliminary screening sample data based on the characteristic data and the influence factor, including:
基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的特征相似度系数;Based on the characteristic data and the influence factor, obtain the characteristic similarity coefficient of each of the preliminary screening sample data;
基于预设的特征相似度系数阀值和每个所述初筛样本数据的特征相似度系数,确定每个所述初筛样本数据的第二筛选标志位。Based on the preset characteristic similarity coefficient threshold and the characteristic similarity coefficient of each preliminary screening sample data, the second screening flag bit of each preliminary screening sample data is determined.
上述的装置,可选的,所述筛选单元执行基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的特征相似度系数的过程,包括:In the above device, optionally, the screening unit performs a process of obtaining the feature similarity coefficient of each preliminary screening sample data based on the feature data and the influence factor, including:
基于所述特征数据和所述影响因子,确定每个所述初筛样本数据在每个特征维度的相似度系数;Based on the characteristic data and the influence factor, determine the similarity coefficient of each of the preliminary screening sample data in each characteristic dimension;
对于每个所述初筛样本数据,将所述初筛样本数据的各个相似度系数进行求和处理,得到所述初筛样本数据的特征相似度系数。For each of the preliminary screening sample data, the similarity coefficients of the preliminary screening sample data are summed to obtain the characteristic similarity coefficient of the preliminary screening sample data.
上述的装置,可选的,所述筛选单元执行基于所述特征数据和所述影响因子,确定每个所述初筛样本数据在每个特征维度的相似度系数的过程,包括:In the above device, optionally, the screening unit performs a process of determining the similarity coefficient of each preliminary screening sample data in each feature dimension based on the feature data and the influence factor, including:
确定每个所述初筛样本数据在每个特征维度的参数集合,所述参数集合包括所述初筛样本数据中与所述特征维度对应的第一特征值、所述特征数据中与所述特征维度对应的第二特征值以及所述影响因子中与所述特征维度对应的因子值;Determine a parameter set in each characteristic dimension of each of the preliminary screening sample data. The parameter set includes the first characteristic value in the preliminary screening sample data corresponding to the characteristic dimension, the characteristic value in the characteristic data and the first characteristic value in the characteristic dimension. The second characteristic value corresponding to the characteristic dimension and the factor value corresponding to the characteristic dimension among the influence factors;
对于每个所述初筛样本数据,将所述初筛样本数据的每个所述参数集合中的第一特征值、第二特征值以及因子值进行运算,得到每个所述参数集合所对应的特征维度的相似度系数。For each of the preliminary screening sample data, the first characteristic value, the second characteristic value and the factor value in each of the parameter sets of the preliminary screening sample data are calculated to obtain the corresponding parameter set of each parameter set. The similarity coefficient of the feature dimension.
上述的装置,可选的,还包括:The above devices, optionally, also include:
训练单元,用于应用所述目标样本数据对预设的预测模型进行训练,并将训练完成的预测模型作为充电站负荷预测模型;A training unit used to train a preset prediction model using the target sample data, and use the trained prediction model as a charging station load prediction model;
输入单元,用于将所述充电站的运行数据输入所述充电站负荷预测模型中,得到所述充电站的负荷预测结果。An input unit is used to input the operating data of the charging station into the charging station load prediction model to obtain the load prediction result of the charging station.
与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:
本发明提供一种充电站的数据处理方法及装置、存储介质及电子设备,该方法包括:构建充电站的知识图谱;基于知识图谱获取充电站的特征数据;获取充电站的各个历史样本数据;应用特征数据,对各个历史样本数据进行筛选,得到目标样本数据。通过构建的知识图谱提取出充电站的特征数据,然后基于特征数据对历史样本数据进行筛选,将与充电站的特征差异大且无效的历史样本数据筛出,保留与充电站的特征相似度高的历史样本数据,由此,可以有效缩小预测充电站时所使用的数据的体量,并且提高所应用的数据的精度,缩短预测所需时间和提高预测的精度。The invention provides a data processing method and device, storage medium and electronic equipment for a charging station. The method includes: constructing a knowledge graph of the charging station; obtaining characteristic data of the charging station based on the knowledge graph; and acquiring historical sample data of the charging station; Apply feature data to filter each historical sample data to obtain target sample data. The characteristic data of the charging station is extracted through the constructed knowledge graph, and then the historical sample data is filtered based on the characteristic data, and the historical sample data that is greatly different from the characteristics of the charging station and invalid is filtered out, and the characteristics of the charging station are highly similar. Historical sample data can effectively reduce the volume of data used when predicting charging stations, improve the accuracy of the data used, shorten the time required for prediction, and improve the accuracy of prediction.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without exerting creative efforts.
图1为本发明实施例提供的一种充电站的数据处理方法的方法流程图;Figure 1 is a method flow chart of a data processing method for a charging station provided by an embodiment of the present invention;
图2为本发明实施例提供的构建充电站的知识图谱的方法流程图;Figure 2 is a flow chart of a method for constructing a knowledge graph of a charging station provided by an embodiment of the present invention;
图3为本发明实施例提供的构建知识图谱以及从知识图谱推导特征数据和影响因子的示例图;Figure 3 is an example diagram of constructing a knowledge graph and deriving feature data and influencing factors from the knowledge graph provided by an embodiment of the present invention;
图4为本发明实施例提供的从各个历史样本数据中得到目标样本数据的方法流程图;Figure 4 is a flow chart of a method for obtaining target sample data from each historical sample data provided by an embodiment of the present invention;
图5为本发明实施例提供的一种充电站的数据处理方法的另一流程图;Figure 5 is another flow chart of a data processing method for a charging station provided by an embodiment of the present invention;
图6为本发明实施例提供的一种充电站的数据处理装置的结构示意图;Figure 6 is a schematic structural diagram of a data processing device of a charging station provided by an embodiment of the present invention;
图7为本发明实施例提供的一种电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.
在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。In this application, the terms "comprises," "comprises," or any other variation thereof are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a list of elements not only includes those elements, but also includes none. Other elements expressly listed, or elements inherent to such process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or apparatus that includes the stated element.
随着电动汽车保有量的增加,作为配套基础设施的充电站数量也在不断增加,而充电负荷的预测则是充电站规划、调度的基础。单车充电负荷受用户行驶习惯、电池特性、充电设施特性、充电方式偏好等多个因素的影响,总的充电站充电负荷则会受到电动汽车数量和充电位置等影响。上述影响因子的叠加,使得汽车充电负荷在时间、空间分布上具有随机性、间歇性、波动性等不确定性特点。As the number of electric vehicles increases, the number of charging stations as supporting infrastructure is also increasing, and charging load prediction is the basis for charging station planning and dispatching. The charging load of a single vehicle is affected by many factors such as user driving habits, battery characteristics, charging facility characteristics, and charging method preferences. The total charging load of a charging station is affected by the number of electric vehicles and charging locations. The superposition of the above influencing factors makes the vehicle charging load have randomness, intermittent, fluctuation and other uncertain characteristics in time and space distribution.
目前充电站负荷预测方法主要有模型驱动和数据驱动两种。模型驱动的预测方法多基于用户出行特征,同时涉及交通、路网等多种影响因素,建立负荷预测模型。数据驱动的预测方法以历史统计数据为基础进行预测。模型驱动的方法虽然不需要大量的历史数据,但建模准确度直接影响最终预测的准确度,灵活性较差,当前电动汽车保有量、城市交通网络等关键影响因子均处于快速发展变化中,建模准确度低,导致预测误差较大。数据驱动的预测方法灵活性高、准确率随时间上升,缺点是需要大量历史数据做支撑,且随着数据量的增多计算时间也会被不断拉长,大大影响负荷预测模块的性能,极易使预测的结果不准确。At present, charging station load prediction methods mainly include model-driven and data-driven. Model-driven prediction methods are mostly based on user travel characteristics and involve multiple influencing factors such as traffic and road networks to establish load prediction models. Data-driven forecasting methods base forecasts on historical statistics. Although the model-driven method does not require a large amount of historical data, the accuracy of modeling directly affects the accuracy of the final prediction, and its flexibility is poor. Currently, key influencing factors such as electric vehicle ownership and urban transportation networks are undergoing rapid development and change. The modeling accuracy is low, resulting in large prediction errors. The data-driven forecasting method has high flexibility and the accuracy increases with time. The disadvantage is that it requires a large amount of historical data to support, and as the amount of data increases, the calculation time will continue to be lengthened, which greatly affects the performance of the load forecasting module and is extremely easy to use. making the predicted results inaccurate.
为了解决上述的问题,本发明提供一种充电站的数据处理方案,可以对充电站的历史样本数据进行筛选,将与充电站特征差异大、无效的历史样本数据去除,保留与充电站的特征相似度高的数据,有效缩小数据量,提高预测的精度。In order to solve the above problems, the present invention provides a data processing solution for charging stations, which can filter historical sample data of charging stations, remove historical sample data that are greatly different from the characteristics of charging stations and are invalid, and retain the characteristics of charging stations. Data with high similarity can effectively reduce the amount of data and improve the accuracy of prediction.
本发明可用于众多通用或专用的计算装置环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器装置、包括以上任何装置或设备的分布式计算环境等等。示例性的,本发明提供的方法应用于充电站预测系统。The present invention may be used in a variety of general purpose or special purpose computing device environments or configurations. For example: personal computers, server computers, handheld devices or portable devices, tablet devices, multi-processor devices, distributed computing environments including any of the above devices or devices, etc. Illustratively, the method provided by the present invention is applied to the charging station prediction system.
参照图1,为本发明实施例提供的一种充电站的数据处理方法的方法流程图,具体说明如下所示:Referring to Figure 1, a method flow chart of a data processing method for a charging station provided by an embodiment of the present invention is shown. The specific description is as follows:
S101、构建充电站的知识图谱。S101. Construct a knowledge graph of charging stations.
本发明实施例提供的方法中,使用充电站的半结构化数据和应用层数据,构建充电站的知识图谱。In the method provided by the embodiment of the present invention, semi-structured data and application layer data of the charging station are used to construct a knowledge graph of the charging station.
进一步的,知识图谱中包含了充电站与负荷相关的各种特征以及影响负荷的各种因素。Furthermore, the knowledge graph contains various characteristics of the charging station related to the load and various factors that affect the load.
参照图2,为本发明实施例提供的构建充电站的知识图谱的方法流程图,具体说明如下所述:Referring to Figure 2, there is a flow chart of a method for constructing a knowledge graph of a charging station provided by an embodiment of the present invention. The specific description is as follows:
S201、获取充电站的半结构化数据以及应用层数据。S201. Obtain semi-structured data and application layer data of the charging station.
充电站的半结构化数据包括但不限于充电桩个数、电池容量、区域电动车数量、充电时长、区域交通状况、车辆驾驶员的驾驶习惯、充电价格以及停车位数量等内容。The semi-structured data of charging stations includes but is not limited to the number of charging piles, battery capacity, the number of electric vehicles in the area, charging time, regional traffic conditions, driving habits of vehicle drivers, charging prices, and the number of parking spaces.
应用层数据包括充电站的特征层和数据层的数据,包括但不限于日期类型、时段类型、节假日类型、城市类型、季节类型、区域类型、天气类型以及路网类型等。The application layer data includes the characteristic layer and data layer data of the charging station, including but not limited to date type, time period type, holiday type, city type, season type, area type, weather type, road network type, etc.
S202、从半结构化数据中抽取负荷影响因素数据。S202. Extract load influencing factor data from semi-structured data.
S203、应用负荷影响因素数据和应用层数据构建知识图谱。S203. Construct a knowledge graph using application load influencing factor data and application layer data.
将充电站的数据进行映射、融合,构建充电站的知识图谱,知识图谱可以体现充电站的各种数据之间的映射关系,充分体现各种数据之间的关联关系,便于后续提取充电站的特征数据和影响因子。Map and fuse the data of the charging station to build a knowledge graph of the charging station. The knowledge graph can reflect the mapping relationship between various data of the charging station, fully reflect the correlation between various data, and facilitate the subsequent extraction of charging station information. Characteristic data and influencing factors.
S102、基于知识图谱获取充电站的特征数据。S102. Obtain the characteristic data of the charging station based on the knowledge graph.
从知识图谱中推导出充电站的特征数据,需要说明的是,特征数据包含N个特征维度的特征值,进一步的,在获取充电站的特征数据时,还可以获取与特征数据相关的影响因子,影响因子也可以用于筛选历史样本数据;影响因子包含N个特征维度的因子值,N为正整数。优选的,利用知识图谱推导出的影响因子体现了不同特征与充电站负荷预测之间的耦合度差异。The characteristic data of the charging station is derived from the knowledge graph. It should be noted that the characteristic data contains characteristic values of N characteristic dimensions. Furthermore, when obtaining the characteristic data of the charging station, the influencing factors related to the characteristic data can also be obtained. , the impact factor can also be used to filter historical sample data; the impact factor contains factor values of N feature dimensions, and N is a positive integer. Preferably, the influence factor derived using the knowledge graph reflects the difference in coupling degree between different features and charging station load prediction.
参照图3,为本发明实施例提供的构建知识图谱以及从知识图谱推导特征数据和影响因子的示例图。通过对充电站的半结构化数据进行信息抽取融合,得到充电站信息、充电汽车信息等负荷数据样本(即为上文的负荷影响因素数据),使用负荷影响因素数据以及特征层及数据层的数据构建,然后通过知识推理以及匹配,得到充电站的负荷主导特征(即为上文的特征数据)以及参与因子(即为上文的影响因子)。Referring to FIG. 3 , an example diagram of constructing a knowledge graph and deriving feature data and influencing factors from the knowledge graph is provided according to an embodiment of the present invention. Through information extraction and fusion of semi-structured data of charging stations, load data samples such as charging station information and charging vehicle information are obtained (i.e., the load influencing factor data above), and the load influencing factor data and feature layer and data layer are used. Data is constructed, and then through knowledge reasoning and matching, the load dominant characteristics of the charging station (that is, the characteristic data above) and the participation factors (that is, the influence factors above) are obtained.
如图3所示,半结构化数据包括但不限于充电桩个数、电池容量、区域电动车数量、充电时长、区域交通状况、车辆驾驶员的驾驶习惯、充电价格以及停车位数量等内容。特征层及数据层即为应用层数据,应用层数据包括充电站的特征层和数据层的数据,包括但不限于日期类型、时段类型、节假日类型、城市类型、季节类型、区域类型、天气类型以及路网类型等。As shown in Figure 3, semi-structured data includes but is not limited to the number of charging piles, battery capacity, number of electric vehicles in the area, charging time, regional traffic conditions, driving habits of vehicle drivers, charging prices, number of parking spaces, etc. The feature layer and data layer are application layer data. The application layer data includes the feature layer and data layer data of the charging station, including but not limited to date type, time period type, holiday type, city type, season type, area type, and weather type. and road network types, etc.
S103、获取充电站的各个历史样本数据。S103. Obtain historical sample data of the charging station.
优选的,充电站的历史样本数据可以是在规定时间范围内的数据,例如充电站一年内的历史数据。Preferably, the historical sample data of the charging station may be data within a specified time range, such as historical data of the charging station within one year.
历史样本数据实际为充电站的历史负荷特征数据,历史特征数据中包含充电站在运行时的负荷特征。The historical sample data is actually the historical load characteristic data of the charging station, and the historical characteristic data includes the load characteristics of the charging station during operation.
S104、应用特征数据,对各个历史样本数据进行筛选,得到目标样本数据。S104. Apply the characteristic data to filter each historical sample data to obtain the target sample data.
需要说明的是,对各个历史样本数据进行筛选,是指基于各个历史样本数据与特征数据之间的相关度信息,选择或剔除至少一个历史样本数据。It should be noted that screening each historical sample data refers to selecting or eliminating at least one historical sample data based on the correlation information between each historical sample data and the characteristic data.
需要说明的是,使用特征数据对各个历史样本数据进行筛选,保留与充电站的特征相似度高、关联性高的样本数据,筛出与充电站的特征相似度差异大的样本数据,从而将样本数据的体量减少,并且提高样本数据的精度。It should be noted that the characteristic data is used to filter each historical sample data, retain the sample data with high similarity and high correlation with the characteristics of the charging station, and filter out the sample data with large differences in the characteristics of the charging station, so as to The volume of sample data is reduced and the accuracy of sample data is improved.
优选的,得到的目标样本数据为多个。在得到目标样本数据后,应用目标样本数据对预设的预测模型进行训练,并将训练完成的预测模型作为充电站负荷预测模型;将充电站的运行数据输入充电站负荷预测模型中,得到充电站的负荷预测结果。需要说明的是,预测模型可以为使用神经网络构建的模型,充电站的运行数据可以为充电站在预设时间范围内运行时产生的数据,包括但不限于充电站的充电桩个数、正常运行的充电桩的个数、出现故障的充电桩的个数、充电桩的充电数据等。Preferably, there are multiple target sample data obtained. After obtaining the target sample data, use the target sample data to train the preset prediction model, and use the trained prediction model as the charging station load prediction model; input the operating data of the charging station into the charging station load prediction model to obtain the charging station load prediction model. Station load forecast results. It should be noted that the prediction model can be a model built using a neural network, and the operating data of the charging station can be data generated when the charging station operates within a preset time range, including but not limited to the number of charging piles at the charging station, normal The number of operating charging piles, the number of faulty charging piles, charging data of charging piles, etc.
本发明提供的实施例中,构建充电站的知识图谱;基于知识图谱获取充电站的特征数据;获取充电站的各个历史样本数据;应用特征数据,对各个历史样本数据进行筛选,得到目标样本数据。通过构建的知识图谱提取出充电站的特征数据,然后基于特征数据对历史样本数据进行筛选,将与充电站的特征差异大且无效的历史样本数据筛出,保留与充电站的特征相似度高的历史样本数据,由此,可以有效缩小预测充电站时所使用的数据的体量,并且提高所应用的数据的精度,缩短预测所需时间和提高预测的精度。In the embodiments provided by the present invention, a knowledge graph of the charging station is constructed; the characteristic data of the charging station is obtained based on the knowledge graph; each historical sample data of the charging station is obtained; the characteristic data is used to filter each historical sample data to obtain the target sample data . The characteristic data of the charging station is extracted through the constructed knowledge graph, and then the historical sample data is filtered based on the characteristic data, and the historical sample data that is greatly different from the characteristics of the charging station and invalid is filtered out, and the characteristics of the charging station are highly similar. Historical sample data can effectively reduce the volume of data used when predicting charging stations, improve the accuracy of the data used, shorten the time required for prediction, and improve the accuracy of prediction.
参照图4,为本发明实施例提供的从各个历史样本数据中得到目标样本数据的方法流程图,具体说明如下所述:Referring to Figure 4, a flow chart of a method for obtaining target sample data from each historical sample data is provided according to an embodiment of the present invention. The specific description is as follows:
S301、基于特征数据,获取每个历史样本数据的第一筛选标志位。S301. Based on the characteristic data, obtain the first screening flag of each historical sample data.
优选的,历史样本数据可以称为充电站负荷样本特征,进一步的,历史样本数据中包含N个特征维度的特征值。Preferably, the historical sample data can be called charging station load sample features. Further, the historical sample data contains feature values of N feature dimensions.
优选的,本发明中涉及的充电站负荷特征的数据可以使用F=[f1,f2,......,fN]表示,其中,fN表示充电站的第N特征维度的特征值。优选的,充电站的特征数据可以使用Fg表示,历史样本数据可以使用Fyi表示,充电站的特征数据中也包含N个特征维度的特征值,N为正整数。Preferably, the data of the charging station load characteristics involved in the present invention can be represented by F=[f1 , f2 ,..., fN ], where fN represents the Nth characteristic dimension of the charging station. Eigenvalues. Preferably, the characteristic data of the charging station can be represented by Fg , and the historical sample data can be represented by Fyi . The characteristic data of the charging station also contains characteristic values of N characteristic dimensions, and N is a positive integer.
优选的,获取每个历史样本数据的第一筛选标志位的过程如:获取每个历史样本数据与特征数据之间的筛选系数;基于预设的筛选阀值和每个历史样本数据的筛选系数,确定每个历史样本数据的第一筛选标志位。Preferably, the process of obtaining the first screening flag of each historical sample data is as follows: obtaining the screening coefficient between each historical sample data and the characteristic data; based on the preset screening threshold and the screening coefficient of each historical sample data , determine the first screening flag of each historical sample data.
优选的,筛选系数可以为杰卡德系数,可以将每个历史样本数据和特征数据作为一组数据,对每组数据按照预设的系数运算方式进行运算,得到每个组数据的杰卡德系数。Preferably, the screening coefficient can be a Jaccard coefficient. Each historical sample data and feature data can be regarded as a set of data, and each set of data can be calculated according to a preset coefficient calculation method to obtain the Jaccard coefficient of each set of data. coefficient.
示例性的,预设的系数运算方式如下所述:For example, the preset coefficient calculation method is as follows:
其中,Ji表示编号为i的历史样本数据的杰卡德系数,Fg表示充电站的特征数据,Fyi表示编号为i的历史样本数据,其中,i=1,2,......,m,m为正整数。 Among them, Ji represents the Jaccard coefficient of the historical sample data numbered i, Fg represents the characteristic data of the charging station, Fyi represents the historical sample data numbered i, where i=1,2,.... ..,m, m is a positive integer.
示例性的,充电站的特征数据Fg和各个历史样本数据可以表示为:For example, the characteristic data Fg of the charging station and each historical sample data can be expressed as:
其中,Fg为充电站的特征数据;Fy1、Fy2、Fy3......Fym均为历史样本数据,fgN为特征数据Fg中第N个特征维度的特征值;fymN为第m个历史样本数据Fym中第N个特征维度的特征值。Among them, Fg is the characteristic data of the charging station; Fy1 , Fy2 , Fy3 ...Fym are all historical sample data, and fgN is the characteristic value of the Nth characteristic dimension in the characteristic data Fg ; fymN is the feature value of the Nth feature dimension in the mth historical sample data Fym .
进一步的,基于预设的筛选阀值和每个历史样本数据的筛选系数,确定每个历史样本数据的第一筛选标志位的方式如:Further, based on the preset filtering threshold and the filtering coefficient of each historical sample data, the first filtering flag bit of each historical sample data is determined as follows:
其中,K1为预设的筛选阀值,Ti为第i个历史样本数据的第一筛选标志位。Among them, K1 is the preset filtering threshold, andTi is the first filtering flag of the i-th historical sample data.
示例性的,当历史样本数据i的筛选系数Ji大于筛选阀值时,历史样本数据的第一筛选标志位Ti为1;当历史样本数据i的筛选系数Ji不大于筛选阀值时,历史样本数据i的第一筛选标志位Ti为0;进一步的,筛选阀值可以根据实际需求进行设置。For example, when the filtering coefficient Ji of the historical sample data i is greater than the filtering threshold, the first filtering flag bit Ti of the historical sample data is 1; when the filtering coefficient J i of the historical sample datai is not greater than the filtering threshold , the first filtering flag Ti of the historical sample data i is 0; further, the filtering threshold can be set according to actual needs.
S302、将满足预设的第一筛选条件的第一筛选标志位所属的历史样本数据确定为初筛样本数据。S302. Determine the historical sample data belonging to the first screening flag that satisfies the preset first screening condition as preliminary screening sample data.
延续S301中关于第一筛选标志位的内容继续进行说明,数值为1的第一筛选标志位满足预设的第一筛选条件,数值为0的第一筛选标志位不满足第一筛选条件。Continuing the description of the first filtering flag in S301, the first filtering flag with a value of 1 satisfies the preset first filtering condition, and the first filtering flag with a value of 0 does not satisfy the first filtering condition.
优选的,将不满足第一筛选条件的第一筛选标志位所对应的历史样本数据删除;将满足第一筛选条件的第一筛选标志位所对应的历史样本数据确定为初筛样本数据。Preferably, the historical sample data corresponding to the first screening flag that does not meet the first screening condition is deleted; and the historical sample data corresponding to the first screening flag that satisfies the first screening condition is determined as preliminary screening sample data.
S303、基于各个初筛样本数据确定目标样本数据。S303. Determine target sample data based on each preliminary screening sample data.
基于各个初筛样本数据确定目标样本数据,示例性的,可以将各个初筛样本数据均确定为目标样本数据;还可以继续对各个初筛样本数据进行筛选,进而得到目标样本数据。The target sample data is determined based on each preliminary screening sample data. For example, each preliminary screening sample data can be determined as the target sample data; each preliminary screening sample data can also be continued to be screened to obtain the target sample data.
继续对各个初筛样本数据进行筛选,得到目标样本数据的过程如下所述:Continue to screen each preliminary screening sample data, and the process of obtaining target sample data is as follows:
获取与特征数据相关的影响因子;Obtain influencing factors related to feature data;
基于特征数据和所述影响因子,获取每个初筛样本数据的第二筛选标志位;Based on the characteristic data and the influence factor, obtain the second screening flag of each preliminary screening sample data;
将满足预设的第二筛选条件的第二筛选标志位所属的初筛样本数据确定为目标样本数据。The preliminary screening sample data belonging to the second screening flag that satisfies the preset second screening condition is determined as the target sample data.
进一步的,获取与特征数据相关的影响因子的内容参照步骤S102的相关说明,此处不再进行赘述。Further, for the content of obtaining the influence factors related to the feature data, refer to the relevant description of step S102, which will not be described again here.
通过使用特征数据和影响因子,可以对历史样本数据进行多次筛选,进而提高得到的数据的精度。By using characteristic data and influencing factors, historical sample data can be filtered multiple times, thereby improving the accuracy of the obtained data.
需要说明的是,确定每个初筛样本数据的第二筛选标志位的过程具体如:基于特征数据和影响因子,获取每个初筛样本数据的特征相似度系数;基于预设的特征相似度系数阀值和每个初筛样本数据的特征相似度系数,确定每个初筛样本数据的第二筛选标志位。It should be noted that the specific process of determining the second screening flag of each preliminary screening sample data is as follows: based on the characteristic data and influence factors, obtaining the characteristic similarity coefficient of each preliminary screening sample data; based on the preset characteristic similarity The coefficient threshold and the feature similarity coefficient of each preliminary screening sample data determine the second screening flag of each preliminary screening sample data.
在确定初筛样本数据的特征相似度系数时,先确定初筛样本数据在每个特征维度的相似度系数。When determining the feature similarity coefficient of the preliminary screening sample data, first determine the similarity coefficient of the preliminary screening sample data in each feature dimension.
确定初筛样本数据在每个特征维度的相似度系数的过程如:确定每个初筛样本数据在每个特征维度的参数集合,参数集合包括初筛样本数据中与特征维度对应的第一特征值、特征数据中与特征维度对应的第二特征值以及影响因子中与特征维度对应的因子值;对于每个初筛样本数据,将初筛样本数据的每个参数集合中的第一特征值、第二特征值以及因子值进行运算,得到每个参数集合所对应的特征维度的相似度系数。The process of determining the similarity coefficient of the preliminary screening sample data in each feature dimension is as follows: determining the parameter set of each preliminary screening sample data in each feature dimension. The parameter set includes the first feature corresponding to the feature dimension in the preliminary screening sample data. value, the second characteristic value corresponding to the characteristic dimension in the characteristic data, and the factor value corresponding to the characteristic dimension among the influence factors; for each preliminary screening sample data, the first characteristic value in each parameter set of the preliminary screening sample data is , the second characteristic value and the factor value are operated to obtain the similarity coefficient of the characteristic dimension corresponding to each parameter set.
优选的,以初筛样本数据Fyi和特征数据Fg为例进行说明,确定初筛样本数据在特征维度为j的参数集合Yj,其中,j的取值最大值为N。参数集合Yj中包含fyij、fgj以及kj;fyij为初筛样本数据中与特征维度j对应的第一特征值、fgj为特征数据中与特征维度j对应的第二特征值;kj为影响因子中与特征维度j对应的因子值。Preferably, the preliminary screening sample data Fyi and the characteristic data Fg are taken as an example to illustrate, and the parameter set Yj of the preliminary screening sample data in the characteristic dimension j is determined, where the maximum value of j is N. The parameter set Yj includes fyij , fgj and kj ; fyij is the first eigenvalue corresponding to feature dimension j in the preliminary screening sample data, and fgj is the second eigenvalue corresponding to feature dimension j in the feature data. ; kj is the factor value corresponding to feature dimension j among the influencing factors.
需要说明的是,初筛样本数据中有N个特征维度的特征值,即可得到N个参数集合。It should be noted that if there are feature values of N feature dimensions in the initial screening sample data, N parameter sets can be obtained.
与参数集合Yj对应的相似度系数H=kj(fgj⊙fyij),H为初筛样本数据在特征维度j的相似度。The similarity coefficient H=kj (fgj ⊙fyij ) corresponding to the parameter set Yj , H is the similarity of the initial screening sample data in the feature dimension j.
故而,将初筛样本数据的各个相似度系数相加,即可得到初筛样本数据的特征相似度系数。Therefore, by adding up the similarity coefficients of the preliminary screening sample data, the characteristic similarity coefficient of the preliminary screening sample data can be obtained.
进一步的,获取初筛样本数据的特征相似度系数的过程还可以如下所述:Further, the process of obtaining the feature similarity coefficient of the preliminary screening sample data can also be described as follows:
S(Fg,Fyi)=∑jkj(fgj⊙fyij);S(Fg , Fyi )=∑j kj (fgj ⊙fyij );
其中,S(Fg,Fyi)为充电站的特征数据和历史样本数据Fyi的特征相似度系数,可以理解为历史样本数据Fyi的特征相似度系数,S(Fg,Fyi)还可以表示为Si;kj为影响因子中第j个特征维度所对应的因子值;fyij为特征数据中第j个特征维度所对应的第一特征值;fgj为特征数据中第j个特征维度所对应的第二特征值;进一步的,fgj⊙fyij为数据特征中的第j个特征值fgj和初筛样本数据中的第j个特征值fyij的同或运算值。Among them, S(Fg , Fyi ) is the characteristic similarity coefficient of the charging station's characteristic data and the historical sample data Fyi , which can be understood as the characteristic similarity coefficient of the historical sample data Fyi , S(Fg , Fyi ) It can also be expressed asSi ; kj is the factor value corresponding to the j-th feature dimension in the influencing factor; fyij is the first feature value corresponding to the j-th feature dimension in the feature data; fgj is the factor value corresponding to the j-th feature dimension in the feature data. The second eigenvalue corresponding to j feature dimensions; further, fgj ⊙fyij is the exclusive OR operation of the jth eigenvalue fgj in the data feature and the jth eigenvalue fyij in the preliminary screening sample data value.
进一步的,在得到每个初筛样本数据的特征相似度系数后,基于预设的特征相似度系数阀值确定每个初筛样本数据的第二筛选标志位的方式如:Further, after obtaining the characteristic similarity coefficient of each preliminary screening sample data, the second screening flag bit of each preliminary screening sample data is determined based on the preset characteristic similarity coefficient threshold as follows:
其中,K2为特征相似度系数阀值,Ri为第i个历史样本数据的第二筛选标志位。Among them, K2 is the feature similarity coefficient threshold, and Ri is the second screening flag of the i-th historical sample data.
示例性的,当历史样本数据i的筛选系数Si大于筛选阀值时,历史样本数据的第一筛选标志位Ri为1;当历史样本数据i的筛选系数Si不大于筛选阀值时,历史样本数据i的第二筛选标志位Ri为0;进一步的,特征相似度系数阀值可以根据实际需求进行设置。For example, when the filtering coefficient Si of the historical sample data i is greater than the filtering threshold, the first filtering flag Ri of the historical sample data is 1; when the filtering coefficient S i of the historical sample datai is not greater than the filtering threshold , the second screening flag Ri of the historical sample data i is 0; further, the feature similarity coefficient threshold can be set according to actual needs.
延续上述关于第二筛选标志位的内容继续进行说明,数值为1的第二筛选标志位满足预设的第二筛选条件,数值为0的第二筛选标志位不满足第二筛选条件。Continuing the above description about the second filtering flag bit, the second filtering flag bit with a value of 1 satisfies the preset second filtering condition, and the second filtering flag bit with a value of 0 does not satisfy the second filtering condition.
参照图5,为本发明实施例提供的一种充电站的数据处理方法的另一流程图,具体流程如下所述:Referring to Figure 5, another flow chart of a data processing method for a charging station provided by an embodiment of the present invention is shown. The specific process is as follows:
步骤S1:对充电站的信息进行抽取、融合,构建知识图谱;然后利用知识图谱推导出充电站的主导特征及参与因子;此处的主导特征即为上文所述的特征数据,此处的参与因子即为上文所述的影响因子。Step S1: Extract and fuse the information of the charging station to construct a knowledge graph; then use the knowledge graph to derive the dominant features and participation factors of the charging station; the dominant features here are the feature data mentioned above, and the The participation factor is the impact factor mentioned above.
步骤S2:基于充电站的主导特征计算每个历史样本数据与充电站的主导特征之间的杰卡德系数。Step S2: Calculate the Jaccard coefficient between each historical sample data and the dominant features of the charging station based on the dominant features of the charging station.
步骤S3:根据计算得到杰卡德系数进行样本初次筛选,获得初筛训练样本集;初筛训练样本集中包含多个杰卡德系数满足条件的历史样本数据。Step S3: Perform initial screening of samples based on the calculated Jaccard coefficient to obtain a preliminary screening training sample set; the preliminary screening training sample set contains multiple historical sample data whose Jaccard coefficients meet the conditions.
步骤S4:基于参与因子,计算初筛训练样本集内每个历史样本数据的特征相似度系数,并根据计算得到特征相似度系数对初筛训练样本集中的各个历史样本数据进行第二次筛选,从而得到最终的预测用训练样本集。Step S4: Based on the participation factor, calculate the characteristic similarity coefficient of each historical sample data in the preliminary screening training sample set, and conduct a second screening of each historical sample data in the preliminary screening training sample set based on the calculated characteristic similarity coefficient. Thus, the final training sample set for prediction is obtained.
应用本发明,可以充分挖掘历史负荷数据样本特征,经过双重筛选得到的预测样本集,可以更好的保证与预测目标特征的一致性,显著提高预测模型的精度;可以筛掉众多弱相似样本,可以大大缩小计算用的数据规模,加快训练速度,尤其对于历史数据样本较多的应用场景,该方法的负荷预测速度提高极为明显。By applying the present invention, the characteristics of historical load data samples can be fully mined. The prediction sample set obtained through double screening can better ensure the consistency with the characteristics of the prediction target and significantly improve the accuracy of the prediction model; many weakly similar samples can be screened out. It can greatly reduce the size of the data used for calculation and speed up the training. Especially for application scenarios with a large number of historical data samples, the load prediction speed of this method is extremely obvious.
本发明实施例还提供一种充电站的数据处理装置,该装置设置于充电站预测系统,该装置可以支持图1所示的方法的具体实现。An embodiment of the present invention also provides a data processing device for a charging station. The device is provided in the charging station prediction system. The device can support the specific implementation of the method shown in Figure 1.
参照图6,为本发明实施例提供的一种充电站的数据处理装置的结构示意图,具体说明如下所述:Referring to Figure 6 , which is a schematic structural diagram of a data processing device for a charging station provided by an embodiment of the present invention, the specific description is as follows:
构建单元601,用于构建充电站的知识图谱;Construction unit 601, used to build the knowledge graph of the charging station;
第一获取单元602,用于基于所述知识图谱获取所述充电站的特征数据;The first acquisition unit 602 is used to acquire the characteristic data of the charging station based on the knowledge graph;
第二获取单元603,用于获取所述充电站的各个历史样本数据;The second acquisition unit 603 is used to acquire each historical sample data of the charging station;
筛选单元604,用于应用所述特征数据,对各个所述历史样本数据进行筛选,得到目标样本数据;其中,所述对各个所述历史样本数据进行筛选,是指基于各个所述历史样本数据与所述特征数据之间的相关度信息,选择或剔除至少一个历史样本数据。The screening unit 604 is used to apply the characteristic data to screen each of the historical sample data to obtain target sample data; wherein the screening of each of the historical sample data refers to based on each of the historical sample data. Based on the correlation information with the feature data, at least one historical sample data is selected or eliminated.
本发明提供的实施例中,构建充电站的知识图谱;基于知识图谱获取充电站的特征数据;获取充电站的各个历史样本数据;应用特征数据,对各个历史样本数据进行筛选,得到目标样本数据。通过构建的知识图谱提取出充电站的特征数据,然后基于特征数据对历史样本数据进行筛选,将与充电站的特征差异大且无效的历史样本数据筛出,保留与充电站的特征相似度高的历史样本数据,由此,可以有效缩小预测充电站时所使用的数据的体量,并且提高所应用的数据的精度,缩短预测所需时间和提高预测的精度。In the embodiments provided by the present invention, a knowledge graph of the charging station is constructed; the characteristic data of the charging station is obtained based on the knowledge graph; each historical sample data of the charging station is obtained; the characteristic data is used to filter each historical sample data to obtain the target sample data . The characteristic data of the charging station is extracted through the constructed knowledge graph, and then the historical sample data is filtered based on the characteristic data, and the historical sample data that is greatly different from the characteristics of the charging station and invalid is filtered out, and the characteristics of the charging station are highly similar. Historical sample data can effectively reduce the volume of data used when predicting charging stations, improve the accuracy of the data used, shorten the time required for prediction, and improve the accuracy of prediction.
在本发明提供的另一实施例中,该装置的构建单元执行构建充电站的知识图谱的过程,包括:In another embodiment provided by the present invention, the construction unit of the device performs a process of constructing the knowledge graph of the charging station, including:
获取所述充电站的半结构化数据以及应用层数据;Obtain semi-structured data and application layer data of the charging station;
从所述半结构化数据中抽取负荷影响因素数据;Extract load influencing factor data from the semi-structured data;
应用所述负荷影响因素数据和所述应用层数据构建知识图谱。The load influencing factor data and the application layer data are used to construct a knowledge graph.
在本发明提供的另一实施例中,该装置的筛选单元执行应用所述特征数据,对各个所述历史样本数据进行筛选,得到目标样本数据的过程,包括:In another embodiment provided by the present invention, the screening unit of the device performs a process of applying the characteristic data to screen each of the historical sample data to obtain target sample data, including:
基于所述特征数据,获取每个所述历史样本数据的第一筛选标志位;Based on the characteristic data, obtain the first screening flag of each historical sample data;
将满足预设的第一筛选条件的第一筛选标志位所属的历史样本数据确定为初筛样本数据;Determine the historical sample data belonging to the first screening flag that satisfies the preset first screening condition as the preliminary screening sample data;
基于各个所述初筛样本数据确定所述目标样本数据。The target sample data is determined based on each of the preliminary screening sample data.
在本发明提供的另一实施例中,该装置的筛选单元执行基于各个所述初筛样本数据确定所述目标样本数据的过程,包括:In another embodiment provided by the present invention, the screening unit of the device performs a process of determining the target sample data based on each of the preliminary screening sample data, including:
获取与所述特征数据相关的影响因子;Obtain the influencing factors related to the characteristic data;
基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的第二筛选标志位;Based on the characteristic data and the influence factor, obtain the second screening flag of each of the preliminary screening sample data;
将满足预设的第二筛选条件的第二筛选标志位所属的初筛样本数据确定为目标样本数据。The preliminary screening sample data belonging to the second screening flag that satisfies the preset second screening condition is determined as the target sample data.
在本发明提供的另一实施例中,该装置的筛选单元执行基于所述特征数据,获取每个所述历史样本数据的第一筛选标志位的过程,包括:In another embodiment provided by the present invention, the screening unit of the device performs a process of obtaining the first screening flag of each historical sample data based on the characteristic data, including:
获取每个所述历史样本数据与所述特征数据之间的筛选系数;Obtain the screening coefficient between each of the historical sample data and the characteristic data;
基于预设的筛选阀值和每个所述历史样本数据的筛选系数,确定每个所述历史样本数据的第一筛选标志位。Based on the preset filtering threshold and the filtering coefficient of each historical sample data, the first filtering flag of each historical sample data is determined.
在本发明提供的另一实施例中,该装置的筛选单元执行基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的第二筛选标志位的过程,包括:In another embodiment provided by the present invention, the screening unit of the device performs a process of obtaining the second screening flag of each of the preliminary screening sample data based on the characteristic data and the impact factor, including:
基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的特征相似度系数;Based on the characteristic data and the influence factor, obtain the characteristic similarity coefficient of each of the preliminary screening sample data;
基于预设的特征相似度系数阀值和每个所述初筛样本数据的特征相似度系数,确定每个所述初筛样本数据的第二筛选标志位。Based on the preset characteristic similarity coefficient threshold and the characteristic similarity coefficient of each preliminary screening sample data, the second screening flag bit of each preliminary screening sample data is determined.
在本发明提供的另一实施例中,该装置的筛选单元执行基于所述特征数据和所述影响因子,获取每个所述初筛样本数据的特征相似度系数的过程,包括:In another embodiment provided by the present invention, the screening unit of the device performs a process of obtaining the feature similarity coefficient of each of the preliminary screening sample data based on the feature data and the influence factor, including:
基于所述特征数据和所述影响因子,确定每个所述初筛样本数据在每个特征维度的相似度系数;Based on the characteristic data and the influence factor, determine the similarity coefficient of each of the preliminary screening sample data in each characteristic dimension;
对于每个所述初筛样本数据,将所述初筛样本数据的各个相似度系数进行求和处理,得到所述初筛样本数据的特征相似度系数。For each of the preliminary screening sample data, the similarity coefficients of the preliminary screening sample data are summed to obtain the characteristic similarity coefficient of the preliminary screening sample data.
在本发明提供的另一实施例中,该装置的筛选单元执行基于所述特征数据和所述影响因子,确定每个所述初筛样本数据在每个特征维度的相似度系数的过程,包括:In another embodiment provided by the present invention, the screening unit of the device performs a process of determining the similarity coefficient of each of the preliminary screening sample data in each feature dimension based on the feature data and the influence factor, including :
确定每个所述初筛样本数据在每个特征维度的参数集合,所述参数集合包括所述初筛样本数据中与所述特征维度对应的第一特征值、所述特征数据中与所述特征维度对应的第二特征值以及所述影响因子中与所述特征维度对应的因子值;Determine a parameter set in each characteristic dimension of each of the preliminary screening sample data. The parameter set includes the first characteristic value in the preliminary screening sample data corresponding to the characteristic dimension, the characteristic value in the characteristic data and the first characteristic value in the characteristic dimension. The second characteristic value corresponding to the characteristic dimension and the factor value corresponding to the characteristic dimension among the influence factors;
对于每个所述初筛样本数据,将所述初筛样本数据的每个所述参数集合中的第一特征值、第二特征值以及因子值进行运算,得到每个所述参数集合所对应的特征维度的相似度系数。For each of the preliminary screening sample data, the first characteristic value, the second characteristic value and the factor value in each of the parameter sets of the preliminary screening sample data are calculated to obtain the corresponding parameter set of each parameter set. The similarity coefficient of the feature dimension.
在本发明提供的另一实施例中,该装置的还包括:In another embodiment provided by the present invention, the device further includes:
训练单元,用于应用所述目标样本数据对预设的预测模型进行训练,并将训练完成的预测模型作为充电站负荷预测模型;A training unit used to train a preset prediction model using the target sample data, and use the trained prediction model as a charging station load prediction model;
输入单元,用于将所述充电站的运行数据输入所述充电站负荷预测模型中,得到所述充电站的负荷预测结果。An input unit is used to input the operating data of the charging station into the charging station load prediction model to obtain the load prediction result of the charging station.
本发明实施例还提供了一种存储介质,所述存储介质包括存储的指令,其中,在所述指令运行时控制所述存储介质所在的设备执行上述充电站的数据处理方法。Embodiments of the present invention also provide a storage medium that includes stored instructions, wherein when the instructions are run, the device where the storage medium is located is controlled to execute the data processing method of the charging station.
本发明实施例还提供了一种电子设备,其结构示意图如图7所示,具体包括存储器701,以及一个或者一个以上的指令702,其中一个或者一个以上指令702存储于存储器701中,且经配置以由一个或者一个以上处理器703执行所述一个或者一个以上指令702执行上述充电站的数据处理方法。An embodiment of the present invention also provides an electronic device, the schematic structural diagram of which is shown in Figure 7, specifically including a memory 701, and one or more instructions 702, wherein one or more instructions 702 are stored in the memory 701, and are The one or more instructions 702 are configured to be executed by one or more processors 703 to perform the data processing method of the charging station.
上述各个实施例的具体实施过程及其衍生方式,均在本发明的保护范围之内;本申请所使用的信息、数据等内容均为合法的内容。The specific implementation processes and derivatives of each of the above embodiments are all within the protection scope of the present invention; the information, data and other contents used in this application are all legal content.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统或系统实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的系统及系统实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。Each embodiment in this specification is described in a progressive manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, for the system or system embodiment, since it is basically similar to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment. The system and system embodiments described above are only illustrative, in which the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, It can be located in one place, or it can be distributed over multiple network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those skilled in the art may further realize that the units and algorithm steps of each example described in connection with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of both. In order to clearly illustrate the possible functions of hardware and software, Interchangeability, in the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered to be beyond the scope of the present invention.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be practiced in other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310884206.4ACN116933019A (en) | 2023-07-18 | 2023-07-18 | Charging station data processing method and device, storage medium and electronic equipment |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310884206.4ACN116933019A (en) | 2023-07-18 | 2023-07-18 | Charging station data processing method and device, storage medium and electronic equipment |
| Publication Number | Publication Date |
|---|---|
| CN116933019Atrue CN116933019A (en) | 2023-10-24 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310884206.4APendingCN116933019A (en) | 2023-07-18 | 2023-07-18 | Charging station data processing method and device, storage medium and electronic equipment |
| Country | Link |
|---|---|
| CN (1) | CN116933019A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118674175A (en)* | 2024-08-22 | 2024-09-20 | 国网安徽省电力有限公司合肥供电公司 | Electric vehicle charging load characteristic modeling method based on knowledge graph |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110093249A1 (en)* | 2009-10-19 | 2011-04-21 | Theranos, Inc. | Integrated health data capture and analysis system |
| WO2017162063A1 (en)* | 2016-03-24 | 2017-09-28 | 阿里巴巴集团控股有限公司 | Similarity processing method and object screening method and device |
| CN111985719A (en)* | 2020-08-27 | 2020-11-24 | 华中科技大学 | Power load prediction method based on improved long-term and short-term memory network |
| CN112183878A (en)* | 2020-10-13 | 2021-01-05 | 东北大学 | A power load forecasting method combining knowledge graph and neural network |
| US20210209467A1 (en)* | 2018-09-25 | 2021-07-08 | Ennew Digital Technology Co., Ltd. | Method and device for predicting thermal load of electrical system |
| CN113379143A (en)* | 2021-06-23 | 2021-09-10 | 阳光电源股份有限公司 | Typical meteorological year construction method, power generation amount prediction method and related device |
| CN115775053A (en)* | 2022-12-21 | 2023-03-10 | 福州大学 | Distributed photovoltaic power short-term prediction method based on improved similar time method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110093249A1 (en)* | 2009-10-19 | 2011-04-21 | Theranos, Inc. | Integrated health data capture and analysis system |
| WO2017162063A1 (en)* | 2016-03-24 | 2017-09-28 | 阿里巴巴集团控股有限公司 | Similarity processing method and object screening method and device |
| US20210209467A1 (en)* | 2018-09-25 | 2021-07-08 | Ennew Digital Technology Co., Ltd. | Method and device for predicting thermal load of electrical system |
| CN111985719A (en)* | 2020-08-27 | 2020-11-24 | 华中科技大学 | Power load prediction method based on improved long-term and short-term memory network |
| CN112183878A (en)* | 2020-10-13 | 2021-01-05 | 东北大学 | A power load forecasting method combining knowledge graph and neural network |
| CN113379143A (en)* | 2021-06-23 | 2021-09-10 | 阳光电源股份有限公司 | Typical meteorological year construction method, power generation amount prediction method and related device |
| CN115775053A (en)* | 2022-12-21 | 2023-03-10 | 福州大学 | Distributed photovoltaic power short-term prediction method based on improved similar time method |
| Title |
|---|
| 李恒杰;朱月阳;陈伟;吕俊青;: "基于节点-支路信息的电动汽车充电站负荷预测", 电气自动化, no. 03, 30 May 2020 (2020-05-30), pages 17 - 20* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118674175A (en)* | 2024-08-22 | 2024-09-20 | 国网安徽省电力有限公司合肥供电公司 | Electric vehicle charging load characteristic modeling method based on knowledge graph |
| Publication | Publication Date | Title |
|---|---|---|
| Wang et al. | A hybrid model for prediction in asphalt pavement performance based on support vector machine and grey relation analysis | |
| CN111950603B (en) | Prediction method and device for road section traffic accident rate and computer storage medium | |
| CN110555148B (en) | User behavior evaluation method, computing device and storage medium | |
| CN105786681A (en) | Server performance evaluating and server updating method for data center | |
| CN113838303B (en) | Parking lot recommendation method and device, electronic equipment and storage medium | |
| CN112069635B (en) | Method and device for deploying battery changing cabinet, medium and electronic equipment | |
| CN111464337B (en) | Resource allocation method and device and electronic equipment | |
| CN108876056A (en) | A kind of shared bicycle Demand Forecast method, apparatus, equipment and storage medium | |
| Lee et al. | Dynamic BIM component recommendation method based on probabilistic matrix factorization and grey model | |
| CN113570867A (en) | An urban traffic state prediction method, device, device and readable storage medium | |
| CN111859172A (en) | Information pushing method and device, electronic equipment and computer readable storage medium | |
| EP3192061B1 (en) | Measuring and diagnosing noise in urban environment | |
| CN107862555A (en) | Forecasting system and method based on exponential smoothing | |
| CN116933019A (en) | Charging station data processing method and device, storage medium and electronic equipment | |
| CN111143769B (en) | Travel mode sharing rate prediction method and prediction device based on big data | |
| CN112990530A (en) | Regional population number prediction method and device, electronic equipment and storage medium | |
| CN116030617B (en) | A method and device for predicting traffic flow based on highway OD data | |
| CN115860626A (en) | Path planning method and device, electronic equipment and storage medium | |
| CN119203304A (en) | Highway route selection optimization method and system based on AI and quantitative parameter constraints | |
| Zheng et al. | Investigating the transferability of Bayesian hierarchical extreme value model for traffic conflict-based crash estimation | |
| Rahman et al. | MDLpark: available parking prediction for smart parking through mobile deep learning | |
| CN113516315A (en) | Wind power generation power interval prediction method, device and medium | |
| CN116030616B (en) | A method and device for predicting traffic volume using big data | |
| CN116071912B (en) | A method and device for determining road traffic volume distribution | |
| CN115271157A (en) | Multi-task traffic flow prediction method, device, terminal and storage medium based on Transformer |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |