Movatterモバイル変換


[0]ホーム

URL:


CN102457569B - Redundancy check method and system for Web services facing IOT (Internet of Things) application - Google Patents

Redundancy check method and system for Web services facing IOT (Internet of Things) application
Download PDF

Info

Publication number
CN102457569B
CN102457569BCN201110206923.9ACN201110206923ACN102457569BCN 102457569 BCN102457569 BCN 102457569BCN 201110206923 ACN201110206923 ACN 201110206923ACN 102457569 BCN102457569 BCN 102457569B
Authority
CN
China
Prior art keywords
service
similarity
rule
services
wsdl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110206923.9A
Other languages
Chinese (zh)
Other versions
CN102457569A (en
Inventor
牛温佳
徐月梅
赵志军
唐晖
谭红艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CASfiledCriticalInstitute of Acoustics CAS
Priority to CN201110206923.9ApriorityCriticalpatent/CN102457569B/en
Publication of CN102457569ApublicationCriticalpatent/CN102457569A/en
Application grantedgrantedCritical
Publication of CN102457569BpublicationCriticalpatent/CN102457569B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

Translated fromChinese

本发明涉及一种面向物联网应用的Web服务的冗余检测方法及系统,所述方法包含如下步骤:获取服务的步骤,Web服务器通过接口获取要进行冗余检测的若干个服务;基于规则的WSDL解析及规则向量化的步骤;所述每个服务采用基于规则的WSDL语法进行描述刻画;所述Web服务采用扩展的WSDL服务元素,进而增加了对服务功能的规则描述;计算相似度的步骤;冗余决策步骤。本发明具有以下优点:1、首次对WSDL语法上进行了基于逻辑表达式的规则扩展,增加了对服务功能的规则描述,为更细致的服务相似度计算提供支撑;2、首次将逻辑表达式规则引入服务的相似度计算,并在此基础上给出了一种新的服务冗余检测方法,提高了物联网中服务的冗余检测和辨识能力。

Figure 201110206923

The present invention relates to a method and system for detecting redundancy of Web services oriented to Internet of Things applications. The method includes the following steps: a step of obtaining services, wherein the Web server obtains several services to be subjected to redundancy detection through an interface; rule-based Steps of WSDL parsing and rule vectorization; each service is described and described using rule-based WSDL grammar; the Web service uses extended WSDL service elements, thereby increasing the rule description of service functions; the step of calculating similarity ; Redundant decision steps. The present invention has the following advantages: 1. For the first time, the rule extension based on logical expressions is carried out on the WSDL grammar, and the rule description for service functions is added to provide support for more detailed service similarity calculations; 2. For the first time, logical expressions Rules are introduced into the similarity calculation of services, and on this basis, a new service redundancy detection method is given, which improves the redundancy detection and identification capabilities of services in the Internet of Things.

Figure 201110206923

Description

Translated fromChinese
一种面向物联网应用的Web服务的冗余检测方法及系统Redundancy detection method and system for web services oriented to internet of things applications

技术领域technical field

本发明涉及物联网的服务管理方法,特别涉及一种面向物联网应用的Web服务表示的冗余检测方法及系统。The invention relates to a service management method of the Internet of Things, in particular to a method and system for detecting redundancy of Web service representations oriented to Internet of Things applications.

背景技术Background technique

物联网(Internet ofThings,IOT)是一个新兴的概念,是一种通过射频识别(RFID)、红外感应器、全球定位系统、激光扫描器等信息传感设备,把任何物品与互联网连接起来,进行信息交换和通讯,以实现智能化识别、定位、跟踪、监控和管理的网络。通过物联网的定义可以看出,其核心和基础仍然是互联网,不同的是将互联网连接的用户端扩展到物品和物品。因此,提供商提供的各种物联网应用服务,可以通过Internet进行部署以便客户端远程访问和调用执行,将是物联网服务管理的一种必然发展趋势。The Internet of Things (IOT) is an emerging concept, which is a kind of information sensing equipment such as radio frequency identification (RFID), infrared sensors, global positioning systems, laser scanners, etc. Information exchange and communication to achieve intelligent identification, positioning, tracking, monitoring and management of the network. It can be seen from the definition of the Internet of Things that its core and foundation is still the Internet, the difference is that the Internet-connected client side is extended to items and items. Therefore, it will be an inevitable development trend of IoT service management that various IoT application services provided by providers can be deployed through the Internet for remote access and call execution by clients.

Web服务(Web Service,WS)是当前Internet上最主要的一种服务实现技术,它通过Web服务描述语言(WSDL)对各种软件应用程序进行描述和封装,进而利用统一资源标识符(URI)对封装后的软件应用程序进行标识定位,通过简单对象访问协议(SOAP)进行消息通信,最终实现跨越不同地域、不同行业的服务间互操作。Web service (Web Service, WS) is currently the most important service implementation technology on the Internet. It describes and encapsulates various software applications through Web Service Description Language (WSDL), and then utilizes Uniform Resource Identifier (URI) The packaged software application is identified and located, and the message communication is carried out through the Simple Object Access Protocol (SOAP), and finally the interoperability between services across different regions and different industries is realized.

Web服务技术为Internet上的服务管理提供了有效的技术支撑。然而,面向具体的物联网应用,Web服务仍面临以下问题和挑战:Web service technology provides effective technical support for service management on the Internet. However, for specific IoT applications, Web services still face the following problems and challenges:

第一,从用户角度看,物联网的服务呈现海量的特点。一方面,这些服务大多会依赖底层的传感设备(如传感器),因此随着时间的增长,服务本身处理的数据将指数级增长;另一方面,接入Internet的物品种类繁多、功能各异,将导致服务数量海量增长。在不影响用户服务质量的前提下,如果不对以上服务规模加以控制,将对现有的存储、搜索和通信技术产生巨大冲击,成为现阶段物联网发展的一个瓶颈。First, from the perspective of users, the services of the Internet of Things present massive characteristics. On the one hand, most of these services will rely on the underlying sensing devices (such as sensors), so as time grows, the data processed by the service itself will increase exponentially; on the other hand, there are many types of items connected to the Internet with different functions , will lead to a massive increase in the number of services. Without affecting the service quality of users, if the scale of the above services is not controlled, it will have a huge impact on the existing storage, search and communication technologies, and become a bottleneck in the development of the Internet of Things at this stage.

第二,从提供商角度看,物联网的服务呈现动态的特点。传感器等传感设备,随着使用的频度加大,会存在能量消耗过大、敏感度下降等问题。如果频繁更换硬件设备,或者根据传感器性能变化重新构建并发布新的服务,都将导致资源的严重浪费,并不是合理的解决方法。因此,当传感设备性能发生变化,适时的对相应服务做出动态调整就变得非常重要。Second, from the perspective of providers, IoT services are dynamic. Sensing equipment such as sensors, as the frequency of use increases, there will be problems such as excessive energy consumption and decreased sensitivity. Frequent replacement of hardware devices, or rebuilding and releasing new services based on changes in sensor performance will result in a serious waste of resources, which is not a reasonable solution. Therefore, when the performance of sensing devices changes, it becomes very important to make dynamic adjustments to the corresponding services in a timely manner.

在物联网应用中,传统的Web服务表示方法(WSDL2.0)缺乏对服务功能的充分刻画,尤其缺乏对核心规则的描述,导致无法区别功能相似的服务而无法消除服务冗余,大大增加物联网服务的存储和搜索负担。因此,在WSDL2.0基础上,本发明提出了一种新的Web服务表示方法,对WSDL语法进行了基于逻辑表达式的规则扩展,并进一步给出相应的服务冗余检测方法。In IoT applications, the traditional Web service representation method (WSDL2.0) lacks sufficient description of service functions, especially the description of core rules, resulting in the inability to distinguish services with similar functions and eliminate service redundancy, which greatly increases the Storage and search burden for networked services. Therefore, on the basis of WSDL2.0, the present invention proposes a new Web service representation method, extends the rules of WSDL grammar based on logical expressions, and further provides a corresponding service redundancy detection method.

因此,针对物联网应用的快速发展,如何解决以上Web服务技术存在的问题迫在眉睫。通过分析我们发现,解决问题的关键还是要在Web服务技术的核心基础,即Web服务表示上寻求突破。Therefore, for the rapid development of Internet of Things applications, how to solve the problems existing in the above Web service technologies is imminent. Through analysis, we found that the key to solving the problem is to seek a breakthrough in the core foundation of Web service technology, that is, Web service representation.

目前,Web服务表示主要采用IBM和微软等公司于2001年3月提出的WSDL描述语言,后经修改于2007年6月获得W3C国际组织推荐,当前的最新版本为WSDL2.0。WSDL2.0是建立在XML语言基础上的服务描述规范,语法上以description为根元素,而description根元素又封装了types、interface、binding和service四个子元素,其语法框架如图1所示。其中types元素定义了服务交换消息时使用的数据类型;interface元素定义了具体的Web服务操作,包括服务的输入、输出和服务出错后所返回的错误消息序列;binding元素定义了用户与Web服务通信的协议;service元素为每一个binding元素声明了一个唯一的Web服务访问地址。WSDL2.0语法框架如图6所示。At present, Web service representation mainly adopts the WSDL description language proposed by IBM and Microsoft in March 2001, which was revised and recommended by W3C in June 2007. The latest version is WSDL2.0. WSDL2.0 is a service description specification based on the XML language. Syntactically, description is the root element, and the description root element encapsulates four sub-elements: types, interface, binding, and service. Its grammatical framework is shown in Figure 1. Among them, the types element defines the data type used by the service to exchange messages; the interface element defines the specific Web service operation, including the input and output of the service, and the error message sequence returned after the service fails; the binding element defines the communication between the user and the Web service protocol; the service element declares a unique Web service access address for each binding element. The WSDL2.0 syntax framework is shown in Figure 6.

事实上,Web服务描述语言WSDL不管在语法上怎样定义和扩展,它在描述Web服务时都遵循着一个基本原则,即任何Web服务的刻画都需要充分考虑到三个方面的服务语义内涵:输入(Input)、输出(Output)和服务功能(Function)。图1给出了WSDL2.0语法中四个子元素分别与服务输入、输出和服务功能的映射关系。In fact, no matter how the Web service description language WSDL is defined and extended in terms of syntax, it follows a basic principle when describing Web services, that is, the description of any Web service needs to fully consider the service semantic connotation in three aspects: input (Input), output (Output) and service functions (Function). Figure 1 shows the mapping relationship between the four sub-elements in WSDL2.0 grammar and service input, output and service function respectively.

归纳起来,WSDL2.0在刻画服务语义内涵的过程中具备以下三个特点:To sum up, WSDL2.0 has the following three characteristics in the process of describing the semantic connotation of services:

第一,types元素和interface元素刻画了大部分服务输入和输出相关的内容,包括数据类型、消息格式、消息传输顺序等;First, the types element and interface element describe most of the content related to service input and output, including data type, message format, message transmission sequence, etc.;

第二,service元素主要用来刻画服务功能,需要指出的是,它并没有给出具体的功能描述,而是给出了服务功能的访问地址;Second, the service element is mainly used to describe the service function. It should be pointed out that it does not give a specific function description, but gives the access address of the service function;

第三,binding元素用来定义服务输入输出的报文格式(如SOAP)和传输协议(如HTTP),相比较其他三个子元素而言,从服务功能角度看,它对Web服务核心内容没有任何影响。Third, the binding element is used to define the message format (such as SOAP) and transmission protocol (such as HTTP) of the service input and output. Compared with the other three sub-elements, it has nothing to do with the core content of the Web service from the perspective of service functions. Influence.

通过以上分析,不难发现在语法上,当前的WSDL2.0只有service元素涉及到服务功能相关的描述。对服务功能的描述接口单一,描述信息不充分,将导致了两个相似的Web服务无法通过WSDL描述区别开来。例如对房间的灯光控制服务中,一个服务是如果房间内光亮度大于300(尼特)则关灯,而另外一个是如果光亮度小于500(尼特)则关灯。输入都为光亮度,输出都为关灯指令,服务功能描述也都为访问地址,而对于服务中涉及的核心规则,传统的WSDL根本无法刻画和区分。Through the above analysis, it is not difficult to find that in terms of syntax, only the service element of the current WSDL2.0 involves descriptions related to service functions. The description interface for service functions is single and the description information is insufficient, which will lead to the inability to distinguish two similar Web services through WSDL description. For example, in the lighting control service of a room, one service is to turn off the lights if the light brightness in the room is greater than 300 (nits), and the other is to turn off the lights if the light brightness is less than 500 (nits). The input is brightness, the output is the command to turn off the light, and the service function description is also the access address. As for the core rules involved in the service, traditional WSDL cannot describe and distinguish them at all.

综上所述,在物联网应用中,这种传统的Web服务表示方法(WSDL2.0)的缺陷是:缺乏对服务功能的充分刻画,尤其缺乏对核心规则的描述,导致无法区别功能相似的服务而无法消除服务冗余,大大增加物联网服务的存储和搜索负担。To sum up, in the application of the Internet of Things, the defect of this traditional Web service representation method (WSDL2.0) is that it lacks a sufficient description of the service function, especially the description of the core rules, resulting in the inability to distinguish between services with similar functions. services without being able to eliminate service redundancy, greatly increasing the storage and search burden of IoT services.

因此,在传统的WSDL基础上,本发明提出了一种新的Web服务表示方法,并进一步给出相应的服务冗余检测方法。Therefore, on the basis of traditional WSDL, the present invention proposes a new Web service representation method, and further provides a corresponding service redundancy detection method.

发明内容Contents of the invention

本发明的目的在于,为克服现有技术基于面向物联网应用的Web服务表示的冗余检测方法缺乏对不同服务核心规则的描述,导致无法区别功能相似的服务而无法消除服务冗余,进而大大增加物联网服务的存储和搜索负担的问题,从而提供一种面向物联网应用的Web服务表示的冗余检测方法及系统。The purpose of the present invention is to overcome the lack of description of the core rules of different services in the prior art based on the redundancy detection method of the Web service representation oriented to the application of the Internet of Things, resulting in the inability to distinguish services with similar functions and the inability to eliminate service redundancy, thereby greatly The problem of increasing the storage and search burden of Internet of Things services, thereby providing a redundancy detection method and system for Web service representations of Internet of Things applications.

鉴于Web服务描述中服务元素只涉及到服务功能的访问地址,本发明将从服务元素入手,引入规则描述以增加对服务输入输出关系的刻画,从而提供一种基于面向物联网应用的Web服务表示的冗余检测方法。In view of the fact that the service element in the Web service description only involves the access address of the service function, the present invention will start with the service element and introduce a rule description to increase the description of the service input and output relationship, thereby providing a Web service representation based on the application of the Internet of Things Redundancy detection method.

为实现上述目的,本发明提供基于计算若干输入服务间的相似度进行冗余检测,所述方法包含如下步骤:In order to achieve the above object, the present invention provides redundancy detection based on calculating the similarity between several input services, and the method includes the following steps:

获取服务的步骤,Web服务器通过接口获取要进行冗余检测的若干个服务;In the step of obtaining services, the web server obtains several services to be checked for redundancy through the interface;

基于规则的WSDL解析及规则向量化的步骤,依据WSDL语法规则,对每个服务的WSDL进行基于XML解析,抽取出每个服务相关的输入输出变量及相应逻辑表达式规则;所述每个服务采用基于规则的WSDL语法进行描述刻画;所述Web服务采用扩展的WSDL服务元素,进而增加了对服务功能的规则描述;The steps of rule-based WSDL parsing and rule vectorization, according to WSDL grammar rules, perform XML-based parsing on the WSDL of each service, and extract the input and output variables and corresponding logical expression rules related to each service; each service The rule-based WSDL grammar is used for description and description; the Web service uses extended WSDL service elements, and then adds rule descriptions for service functions;

计算相似度的步骤,对每个服务的规则进行向量化,基于逻辑表达式逐一比较其中一个服务与其余服务,计算得到若干个基于服务规则的相似度计算值;The step of calculating the similarity is to vectorize the rules of each service, compare one of the services with other services one by one based on logical expressions, and calculate and obtain several similarity calculation values based on service rules;

冗余决策步骤,将得到的所有相似度值逐一与一设定阈值比较,相似度大于阈值则判定两个服务存在冗余;否则服务之间不相似。In the redundancy decision-making step, all obtained similarity values are compared with a set threshold one by one, and if the similarity is greater than the threshold, it is determined that there is redundancy between the two services; otherwise, the services are not similar.

上述技术方案中,所述扩展的WSDL服务元素包含标签如下:policy、condition、element、relation、bracket、operand、loperator和roperator;In the above technical solution, the extended WSDL service element includes tags as follows: policy, condition, element, relation, bracket, operand, loperator and roperator;

所述policy元素,用来标记服务规则;The policy element is used to mark service rules;

在所述policy元素定义condition子元素,该condition子元素标记该服务的使用输入必须满足的前提条件;The condition sub-element is defined in the policy element, and the condition sub-element marks the preconditions that must be met for the use input of the service;

其中,所述前提条件由若干个element标签组成,每一个element为一个简单的关系表达式,所述表达式通过relation和bracket可形成复杂的逻辑表达式;在每个element中,由operand定义表达式的操作数,由loperator定义关系运算符,由roperator定义基本运算。Wherein, the precondition is composed of several element tags, and each element is a simple relational expression, and the expression can form a complex logical expression through relation and bracket; in each element, it is defined and expressed by operand The operand of the formula, the relational operator is defined by the loperator, and the basic operation is defined by the roperator.

所述关系运算符包含:>,<,>=,<=,==,!=,所述基本运算符包含:+,-,*,/,%。The relational operators include: >, <, >=, <=, ==, ! =, the basic operators include: +, -, *, /, %.

上述技术方案中,所述对规则进行向量化具体步骤如下:In the above technical solution, the specific steps of vectorizing the rules are as follows:

步骤1,从WSDL中定位到服务元素,通过XML解析policy标签,根据逻辑运算符抽取出规则中的关系表达式集合,形成逻辑表达式的二叉树数据结构;Step 1. Locate the service element from WSDL, parse the policy tag through XML, extract the set of relational expressions in the rules according to the logical operators, and form a binary tree data structure of logical expressions;

步骤2,,对每个关系表达式通过移项操作,向量化为如下公式中的标准多维向量ti;Step 2, through the transposition operation for each relational expression, vectorize it into a standard multidimensional vector ti in the following formula;

t=(s1 v11 q11 v12 q12 v13…v1n,…,si vi1 qi1 vi2 qi2 vi3…vin,…,sn vn1 qn1 vn2qn2 vn3…vnn,p,c)t=(s1 v11 q11 v12 q12 v13...v1n,...,si vi1 qi1 vi2 qi2 vi3...vin,...,sn vn1 qn1 vn2qn2 vn3...vnn,p,c)

其中,V为变量集合,C为常量集合,P为关系运算符集合和Q为基本运算符集合;移项符si∈{+,-},vij∈V,qij∈Q,p∈P,c∈C。所述逻辑运算符包含:&&,!,||和

Figure GDA0000108962380000041
Among them, V is a set of variables, C is a set of constants, P is a set of relational operators and Q is a set of basic operators; the transposition symbols si∈{+,-}, vij∈V, qij∈Q, p∈P, c ∈C. The logical operators include: &&, ! , || and
Figure GDA0000108962380000041

所述基于服务规则的相似度计算采用Dice系数二值权重方法。所述相似度计算公式为:The similarity calculation based on service rules adopts the Dice coefficient binary weight method. The formula for calculating the similarity is:

Sim(W1,W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2);Sim(W1, W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2);

其中,W1与W2分别为需要进行相似度计算的两个服务。服务的相似度包括服务IO相似度以及服务规则相似度,分别用SimIO(W1,W2)及SimRule(W1,W2)表示。这里考虑服务IO及服务规则在服务相似度计算中同等重要,因此,分别赋予它们各0.5的计算权重。Wherein, W1 and W2 are two services that need to perform similarity calculation. The service similarity includes a service IO similarity and a service rule similarity, represented by SimIO(W1, W2) and SimRule(W1, W2) respectively. It is considered here that service IO and service rules are equally important in the calculation of service similarity, so they are assigned a calculation weight of 0.5 respectively.

基于上述方法,本发明还提供一种基于面向物联网应用的Web服务的冗余检测系统,用于计算若干输入服务间的相似度,该系统包含:服务获取模块、相似度计算模块、冗余决策模块和后期处理模块,其特征在于,所述系统还包含:基于规则的WSDL解析和规则向量化模块;Based on the above method, the present invention also provides a redundancy detection system based on Web services for Internet of Things applications, which is used to calculate the similarity between several input services. The system includes: service acquisition module, similarity calculation module, redundancy The decision-making module and the post-processing module are characterized in that the system also includes: a rule-based WSDL parsing and rule vectorization module;

所述基于规则的WSDL解析模块,依据WSDL语法规则,对每个服务的WSDL进行基于XML解析,抽取出每个服务相关的输入输出变量及相应逻辑表达式规则;The rule-based WSDL parsing module performs XML-based parsing on the WSDL of each service according to WSDL grammar rules, and extracts input and output variables and corresponding logical expression rules related to each service;

所述规则向量化模块,用于对每个服务的规则进行向量化;The rule vectorization module is used to vectorize the rules of each service;

其中,所述相似度计算模块采用基于服务规则的相似度计算方法;Wherein, the similarity calculation module adopts a similarity calculation method based on service rules;

所述冗余决策模块基于服务规则的相似度计算方法得到的相似度的值判断决策服务的相似度。The redundancy decision module judges the similarity of the decision service based on the similarity value obtained by the similarity calculation method of the service rule.

所述每个服务采用基于规则的WSDL语法进行描述刻画;所述Web服务表示扩展服务元素增加对服务功能的规则描述且所述WSDL语法基于逻辑表达式的规则;所述的相似度计算为基于的逻辑表达式的相似度计算。Each service is described and characterized using rule-based WSDL grammar; the Web service represents extended service elements to add rule descriptions for service functions and the WSDL grammar is based on logical expression rules; the similarity calculation is based on The similarity calculation of the logical expression.

所述扩展的WSDL服务元素包含标签如下:policy、condition、element、relation、bracket、operand、loperator和roperator;所述policy元素,用来标记服务规则;在所述policy元素定义condition子元素,该condition子元素标记该服务的使用输入必须满足的前提条件;其中,所述前提条件由若干个element标签组成,每一个element为一个简单的关系表达式,所述表达式通过relation和bracket可形成复杂的逻辑表达式;在每个element中,由operand定义表达式的操作数,由loperator定义关系运算符,由roperator定义基本运算。The extended WSDL service element includes tags as follows: policy, condition, element, relation, bracket, operand, loperator, and roperator; the policy element is used to mark service rules; the condition sub-element is defined in the policy element, and the condition Sub-elements mark the preconditions that must be met by the input of the service; wherein, the preconditions are composed of several element tags, each element is a simple relational expression, and the expression can form a complex Logical expression; in each element, the operand of the expression is defined by operand, the relational operator is defined by loperator, and the basic operation is defined by roperator.

所述关系运算符包含:>,<,>=,<=,==,!=,所述基本运算符包含:+,-,*,/,%。The relational operators include: >, <, >=, <=, ==, ! =, the basic operators include: +, -, *, /, %.

所述规则向量化模块进一步包含:解析及形成二叉树子模块,用于从WSDL中定位到服务元素,通过XML解析policy标签,根据逻辑运算符抽取出规则中的关系表达式集合,形成逻辑表达式的二叉树数据结构;向量化模块,用于对每个关系表达式通过移项操作,向量化为如下公式中的标准多维向量ti;The rule vectorization module further includes: parsing and forming a binary tree sub-module for locating service elements from WSDL, parsing policy tags through XML, extracting a set of relational expressions in rules according to logical operators, and forming logical expressions The binary tree data structure; the vectorization module is used to transpose each relational expression into a standard multidimensional vector ti in the following formula;

t=(s1 v11 q11 v12 q12 v13…v1n,…,si vi1 qi1 vi2 qi2 vi3…vin,…,sn vn1 qn1 vn2qn2 vn3…vnn,p,c)t=(s1 v11 q11 v12 q12 v13...v1n,...,si vi1 qi1 vi2 qi2 vi3...vin,...,sn vn1 qn1 vn2qn2 vn3...vnn,p,c)

其中,V为变量集合,C为常量集合,P为关系运算符集合和Q为基本运算符集合;移项符si∈{+,-},vij∈V,qij∈Q,p∈P,c∈C。Among them, V is a set of variables, C is a set of constants, P is a set of relational operators and Q is a set of basic operators; the transposition symbols si∈{+,-}, vij∈V, qij∈Q, p∈P, c ∈C.

所述基于服务规则的相似度计算采用Dice系数二值权重方法。The similarity calculation based on service rules adopts the Dice coefficient binary weight method.

本发明涉及一种基于面向物联网应用的Web服务表示的冗余检测方法,该方法对WSDL语法进行了基于逻辑表达式的规则扩展,并进一步给出相应的服务冗余检测方法,包括:1)规则向量化;2)服务相似度计算;3)服务冗余检测。与现有技术相比,本发明具有以下优点:1、首次对WSDL语法上进行了基于逻辑表达式的规则扩展,增加了对服务功能的规则描述,为更细致的服务相似度计算提供支撑;2、首次将逻辑表达式规则引入服务的相似度计算,并在此基础上给出了一种新的服务冗余检测方法,提高了物联网中服务的冗余检测和辨识能力。The present invention relates to a redundancy detection method based on a Web service representation oriented to Internet of Things applications. The method extends WSDL grammar based on logical expression rules, and further provides a corresponding service redundancy detection method, including: 1. ) rule vectorization; 2) service similarity calculation; 3) service redundancy detection. Compared with the prior art, the present invention has the following advantages: 1. For the first time, the rule extension based on logical expression is carried out on the WSDL syntax, and the rule description for service functions is added to provide support for more detailed service similarity calculation; 2. For the first time, logical expression rules are introduced into the similarity calculation of services, and on this basis, a new service redundancy detection method is given, which improves the redundancy detection and identification capabilities of services in the Internet of Things.

本发明的优点在于,本发明与现有技术相比,具有以下优点:The advantage of the present invention is that, compared with the prior art, the present invention has the following advantages:

1、首次对WSDL语法上进行了基于逻辑表达式的规则扩展,增加了对服务功能的规则描述,为更细致的服务相似度计算提供支撑;1. For the first time, the rule extension based on logical expressions is carried out on the WSDL grammar, and the rule description of service functions is added to provide support for more detailed service similarity calculations;

2、首次将逻辑表达式规则引入服务的相似度计算,并在此基础上给出了一种新的服务冗余检测方法,提高了物联网中服务的冗余检测和辨识能力,从而大大减小了物联网服务的存储和搜索负担。2. For the first time, logical expression rules are introduced into the similarity calculation of services, and on this basis, a new service redundancy detection method is given, which improves the redundancy detection and identification capabilities of services in the Internet of Things, thereby greatly reducing Reduce the storage and search burden of IoT services.

附图说明Description of drawings

图1为现有技术的WSDL2.0语法元素与服务输入、输出和服务功能的映射关系;FIG. 1 is a mapping relationship between WSDL2.0 syntax elements of the prior art and service input, output and service functions;

图2为本发明的服务service元素扩充图;Fig. 2 is the expansion diagram of the service element of the present invention;

图3-a1为本发明的基于规则的WSDL表示的工作流程图;Fig. 3-a1 is the working flowchart of the rule-based WSDL representation of the present invention;

图3-a2为本发明的基于规则的WSDL解析模块的工作流程图;Fig. 3-a2 is the working flowchart of the rule-based WSDL parsing module of the present invention;

图3-b为本发明的规则向量化模块的工作流程图;Fig. 3-b is the workflow diagram of the rule vectorization module of the present invention;

图3-c为本发明的相似度计算模块的工作流程图;Fig. 3-c is the working flowchart of the similarity calculation module of the present invention;

图4为本发明的规则中逻辑表达式的二叉树示意图;Fig. 4 is the binary tree schematic diagram of logic expression in the rule of the present invention;

图5为本发明的服务冗余检测方法流程图;5 is a flow chart of the service redundancy detection method of the present invention;

图6是现有技术的WSDL2.0语法框架示意图;FIG. 6 is a schematic diagram of a WSDL2.0 syntax framework in the prior art;

图7是本发明的描述实例示意图。Fig. 7 is a schematic diagram of a description example of the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施方式,对本发明进行进一步详细的说明。如图2所示,本发明对服务service元素的改进,灰色框为新扩充部分。可以看出,改进和扩展是沿着从外到内,从粗到细的路线进行的。首先,定义了policy元素,与endpoint元素处在并列位置,用来标记整个服务规则;接着在policy元素里面定义了condition子元素,标记该服务的使用输入必须满足的前提条件;前提条件是由若干个element组成,一个element就是一个简单的关系表达式(如a>b),通过relation和bracket可形成复杂的逻辑表达式;在每个element中,由operand定义表达式的操作数,由loperator定义关系运算符(如>,<,>=,<=,==,!=),由roperator定义基本运算(如+,-,*,/,%)。需要明确指出的是,为简单起见,以上运算目前限定为基本的数值型运算,并未考虑字符操作。然而,这并不影响到本方法对字符操作的扩展能力,因为任何字符操作最终都可以转化成ASCII码的数值型运算。The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. As shown in Figure 2, the present invention improves the service element, and the gray box is a new extension. It can be seen that the improvement and expansion are carried out along the route from outside to inside, from coarse to fine. First, the policy element is defined, which is in a parallel position with the endpoint element, and is used to mark the entire service rule; then, the condition sub-element is defined in the policy element, which marks the preconditions that must be met for the use input of the service; the preconditions are composed of several Composed of four elements, one element is a simple relational expression (such as a > b), and complex logical expressions can be formed through relation and bracket; in each element, the operand of the expression is defined by operand, and defined by loperator Relational operators (such as >, <, >=, <=, ==, !=), basic operations defined by roperator (such as +, -, *, /, %). It should be clearly pointed out that, for the sake of simplicity, the above operations are currently limited to basic numerical operations, and character operations are not considered. However, this does not affect the ability of this method to expand character operations, because any character operations can eventually be converted into numerical operations of ASCII codes.

改进后,新增自定义标签七个:policy、condition、element、relation、bracket、operand、loperator和roperator;新增标签内引用WSDL已有属性关键字,共计两个:type和value;引进依赖约束两处:operand中的type属性依赖于命名空间定义的types类型,value属性依赖于input中定义的参数;对loperato、roperator和relation元素分别定义相应的取值集合,如表1所示。After improvement, seven new custom tags are added: policy, condition, element, relation, bracket, operand, loperator, and roperator; the new tags refer to the existing attribute keywords of WSDL, a total of two: type and value; the introduction of dependency constraints Two places: the type attribute in operand depends on the types defined in the namespace, and the value attribute depends on the parameters defined in input; the corresponding value sets are defined for the loperato, roperator, and relation elements, as shown in Table 1.

表1:loperato、roperator和relation取值集合Table 1: Value collection of loperato, roperator and relation

Figure GDA0000108962380000071
Figure GDA0000108962380000071

WSDL文档定义的数据类型遵循W3C制定的XML Schema标准,因此loperato、roperator和rel ation元素数据类型的定义也需要跟XML Schema标准保持一致。XML模式中的数据类型分为根数据类型、内置数据类型及用户派生数据类型,其中用户派生数据类型允许用户通过restriction,list或union关键字自创建数据类型:restriction通过用允许的约束面限制现有的数据类型来创建用户派生数据类型,允许的约束面如表2所示;list创建列表数据类型;union创建由一个或多个数据类型联合的新数据类型。通过分析可知,可用restriction关键字为loperato、roperator和rel ation元素定义派生自string数据类型的自定义数据类型,并用enumeration约束该派生数据类型的可接受值如表1所示。本发明为loperato、roperator和rel ation元素定义的数据类型分别为:loperatorType、roperatorType、rel ationType,描述实例如图7所示。The data types defined by the WSDL document follow the XML Schema standard formulated by W3C, so the definitions of the data types of loperato, roperator and relation elements also need to be consistent with the XML Schema standard. The data types in the XML schema are divided into root data types, built-in data types, and user-derived data types, among which user-derived data types allow users to create data types by themselves through restriction, list or union keywords: restriction restricts current Some data types are used to create user-derived data types, and the allowed constraints are shown in Table 2; list creates a list data type; union creates a new data type combined by one or more data types. Through the analysis, it can be seen that the restriction keyword can be used to define a custom data type derived from the string data type for the loperato, roperator, and relation elements, and enumeration is used to constrain the acceptable values of the derived data type, as shown in Table 1. The data types defined by the present invention for the loperato, roperator and relation elements are respectively: loperatorType, roperatorType, and relationType, and a description example is shown in FIG. 7 .

表2,restriction关键字允许的约束面Table 2, restricted surfaces allowed by the restriction keyword

  序号serial number  允许的约束面allowable constraints  含义meaning  1 1  enumerationenumeration  定义可接受值的一个列表Defines a list ofacceptable values  2 2  fractionDigitsfractionDigits  定义所允许的最大的小数位数Define the maximum number of decimal places allowed  33  lengthlength  定义所允许的字符或者列表项目的精确数目Defines the exact number of characters or list items allowed  44  maxExclusivemaxExclusive  定义数值的上限。所允许的值必须小于此值Defines the upper limit of the value. Allowed values must be less than this value  55  maxInclusivemaxInclusive  定义数值的上限。所允许的值必须小于或等于此值Defines the upper limit of the value. Allowed values must be less than or equal to this value  66  maxLengthmaxLength  定义所允许的字符或者列表项目的最大数目Defines the maximum number of characters or list items allowed  77  minExclusiveminExclusive  定义数值的下限。所允许的值必需大于此值Defines the lower bound of the value. Allowed values must be greater than this value  8 8  minInclusiveminInclusive  定义数值的下限。所允许的值必需大于或等于此值Defines the lower bound of the value. Allowed values must be greater than or equal to this value  9 9  minLengthminLength  定义所允许的字符或者列表项目的最小数目Defines the minimum number of characters or list items allowed  1010patternpattern  定义可接受的字符的精确序列Define the exact sequence of acceptable characters  1111  totalDigitstotalDigits  定义所允许的阿拉伯数字的精确位数Define the exact number of digits allowed for Arabic numerals

  1212  whiteSpacewhiteSpace  定义空白字符(换行、回车、空格以及制表符)的处理方式Define how blank characters (newline, carriage return, space, and tab) are handled

如图3-a和3-b为本发明的基于规则的WSDL表示与解析模块的工作流程图,本发明核心的模块在于基于规则的WSDL表示与解析模块、规则向量化模块以及相似度计算模块。Figure 3-a and 3-b are the workflow diagrams of the rule-based WSDL representation and analysis module of the present invention, the core modules of the present invention are the rule-based WSDL representation and analysis module, rule vectorization module and similarity calculation module .

基于规则的WSDL表示与解析模块分为WSDL表示与解析功能。其中,WSDL表示功能遵循了WSDL的语言规范,分别用policy标签表示规则描述的开始,condition标签标记该规则是对服务应满足的输入条件进行描述,这些输入条件用关系表达式以及逻辑操作符表示,每一对element描述一个关系表达式,用relation type描述这些表达式之间的关系。而WSDL解析功能则是根据XML schema语法规范,分别定位到policy标签,condition标签,element标签,提取描述服务规则的关系表达式,此外,提取用relation type标记的关系表达式之间的逻辑关系,为规则向量化做准备。The rule-based WSDL representation and analysis module is divided into WSDL representation and analysis functions. Among them, the WSDL representation function follows the WSDL language specification, and the policy tag is used to indicate the beginning of the rule description, and the condition tag marks the rule to describe the input conditions that the service should meet. These input conditions are represented by relational expressions and logical operators. , each pair of elements describes a relational expression, and the relation type describes the relationship between these expressions. The WSDL parsing function locates the policy tag, condition tag, and element tag respectively according to the XML schema syntax specification, and extracts the relational expressions describing the service rules. In addition, it extracts the logical relationship between the relational expressions marked with the relation type, Prepare for regular vectorization.

规则向量化模块将进行WSDL解析后得到的关系表达式以及它们之间的逻辑关系用如图4的二叉树数据结构表示,并将每一个关系表达式向量化为公式(1)的标准向量。The rule vectorization module expresses the relational expressions obtained after WSDL parsing and their logical relations with the binary tree data structure shown in Figure 4, and vectorizes each relational expression into a standard vector of formula (1).

相似度计算模块综合考虑了服务IO相似度以及规则IO相似度,并在总服务相似度计算中赋予它们相同的权重。其中,服务IO相似度采用Dice系数二值权重方法;而规则IO相似度的创新点则是在引用Levenstein方法计算每个去常量后的关系表达式向量间的相似度。The similarity calculation module comprehensively considers the service IO similarity and the rule IO similarity, and gives them the same weight in the calculation of the total service similarity. Among them, the service IO similarity adopts the Dice coefficient binary weight method; while the innovation point of the regular IO similarity is to refer to the Levenstein method to calculate the similarity between the relational expression vectors after each de-constant.

1)规则向量化1) Rule vectorization

规则中的逻辑表达式可形式化定义为二元组<E,F>,其中E为逻辑表达式中的关系表达式集合,而F为关系表达式间的关系集合,归纳起来共有四类逻辑运算关系:&&,!,||和

Figure GDA0000108962380000081
关系表达式T则由四个集合的元素构成,即变量集合V={v1,v2,…vn},常量集合C,关系运算符集合P={>,<,>=,<=,==,!=}和基本运算符集合Q={+,-,*,/,%}。The logical expression in the rule can be formally defined as a binary group <E, F>, where E is the set of relational expressions in the logical expression, and F is the set of relations between relational expressions. In summary, there are four types of logic Operation relationship: &&,! , || and
Figure GDA0000108962380000081
The relational expression T is composed of elements of four sets, that is, the variable set V={v1, v2,...vn}, the constant set C, and the relational operator set P={>, <, >=, <=, == ,! =} and the basic operator set Q={+, -, *, /, %}.

关系表达式以关系运算符为中间界限,通过移项分别将包含变量的项和常量项移至关系运算符的两边。不妨规定,移项操作的方向应遵循移项后正项总数大于负项总数。这样,关系表达式就可表示为一个标准的多维向量t:The relational expression takes the relational operator as the middle limit, and the term containing the variable and the constant term are moved to both sides of the relational operator by transposition. It may be stipulated that the direction of the transposition operation should follow that the total number of positive items is greater than the total number of negative items after the transposition. In this way, the relational expression can be expressed as a standard multidimensional vector t:

t=(s1v11q11v12q12v13…v1n,…,si vi1qi1vi2qi2vi3…vin,…,sn vn1qn1vn2qn2vn3…vnn,p,c),(1)t = (s1v11q11v12q12v13...v1n,..., si vi1qi1vi2qi2vi3...vin,..., sn vn1qn1vn2qn2vn3...vnn, p, c), (1)

其中移项符si∈{+,-},vij∈V,qij∈Q,p∈P,c∈C。Among them, the transposition symbol si∈{+,-}, vij∈V, qij∈Q, p∈P, c∈C.

通过以上分析和定义,规则向量化共需要两个步骤,如图3-b所示:Through the above analysis and definition, rule vectorization requires two steps, as shown in Figure 3-b:

第一,从WSDL中定位到service元素,通过XML解析技术对policy子元素解析,根据逻辑运算符

Figure GDA0000108962380000091
抽取出规则中的关系表达式集合E={e1,e2,…en},形成逻辑表达式的二叉树数据结构Tr,如图4所示。First, locate the service element from WSDL, parse the policy sub-element through XML parsing technology, and use logical operators
Figure GDA0000108962380000091
Extract the relational expression set E={e1, e2, . . . en} in the rules to form a binary tree data structure Tr of logical expressions, as shown in FIG. 4 .

第二,对每个关系表达式ei,通过移项操作,向量化为公式(1)中的标准多维向量ti。Second, for each relational expression ei, through the transposition operation, it is vectorized into the standard multidimensional vector ti in formula (1).

通过分析不难发现,规则向量化体现了关系表达式向量化和以关系表达式为基础的逻辑表达式二叉树化相结合的特点。实际上,逻辑表达式可以直接以更细粒度的操作符和运算符形成二叉树结构。之所以把逻辑表达式以粗粒度的关系表达式和逻辑运算符相结合以构造二叉树,且进一步对关系表达式向量化,是出于以下两点考虑。首先,服务规则的重点应是发掘在关系表达式层面呈现的逻辑关系上,而不是具体到每个输入变量上。其次,在关系表达式向量化的基础上,也要兼顾关系表达式间的逻辑运算关系,利用二叉树进行关联,将清晰刻画规则中逻辑表达式的整体结构。综上所述,规则向量化不仅重点突出了关系表达式内部结构,而且整体地刻画了关系表达式之间的逻辑运算关系,对服务相似度的计算产生重要影响。Through the analysis, it is not difficult to find that the vectorization of rules embodies the characteristics of the combination of vectorization of relational expressions and binary treeization of logical expressions based on relational expressions. In fact, logical expressions can directly form a binary tree structure with finer-grained operators and operators. The reason why logical expressions are combined with coarse-grained relational expressions and logical operators to construct a binary tree and further vectorize relational expressions is due to the following two considerations. First of all, the focus of service rules should be to explore the logical relationship presented at the relational expression level, rather than specific to each input variable. Secondly, on the basis of the vectorization of relational expressions, it is also necessary to take into account the logical operation relationship between relational expressions, and use the binary tree for association, which will clearly describe the overall structure of the logical expressions in the rules. To sum up, rule vectorization not only highlights the internal structure of relational expressions, but also portrays the logical operation relationship between relational expressions as a whole, which has an important impact on the calculation of service similarity.

2)服务相似度计算,如图3-c所示。2) Service similarity calculation, as shown in Figure 3-c.

基于规则的服务功能描述,实际上由两大部分组成:服务IO和连接IO的规则。而IO可以分别由输入变量和输出变量的集合刻画,即I={v1,v2,…}、O={o1,o2,…};规则可以由逻辑表达式的二叉树Tr表示,其中二叉树中的关系表达式节点e可进一步向量化为公式(1)中的标准多维向量t,逻辑运算符节点为

Figure GDA0000108962380000092
由于并非所有输入变量都参与于服务规则的刻画,因此服务的相似度除了重点考虑规则相似度外,还必须同时兼顾服务IO的相似度。The rule-based service function description actually consists of two parts: service IO and rules for connecting IO. And IO can be described by the set of input variables and output variables respectively, i.e. I={v1, v2,...}, O={o1, o2,...}; the rules can be represented by the binary tree Tr of the logical expression, wherein in the binary tree The relational expression node e can be further vectorized into a standard multidimensional vector t in formula (1), and the logical operator node is
Figure GDA0000108962380000092
Since not all input variables are involved in the description of service rules, the similarity of services must not only focus on the similarity of rules, but also take into account the similarity of service IO.

关于服务IO的相似度计算方法有许多种,本发明只是采用了其中一种简单的比较方法,即Dice系数二值权重方法。设两个Web服务分别为W1和W2,则IO相似度计算如下:There are many methods for calculating the similarity of service IOs, and the present invention only adopts one of the simple comparison methods, that is, the Dice coefficient binary weight method. Assuming that two web services are W1 and W2 respectively, the IO similarity is calculated as follows:

SimSimIOIO((WW11,,WW22))==22((BBII++BBOo))||II11||++||Oo11||++||II22||++||Oo22||------((22))

其中,BI是W1,W2输入变量集合I1和I2相同输入变量的个数,BO是W1,W2输出变量集合O1和O2相同输出变量的个数,||表示计算集合中的元素个数。Among them, BI is the number of the same input variable of W1, W2 input variable set I1 and I2, BO is the number of the same output variable of W1, W2 output variable set O1 and O2, || indicates the number of elements in the calculation set.

关于服务规则的相似度计算方法,本发明则是第一次提出。两个Web服务W1和W2的规则可以分别由二叉树Tr1和Tr2表示。规定对二叉树进行一致的先序遍历,分别得到两个先序遍历集合STr1={e1,f1,e2,…}和STr2={e’1,f’1,e’2,…},其中e为关系表达式节点,f为逻辑运算符节点。The present invention proposes for the first time the method for calculating the similarity of service rules. The rules of two web services W1 and W2 can be represented by binary trees Tr1 and Tr2 respectively. It is stipulated that a consistent pre-order traversal of the binary tree is performed, and two pre-order traversal sets STr1={e1, f1, e2,...} and STr2={e'1, f'1, e'2,...} are respectively obtained, where e is a relational expression node, and f is a logical operator node.

首先计算关系表达式节点e的相似度。e可以用公式(1)的标准多维向量t表示。对于关系表达式T>300和T<500来说,我们认为常量300和500对相似度的影响可以忽略,而T>和T<则对相似度产生重要的影响。因此对t做去常量处理,形成tw=(s1v11 q11 v12 q12 v13…v1n,…,si vi1 qi1 vi2 qi2 vi3…vin,…,sn vn1 qn1 vn2 qn2vn3…vnn,p)=(tw1,tw2,…twn)。下面我们利用改进的Levenstein编辑距离来计算两个关系表达式节点向量(tw1,tw2,…,twn)和(tw’1,tw’2,…,tw’n)的相似度。Levenstein方法通过计算从一个字符串变换到另一个字符串的变换操作(字符插入、删除和替换)来计算两个字符串间的相似度。很显然,以上的方法经过改进,同样适用于由字符串和关系运算符组成的关系表达式向量,不同的是将计算从一个关系表达式向量变换到另一个关系表达式所需要的变换操作。令从ei到ej的变换操作定义为xform(ei,ej),其中变换操作包括tw的插入(insert),删除(delete)和替换(replace)三种操作,且规定三种操作的代价函数c存在关系c(delete)+c(insert)<=c(replace),那么关系表达式向量ei和ej的相似度定义为:First calculate the similarity of relational expression node e. e can be represented by the standard multidimensional vector t of formula (1). For relational expressions T>300 and T<500, we believe that the influence of constants 300 and 500 on the similarity can be ignored, while T> and T< have an important influence on the similarity. Therefore, de-constant processing is performed on t to form tw=(s1v11 q11 v12 q12 v13…v1n,…,si vi1 qi1 vi2 qi2 vi3…vin,…,sn vn1 qn1 vn2 qn2vn3…vnn,p)=(tw1, tw2,… twn). Below we use the improved Levenstein edit distance to calculate the similarity between two relational expression node vectors (tw1, tw2, ..., twn) and (tw'1, tw'2, ..., tw'n). The Levenstein method calculates the similarity between two strings by computing transformation operations (character insertion, deletion, and substitution) from one string to another. Obviously, the above method has been improved and is also applicable to relational expression vectors composed of strings and relational operators. The difference is the transformation operation required to transform the calculation from one relational expression vector to another relational expression. Let the transformation operation from ei to ej be defined as xform(ei, ej), where the transformation operation includes three operations of insert, delete and replace of tw, and the cost function c of the three operations is stipulated There is a relationship c(delete)+c(insert)<=c(replace), then the similarity between relational expression vectors ei and ej is defined as:

SimLeven(ei,ej)=1-xform(ei,ej)/max(|ei|,|ej|)(3)SimLeven(ei, ej) = 1-xform(ei, ej)/max(|ei|, |ej|) (3)

那么,两个服务规则中逻辑表达式的二叉树Tr1和Tr2间的相似度计算公式如下:Then, the formula for calculating the similarity between the binary trees Tr1 and Tr2 of the logical expressions in the two service rules is as follows:

SimSimRuleRules((WW11,,WW22))==22||STrSTrsfsf||**MaxMax{{SimSimlevenleven((eeii,,eejj)),,eeii&Element;&Element;STrSTr11,,eejj&Element;&Element;STrSTr22}}||STrSTr11ff||++||STrSTr22ff||------((44))

其中,STr1,STr2为Tr1,Tr2的先序遍历集合;STr1f,STr2f为Tr1,Tr2先序遍历集合中关于逻辑运算符的子集;STrsf为Tr1,Tr2先序遍历集合中相同位置逻辑运算符f相同的个数;分子的后半部分表示取两个先序遍历集合中关系表达式节点间的相似度最大值。Among them, STr1 and STr2 are the pre-order traversal sets of Tr1 and Tr2; STr1f and STr2f are the subsets of logical operators in the pre-order traversal sets of Tr1 and Tr2; STrsf is the same-position logical operators in the pre-order traversal sets of Tr1 and Tr2 The number of f is the same; the second half of the numerator means to take the maximum value of the similarity between the nodes of the relational expression in the two preorder traversal sets.

其中,STr为Tr的先序遍历集合,d为两个先序遍历集合中相同位置逻辑运算符f不相同的个数。分子的后半部分表示取两个先序遍历集合中关系表达式节点间的相似度最大值。Among them, STr is the preorder traversal set of Tr, and d is the number of different logical operators f at the same position in the two preorder traversal sets. The second half of the numerator means to take the maximum value of the similarity between the nodes of the relational expression in the two preorder traversal sets.

最后可以得到总的相似度为:Finally, the total similarity can be obtained as:

Sim(W1,W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2)(5)Sim(W1, W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2)(5)

3)服务冗余检测3) Service redundancy detection

图5给出了服务冗余检测方法的流程。可以看出,虚线框内基于规则的WSDL表示与解析、规则向量化和相似度计算等功能模块为本发明与其它方法所区别的核心部分。首先,“服务获取”将通过Web服务API接口获取,并准备好将要进行冗余检测的两个服务。当然,这两个服务是以采用本发明所提出的基于规则的WSDL语法进行描述刻画。接着遵循语法规则,对服务WSDL进行基于XML解析,抽取出服务相关的I/O变量及相应逻辑表达式规则。然后对服务中的规则进行向量化操作,并进一步进行相似度计算。如果相似度大于预定阈值,则判定两个服务相似,存在冗余情况,并进行相应冗余服务标记操作。Fig. 5 shows the flow of the service redundancy detection method. It can be seen that the functional modules such as rule-based WSDL representation and analysis, rule vectorization and similarity calculation in the dashed box are the core parts that distinguish the present invention from other methods. First, "Service Acquisition" will acquire through the Web service API interface, and prepare two services that will be checked for redundancy. Certainly, these two services are described and characterized by adopting the rule-based WSDL grammar proposed by the present invention. Then follow the grammatical rules, analyze the service WSDL based on XML, and extract the service-related I/O variables and corresponding logical expression rules. Then vectorize the rules in the service, and further calculate the similarity. If the similarity is greater than the predetermined threshold, it is determined that the two services are similar and there is redundancy, and a corresponding redundant service marking operation is performed.

实际上,服务冗余检测对用户、服务提供商和UDDI管理员都是非常必要的操作。对于用户来说,当发现服务冗余的时候,“后期处理”意味着对请求服务进行选择;对于服务提供商来说,如果发现新发布的服务与自己早期发布服务存在冗余,则需要对新服务进行删除操作或者对旧服务进行相应修改;对于UDDI管理员来说,如果发现多个服务存在冗余情况,则可能需要对服务进一步进行归类或者清理等操作。需要指明的是,冗余虽然也是一种安全备份和故障恢复的重要手段,但这种情况并不在本发明研究考虑的范围。In fact, service redundancy detection is a very necessary operation for users, service providers and UDDI administrators. For users, when finding service redundancy, "post-processing" means selecting the requested service; for service providers, if they find that the newly released service is redundant with their earlier released service, they need to Delete new services or modify old services accordingly; for UDDI administrators, if redundant services are found in multiple services, they may need to further classify or clean up services. It should be pointed out that although redundancy is also an important means for safety backup and failure recovery, this situation is not considered in the scope of the present invention.

最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than limit them. Although the present invention has been described in detail with reference to the embodiments, those skilled in the art should understand that modifications or equivalent replacements to the technical solutions of the present invention do not depart from the spirit and scope of the technical solutions of the present invention, and all of them should be included in the scope of the present invention. within the scope of the claims. Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than limit them. Although the present invention has been described in detail with reference to the embodiments, those skilled in the art should understand that modifications or equivalent replacements to the technical solutions of the present invention do not depart from the spirit and scope of the technical solutions of the present invention, and all of them should be included in the scope of the present invention. within the scope of the claims.

Claims (11)

Translated fromChinese
1.一种面向物联网应用的Web服务的冗余检测方法,基于计算若干输入服务间的相似度进行冗余检测,所述方法包含如下步骤: 1. A kind of redundant detection method of the Web service of Internet of Things application, based on calculating the similarity between some input services, redundant detection is carried out, and described method comprises the following steps:获取服务的步骤,Web服务器通过接口获取要进行冗余检测的若干个服务; In the step of obtaining services, the web server obtains several services to be checked for redundancy through the interface;基于规则的WSDL解析及规则向量化的步骤,依据WSDL语法规则,对每个服务的WSDL进行基于XML解析,抽取出每个服务相关的输入输出变量及相应逻辑表达式规则;所述每个服务采用基于规则的WSDL语法进行描述刻画;所述Web服务采用扩展的WSDL服务元素,进而增加了对服务功能的规则描述;所述WSDL是Web服务描述语言的英文简写; The steps of rule-based WSDL parsing and rule vectorization, according to WSDL grammar rules, perform XML-based parsing on the WSDL of each service, and extract the input and output variables and corresponding logical expression rules related to each service; each service The rule-based WSDL grammar is used for description and description; the Web service uses the extended WSDL service element, and then adds the rule description of the service function; the WSDL is the English abbreviation of Web Service Description Language;计算相似度的步骤,对每个服务的规则进行向量化,基于逻辑表达式逐一比较其中一个服务与其余服务,计算得到若干个基于服务规则的相似度计算值; The step of calculating similarity is to vectorize the rules of each service, compare one of the services with other services one by one based on logical expressions, and calculate several similarity calculation values based on service rules;冗余决策步骤,将得到的所有相似度值逐一与一设定阈值比较,相似度大于阈值则判定两个服务存在冗余;否则服务之间不相似; In the redundancy decision-making step, compare all obtained similarity values with a set threshold one by one, and if the similarity is greater than the threshold, it is determined that there is redundancy between the two services; otherwise, the services are not similar;其中,所述扩展的WSDL服务元素包含标签如下:policy、condition、element、relation、bracket、operand、loperator和roperator; Wherein, the extended WSDL service element includes tags as follows: policy, condition, element, relation, bracket, operand, loperator and roperator;所述policy元素,用来标记服务规则; The policy element is used to mark service rules;在所述policy元素定义condition子元素,该condition子元素标记该服务的使用输入必须满足的前提条件; The condition sub-element is defined in the policy element, and the condition sub-element marks the preconditions that the service input must meet;其中,所述前提条件由若干个element标签组成,每一个element为一个简单的关系表达式,所述表达式通过relation和bracket形成逻辑表达式;在每个element中,由operand定义表达式的操作数,由loperator定义关系运算符,由roperator定义基本运算。 Wherein, the precondition is composed of several element tags, each element is a simple relational expression, and the expression forms a logical expression through relation and bracket; in each element, the operation of the expression is defined by operand Number, the relational operator is defined by loperator, and the basic operation is defined by roperator. the2.根据权利要求1所述的面向物联网应用的Web服务的冗余检测方法,其特征在于,所述关系运算符包含:>,<,>=,<=,==和!=,所述基本运算符包含:+,-,*,/和%。 2. the redundancy detection method of the Web service facing Internet of Things application according to claim 1, is characterized in that, described relational operator comprises: >, <, >=, <=, == and !=, so The above basic operators include: +,-,*,/ and %. the3.根据权利要求1所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述对规则进行向量化具体步骤如下: 3. the redundancy detection method based on the Web service facing Internet of Things application according to claim 1, is characterized in that, the described specific steps of carrying out vectorization to rule are as follows:步骤1,从WSDL中定位到服务元素,通过XML解析policy标签,根据逻辑运算符抽取出规则中的关系表达式集合,形成逻辑表达式的二叉树数据结构; Step 1, locate the service element from WSDL, parse the policy tag through XML, extract the relational expression set in the rule according to the logical operator, and form the binary tree data structure of the logical expression;步骤2,对每个关系表达式通过移项操作,向量化为如下公式中的标准多维向量ti; Step 2, transpose each relational expression into a standard multidimensional vector ti in the following formula through the transposition operation;t=(s1v11q11v12q12v13…v1n,…,si vi1qi1vi2qi2vi3…vin,…,sn vn1qn1vn2qn2vn3…vnn,p,c) t=(s1v11q11v12q12v13…v1n,…,si vi1qi1vi2qi2vi3…vin,…,sn vn1qn1vn2qn2vn3…vnn,p,c) 其中,V为变量集合,C为常量集合,P为关系运算符集合和Q为基本运算符集合;移项符si∈{+,-},vij∈V,qij∈Q,p∈P,c∈C。Among them, V is a set of variables, C is a set of constants, P is a set of relational operators and Q is a set of basic operators; the transposition symbols si∈{+,-}, vij∈V, qij∈Q, p∈P, c ∈C.4.根据权利要求3所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述逻辑运算符包含:&&,!,||和⊕。 4. The method for detecting redundancy based on web services for Internet of Things applications according to claim 3, wherein the logical operators include: &&, !, || and ⊕. the5.根据权利要求1所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述基于服务规则的相似度计算采用Dice系数二值权重方法。 5. The method for detecting redundancy based on web services for Internet of Things applications according to claim 1, wherein the similarity calculation based on service rules adopts a Dice coefficient binary weight method. the6.根据权利要求5所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述相似度计算公式为: 6. the redundancy detection method based on the Web service of application-oriented Internet of Things according to claim 5, is characterized in that, described similarity computing formula is:Sim(W1,W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2); Sim(W1,W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2);其中,W1和W2分别为需要进行相似度计算的两个服务;服务的相似度包括服务IO相似度以及服务规则相似度,分别用SimIO(W1,W2)及SimRule(W1,W2)表示;0.5值为服务IO及服务规则在服务相似度计算中的计算权重。 Among them, W1 and W2 are two services that need to be calculated for similarity; the similarity of services includes service IO similarity and service rule similarity, respectively expressed by SimIO(W1,W2) and SimRule(W1,W2); 0.5 The value is the calculation weight of service IO and service rules in service similarity calculation. the7.一种基于面向物联网应用的Web服务的冗余检测系统,用于计算若干输入服务间的相似度,该系统包含:服务获取模块、相似度计算模块、冗余决策模块和后期处理模块,其特征在于,所述系统还包含:基于规则的WSDL解析模块和规则向量化模块; 7. A redundancy detection system based on Web services for Internet of Things applications, used to calculate the similarity between several input services, the system includes: service acquisition module, similarity calculation module, redundancy decision module and post-processing module , characterized in that the system also includes: a rule-based WSDL parsing module and a rule vectorization module;所述基于规则的WSDL解析模块,依据WSDL语法规则,对每个服务的WSDL进行基于XML解析,抽取出每个服务相关的输入输出变量及相应逻辑表达式规则; The rule-based WSDL parsing module, according to the WSDL grammar rules, performs XML-based parsing on the WSDL of each service, and extracts the input and output variables and corresponding logic expression rules related to each service;所述规则向量化模块,用于对每个服务的规则进行向量化; The rule vectorization module is used to vectorize the rules of each service;其中,所述相似度计算模块采用基于服务规则的相似度计算方法; Wherein, the similarity calculation module adopts a similarity calculation method based on service rules;所述冗余决策模块基于服务规则的相似度计算方法得到的相似度的值判断决策服务的相似度; The redundant decision-making module judges the similarity of the decision-making service based on the value of the similarity obtained by the similarity calculation method of the service rule;所述每个服务采用基于规则的WSDL语法进行描述刻画;所述Web服务采用扩展的WSDL服务元素,该服务元素增加对服务功能的规则描述;所述的相似度计算为基于的逻辑表达式的相似度计算; Each service is described and characterized using rule-based WSDL grammar; the Web service uses extended WSDL service elements, which add rule descriptions for service functions; the similarity calculation is based on the logical expression similarity calculation;其中,所述扩展的WSDL服务元素包含标签如下:policy、condition、element、relation、bracket、operand、loperator和roperator; Wherein, the extended WSDL service element includes tags as follows: policy, condition, element, relation, bracket, operand, loperator and roperator;所述policy元素,用来标记服务规则; The policy element is used to mark service rules;在所述policy元素定义condition子元素,该condition子元素标记该服务的使用输入必须满足的前提条件; The condition sub-element is defined in the policy element, and the condition sub-element marks the preconditions that the service input must meet;其中,所述前提条件由若干个element标签组成,每一个element为一个简单的关系表达式,所述表达式通过relation和bracket可形成逻辑表达式;在每个element 中,由operand定义表达式的操作数,由loperator定义关系运算符,由roperator定义基本运算。 Wherein, the precondition is composed of several element tags, each element is a simple relational expression, and the expression can form a logical expression through relation and bracket; in each element, the expression is defined by operand Operands, relational operators are defined by loperator, and basic operations are defined by roperator. the8.根据权利要求7所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述关系运算符包含:>,<,>=,<=,==和!=,所述基本运算符包含:+,-,*,/和%。 8. The method for detecting redundancy based on web services for Internet of Things applications according to claim 7, wherein said relational operators include: >, <, >=, <=, == and !=, The basic operators include: +,-,*,/ and %. the9.根据权利要求7所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述规则向量化模块进一步包含: 9. the redundancy detection method based on the Web service of application-oriented Internet of Things according to claim 7, is characterized in that, described rule vectorization module further comprises:解析及形成二叉树子模块,用于从WSDL中定位到服务元素,通过XML解析policy标签,根据逻辑运算符抽取出规则中的关系表达式集合,形成逻辑表达式的二叉树数据结构; Parsing and forming binary tree sub-modules are used to locate service elements from WSDL, parse policy tags through XML, extract relational expression sets in rules according to logical operators, and form a binary tree data structure of logical expressions;向量化模块,用于对每个关系表达式通过移项操作,向量化为如下公式中的标准多维向量ti; The vectorization module is used to transpose each relational expression into a standard multidimensional vector ti in the following formula;t=(s1v11q11v12q12v13…v1n,…,si vi1qi1vi2qi2vi3…vin,…,sn vn1qn1vn2qn2vn3…vnn,p,c) t=(s1v11q11v12q12v13…v1n,…,si vi1qi1vi2qi2vi3…vin,…,sn vn1qn1vn2qn2vn3…vnn,p,c) 其中,V为变量集合,C为常量集合,P为关系运算符集合和Q为基本运算符集合;移项符si∈{+,-},vij∈V,qij∈Q,p∈P,c∈C。Among them, V is a set of variables, C is a set of constants, P is a set of relational operators and Q is a set of basic operators; the transposition symbols si∈{+,-}, vij∈V, qij∈Q, p∈P, c ∈C.10.根据权利要求7所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述基于服务规则的相似度计算采用Dice系数二值权重方法。 10. The method for detecting redundancy based on Web services for Internet of Things applications according to claim 7, wherein the similarity calculation based on service rules adopts a Dice coefficient binary weight method. the11.根据权利要求7所述的基于面向物联网应用的Web服务的冗余检测方法,其特征在于,所述相似度计算公式为: 11. the redundancy detection method based on the Web service of Internet of Things application according to claim 7, is characterized in that, described similarity calculating formula is:Sim(W1,W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2); Sim(W1,W2)=0.5SimIO(W1,W2)+0.5SimRule(W1,W2);其中,W1和W2分别为需要进行相似度计算的两个服务;服务的相似度包括服务IO相似度以及服务规则相似度,分别用SimIO(W1,W2)及SimRule(W1,W2)表示;0.5值为服务IO及服务规则在服务相似度计算中的计算权重。 Among them, W1 and W2 are two services that need to be calculated for similarity; the similarity of services includes service IO similarity and service rule similarity, respectively expressed by SimIO(W1,W2) and SimRule(W1,W2); 0.5 The value is the calculation weight of service IO and service rules in service similarity calculation. the
CN201110206923.9A2010-10-252011-07-22Redundancy check method and system for Web services facing IOT (Internet of Things) applicationExpired - Fee RelatedCN102457569B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201110206923.9ACN102457569B (en)2010-10-252011-07-22Redundancy check method and system for Web services facing IOT (Internet of Things) application

Applications Claiming Priority (3)

Application NumberPriority DateFiling DateTitle
CN201010525971.X2010-10-25
CN2010105259712010-10-25
CN201110206923.9ACN102457569B (en)2010-10-252011-07-22Redundancy check method and system for Web services facing IOT (Internet of Things) application

Publications (2)

Publication NumberPublication Date
CN102457569A CN102457569A (en)2012-05-16
CN102457569Btrue CN102457569B (en)2014-04-02

Family

ID=46040219

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201110206923.9AExpired - Fee RelatedCN102457569B (en)2010-10-252011-07-22Redundancy check method and system for Web services facing IOT (Internet of Things) application

Country Status (1)

CountryLink
CN (1)CN102457569B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104111106B (en)*2014-07-012016-03-16武汉领傲科技有限公司A kind of Internet of Things cognitive method based on article consumption and composition transfer and system
CN107329946B (en)*2016-04-292021-08-24阿里巴巴集团控股有限公司Similarity calculation method and device
CN110532260B (en)*2019-07-232021-05-25北京三快在线科技有限公司Logic expression storage and reading method and device, electronic equipment and medium
CN110474929B (en)*2019-09-272021-06-22新华三信息安全技术有限公司Redundancy rule detection method and device
CN112990466A (en)*2021-03-312021-06-18龙马智芯(珠海横琴)科技有限公司Redundancy rule detection method and device and server
CN119493813B (en)*2025-01-172025-05-27福建文达森信息科技有限公司 Industrial data analysis method and system based on dynamic form

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101764837A (en)*2009-12-232010-06-30宁波东海蓝帆科技有限公司Web service dynamic calling system and method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP4285658B2 (en)*2006-10-172009-06-24インターナショナル・ビジネス・マシーンズ・コーポレーション Apparatus and method for providing web service
US7865535B2 (en)*2007-05-182011-01-04International Business Machines CorporationApparatus, system, and method for a data server-managed web services runtime
CN100583846C (en)*2008-01-082010-01-20北京邮电大学A semantic telecommunication network capability service gateway component, network system and work method
CN101827125B (en)*2010-03-312013-04-10吉林大学Semantic Web service body and application thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101764837A (en)*2009-12-232010-06-30宁波东海蓝帆科技有限公司Web service dynamic calling system and method

Also Published As

Publication numberPublication date
CN102457569A (en)2012-05-16

Similar Documents

PublicationPublication DateTitle
US10929598B2 (en)Validating an XML document
US6487566B1 (en)Transforming documents using pattern matching and a replacement language
US8032828B2 (en)Method and system of document transformation between a source extensible markup language (XML) schema and a target XML schema
US8640087B2 (en)Semantic system for integrating software components
CN102457569B (en)Redundancy check method and system for Web services facing IOT (Internet of Things) application
US20070136698A1 (en)Method, system and apparatus for a parser for use in the processing of structured documents
US7409400B2 (en)Applications of an appliance in a data center
CN113051285A (en)SQL statement conversion method, system, equipment and storage medium
US20090125529A1 (en)Extracting information based on document structure and characteristics of attributes
US8938668B2 (en)Validation based on decentralized schemas
CN106649769B (en)Semantic-based conversion method from XBRL data to OWL data
CN103246732B (en) A method and system for extracting online web news content
US10489493B2 (en)Metadata reuse for validation against decentralized schemas
CN105005606A (en) XML Data Query Method and System Based on MapReduce
WO2024221562A1 (en)Semi-structured webpage attribute value extraction method based on prompt learning, and electronic device and storage medium
WO2023155303A1 (en)Webpage data extraction method and apparatus, computer device, and storage medium
US12242815B1 (en)Method and apparatus for data processing, computer, storage medium, and program product
CN103092973B (en)information extraction method and device
CN106874397A (en)A kind of automatic semanteme marking method of internet of things oriented equipment
CN118503450A (en) A method and system for identifying key nodes of network pollution based on knowledge graph
CN111259087B (en)Computer network protocol entity linking method and system based on domain knowledge base
JP2013218627A (en)Method and device for extracting information from structured document and program
CN116775889B (en)Threat information automatic extraction method, system, equipment and storage medium based on natural language processing
CN111199259B (en)Identification conversion method, device and computer readable storage medium
Zhang et al.Representing and reasoning about xml with ontologies

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C14Grant of patent or utility model
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20140402

Termination date:20160722

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp