BACKGROUND OF THE INVENTIONThe present invention relates generally to data stream processing applications, and relates more specifically to the processing of location-based data streams to allow monitoring of location-sensitive data.
The availability of inexpensive location-sensing technologies and the advancement of wireless communication technology have led to an explosion in location-based services. At the same time, other technological advancements have led to an abundance of location-sensitive data. Within this context, information needs may be expressed using location-centric triggers.
For example, a user of a mobile device may install a location-centric trigger in a location-based monitoring server for a particular gas station. This trigger may specify spatial and non-spatial predicate conditions for activating the trigger. For example, the user may request that the trigger be activated when the user is within one mile of the gas station and the gas price is below four dollars. In this case, “within one mile of the gas station” is a spatial predicate condition, while “the gas price is below four dollars” is a non-spatial predicate condition. Data relating to the gas price is “location-sensitive” because it is tied to a particular location (i.e., the gas station).
The location-based information monitoring server receives information in the form of data streams that arrive continuously, rapidly, and in real time from multiple sources. In the above example, these data streams may include, for example, a first data stream identifying the location of the user at various times and a second data stream identifying the price of gas at various times. The data streams are processed against the user's location-specific trigger in order to determine when the trigger should be activated.
Simplistic systems process the data streams as they are received. However, if the location-based information monitoring system receives a large number of data streams and/or processes location-centric triggers for a large number of users, processing data streams as they arrive may delay the delivery of information to the users because processing resources are wasted on large amounts of irrelevant data (i.e., data that does not activate any of the triggers).
SUMMARY OF THE INVENTIONA method for processing a first data stream specifying locations of a user at different times and at least a second data stream specifying values of a monitored attribute at a location of interest at different times includes: receiving a location-centric trigger specifying at least one spatial predicate condition relative to the location of interest and at least one non-spatial predicate condition relevant to the location of interest, calculating a safe region that includes locations whose probability of satisfying the spatial predicate condition falls below a first threshold, calculating one or more safe value containers that include values whose probabilities of satisfying the non-spatial predicate conditions fall below one or more second thresholds, and processing the first data stream and the at least a second data stream against the location-centric trigger, by considering only those locations that are not contained within the safe region and only those values that are not contained within respective safe value containers for the corresponding non-spatial predicate conditions.
BRIEF DESCRIPTION OF THE DRAWINGSSo that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
FIG. 1 is a schematic diagram illustrating one embodiment of a location-based information monitoring system, according to the present invention;
FIG. 2 is a flow diagram illustrating one embodiment of a method for processing location-sensitive data streams, according to the present invention;
FIG. 3A is a graph illustrating an exemplary user location in a two-dimensional coordinate space;
FIG. 3B is a graph illustrating exemplary one-dimensional value domains associated with each of a plurality of monitored attributes;
FIG. 4 is a flow diagram illustrating one embodiment of a method for assisting in the processing of location-sensitive data streams, according to the present invention;
FIG. 5A is a schematic diagram illustrating a mobile user at a location;
FIG. 5B is a graph illustrating the probability density function for a mobile user's motion inside a safe region;
FIG. 6 is a flow diagram illustrating one embodiment of a method for computing a safe region, according to the present invention;
FIG. 7A is a schematic diagram illustrating an exemplary grid cell that is divided into four partitions or quadrants;
FIG. 7B is a schematic diagram illustrating the exemplary grid cell ofFIG. 7A, including a set of tension points obtained from the candidate points illustrated inFIG. 7A;
FIG. 7C is a schematic diagram illustrating the exemplary grid cell ofFIGS. 7A-B, including a set of component rectangles obtained from the tension points illustrated inFIG. 7B;
FIG. 7D is a schematic diagram illustrating the exemplary grid cell ofFIGS. 7A-C, in which a final safe region composed of the component rectangles illustrated inFIG. 7C is selected;
FIG. 8 is a flow diagram illustrating one embodiment of a method for computing a safe value container, according to the present invention;
FIG. 9 is a high-level block diagram of the location-based monitoring method that is implemented using a general purpose computing device; and
FIGS. 10A-B respectively illustrate the use of single and multiple safe value containers.
DETAILED DESCRIPTIONIn one embodiment, the invention is a method and apparatus for selective processing of location-sensitive data streams. Embodiments of the invention implement a selective approach to the processing of incoming data streams, as opposed to processing the incoming data streams on delivery. In particular, stream data with a low probability of activating a location-centric trigger is discarded without being processed, allowing more server resources to be devoted to the processing of stream data with a greater probability of activating a trigger.
Embodiments of the invention rely on the use of “safe regions” for location-based stream data and “safe value containers” for monitored stream data, where data is “safe” if it can be discarded (because it is not likely to activate a trigger). Specifically, a “safe region” is a physical location in which a user's location-centric triggers are not likely to be activated. For instance, referring to the gas station example discussed in the background, the trigger's safe region encompasses any location outside of an approximately one-mile radius from the gas station. A “safe value container” is a range of values for a monitored attribute that is not likely to activate the user's location-centric triggers. For instance, referring again to the gas station example, the trigger's “safe value container” includes any gas prices above four dollars. The probability threshold for the location data and the monitored data may be the same, or each may have a different threshold. For ease of explanation, discussion of the invention herein refers to a single safe value container. However, those skilled in the art will appreciate that a location-centric trigger may be associated with a plurality of safe value containers (e.g., one safe value container for each non-spatial predicate condition). Thus, any instance discussed herein in which reference is made to a single safe value container inherently contemplates the existence of multiple safe value containers.
As discussed above, each mobile user ui∈U expresses her location-based information needs in the form of location-centric triggers ti,j∈T at a location of interest lj∈L. In one embodiment, the triggers are installed at an information monitoring server that receives location updates from mobile users as well as data updates from other data sources and processes these updates to determine if any relevant triggers need to be activated. The term “mobile user” and the designation uiare used interchangeably herein to refer to both the user of a mobile device and the mobile device itself (e.g., a mobile global positioning system (GPS) device, a cellular telephone, a personal digital assistant, a laptop computer, a satellite radio receiver, or the like).
FIG. 1 is a schematic diagram illustrating one embodiment of a location-basedinformation monitoring system100, according to the present invention. As illustrated, thesystem100 comprises aninformation monitoring server102 and a plurality of data sources that provide data updates to theinformation monitoring server102. These data sources includelocation data sources104 and monitoreddata sources106.
The
location data sources104 comprise one or more base stations that are in communication with mobile users and that provide location data relevant to the mobile users (i.e., the locations of the mobile users at given times). Each of the
location data sources104 delivers updates in the form of location data streams from multiple mobile users to the
information monitoring server102. A location data stream contains tuples of the form l
ui(t)=
u
i,t,x,y
, which indicates the location of the mobile user u
iin two-dimensional coordinate space at time t. In one embodiment, the
location data sources104 operate within a wireless environment in which the base stations communicate directly with the mobile users.
The monitored
data sources106 comprise one or more information delivery systems that provide monitored data relevant to one or more locations of interest (e.g., gas prices at a specified gas station). Each of the monitored
data sources106 delivers updates in the form of monitored data streams relevant to the monitored data at different locations of interest to the
information monitoring server102. A monitored data stream contains tuples of the form m
lj(t)=
l
j,t,a
1,a
2, . . . , a
r, which contains r-dimensional monitored data. a
k, k∈[1 . . . r] is a value in domain D
k, which represents the value of the k
thmonitored attribute at location of interest l
jat time t. In one embodiment, the monitored
data sources106 operate within an Ethernet environment.
Theinformation monitoring server102 maintains a set T of location centric triggers. Each trigger ti,j∈T represents a trigger installed by mobile device user ui∈U on location of interest lj∈L. Theinformation monitoring server102 receives updates from thelocation data sources104 and the monitoreddata sources106 in the form of streaming data, processes the updates, and activates location-centric triggers in response to the updates. As illustrated, theinformation monitoring server102 comprises five main components, each of which may individually comprise a processor. These components include: adata processor108, atrigger manager110, anoptimizer112, anevent detector114, and adata manager116.
Thedata processor108 receives updates directly from thelocation data sources104 and the monitoreddata sources106. In addition, thedata processor108 receives requests to install new location-centric triggers from the location data sources104. Thedata processor108 classifies the updates according to their sources (e.g., location data source or monitored data source) and then provides the classified updates to theoptimizer112. The requests to install new location-centric triggers are forwarded by thedata processor108 to thetrigger manager110.
Theoptimizer112 receives the classified updates from thedata processor108 and determines whether the classified updates should be processed. As discussed in further detail below, the determination as to whether a classified update should be processed is based on the probability that the classified update will activate (or not activate) a location-centric trigger for at least one of the users. In one embodiment, theoptimizer112 facilitates this determination by computing “safe regions” for the location data and “safe value containers” for the monitored data. Theoptimizer112 delivers the safe regions and safe value containers to thedata manager116.
Thetrigger manager110 receives the requests to install new location-centric triggers from thedata processor108 and handles the addition and removal of triggers in accordance with these requests. In addition, thetrigger manager110 coordinates with theoptimizer112 and theevent detector114 in order to determine whether any triggers should be activated in response to incoming updates.
Thedata manager116 receives the safe regions and safe value containers from theoptimizer112 and communicates the safe regions back to the relevant mobile users for use in self-monitoring, discussed in further detail below. Thedata manager116 also stores the safe value containers. In one embodiment, safe value containers are not communicated to the monitoreddata sources106 because it is assumed that the monitoreddata sources106 do not possess computational power that can be devoted to self-monitoring; however, in other embodiments, thedata manager116 communicates the safe value containers to the monitoreddata sources106. In addition, thedata manager116 receives updates for processing from theoptimizer112.
Thedata manager116 delivers the safe region information, safe value container information, and updates to theevent detector114. In addition, theevent detector114 receives trigger information from thetrigger manager110. Theevent detector114 determines whether to activate a trigger by processing the data received from thedata manager116 against the data provided by thetrigger manager110.
FIG. 2 is a flow diagram illustrating one embodiment of amethod200 for processing location-sensitive data streams, according to the present invention. Themethod200 may be implemented, for example, by theinformation monitoring server102 illustratedFIG. 1. As such, reference is made in the discussion of themethod200 to various components of theinformation monitoring server102. However, it is understood that themethod200 is not limited to operation in conjunction with theinformation monitoring server102, and may readily be deployed in systems having different configurations from that illustrated.
Themethod200 is initialized atstep202 and proceeds to step204, where thedata processor108 receives one or more location-centric triggers from one or more mobile users. In one embodiment, these location-centric triggers are received from one or more of thelocation data sources104, such as base stations that are in communication with mobile users. Each location-centric trigger specifies a set of spatial and non-spatial predicate conditions for activating the trigger (e.g., “Notify User A when User A is within one mile of gas station G and the gas price is below four dollars”). In one embodiment, these triggers are expressed in the form of <monitored attribute><op><value>, where <op>∈{<, >, ≦, ≧}, combined using the logical ̂ operator. For instance, User A's trigger can be expressed as tA,G=(x≧−1̂x≦1̂y≧−1̂y≦1̂p<4), where the first four constraints express the spatial trigger region using the minimum bounding rectangle of a circle of one-mile radius around the gas station G, assuming that the gas station is located at the origin of the coordinate space. The last constraint expresses the gas price requirement. The predicate conditions specified on the spatial region will be a common feature of all triggers; however, different locations of interest will have different monitored attributes associated with them. In one embodiment, themethod200 assumes that a trigger specifies predicate conditions on all of the monitored attributes associated with the corresponding location of interest.
In one embodiment, the triggers are classified into one of three categories depending on their relevance to the population of mobile users: private, public, or shared. Considering a location-based information monitoring system with n mobile users, private triggers ti,jprivate∈T are relevant to a single mobile user, where i∈[1 . . . n] and |i|=1. Shared triggers ti,jshared∈T are relevant to at least two mobile users under the constraints i∈[1 . . . n] and 2≦|i|≦n′, where n′ specifies system limitations on the maximum number of mobile users permitted to share a trigger. Public triggers ti,jpublic∈T are relevant to all of the mobile users, |i|=n. In a further embodiment, an additional constraint specifies that a mobile user may have only one trigger relevant to a given location of interest lj.
Instep206, thedata processor108 receives one or more location updates from the location data sources104. As discussed above, the location updates indicate the physical locations of the mobile users at given times (e.g., where user A is at time t).
Instep208, thedata processor108 receives one or more data updates from the monitoreddata sources106. As discussed above, the data updates indicate monitored data relevant to one or more locations of interest (e.g., the gas price at gas station G at time t).
Instep210, theoptimizer112 computes one or more safe regions in accordance with the location updates and one or more safe value containers in accordance with the data updates. As discussed above, a safe region is a physical location in which a mobile user's location-centric triggers are not likely to be activated, while safe value container is a range of values for a given parameter that is not likely to activate any of the mobile users' location-centric triggers.
The safe region for each user uiin a set of users U may be defined as ψ (ui). One specific embodiment of a method for calculating a safe region is discussed in greater detail with respect toFIG. 6, while one specific embodiment of a method for calculating a safe value container is discussed in greater detail with respect toFIG. 8.
Instep212, thedata manager116 delivers the safe regions to the mobile users (e.g., via the base stations within the location data sources104). Thedata manager116 also stores the safe value containers (e.g., locally).
Instep214, theevent detector114 processes the location updates and the monitored data updates against the safe regions and the safe value containers in order to produce a reduced set of updates. In particular, any location updates that indicate mobile user locations within the safe regions are discarded. This is because locations within the safe regions have zero probability of activating any location-centric triggers. Once the safe regions have been delivered to mobile users (e.g., as in step212), the number of location updates that have to be discarded should be greatly reduced. This is because the mobile users can use the safe region information to control when they send location updates, as discussed in further detail in connection withFIG. 4. In addition, data updates that indicate data values falling within the safe value containers are discarded. This is because data values falling within the safe value containers have a zero probability of activating any triggers.
Instep216, theevent detector114 processes the reduced set of updates against the location-centric triggers in order to determine whether any of the triggers should be activated. A trigger is to be activated when all of its spatial and non-spatial predicate conditions are satisfied. Thus, following the above example, User A's trigger tA,Gis activated when the location updates indicate that User A's current location is within one mile of gas station G and the data updates indicate that the price of gas at gas station G is below four dollars.
Instep218, theevent detector114 determines whether any triggers should be activated, based on the processing performed instep216. If theevent detector114 concludes instep218 that no triggers should be activated, themethod200 returns to step206, and thedata processor108 continues to receive location updates and data updates.
Alternatively, if theevent detector114 concludes instep218 that at least one trigger should be activated, themethod200 proceeds to step220, and theevent detector114 activates the trigger(s) by delivering an update to the relevant mobile user(s) (e.g., by informing User A that he is within one mile of gas station G and that the gas price at gas station G is under four dollars). Themethod200 then returns to step206, and thedata processor108 continues to receive location updates and data updates.
The
method200 therefore employs a selective processing approach that drops data updates with less than a threshold probability (e.g., zero probability) of activating any relevant triggers. The probability of a location data update l
ui(t) from a location data source being able to activate a trigger t
i,j, denoted by Pr[l
ui(t)
t
i,j], is dependent on two factors: (1) the probability of the mobile user entering the spatial region R
i,jSassociated with the trigger t
i,j, denoted Pr[l
ui(t)∈R
i,jS]; and (2) the probability of monitored attribute values m
lj(t′) at the location of interest satisfying the non-spatial predicate conditions specified by the installed trigger, denoted by Pr[m
lj(t′)∈R
i,jNS]. Here, t′ is the time instant at which the latest location update from the location of interest l
jhas been received by the location monitoring server, such that t′<t. Thus, one has:
Pr[lui(
t)
t
i,j]=f(
Pr[lui(
t)∈
Ri,jS],Pr[mlj(
t′)∈
Ri,jNS]) (EQN. 1)
Similarly, the probability of a monitored data update m
lj(t) from a monitored data source being able to activate a trigger t
i,j, denoted by Pr[m
lj(t)
t
i,j], is dependent on two factors: (1) the probability of the monitored attribute values m
lj(t) at the location of interest l
jsatisfying the non-spatial predicate conditions R
i,jNSon the trigger t
i,j, denoted by Pr[m
lj(t)∈R
i,jNS]; and (2) the probability of the mobile user location l
ui(t′) lying within the region R
i,jSassociated with the trigger t
i,j, denoted by Pr[l
ui(t′)∈R
i,jS]. Thus, one has:
Pr[mlj(
t)
ti,j]=f(
Pr[mlj(
t)∈
Ri,jNS],Pr[lui(
t′)∈
Ri,jS]) (EQN. 2)
In some embodiments, the location-based
information monitoring system100 has installed therein a large number of triggers associated with each mobile user u
iand each location of interest l
j. In this embodiment, the set of triggers T
i⊂T is relevant to the mobile user u
i. Any location update from the mobile user u
ishould be processed by the information monitoring server only if the probability of activating at least one trigger in the set T
i, denoted by Pr[l
ui(t)
≧1T
i], is greater than a predefined threshold (e.g., zero). Thus, one has:
Pr[lui(
t)
≧1Ti]=1−
Pr[lui(
t)≠>
Ti] (EQN. 3)
where Pr[lui(t)≠>Ti] denotes the probability of not activating any relevant triggers in the set Ti. Assuming that each trigger in the set Tiis independent of all other triggers, one has:
The monitored attributes associated with a trigger ti,jare considered to be independent of each other. This allows one to represent the probability of the non-spatial predicate conditions being satisfied as a product of the probabilities along each dimension in EQN. 6.
Similarly, in one embodiment, the set of triggers Tj⊂T is relevant to the location of interest lj; any monitored update for this location of interest ljshould be processed by theinformation monitoring server102 only if the probability of activating at least one trigger in the set Tjis greater than a predefined threshold (e.g., zero). The probability that none of the triggers in the set Tjwill be activated is given by:
It can be seen from EQNs. 6 and 9 that the probability of a location data update or a monitored data update activating any relevant trigger is zero if and only if: (1) Pr[lui(t′)∈Ri,jS]=0,∀j|ti,j∈Ti; or (2) Pr[ak(t)∈Ri,jk]=0 for any of the k monitored attributes ∀i‘ti,j∈T. The safe regions for the spatial locations and the safe value containers for the non-spatial attributes allow one to quickly determine the time instants when these two conditions on probability are satisfied.
A safe region Ω(ui) for each mobile user uican thus be defined such that as long as the mobile user's location lies within the safe region, the condition Pr[lui(t′)∈Ri,jS]=0 holds ∀j|ti,j∈Ti. Consequently, any location data updates from the mobile user uimay be discarded without processing, since the probability of any relevant triggers being activated by these updates is zero. Thus:
Pr[lui(
t)
≧1Ti|lui(
t)∈ψ(
ui)]=0 (EQN. 10)
A safe value container can be defined for each monitored attribute ajk,k∈[1 . . . r], relevant to each location of interest lj, such that as long as the value of a monitored attribute falls within one of its safe value containers, denoted by δk(lj), the condition Pr[mlj∈Ri,jk]=0 holds ∀i|ti,j∈T. As long as the value of even one of the monitored attributes associated with the location of interest ljlies within a safe value container, corresponding monitored data updates for the location of interest ljmay be discarded without processing, since the probability of any relevant triggers being activated by these updates is zero. Thus:
Prmlj[lui(
t)
≧1Ti|∃k∈[1 . . .
r],ajk∈δ
k(
lj)]=0 (EQN. 11)
FIG. 3A is a graph illustrating an exemplary user location P in a two-dimensional coordinatespace300. The spatial regions associated withexemplary triggers1,2,3, and4 are shown, along with thesafe region302 containing the user at position P.
FIG. 3B is a graph illustrating exemplary one-dimensional value domains associated with each of a plurality of r monitored attributes. Corresponding value ranges for triggers along each of the attributes are displayed. The computed safe value containers are displayed under three different scenarios: (1) when the value of the monitored attribute lies outside any relevant trigger's value range (e.g., as in the case of attribute a1); (2) when the value of the monitored attribute lies inside a single trigger's value range (e.g., as in the case of attribute a2); and (3) when the value of the monitored attribute lies within multiple intersecting triggers' value ranges (e.g., as in the case of attribute ar).
FIG. 4 is a flow diagram illustrating one embodiment of amethod400 for assisting in the processing location-sensitive data streams, according to the present invention. Themethod400 may be implemented, for example, at a mobile user such as a mobile GPS device.
Themethod400 is initialized atstep402 and proceeds to step404, where the mobile user delivers a location-centric trigger to an information monitoring server (e.g.,information monitoring server102 ofFIG. 1), via a base station.
Instep406, the mobile user receives a safe region from the information monitoring server, via the base station. The safe region, as discussed above, indicates a physical location within which the location-centric trigger is not likely to be activated. The mobile user stores the safe region instep408.
Instep410, the mobile user processes its current location against the stored safe region. Instep412, the mobile user determines whether its current location falls within the safe region.
If the mobile user concludes instep412 that its current location falls within the safe region, themethod400 returns to step410, and the mobile user continues to process its location against the stored safe region. Alternatively, if the mobile user concludes instep412 that its current location does not fall within the safe region, themethod400 proceeds to step414, and the mobile user delivers a location update to the information monitoring server, via the base station. Themethod400 then returns to step410, and the mobile user continues to process its location against the stored safe region.
Thus, the mobile users use the safe region information to self-monitor their locations. In particular, once a mobile user knows where its safe region is, it can avoid sending location updates to theinformation monitoring server102 when it knows that it is inside its safe region. Shifting the location monitoring burden from theinformation monitoring server102 to the mobile users allows the mobile users to conserve energy and bandwidth by reducing the number of updates that must be sent. Theinformation monitoring server102 also conserves energy and bandwidth because the number of updates that it has to process is reduced.
In one embodiment, safe regions are represented as rectangular regions. There are at least three advantages to the rectangular representation: (1) the safe region can be represented compactly through the specification of two points (e.g., bottom-left and top-right corners), making it easy to communicate to the relevant mobile user; (2) mobile users can quickly determine their locations within a rectangular region, which facilitates the self-monitoring of location; and (3) computation of a rectangular safe region requires relatively low processing overhead.
FIG. 5A is a schematic diagram illustrating a mobile user at location {right arrow over (P)}, where {right arrow over (P)}lastindicates the last-recorded location of the mobile user. The probability density function for the mobile user's motion inside the safe region, denoted by p(φ), is given by the following density function:
In EQN. 12, s and t are parameters of steadiness such that s/t<1.FIG. 5B is a graph illustrating the probability density function for the mobile user's motion inside the safe region for s=1 and for different values of t. The value of s in particular determines the weight to be assigned to the probability of the mobile user moving in the same direction, whereas t determines the granularity of change in φ for which the probability value changes. As illustrated inFIG. 5B, the probability of the mobile user moving in a direction such that 0≦φ≦π/t is the same; for values of φ>π/t, this probability decreases. Assuming random motion would lead to the probability of motion in any direction being ½π.
Assuming that a mobile user moves in a direction φ as illustrated inFIG. 5A with a speed of v, and assuming a convex safe region ψ(ui), a last-recorded location {right arrow over (P)}last, and an updated location {right arrow over (P)}, the average location update cost Cui
over time can be computed as:
where Clis the cost of a single location update, φ is the angle between the direction of motion of the mobile user and the mobile user's last-recorded direction of motionPlastP and r(φ) is the length of the segmentPR (where R is the point on the safe region boundary at which the next update is expected to occur). Given the angle φ, the elapsed time before the next update is r(φ)/v. The average elapsed time over all values of φ is given
Thus, the average location update cost Cuican be re-written as:
where λ(φ)=∫−ππr(φ)p(φ)d(φ) is the weighted perimeter of the safe region. In order to minimize the update costs, one must maximize the value of the weighted perimeter. Therefore, the problem of minimizing update costs reduces to finding a rectangular safe region with a maximum weighted perimeter.
In one embodiment, the maximum weighted perimeter safe region for a given mobile user is calculated by calculating the individual safe regions for each of the mobile user's location-centric triggers and then calculating the intersection of these safe regions. In another embodiment, illustrated in greater detail with respect toFIG. 6, the maximum weighted perimeter safe region is calculated for a set of relevant spatial trigger regions for a given mobile user.
FIG. 6 is a flow diagram illustrating one embodiment of amethod600 for computing a safe region, according to the present invention. Specifically, themethod600 computes a maximum weighted perimeter safe region, as described above. Themethod600 may be implemented, for example, in accordance withstep210 of themethod200 and by theinformation monitoring server102 illustratedFIG. 1. As such, reference is made in the discussion of themethod600 to various components of theinformation monitoring server102. However, it is understood that themethod600 is not limited to operation in conjunction with theinformation monitoring server102, and may readily be deployed in systems having different configurations from that illustrated.
In one embodiment, themethod600 reduces computation costs by considering only relevant triggers in the vicinity of the mobile user's current location. In one embodiment, this is achieved by overlaying a grid over the entire universe of discourse U (or map).
Themethod600 is initialized at step and proceeds to step604, where thedata processor108 receives the mobile user's current location vector {right arrow over (P)} and the current grid cell G({right arrow over (P)}) in which the mobile user resides.
Instep606, theoptimizer112 identifies the set of triggers that intersect the current grid cell G({right arrow over (P)}). As discussed above, embodiments of themethod600 consider only these triggers in the calculation of the safe region. In one embodiment, if none of the user's triggers intersect the current grid cell G({right arrow over (P)}), then theoptimizer112 returns the entire current grid cell G({right arrow over (P)}) as the safe region.
Instep608, theoptimizer112 partitions the current grid cell G({right arrow over (P)}) into a plurality of partitions, with the mobile user's current location {Px, Py} as the origin. In one embodiment, theoptimizer112 partitions the current grid cell G({right arrow over (P)}) into four quadrants.
Instep610, theoptimizer112 defines, for each partition, a set of candidate points (cpSetPart). The set of candidate points comprises the set of points that can potentially form a corner of a rectangular safe region. In one embodiment, the set of candidate points is defined by first selecting the spatial region corner of each of the mobile user's triggers as a candidate point in its appropriate partition. For triggers that do not lie completely inside the current grid cell G({right arrow over (P)}), the intersection points of the boundary of the current grid cell G({right arrow over (P)}) and the trigger spatial region are selected as candidate points instead of the corner points (which fall outside the region of the current grid cell G({right arrow over (P)})). In a further embodiment, the set of candidate points is expanded by also selecting, for trigger spatial conditions that intersect the x axis or y axis of the coordinate axes with origin at {Px, Py}, the points of intersection of the triggers with the axes. In a further embodiment still, if no points of intersection with an axis exist, the point of intersection of the axis with the current grid cell G({right arrow over (P)}) is added to the set of candidate points.
In one embodiment, the set of candidate points is trimmed. In one embodiment, trimming includes removing, in the case of multiple candidate points in a partition that intersect the x or y axis, all candidate points on this axis except for the one that is closest to the origin. In a further embodiment, all points that dominate any other point in the candidate set are removed. A point P1dominates a point P2if P1·x>P2·x and P1·y>P2·y. In yet another embodiment, the candidate points are sorted according to increasing distance of the x coordinate from the origin. Points with the same x coordinate are arranged in order of decreasing distance of the y coordinate from the origin.
FIG. 7A is a schematic diagram illustrating anexemplary grid cell700 that is divided into four partitions or quadrants, labeled I-IV. As illustrated, a plurality of candidate points (labeled C11-C44) has been identified within thegrid cell700. Black dots represent the candidate points that have survived trimming, as discussed above, while hollow dots represent candidate points that have been trimmed.
Referring back toFIG. 6, instep612, theoptimizer112 defines, for each partition, a set of tension points (tpSetPart). Tension points are obtained from candidate points by ensuring that only points that form the largest possible rectangular regions that do not overlap the spatial region of any trigger are selected. In one embodiment, each tension point TQi(where Q∈{1,2,3,4} represents the partition that the tension point belongs to) is assigned to the same x-coordinate as the corresponding candidate point CQi. TQiis assigned the same y-coordinate as that of CQi−1, or TQi−1if TQiand TQi−1have the same x coordinate. The y-coordinate of TQiis set as either the top bound of the current grid cell G({right arrow over (P)}) or the y-coordinate of a candidate point that intersects the y axis (if any such candidate points exist).
FIG. 7B is a schematic diagram illustrating theexemplary grid cell700 ofFIG. 7A, including a set of tension points (labeled T11-T44) obtained from the candidate points illustrated inFIG. 7A. If one imagines an elastic band laid around the set of candidate points, the set of tension points can be obtained by stretching the elastic band to obtain a rectilinear polygonal shape that does not overlap any of the triggers' spatial regions.
Referring back toFIG. 6, instep614, theoptimizer112 defines a set of component rectangles. The set of tension points forms the opposite corner (i.e., opposite to the origin) of the set of candidate component rectangles in each partition. The safe region that is ultimately calculated is composed of the intersection of the component rectangles from each partition. In one embodiment, component rectangles with the maximum perimeter in each partition are selected, as this will facilitate the calculation of safe regions with the maximum weighted perimeter.
Instep616, theoptimizer112 calculates the safe region in accordance with the component rectangles. In one embodiment, this is accomplished using greedy heuristics that first select the partition in which the probability density function of the expected future movement of the mobile user is maximum. The component rectangle with the largest weighted perimeter in this partition is then selected. Partitions are further selected dependent on the distribution of probability density function values in the partition using the steady motion assumption. At each step, the component rectangle with the largest weighted perimeter is selected, and this continues until all partitions are processed using this greedy heuristic.
FIG. 7C is a schematic diagram illustrating theexemplary grid cell700 ofFIGS. 7A-B, including a set of component rectangles obtained from the tension points illustrated inFIG. 7B. As illustrated, the largest component rectangle in Quadrant I is first selected, since the probability density function values for mobile user movement are maximum in this quadrant. The next quadrant (i.e., Quadrant IV) is selected in a clockwise manner; probability density function values of motion direction are expected to be higher in Quadrant IV because the angle θ illustrated inFIG. 7C is such that θ<π/4. If the value of the angle θ was greater than π/4, Quadrant II would be selected next rather than Quadrant IV. Addition of the component rectangle at tension point T44provides a safe region with a larger perimeter than the safe region obtained by adding the component rectangle with the tension point T41,T42. Finally, the component rectangles with tension points at T23in Quadrant II and T34in Quadrant III (selected in order of increasing expected probability density function values) are selected.
FIG. 7D is a schematic diagram illustrating theexemplary grid cell700 ofFIGS. 7A-C, in which a finalsafe region702 composed of the component rectangles illustrated inFIG. 7C is selected.
Thedata manger116 outputs the safe regions to the mobile users instep618. Themethod600 then returns to step604, and proceeds as described above to process a new location vector and grid cell.
As discussed above, each location-centric trigger may also have a non-spatial predicate condition that requires monitoring of a non-spatial attribute a
jk(e.g., the price of gas at a location of interest). Each monitored attribute a
jkat a location of interest l
jhas at least one safe value container δ
k(l
j) associated with it. The following condition holds true for any safe value container Pr[m
lj(t)
≧1T
j|∃k∈[1 . . . r],a
jk∈δ(l
j)]=0, which implies that if the value of even a single monitored attribute at a location of interest lies within the corresponding safe value container, the probability of any relevant triggers being activated is zero.
In one embodiment, the safe value containers are constructed to satisfy at least three goals: (1) quick calculation of the safe value containers (since calculation must be performed for each of the r monitored attributes at each location of interest); (2) quick containment check to verify that the current monitored attribute value lies within a one-dimensional value range; and (3) maximization of the value range covered by the safe value containers, so as to minimize the number of updates that need to be processed by thelocation monitoring server102.
In one embodiment, the safe value containers for any monitored attribute comprise either a set of multiple safe value containers or a single safe value container. Multiple safe value containers reduce the number of updates for low update frequency data streams and data streams with high rates of change of data values, where the values of the monitored data may jump from one safe value container to another.
FIGS. 10A-B, for example, respectively illustrate the use of single and multiple safe value containers. In particular,FIG. 10A illustrates a single safe value container being invalidated at time instant t2when a monitored data source delivers data exhibiting a high rate of change and a low update frequency. By contrast,FIG. 10B illustrates that the use of multiple safe value containers eliminates the need to recomputed the safe value containers at time instants t2and t3.
FIG. 8 is a flow diagram illustrating one embodiment of amethod800 for computing a safe value container, according to the present invention. Themethod800 may be implemented, for example, in accordance withstep210 of themethod200 and by theinformation monitoring server102 illustratedFIG. 1. As such, reference is made in the discussion of themethod800 to various components of theinformation monitoring server102. However, it is understood that themethod800 is not limited to operation in conjunction with theinformation monitoring server102, and may readily be deployed in systems having different configurations from that illustrated.
Themethod800 is initialized atstep802 and proceeds to step804, where theoptimizer112 receives the values of an attribute being monitored. For ease of explanation, themethod800 describes monitoring the values for a single attribute; however, it will be appreciated that themethod800 can just as easily be implemented to monitor values for multiple attributes.
Instep806, theoptimizer112 divides the entire value domain of the attribute into a plurality of equally sized blocks. This reduces the computation costs associated with calculating the safe value containers, since only triggers whose attribute value range intersects the current monitored block are considered for the purposes of safe value container calculation. Consider, for example, the monitored attribute values a={a1, a2, . . . ar} and the corresponding blocks identified as B(a1), B(a2), . . . B(ar), respectively. Further consider the calculation of safe value containers for the kthattribute ak.
Instep808, the optimizer selects the block B(ak) to monitor as the block in which the current value of attribute aklies. Themethod800 then proceeds to step810, where theoptimizer112 determines the set Tjof triggers such that the intersection of the predicate conditions of the triggers with the currently monitored block is not an empty set. In other words, theoptimizer112 determines the set Tjof triggers such that Ri,jk∩B(ak)≠Ø.
Instep812, theoptimizer112 determines whether, for any of the triggers in the set Tjof triggers, the monitored attribute value satisfies the predicate conditions. In other words, theoptimizer112 determines whether for any of the triggers ti,j∈Tj,mlj(t)·ak∈Ri,jk.
If theoptimizer112 concludes instep812 that the monitored attribute value does not satisfy the predicate condition for any of the triggers in the set Tj, the safe value container is set instep814 as the currently monitored block minus the union of the predicate conditions for all triggers that intersect the currently monitored block. In other words, the safe value container is set as B(ak)−(R1,jk∪R2,jk∪ . . . Rq,jk), where q denotes the number of triggers that intersect the currently monitored block B(ak). Themethod800 then terminates instep812.
Alternatively, if theoptimizer112 concludes instep812 that the monitored attribute value satisfies the predicate condition for at least one of the triggers in the set Tj, themethod800 proceeds to step816, where the optimizer determines whether more than one trigger is satisfied by the monitored attribute value.
If theoptimizer112 concludes instep816 that the monitored attribute value satisfies the predicate condition for more than one of the triggers in the set Tj, themethod800 proceeds to step818, and theoptimizer112 sets the safe value container as the intersection of the predicate conditions for all triggers whose value range contains the value of the currently monitored attribute. In other words, the safe value container is set as R1,jk∩R2,jk∩ . . . Rp,jk, where p denotes the number of triggers whose value range contains the value mlj(t)·ak. Themethod800 then terminates instep812.
Alternatively, if theoptimizer112 concludes instep816 that the monitored attribute value satisfies the predicate condition for only one of the triggers in the set Tj, themethod800 proceeds to step820, and theoptimizer112 sets the safe value container to the value range of the predicate condition. In other words, the safe value container is set as Ri,jk. Themethod800 then terminates instep812.
FIG. 9 is a high-level block diagram of the location-based monitoring method that is implemented using a generalpurpose computing device900. In one embodiment, a generalpurpose computing device900 comprises aprocessor902, amemory904, a location-basedmonitoring module905 and various input/output (I/O)devices906 such as a display, a keyboard, a mouse, a stylus, a wireless network access card, an Ethernet interface, and the like. In one embodiment, at least one I/O device is a storage device (e.g., a disk drive, an optical disk drive, a floppy disk drive). It should be understood that the location-basedmonitoring module905 can be implemented as a physical device or subsystem that is coupled to a processor through a communication channel.
Alternatively, the location-basedmonitoring module905 can be represented by one or more software applications (or even a combination of software and hardware, e.g., using Application Specific Integrated Circuits (ASIC)), where the software is loaded from a storage medium (e.g., I/O devices906) and operated by theprocessor902 in thememory904 of the generalpurpose computing device900. Thus, in one embodiment, the location-basedmonitoring module905 for providing fault tolerance for stream processing applications, as described herein with reference to the preceding figures, can be stored on a computer readable storage medium (e.g., RAM, magnetic or optical drive or diskette, and the like).
It should be noted that although not explicitly specified, one or more steps of the methods described herein may include a storing, displaying and/or outputting step as required for a particular application. In other words, any data, records, fields, and/or intermediate results discussed in the methods can be stored, displayed, and/or outputted to another device as required for a particular application. Furthermore, steps or blocks in the accompanying figures that recite a determining operation or involve a decision, do not necessarily require that both branches of the determining operation be practiced. In other words, one of the branches of the determining operation can be deemed as an optional step.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. Various embodiments presented herein, or portions thereof, may be combined to create further embodiments. Furthermore, terms such as top, side, bottom, front, back, and the like are relative or positional terms and are used with respect to the exemplary embodiments illustrated in the figures, and as such these terms may be interchangeable.