CROSS-REFERENCE TO RELATED APPLICATIONSThis application claims the benefit of U.S. Provisional Application No. 62/110,263, filed Jan. 30, 2015; U.S. Provisional Application No. 62/112,032, filed Feb. 4, 2015; and U.S. Provisional Application No. 62/113,092, filed Feb. 6, 2015, which are incorporated by reference herein.
TECHNICAL FIELDThis description relates generally to data analysis, and more particularly to denoising and data fusion of biophysiological rate features.
BACKGROUNDData analysis generally encompasses processes of collecting, cleaning, processing, transforming, and modeling data with the goal, for example, of accurately describing the data, discovering useful information or features among the data, suggesting conclusions, or supporting decision-making. Data analysis typically includes systematically applying statistical or logical techniques to describe, condense, illustrate and evaluate data. Various analytic techniques facilitate distinguishing the signal or phenomenon of interest from unrelated noise and uncertainties inherent in observed data.
Sensor data fusion techniques typically provide higher-level information from data observed at multiple sensors, for example, employing spatio-temporal data integration, exploiting redundant and complementary information, as well as available context. Exploratory data analysis often applies quantitative data methods for outlier detection attempt to identify and eliminate inaccurate data. In addition, descriptive statistics, such as the statistical mean, median, variation or standard deviation may be generated to help interpret the data. Further, data visualization may also be used to examine the data in graphical format, providing insight regarding the information embedded the data.
In general, statistical hypothesis testing, or confirmatory data analysis, employs statistical inference to determine if a result is significant based on a confidence interval or threshold probability. Model selection techniques may be employed to determine the most appropriate model from multiple hypotheses. Decision theory and optimization techniques, including chi-square testing, may further be employed to select the best of multiple descriptive models. Statistical inference methods include, but are not limited to, the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the focused information criterion (FIC) the deviance information criterion (DIC), and the Hannan-Quinn information criterion (HQC).
A photoplethysmogram (PPG) is an optically obtained plethysmogram, or volumetric measurement of an organ. The pulse oximeter, a type of PPG sensor, illuminates the skin with one or more colors of light and measures changes in light absorption at each wavelength. The PPG sensor illuminates the skin, for example, using an optical emitter, such as a light-emitting diode (LED), and measures either the amount of light transmitted through a relatively thin body segment, such as a finger or earlobe, or the amount of light reflected from the skin, for example, using a photodetector, such as a photodiode. PPG sensors have been used to monitor respiration and heart rates, blood oxygen saturation, hypovolemia, and other circulatory conditions.
Conventional PPGs typically monitor the perfusion of blood to the dermis and subcutaneous tissue of the skin, which may be used to detect, for example, the change in volume corresponding to the pressure pulses of consecutive cardiac cycles of the heart. If the PPG is attached without compressing the skin, a secondary pressure peak may also be seen from the venous plexus. A microcontroller typically processes and calculates the peaks in the waveform signal to count heart beats per minute (bpm).
However, signal noise from sources unrelated to desired features, including, for example, motion artifacts and electrical signal contamination, have proven to be a limiting factor affecting the accuracy of PPG sensor readings. While the signal noise from sources unrelated to desired features may be avoided in a clinical environment, this signal noise may have an undesirable effect on PPG sensor readings taken in free living conditions, for example, during exercise. As a result, some existing data analysis methodologies may have drawbacks when used with PPG sensor readings taken in free living conditions.
SUMMARYAccording to one embodiment, a device includes a memory that stores machine instructions and a processor coupled to the memory that executes the machine instructions to receive a plurality of feature data points and extract a feature from a feature data point of the plurality of feature data points that satisfy a predetermined range. The processor further executes the machine instructions to perform a plurality of hypothesis tests to determine whether the feature corresponds to each of a plurality of predetermined hypothesis distributions comprising a first hypothesis distribution. If the feature corresponds to the first hypothesis distribution, the processor further executes the machine instructions to qualify the feature as a qualified estimate of an actual feature.
According to another embodiment, a method includes receiving a plurality of feature data points and extracting a feature from a feature data point of the plurality of feature data points that satisfy a predetermined range. The method further includes performing a plurality of hypothesis tests to determine whether or not the feature corresponds to each of a plurality of predetermined hypothesis distributions comprising a first hypothesis distribution. The method also includes qualifying the feature as a qualified estimate of an actual feature if the feature corresponds to the first hypothesis distribution.
According to yet another embodiment, a computer program product includes a non-transitory, computer-readable storage medium encoded with instructions adapted to be executed by a processor to implement receiving a plurality of feature data points and extracting a feature from a feature data point of the plurality of feature data points that satisfy a predetermined range. The instructions are further adapted to implement performing a plurality of hypothesis tests to determine whether or not the feature corresponds to each of a plurality of predetermined hypothesis distributions comprising a first hypothesis distribution. The instructions are also adapted to implement qualifying the feature as a qualified estimate of an actual feature if the feature corresponds to the first hypothesis distribution.
The details of one or more embodiments of the present disclosure are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the present disclosure will be apparent from the description and drawings, and from the claims.
DESCRIPTION OF THE DRAWINGSFIG. 1 illustrates a block diagram depicting an exemplary biophysiological periodic data analyzer in accordance with an embodiment.
FIG. 2 illustrates a flowchart of an exemplary method of multiple-model adaptive estimation used to analyze biophysiological periodic data in accordance with an embodiment.
FIG. 3 illustrates a graph depicting exemplary statistical hypotheses for use in performing statistical inference regarding feature data in accordance with an embodiment.
FIG. 4A illustrates a flowchart of an exemplary method of analyzing biophysiological periodic data in accordance with an embodiment.
FIG. 4B illustrates another flowchart of an exemplary method of analyzing biophysiological periodic data in accordance with an embodiment.
FIG. 4C illustrates another flowchart of an exemplary method of analyzing biophysiological periodic data in accordance with an embodiment.
FIG. 5 illustrates a schematic view depicting a computing system that may be employed in a biophysiological periodic data analyzer in accordance with an embodiment.
DETAILED DESCRIPTIONFIG. 1 illustrates a block diagram of an exemplary biophysiological periodic data analyzer, according to one embodiment. An biophysiologicalperiodic data analyzer10 includes afeature receiver12, arate calculator14, anoutlier eliminator16, a recent rate calculator18, arate filter20, arate change computer22, a biosemanticbinary qualifier24, afeature modifier26, and afilter generator28. Thefeature receiver12 is configured to receive multiple simultaneous data points from various sensors monitoring biophysiological features of a subject, including, but not limited to, a heart rate (HR), a respiration rate, a fluid solution concentration, and a bodily movement. The subject may include, but not limited to, a person, an animal, and a living organism.
The data points include a data fusion from multiple sources coming from different features on the same underlying sensors, or different sensors. For example, the data points include feature data regarding a subject's heart rate and respiration rate observed over time using photoplethysmogram (PPG) sensors, such as pulse oximeters. In one embodiment, the PPG sensor and the biophysiological periodic data analyzer may be embedded in a wearable device that is fastened to a subject, for example, the subject's head, foot, finger, and wrist.
Thefeature receiver12 sorts the monitored feature data points and places the data points in order, for example, feature-by-feature. Thefeature receiver12 outputs each ordered data point along with a synchronous time output. Therate calculator14 uses the most recent data point and a corresponding time output to calculate the current feature rate based on a series of recent data points.
Theoutlier eliminator16 determines whether the current feature rate falls within an acceptable range based on a set of predetermined biological limits regarding the feature, for example, minimum and maximum rate limits. A current feature rate that falls outside the acceptable range are not used in further calculations. The recent rate calculator18 uses a series of current feature rates within the acceptable range during a desired window of time to calculate an updated recent feature rate.
Theoutlier eliminator16 imposes constraints on the hypotheses based on biophysiological limits. For example, a minimum limit (‘minHR’) and a maximum limit (‘maxHR’) may be based on the realistic expected range of human heart rates. Similarly, minimum and maximum relative limits (‘+/−deltaHR’) centered around the recently observed heart rate value (uRecent) may be based on physiological limitations regarding the rate of change of the heart rate over the sampling time.
Therate filter20 performs statistical calculations on qualified feature data from the biosemanticbinary qualifier24, which is further explained below.FIG. 2 illustrates a flowchart of an exemplary method of multiple-model adaptive estimation (MMAE) used to analyze biophysiological periodic data in accordance with an embodiment.MMAE30 may be implemented by therate filter20 to analyze qualified feature data. In an embodiment, therate filter20 includes multiple Kalman filters, each based on a different model. For example, afirst Kalman filter32 is based on a first model, asecond Kalman filter34 is based on a second model, athird Kalman filter36 is based on a third model, and afourth Kalman filter38 is based on a fourth model. Optionally, the statistical calculations may implement weightings attached to the data from each of the input streams, for example, indicating a preference for information from one stream over that of another stream. Therate change computer22 continuously computes the current rates of change regarding the filtered and unfiltered rates.
The fusion at the hypothesis level follows an approach equivalent to that used in the generic multiple-model adaptive estimation framework, as described in the context of Kalman filters by P. D. Hanlon and P. S. Maybeck in “Multiple-Model Adaptive Estimation Using a Residual Correlation Kalman Filter Bank,” IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-36, No. 2, April 2000, pp. 393-406, the entirety of which is incorporated herein by reference. The Kalman filter estimation involves an estimate and an uncertainty of the state of the system. For instance, in an embodiment, an unscented Kalman filter associated with alternate hypotheses of system behavior is used, which explicitly fits a distribution from deterministic sampling of the input, as described in Simon J. Julier & Jeffrey K. Uhlmann, “A new extension of the Kalman filter to nonlinear systems”, Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, vol. 3, p. 182, 1997, the entirety of which is incorporated herein by reference.
The biosemanticbinary qualifier24 determines qualified data, or qualifies data, based on a binary selection criterion for each input feature, based on compatibility with learned probabilistic models (many possible methods for model development). The binary selection approach handles input data, even when there is a large fraction of anomalies, or uncertainty, in the feature data. The biosemanticbinary qualifier24 includes, for example, a maximum likelihood decision engine. The biosemanticbinary qualifier24 produces qualified data as output.
In an embodiment, the biosemanticbinary qualifier24 uses the recent rate along with the filtered and unfiltered rates of change to perform ahypothesis testing method40. Multiple hypothetical models are considered for each observed data point, and the decision to accept the point is made based on a decision rule for each hypothesis. The model hypotheses incorporate biophysical limits on both on rates of change and the hard limits on the values of the inputs, grounded in biophysiological constraints. Each hypothesis transforms the input feature differently, depending on the nature of the hypothesis.
FIG. 3 illustrates a graph depicting exemplary statistical hypotheses for use in performing statistical inference regarding feature data in accordance with an embodiment. Agraph50 illustrates various exemplary test hypotheses. Based on the window statistics with respect to a particular time window, such as the mean and standard deviation of the windowed rates, multiple hypothetical probability models are trained, or developed. In an embodiment, the test hypotheses consist of discrete expected probability distributions, for example, including arecent distribution52, atrial distribution54, and anartifact distribution56.
Referring toFIG. 3, the decision question is presented: “Should anew beat58 be accepted as a legitimate heart beat?” Two exemplary hypotheses have been developed with respect to the heart rate (HR), as follows: A first hypothesis, therecent distribution52, presumes the measured input feature is consistent with the recently observed heart rate. A second hypothesis, thetrial distribution54, presumes the measured input feature has been corrupted and is consistent with one-half the recently observed heart rate. The second hypothesis is related to a specific sort of signal corruption that gives an accurate estimate of one-half the heart rate, which is grossly inaccurate for the true rate. A third hypothesis,artifact distribution56, presumes the measured input feature has been corrupted and is consistent with an artifact that is unrelated to the true heart rate. In other embodiments, additional hypotheses may be included, for example, based on characteristics of the input data stream.
The biosemanticbinary qualifier24 tests each of the hypotheses on the basis of a probabilistic test. For instance, in the case of the first hypothesis type described, both therecent distribution52 and thecandidate point58 are available. Therefore, the computation of the posteriori likelihood of the point being derived from the distribution is used to represent the posteriori likelihood of the associated hypothesis.
Each hypothesis is considered independently—on the basis of its own test against a null hypothesis. For instance, a hypothesis is based on exceeding a threshold in a log-likelihood ratio test, or in exceeding a threshold with respect to the affinity to the distribution associated with the hypothesis. Following this, all hypotheses which overcome the null hypothesis are ranked based on an a priori ranking among hypotheses and the highest ranked hypothesis is selected. This has the advantage that diverse hypothesis types may be considered—some with an explicit probability model for which likelihood may be computed, but others using logical triggers for which no explicit probability model exists.
Thus, these statistics are combined among the different data sources, and then applied across each of the hypotheses. Alternatively, separate statistics may be calculated associated with each data type and these may be selectively attached to different hypotheses.
In an alternate embodiment in which all of the hypotheses have explicit probabilities, the hypothesis selection may then proceed by computing the relative likelihood of each hypothesis computed and selecting the most likely hypothesis is selected as being correct. This triggers certain logic, as described below, to either accept or to reject the candidate point.
For example, the feature data point may be accepted as measured, based on a relatively high correlation to the hypothesis associated with therecent distribution52. Otherwise, thefeature modifier26 may modify the feature data point before it is accepted, for example, based on a relatively high correlation to the hypothesis associated with thetrial distribution54. On the other hand, the feature data point may be dropped from the output stream, based on a relatively high correlation to the hypothesis associated with theartifact distribution56.
Thefilter generator28 updates therate filter20 and provides feedback to the biosemanticbinary qualifier24 to develop the model hypotheses. The model hypotheses are stochastic processes, which calculate the increases in uncertainty associated with the time-sensitivity of information gathered. If no recent feature data has been explained, the uncertainty grows. In an embodiment, the statistics calculation implements, for example, a Langevin correction. This modifies the probability model to account for the time value of data by growing the model variance with the time gap period. In an embodiment, the Langevin model, which is based on physical models of Brownian motion, grows the model variance linearly with time.
FIGS. 4A through 4C illustrate flowcharts of an exemplary method of analyzing biophysiological periodic data in accordance with an embodiment. Examples of biophysiological periodic data that may be analyzed using the present method described in this disclosure include, for example, a heart rate (HR), a respiration rate, a fluid solution concentration, and a bodily movement. The present method processes one or more streams of feature data regarding a biophysiological feature over time and outputs a single stream of qualified data.
Referring toFIG. 4A, input data tracks62,64, and65 are fed in order, feature-by-feature at60. In one embodiment, the features may include, for example, the interbeat interval of a heart, a respiration rate, a step rate, and any other periodic signal from a biophysiological sensor. A feature data stream is separated into a sensed event at68, and a corresponding time at70. The output time at70 is presented to a process that continues atFIG. 4B, and the output rate, and/or output trial rate at72 is presented to processes that continue atFIGS. 4B and 4C. At72, a current rate (thisRate) associated with the sensed event and a trial rate (trialRate) associated with a statistical hypothesis are each calculated based on the event at68.
A set of fixed, or absolute, biophysiological limits regarding the features are received at74, and a determination is made at76, regarding whether the rate and/or trial rate at72 fall within an acceptable range defined by the biophysiological limits. If the rate and/or trial rate at72 are found to be within the acceptable range at76, the process continues at80 ofFIG. 4B. Otherwise, the rate and trial rate at72 that fall outside the acceptable range are discarded at78. The biophysiological limits are forwarded to the process at80 ofFIG. 4B.
Referring toFIG. 4B, if the rate and/or trial rate at72 are found to be within the acceptable range at76, the recent rate based on statistics over a trailing window of time is updated at80, based on the rate at72 and the time at70 inFIG. 4A. Data points that fall outside the acceptable range at76 ofFIG. 4A are trimmed from the input to the recent rate. At82, the current rate of change of the rate ofblock72 is computed, resulting in a delta rate (deltaRate) at84. The recent rate calculated over a fixed window of time is stored in a buffer, at86.
In addition to the absolute limits applied at76, the present method also detects conditions in which limits on the allowable rate of change have been exceeded. A dynamic limit computed by the statistics of the recent time window, such as a confidence interval. For example, a ninety-percent confidence interval, a ninety-two-percent confident interval, or a ninety-five-percent confidence interval is applied based on a probabilistic model fit with respect to the previous window.
Statistical feedback data fromFIG. 4C is used to modify the recent rate filter (recentRateFilt), which is calculated over a time window and stored in abuffer88 as illustrated inFIG. 4B. For example, the recent rate filter includes multiple Kalman filters, as described above. The data fusion among the different streams entering at the top of the block diagram ofFIG. 4A is managed in the calculation of statistics in the recent window at88. Referring toFIG. 4B, at90, the current rates of change of the recent rate filter at88 and the trial rate at72 are computed, resulting in a delta rate (deltaRateFilt)at92.
Statistical hypothesis testing and data fusion are performed at94, for example, by a maximum likelihood decision engine (biosemBinaryQualifier, or BBQ), to determine the event type based on the biophysiological limits at74, the recent rate at86, the delta rate at84, the filter delta rate and the trial delta filter rate at92 and statistical feedback data at112 fromFIG. 4C. The resultant event type at96, is forwarded to the process atFIG. 4C.
Referring toFIG. 4C, based on the event type at96 inFIG. 4B, decision logic at100 determines the hypothesis category, for example,type 0,type 1, ortype 2. In an embodiment, the decision rule (decision logic) may be framed as a question, for example, “Should a newly observed feature (beat) be accepted as legitimate?” The question may be answered probabilistically, for example based on whether the feature lies within a certain confidence interval of each of the hypotheses, or alternatively by computing the chi-squared statistics associated with each of the hypotheses.
If the event type at96 is determined to belong to a hypothesis category,type 0, no further processing is performed regarding the event type at102. If theevent type96 is determined to belong to a hypothesis category,type 1, the feature is passed along without modification at104. If theevent type96 is determined to belong to the category,type 2, the feature is modified according to a suitable model at106.
At108, the feature outputs at104 and106 are combined with the time at70 ofFIG. 4A to produce a qualified feature with a timestamp. The result for each timestamp is sent as an output at110, for example, including a postqualified feature, the corresponding hypothesis category or type. Optionally, a corresponding weight may be included in the output.
In addition, in an alternative embodiment, the final result may be temporally smoothed to improve the precision, albeit at the expense of responsiveness. For example, the feature stream may be estimated using various data smoothing approaches including, for example, a boxcar moving average filter, an exponential moving average filter, or the like. For example, the qualified feature stream and the smoothed feature stream provide two estimates of the true heart rate of a subject over time based on the measured heart rate data represented by the feature data streams.
Statistical data is computed based on the qualified feature with regard to a corresponding window of time at112, and the filter criteria is developed to update the recent rate filter at88 inFIG. 4B. For example, a Langevin correction is made for time gaps in the data streams. In an embodiment, all of the required filtering criteria are determined at112. A corollary output is sent to a buffer at114, for example, including statistics such as the qualified feature mean and standard deviation with respect to the time window corresponding to each timestamp. The windowed statistics may be used, for example, to produce a confidence measure on the output qualified feature stream.
As illustrated inFIG. 5, anexemplary computing device120 may be employed in the biophysiological periodic data analyzer10 ofFIG. 1 includes aprocessor122, amemory124, an input/output device (I/O)126storage128 and anetwork interface130. The various components of thecomputing device120 are coupled by alocal data link132, which in various embodiments incorporates, for example, an address bus, a data bus, a serial bus, a parallel bus, or any combination of these.
Thecomputing device120 may be used, for example, to implement the method of analyzing biophysiological periodic data ofFIG. 1. Programming code, such as source code, object code or executable code, stored on a computer-readable medium, such as thestorage128 or a peripheral storage component coupled to thecomputing device120, may be loaded into thememory124 and executed by theprocessor122 in order to perform the functions of the method of analyzing biophysiological periodic data ofFIG. 1.
Aspects of this disclosure are described herein with reference to flowchart illustrations or block diagrams, in which each block or any combination of blocks may be implemented by computer program instructions. The instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to effectuate a machine or article of manufacture, and when executed by the processor the instructions create means for implementing the functions, acts or events specified in each block or combination of blocks in the diagrams.
In this regard, each block in the flowchart or block diagrams may correspond to a module, segment, or portion of code that including one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functionality associated with any block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or blocks may sometimes be executed in reverse order.
A person of ordinary skill in the art will appreciate that aspects of this disclosure may be embodied as a device, system, method or computer program product. Accordingly, aspects of this disclosure, generally referred to herein as circuits, modules, components or systems, may be embodied in hardware, in software (including firmware, resident software, micro-code, etc.), or in any combination of software and hardware, including computer program products embodied in a computer-readable medium having computer-readable program code embodied thereon.
It will be understood that various modifications may be made. For example, useful results still could be achieved if steps of the disclosed techniques were performed in a different order, and/or if components in the disclosed systems were combined in a different manner and/or replaced or supplemented by other components. Accordingly, other implementations are within the scope of the following claims.