FIELD OF THE INVENTION The present invention relates generally to the diagnosis of aspiration and more particularly relates to an apparatus and method for detecting swallowing and related activity.
BACKGROUND OF THE INVENTION Dysphagia refers to any deglutition (swallowing) disorder, including abnormalities within the oral, pharyngeal and esophageal phases of swallowing. Dysphagia is common in individuals with neurological impairment, due to, for example, cerebral palsy, cerebrovascular accident, brain injury, Parkinson's disease, stroke and multiple sclerosis. Individuals with dysphagia are often at risk of aspiration. Aspiration refers to the entry of foreign material into the airway during inspiration. Aspiration may manifest itself in a number of different ways. The individual may begin to perspire and the face may become flushed. Alternatively, the individual may cough subsequent to swallowing. In silent aspiration, there are no overt clinical or easily recognizable signs of bolus inhalation. The invention is particularly useful for individuals with silent aspiration, but is also applicable for other manifestations of aspiration. Aspiration bears serious health consequences such as chronic lung disease, aspiration pneumonia, dehydration, and malnutrition.
Dysphagia afflicts an estimated fifteen million people in the United States. Certain sources indicate that fifty thousand people die each year from aspiration pneumonia (Dray et al., 1998). The occurrence of diffuse aspiration bronchiolitis in patients with dysphagia is not uncommon, regardless of age (Matsuse et al., 1998). Silent aspiration is especially prominent in children with dysphagia, occurring in an estimated 94% of that population (Arvedson et al., 1994). Half of stroke survivors have swallowing difficulties (Zorowitz & Robinson, 1999), which translates to 500,000 people per year in the United States, (Broniatowski et al., 2001), and aspiration is reported in 75% of these cases while 32% report chest infections (Perry & Love, 2001). The incidence of dysphagia is particularly significant in acute care settings (25-45%), chronic care units (50%) (Finiels et al., 2001) and homes for the aged (68%) (Steele et al., 1997). Dysphagia tremendously diminishes quality of life for people of all ages, compromising not only medical, but social, emotional and psychosocial well-being.
The modified barium swallow using videofluoroscopy is the current gold standard for confirmation of aspiration (Wright et al., 1996). Its clinical utility in dysphagia management continues to be asserted (e.g., Martin-Harris, 2000; Scott et al., 1998). The patient ingests barium-coated material and a video sequence of radiographic images is obtained via X-radiation. The modified barium swallow procedure is invasive and costly both in terms of time and labor (approximately 1,000 health care dollars per procedure in Canada), and renders the patient susceptible to the effects of ionizing radiation (Beck & Gayler, 1991).
Fibreoptic endoscopy, another invasive technique in which a flexible endoscope is inserted transnasally into the hypopharynx, has also been applied in the diagnosis of post-operative aspiration (Brehmer & Laubert, 1999) and bedside identification of silent aspiration (Leder et al., 1998). Fibreoptic endoscopy is generally comparable to the modified barium swallow in terms of sensitivity and specificity for aspiration identification (e.g., Madden et al., 2000; Leder & Karas, 2000), with the advantage of bedside assessment.
Pulse oximetry has been proposed as a non-invasive adjunct to bedside assessment of aspiration (e.g., Sherman et al., 1999; Lim et al., 2001). However, several controlled studies comparing pulse oximetric data to videofluorscopic (Sellars et al., 1998) and fiberoptic endoscopic evaluation (Leder, 2000; Colodny, 2000) have raised doubts about the existence of a relationship between arterial oxygen saturation and the occurrence of aspiration.
Cervical auscultation involves listening to the breath sounds near the larynx by way of a laryngeal microphone, stethoscope or accelerometer (Zenner et al., 1995) placed on the neck. It is generally recognized as a limited but valuable tool for aspiration detection and dysphagia assessment in long-term care (Zenner et al., 1995; Cichero & Murdoch, 2002; Stroud et al., 2002). However, when considered against the gold standard of videofluoroscopy, bedside evaluation even with cervical auscultation yields limited accuracy (40-60%) in detecting aspirations (Sherman et al., 1999; Selina et al., 2001; Sellars et al., 1998). Indeed, our recent research shows that aspirations identified by clinicians using cervical auscultation, represent only a quarter of all aspirations (Chau, Casas, Berall & Kenny, submitted).
Swallowing accelerometry (Reddy et al., 2000) is closely related to cervical auscultation, but has entailed digital signal processing and artificial intelligence as discrimination tools, rather than trained clinicians. In clinical studies, accelerometry has demonstrated moderate agreement with videofluoroscopy in identifying aspiration risk (Reddy et al., 1994) where as the signal magnitude has been linked to the extent of laryngeal elevation (Reddy et. al, 2000). Recently, fuzzy committee neural networks have demonstrated extremely high accuracy at classifying normal and “dysphagic” swallows (Das et al., 2001). However, prior art swallowing accelerometry only provides limited information in classifying normal from “dysphagic” swallows and does not provide broader information about the clinical status of the patient.
Administration of videofluoroscopy or nasal endoscopy requires expensive equipment and trained professionals such as a radiologist, otolaryngologist or speech-language pathologist (Sonies, 1994). Invasive procedures are not well-tolerated by children and cannot be practically administered for extended periods of feeding. There is a need for an economical, non-invasive and portable method of aspiration detection, for use at the bedside and outside of the institutional setting.
SUMMARY OF THE INVENTION It is an object of the present invention to provide a novel apparatus and method for detecting swallowing activity that obviates or mitigates at least one of the above-identified disadvantages of the prior art.
An aspect of the invention provides a method for detecting swallowing activity comprising the steps of:
- receiving an electronic signal representing swallowing activity;
- extracting at least two features from the signal;
- classifying the signal as a type of swallowing activity based on the features;
- and,
- generating an output representing the classification.
The electronic signal can be generated by an accelerometer. The features can include at least one of stationarity, normality and dispersion ratio. The classifying step can be performed using a radial basis neural network.
The swallowing activity can include at least one of a swallow and an aspiration.
The extracting step can include stationarity as one of the features, the extracting step of stationarity including the following sub-steps:
- dividing the signal into a plurality of non-overlapping bins;
- determining a total number of total number of reverse arrangements, (ATotal,) in a mean square sequence is determined;
- extracting the stationarity feature (z), determined according to the following equation:
- where:
- μAis the mean number of reverse arrangements expected for a stationary signal of the same length.
- σAis the standard deviation for an equal length stationary signal
Each of the bins can be between about one ms and about nine ms in length. Each of the bins can be between about three ms and about seven ms in length. Each of the bins can be about five milliseconds (“ms”) in length.
The extracting step can include normality as one of the features, the extracting step of normality including the following sub-steps:
- standardizing the signal to have zero mean and unit variance (“s”).
- dividing the standardized signal into a plurality of bins (“I”) each of about 0.4 Volts, where
and wherein a highest bin extends to infinity and a lowest bin extends to negative infinity. - determining observed frequencies (“n”) for each the bin by counting the number of samples in the standardized signal (“s”) that fell within each the bin.
- determining expected frequencies {circumflex over (m)} for each the bin is determined under the assumption of normality, using a Chi-square (X2) statistic using the following:
- determining the normality feature using the following:
log10({circumflex over (X)}2)
The extracting step can include a dispersion ratio as one of the features, the extracting step of dispersion ratio including the following sub-steps:
- determining a mean absolute deviation of the signal according to the following:
- determining an interquartile range, S2, of the signal
- extracting the dispersion ratio according to the following:
Another aspect of the invention provides a device for detecting swallowing activity comprising an input device for receiving an electronic signal from a sensor. The electronic signal can represent swallowing activity. The device also comprises a microcomputer connected to the input device that is operable to extract at least two features from the signal. The microprocessor is further operable to classify the signal as a type of swallowing activity based on the features. The device also includes an output device connected to the microcomputer for generating an output representing the classification.
BRIEF DESCRIPTION OF THE DRAWINGS The invention will now be described by way of example only, and with reference to the accompanying drawings, in which:
FIG. 1 is a schematic representation of an apparatus for detecting swallowing activity in accordance with an embodiment of the invention;
FIG. 2 is a flow chart depicting a method for detecting swallowing activity in accordance with another embodiment of the invention;
FIG. 3 is a set of graphs showing exemplary signals that can be detected using the apparatus inFIG. 1; and
FIG. 4 is a graph showing exemplary output that can be generated by the method inFIG. 2.
DETAILED DESCRIPTION OF THE INVENTION As used herein the terms “swallow” and “penetration” are distinguished from the term “aspiration”. As used herein, a “swallow” is the safe passage of foodstuffs from the oral cavity, through the hypopharynx and into esophagus. Further, a swallow is accompanied by a period of apnea with no entry of foodstuffs into the protected airway. “Penetration” is the entry of foreign material into the airway but not accompanied by inspiration. However, an “aspiration” is the entry of foreign material into the airway during inspiration. As used in relation to the embodiments discussed below, the term “swallowing activity” means a swallow or an aspiration or the absence of either, but in other embodiments “swallowing activity” can refer to other types of activities including penetration.
Referring now toFIG. 1, an apparatus for detecting swallowing activity is indicated generally at30.Apparatus30 includes anaccelerometer34 that is positioned on the throat of apatient38. In a present embodiment,accelerometer34 is placed infer-anterior to the thyroid notch, so that the axis of theaccelerometer34 is aligned to measure anterior-posterior vibrations.Apparatus30 also includes acomputing device42 that is connected toaccelerometer34 via alink46.Link46 can be wired or wireless as desired and corresponding to appropriate interfaces onaccelerometer34 anddevice42.Apparatus30 is operable to receive acceleration signals fromaccelerometer34 that reflect swallowing activity inpatient38.
In a present embodiment,accelerometer38 is the EMT 25-C single axis accelerometer from Siemens Canada, Mississauga, Ontario Canada (“EMT 25-C”). Other accelerometers that can be used will occur to those of skill in the art.
In a present embodiment,computing device42 is based on the computing environment and functionality of a personal digital assistant that includes achassis50 that frames adisplay54 for presenting user output and a plurality ofkeys58 for receiving user input.Computing device42 thus includes an interface to allowdevice42 to connect toaccelerometer34 vialink46.Computing device42 thus includes any suitable arrangement of microprocessor, random access memory, non-volatile storage, operating system, etc. As will be explained in greater detail below,computing device42 is operable to receive signals fromaccelerometer34 and to detect swallowing activity from such signals, and report on those activities by presenting output ondisplay54.
In order to help explain certain of these implementations and various other aspects ofapparatus30, reference will now be made toFIG. 2 which shows a method for detecting swallowing activity and which is indicated generally at200. However, it is to be understood thatapparatus30 and/ormethod200 can be varied, and need not work exactly as discussed herein in conjunction with each other, and that such variations are within the scope of the present invention.
Beginning first atstep210, signals representing swallowing activity are received. Whenmethod200 is implemented usingapparatus30,step210 refers to the generation of electrical signals byaccelerometer34 and the receipt of those signals atcomputing device42. The use ofaccelerometer34 means that acceleration signals representing the swallowing activity ofpatient38 are received, and due to the unique characteristics of the EMT 25-C accelerometer used in the present embodiment, unique features can be found in the appearance of those signals.FIG. 3 shows examples of signals that can be received using EMT 25-C, indicated generally at300, and specifically at304,308 and312. Speaking in very general terms, signal304 is an example of typical paediatric aspiration signals that portray weak or wide-sense stationarity; signal308 is an aspiration signal that portrays nonstationarity due to evolving variance; and signal312 is an aspiration signal that portrays nonstationarity due to time-varying frequency and variance structure.
However, it is to be understood that
signals300 are simply raw data, and can represent aspirations or swallows or motion artifact. It has been determined by the inventors that the distribution of median acceleration magnitude is right-skewed for both aspiration and swallows. Due to the skewness of the distribution, gamma distribution is used to estimate the spread and location parameters within signals
300. In particular, the spread a and location b parameters of the gamma distributions for aspirations and swallows that can be associated with signals such as
signals300 are summarized in Table I.
| TABLE I |
|
|
| Location and spread of signal accelerations |
| Maximum | | Maximum | |
| likelihood | 95% confidence | likelihood | 95% confidence |
| Parameter | estimate | interval | estimate | interval |
|
| Spread (a) | 1.3647 g | [0.9343, 1.7952] | 3.642 g | [2.2713, 5.0128] |
| Location | 1.176 g | [0.732, 1.62] | 0.063 g | [0.041, 0.086] |
| (b) |
|
The stationarity and normality characteristics ofsignals300 are summarized in Table II. Stationarity is measured by the nonparametric reverse arrangements tests while normality is measured by a chi-squared distribution-based test of histogram bin counts. Further details about stationarity and normality can be found in “Random Data Analysis and Measurement Procedures” 3rdEdition, Julius S. Bendat and Allan G. Pierson, John Wiley & Sons Inc., (c) 2000, New York (“Bendat”), the contents of which are incorporated herein by reference.Chapter 10 of Bendat discusses for tests for stationarity, whileChapter 4 of Bendat discusses regarding normality.
Table II thus shows a very general, exemplary, summary of how aspirations and swallows can correspond to the stationarity and normality characteristics of received signals such as signals
300.
| TABLE II |
|
|
| Stationarity and normality characteristics of signal accelerations |
| Stationarity | 41% not stationary | 46% not stationary |
| Normality | 90% violating normality | 100% violating normality |
|
Due to the skewness of the distributions of the bandwidths, a gamma distribution is used to determine the location estimate. The frequency bandwidths can be calculated using a discrete wavelet decomposition at ten levels and determining the level at which the cumulative energy (starting from the final level of decomposition) exceeded 85% of the total energy. This determines the 85% bandwidth for the signal in question.
The location estimate of the about 85% frequency bandwidth can be between about 700 Hz to about 1100 Hz for aspiration signals, and more preferably can be between about 900Hz and about 950 Hz, and even more preferably between about 910Hz and about 940 Hz, and still further preferably about 928 Hz for aspiration signals.
The location estimate of the about 85% frequency bandwidth can be between about 400 Hz to about 700 Hz for swallow signals, and more preferably can be between about 500Hz and about 650 Hz, and even more preferably between about 590 Hz and about 630 Hz, and still further preferably about 613 Hz for swallows.
Having received signals atstep210,method200 advances to step220. Atstep220, a determination is made as to whether an event is present inside the signals received atstep210. The criteria for making such a determination is not particularly limited. In a present embodiment, when computingdevice42 receives a signal magnitude fromaccelerometer34 that exceeds an “on” threshold (in a present embodiment of about 0.025 Volts (“V”)) for a pre-determined “onset” period (in a present embodiment about thirty milliseconds (“ms”)), event initiation is identified and signal recording begins. The next about 12,000 samples are recorded, corresponding to about 1.2 seconds (“s”) of data. Back-trimming is then performed to determine when the signal activity substantially ceased. Such back-trimming involves counting the number of data samples below about 0.05 V, starting from the end of the recording. Once this count exceeds about thirty data points, the end of the useful signal is deemed to have been identified and the end of the signal is trimmed therefrom. In a present embodiment, 12000 samples are recorded, but about 15,000 samples (i.e. about 1.5 s of above threshold signal activity) can also be recorded for analysis as a single signal. In other embodiments other numbers of samples can be recorded, as desired. If the foregoing criteria are not met, then it is determined atstep220 that an event has not occurred andmethod200 returns to step210. However, if the criteria is met thenmethod200 advances fromstep220 to step230, and the signals that are recorded atstep220 is retained for use atstep230.
Next, atstep230, features are extracted from the recorded signals. In a presently preferred embodiment, stationarity, normality and dispersion ratio are three features that are extracted.
In order to extract the stationarity feature, the procedure inChapter 10 of Bendat is employed. First, the received signal, is divided into non-overlapping bins each of about five milliseconds (“ms”) (i.e. for a total of fifty samples) in length. (The received signal can, however, be divided into non-overlapping bins of between about one ms and about nine ms, or more preferably between about three ms and about seven ms.) Where the signal length, defined herein as “L” is not an integral multiple of fifty, the signal was trimmed at the beginning and end of the signal by approximately (L mod 50)/2. Next, the mean square value within each window was computed. Next, the total number of reverse arrangements, referred to herein as ATotal,in the mean square sequence is determined. Finally, z-deviate serves as the stationarity feature which is determined according toEquation 1.
- where:
- μAis the mean number of reverse arrangements expected for a stationary signal of the same length.
- σAis the standard deviation for an equal length stationary signal.
In order to extract the normality feature, an adaptation of the procedure inChapter 4 of Bendat is employed. First, the signal is standardized to have zero mean and unit variance. The standardized signal is referred to herein as “s”. Next, the amplitude of the standardized signal, s, is divided into I bins each of about 0.4 Volts, where
The highest bin extended to infinity and the lowest bin extended to negative infinity.
Next, the observed frequencies n for each bin are determined by counting the number of samples in the standardized signal that fell within each bin. The expected frequencies {circumflex over (m)} for each bin is determined under the assumption of normality. The Chi-square statistic was computed as shown inEquation 2.
Finally, the normality feature is computed as shown in Equation 3.
log10({circumflex over (X)}2) Equation 3
In order to determine the dispersion ratio feature, the mean absolute deviation of each signal is determined according toEquation 4.
Next, the interquartile range, S2, of each signal is determined. The interquartile range is defined inChapter 2 of “Introduction to robust estimation and hypothesis testing”, Rand R. Wilcox, 1997, Academic Press, CA. Finally, the dispersion ratio feature is determined according to Equation 5.
Having extracted these features from the signal,method200 advances to step240, at which point the signal is classified based on the features extracted atstep230. In a presently preferred embodiment, the classification is performed using a radial basis function neural network implemented on the microcontroller ofdevice42 to classify swallowing events in real-time, as either swallows or aspirations. Further details about such a radial basis function neural network can be found in Chapter 5 of “Neural Networks for Pattern Recognition”, Christopher Bishop, 1995, Clarendon Press, Oxford (“Bishop”), the contents of which are incorporated herein by reference. The network is operable to take the three extracted features as inputs, and output a single number as its classification of the detected type of swallowing activity. In particular, an output level of about 0.1 is assigned to represent swallows and an output level of about 0.9 to represent aspirations. The network architecture consists of three inputs corresponding to each extracted feature, eighty-nine radial basis function units determined from an interactive training procedure as outlined in “Bishop” and one output unit, representing swallowing or aspiration. While eighty-nine radial basis units is presently preferred, in other embodiments from about seventy-five to about one-hundred radial basis units can be used, and in other embodiments from about eighty to about ninety-five radial basis units can be used, all corresponding to one output. The first layer is nonlinear and the second layer is linear. Put in other words, the first layer of the network consists of the nonlinear radial basis functions while the second layer of the network is a weighted linear summation of the radial basis function outputs.
Referring now toFIG. 4, a scatter plot is shown for the results of performing steps210-240 for a number of different signals. The scatter plot inFIG. 4 is only two dimensional, showing only a plot of the stationarity features vs. the normality features. It can be seen that the squares on the scatter plot indicate where aspirations actually occurred, whereas the circles indicate swallows actually occurred. The scatter plot was generated while performingmethod200 in conjunction with videofluroscopy so that the actual swallowing activity could be verified, not withstanding the classification performed atstep230, so that the classifications made atstep230 could be verified for accuracy. The line indicated at400 inFIG. 4 represents a rough dividing line between classifications associated with swallows and aspirations. While some measurements in the scatter plot show a classification that does not reflect the actual type of swallowing activity, the majority of swallowing events are in fact correctly classified. Further improvement to the results shown inFIG. 4 are obtained when the third feature, dispersion ratio, is used to assist in the determination.
Method200 then advances to step250, at which point an output is generated corresponding to the classification performed atstep240. Thus, where a particular event was classified as a swallow, then display54 ofdevice42 would be instructed to present the message “SWALLOW”, whereas if the event was classified as an aspiration then display54 ofdevice42 would be instructed to present the message “ASPIRATION”. Such messages presented bydevice42 could also include colours (e.g. green associated with swallows, red associated with aspirations) and/or auditory signals (e.g. no sound for swallow, beeping for aspirations).
Usingmethod200, anindividual feeding patient38 can adjust how the feeding is being performed in order to reduce aspirations and increase swallows. Such adjustments to feedings can be based on changing consistency or type of food, the size and/or frequency of mouthfuls being offered topatient38, and the like.
It should now be understood that asmethod200 is implemented usingdevice42, the microcontroller ofdevice42 will be provided with software programming instructions corresponding tomethod200.
While only specific combinations of the various features and components of the present invention have been discussed herein, it will be apparent to those of skill in the art that desired subsets of the disclosed features and components and/or alternative combinations of these features and components can be utilized, as desired. For example, it is also to be understood that other types of vibration sensors other thanaccelerometer34 can be used with appropriate modifications tocomputing device42. While presently less preferred, another sensor can include a sensor that measure displacement (e.g microphone), while havingcomputing device42 record received displacement signals over time. Another type of sensor can include a sensor that measures velocity, havingcomputing device42 record received velocity signals over time. Such signals can then be converted into acceleration signals and processed according to the above, or other techniques of feature extraction and classification thereof that work with the type of received signal can be employed, as desired.
As an additional example, while atstep230 ofmethod200 stationarity, normality and dispersion ratio are three features that are extracted, it is to be understood that in other embodiments other features and/or combinations thereof can be extracted that can be used to detect a swallowing event. For example, while presently less preferred, it can be desired to simply extract any two of stationarity, normality and dispersion ratio in order to make a determination as to whether a particular swallowing event is to be classified as a swallow or aspiration.
Furthermore, while computingdevice42 is a personal digital assistant having a programmable microprocessor and display, inother embodiments device42 can simply be an electronic device that includes circuitry dedicated to processing signals from an accelerometer (or other sensor) and classifying those signals as different types of swallowing activity. Similarly, the device can simply include a set of indicator lights—e.g. a pair of indicator lights, one light for indicating a swallow, the other for indicating an aspiration. Whatever the format ofdevice42,device42 can also include an interface for connection to a personal computer or other computing device so that updated programming instructions for detecting aspirations, swallows and/or other types of swallowing activity can be uploaded thereto.
The above-described embodiments of the invention are intended to be examples of the present invention and alterations and modifications may be effected thereto, by those of skill in the art, without departing from the scope of the invention which is defined solely by the claims appended hereto.