Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Sequential probability ratio test

From Wikipedia, the free encyclopedia
Hypothesis test in mathematics
"SPRT" redirects here. For standard platinum resistance thermometers, seeresistance thermometer.

Thesequential probability ratio test (SPRT) is a specificsequential hypothesis test, developed byAbraham Wald[1] and later proven to be optimal by Wald andJacob Wolfowitz.[2]Neyman and Pearson's 1933 result inspired Wald to reformulate it as a sequential analysis problem. The Neyman-Pearson lemma, by contrast, offers arule of thumb for when all the data is collected (and its likelihood ratio known).

While originally developed for use inquality control studies in the realm of manufacturing, SPRT has been formulated for use in the computerized testing of human examinees as a termination criterion.[3][4][5]

Theory

[edit]

As in classicalhypothesis testing, SPRT starts with a pair of hypotheses, sayH0{\displaystyle H_{0}} andH1{\displaystyle H_{1}} for thenull hypothesis andalternative hypothesis respectively. They must be specified as follows:

H0:p=p0{\displaystyle H_{0}:p=p_{0}}
H1:p=p1{\displaystyle H_{1}:p=p_{1}}

The next step is to calculate the cumulative sum of the log-likelihood ratio,logΛi{\displaystyle \log \Lambda _{i}}, as new data arrive: withS0=0{\displaystyle S_{0}=0}, then, fori{\displaystyle i}=1,2,...,

Si=Si1+logΛi{\displaystyle S_{i}=S_{i-1}+\log \Lambda _{i}}

Thestopping rule is a simple thresholding scheme:

wherea{\displaystyle a} andb{\displaystyle b} (a<0<b<{\displaystyle a<0<b<\infty }) depend on the desiredtype I and type II errors,α{\displaystyle \alpha } andβ{\displaystyle \beta }. They may be chosen as follows:

alogβ1α{\displaystyle a\approx \log {\frac {\beta }{1-\alpha }}} andblog1βα{\displaystyle b\approx \log {\frac {1-\beta }{\alpha }}}

In other words,α{\displaystyle \alpha } andβ{\displaystyle \beta } must be decided beforehand in order to set the thresholds appropriately. The numerical value will depend on the application. The reason for being only an approximation is that, in the discrete case, the signal may cross the threshold between samples. Thus, depending on the penalty of making an error and thesampling frequency, one might set the thresholds more aggressively. The exact bounds are correct in the continuous case.

Example

[edit]

A textbook example isparameter estimation of aprobability distribution function. Consider theexponential distribution:

fθ(x)=θ1exθ,x,θ>0{\displaystyle f_{\theta }(x)=\theta ^{-1}e^{-{\frac {x}{\theta }}},\qquad x,\theta >0}

The hypotheses are

{H0:θ=θ0H1:θ=θ1θ1>θ0.{\displaystyle {\begin{cases}H_{0}:\theta =\theta _{0}\\H_{1}:\theta =\theta _{1}\end{cases}}\qquad \theta _{1}>\theta _{0}.}

Then the log-likelihood function (LLF) for one sample is

logΛ(x)=log(θ11exθ1θ01exθ0)=log(θ0θ1exθ0xθ1)=log(θ0θ1)+log(exθ0xθ1)=log(θ1θ0)+(xθ0xθ1)=log(θ1θ0)+(θ1θ0θ0θ1)x{\displaystyle {\begin{aligned}\log \Lambda (x)&=\log \left({\frac {\theta _{1}^{-1}e^{-{\frac {x}{\theta _{1}}}}}{\theta _{0}^{-1}e^{-{\frac {x}{\theta _{0}}}}}}\right)\\&=\log \left({\frac {\theta _{0}}{\theta _{1}}}e^{{\frac {x}{\theta _{0}}}-{\frac {x}{\theta _{1}}}}\right)\\&=\log \left({\frac {\theta _{0}}{\theta _{1}}}\right)+\log \left(e^{{\frac {x}{\theta _{0}}}-{\frac {x}{\theta _{1}}}}\right)\\&=-\log \left({\frac {\theta _{1}}{\theta _{0}}}\right)+\left({\frac {x}{\theta _{0}}}-{\frac {x}{\theta _{1}}}\right)\\&=-\log \left({\frac {\theta _{1}}{\theta _{0}}}\right)+\left({\frac {\theta _{1}-\theta _{0}}{\theta _{0}\theta _{1}}}\right)x\end{aligned}}}

The cumulative sum of the LLFs for allx is

Sn=i=1nlogΛ(xi)=nlog(θ1θ0)+(θ1θ0θ0θ1)i=1nxi{\displaystyle S_{n}=\sum _{i=1}^{n}\log \Lambda (x_{i})=-n\log \left({\frac {\theta _{1}}{\theta _{0}}}\right)+\left({\frac {\theta _{1}-\theta _{0}}{\theta _{0}\theta _{1}}}\right)\sum _{i=1}^{n}x_{i}}

Accordingly, the stopping rule is:

a<nlog(θ1θ0)+(θ1θ0θ0θ1)i=1nxi<b{\displaystyle a<-n\log \left({\frac {\theta _{1}}{\theta _{0}}}\right)+\left({\frac {\theta _{1}-\theta _{0}}{\theta _{0}\theta _{1}}}\right)\sum _{i=1}^{n}x_{i}<b}

After re-arranging we finally find

a+nlog(θ1θ0)<(θ1θ0θ0θ1)i=1nxi<b+nlog(θ1θ0){\displaystyle a+n\log \left({\frac {\theta _{1}}{\theta _{0}}}\right)<\left({\frac {\theta _{1}-\theta _{0}}{\theta _{0}\theta _{1}}}\right)\sum _{i=1}^{n}x_{i}<b+n\log \left({\frac {\theta _{1}}{\theta _{0}}}\right)}

The thresholds are simply twoparallel lines withslopelog(θ1/θ0){\displaystyle \log(\theta _{1}/\theta _{0})}. Sampling should stop when the sum of the samples makes an excursion outside thecontinue-sampling region.

Applications

[edit]

Manufacturing

[edit]

The test is done on the proportion metric, and tests that a variablep is equal to one of two desired points,p1 orp2. The region between these two points is known as theindifference region (IR). For example, suppose you are performing a quality control study on a factory lot of widgets. Management would like the lot to have 3% or less defective widgets, but 1% or less is the ideal lot that would pass with flying colors. In this example,p1 = 0.01 andp2 = 0.03 and the region between them is the IR because management considers these lots to be marginal and is OK with them being classified either way. Widgets would be sampled one at a time from the lot (sequential analysis) until the test determines, within an acceptable error level, that the lot is ideal or should be rejected.

Testing of human examinees

[edit]

The SPRT is currently the predominant method of classifying examinees in a variable-lengthcomputerized classification test (CCT)[citation needed]. The two parameters arep1 andp2 are specified by determining a cutscore (threshold) for examinees on the proportion correct metric, and selecting a point above and below that cutscore. For instance, suppose the cutscore is set at 70% for a test. We could selectp1 = 0.65 andp2 = 0.75 . The test then evaluates the likelihood that an examinee's true score on that metric is equal to one of those two points. If the examinee is determined to be at 75%, they pass, and they fail if they are determined to be at 65%.

These points are not specified completely arbitrarily. A cutscore should always be set with a legally defensible method, such as amodified Angoff procedure. Again, the indifference region represents the region of scores that the test designer is OK with going either way (pass or fail). The upper parameterp2 is conceptually the highest level that the test designer is willing to accept for a Fail (because everyone below it has a good chance of failing), and the lower parameterp1 is the lowest level that the test designer is willing to accept for a pass (because everyone above it has a decent chance of passing). While this definition may seem to be a relatively small burden, consider thehigh-stakes case of a licensing test for medical doctors: at just what point should we consider somebody to be at one of these two levels?

While the SPRT was first applied to testing in the days ofclassical test theory, as is applied in the previous paragraph, Reckase (1983) suggested thatitem response theory be used to determine thep1 andp2 parameters. The cutscore and indifference region are defined on the latent ability (theta) metric, and translated onto the proportion metric for computation. Research on CCT since then has applied this methodology for several reasons:

  1. Large item banks tend to be calibrated with IRT
  2. This allows more accurate specification of the parameters
  3. By using the item response function for each item, the parameters are easily allowed to vary between items.

Detection of anomalous medical outcomes

[edit]

Spiegelhalter et al.[6] have shown that SPRT can be used to monitor the performance of doctors, surgeons and other medical practitioners in such a way as to give early warning of potentially anomalous results. They showed how it could have helped identifyHarold Shipman as a murderer well before he was actually identified.[6]

Extensions

[edit]

MaxSPRT

[edit]

More recently, in 2011, an extension of the SPRT method called Maximized Sequential Probability Ratio Test (MaxSPRT)[7] was introduced. The salient feature of MaxSPRT is the allowance of a composite, one-sided alternative hypothesis, and the introduction of an upper stopping boundary. The method has been used in several medical research studies.[8]

See also

[edit]

References

[edit]
  1. ^Wald, Abraham (June 1945)."Sequential Tests of Statistical Hypotheses".Annals of Mathematical Statistics.16 (2):117–186.doi:10.1214/aoms/1177731118.JSTOR 2235829.
  2. ^Wald, A.; Wolfowitz, J. (1948)."Optimum Character of the Sequential Probability Ratio Test".The Annals of Mathematical Statistics.19 (3):326–339.doi:10.1214/aoms/1177730197.JSTOR 2235638.
  3. ^Ferguson, Richard L. (1969).The development, implementation, and evaluation of a computer-assisted branched test for a program of individually prescribed instruction. Unpublished doctoral dissertation, University of Pittsburgh.
  4. ^Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). New York: Academic Press.
  5. ^Eggen, T. J. H. M. (1999). "Item Selection in Adaptive Testing with the Sequential Probability Ratio Test".Applied Psychological Measurement.23 (3):249–261.doi:10.1177/01466219922031365.S2CID 120780131.
  6. ^abSpiegelhalter, David; Grigg, Olivia; Kinsman, Robin; Treasure, Tom (2003-02-01)."Risk-adjusted sequential probability ratio tests: applications to Bristol, Shipman and adult cardiac surgery".International Journal for Quality in Health Care.15 (1):7–13.doi:10.1093/intqhc/15.1.7.ISSN 1353-4505.
  7. ^Kulldorff, Martin; Davis, Robert L.; Kolczak†, Margarette; Lewis, Edwin; Lieu, Tracy; Platt, Richard (2011)."A Maximized Sequential Probability Ratio Test for Drug and Vaccine Safety Surveillance".Sequential Analysis.30:58–78.doi:10.1080/07474946.2011.539924.
  8. ^2nd to last paragraph of section 1:http://www.tandfonline.com/doi/full/10.1080/07474946.2011.539924 A Maximized Sequential Probability Ratio Test for Drug and Vaccine Safety Surveillance Kulldorff, M. et alSequential Analysis: Design Methods and Applications vol 30, issue 1

Further reading

[edit]

External links

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Sequential_probability_ratio_test&oldid=1321902819"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp