Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Odds ratio

From Wikipedia, the free encyclopedia
Statistic quantifying the association between two events
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(July 2024) (Learn how and when to remove this message)

Anodds ratio (OR) is astatistic that quantifies the strength of theassociation between two events, A and B. The odds ratio is defined as the ratio of theodds of event A taking place in the presence of B, and the odds of A in the absence of B. Due tosymmetry, odds ratio reciprocally calculates the ratio of the odds of B occurring in the presence of A, and the odds of B in the absence of A. Two events areindependent if and only if the OR equals 1, i.e., the odds of one event are the same in either the presence or absence of the other event. If the OR is greater than 1, then A and B are associated (correlated) in the sense that, compared to the absence of B, the presence of B raises the odds of A, and symmetrically the presence of A raises the odds of B. Conversely, if the OR is less than 1, then A and B are negatively correlated, and the presence of one event reduces the odds of the other event occurring.

Note that the odds ratio is symmetric in the two events, and nocausal direction is implied (correlationdoes not imply causation): an OR greater than 1 does not establish that B causes A, or that A causes B.[1]

Two similar statistics that are often used to quantify associations are therelative risk (RR) and theabsolute risk reduction (ARR). Often, the parameter of greatest interest is actually the RR, which is the ratio of the probabilities analogous to the odds used in the OR. However, available data frequently do not allow for the computation of the RR or the ARR, but do allow for the computation of the OR, as incase-control studies, as explained below. On the other hand, if one of the properties (A or B) is sufficiently rare (in epidemiology this is called therare disease assumption), then the OR is approximately equal to the corresponding RR.

The ORplays an important role in thelogistic model.

Definition and basic properties

[edit]

Intuition from an example for laypeople

[edit]

If we flip an unbiased coin, the probability of getting heads and the probability of getting tails are equal — both are 50%. Imagine we get a biased coin that makes it two times more likely to get heads. But what does "twice as likely" mean in terms of a probability? It cannot literally mean to double the original probability value, because doubling 50% would yield 100%. Rather, it is theodds that are doubling: from 1:1 odds, to 2:1 odds. The new probabilities would be 6623% for heads and 3313% for tails.

A motivating example, in the context of therare disease assumption

[edit]

Suppose a radiation leak in a village of 1,000 people increased the incidence of a rare disease. The total number of people exposed to the radiation wasVE=400,{\displaystyle V_{E}=400,} out of whichDE=20{\displaystyle D_{E}=20} developed the disease andHE=380{\displaystyle H_{E}=380} stayed healthy. The total number of people not exposed wasVN=600,{\displaystyle V_{N}=600,} out of whichDN=6{\displaystyle D_{N}=6} developed the disease andHN=594{\displaystyle H_{N}=594} stayed healthy. We can organize this in acontingency table:

 Diseased  Healthy  Exposed 20380 Not exposed 6594{\displaystyle {\begin{array}{|r|cc|}\hline &{\text{ Diseased }}&{\text{ Healthy }}\\\hline {\text{ Exposed }}&20&380\\{\text{ Not exposed }}&6&594\\\hline \end{array}}}

Therisk of developing the disease given exposure isDE/VE=20/400=.05{\displaystyle D_{E}/V_{E}=20/400=.05} and of developing the disease given non-exposure isDN/VN=6/600=.01{\displaystyle D_{N}/V_{N}=6/600=.01}. One obvious way to compare the risks is to use the ratio of the two, therelative risk.

Relative risk=DE/(DE+HE)DN/(DN+HN)=DE/VEDN/VN=20/4006/600=.05.01=5.{\displaystyle {\text{Relative risk}}={\frac {D_{E}/(D_{E}+H_{E})}{D_{N}/(D_{N}+H_{N})}}={\frac {D_{E}/V_{E}}{D_{N}/V_{N}}}={\frac {20/400}{6/600}}={\frac {.05}{.01}}=5\,.}

The odds ratio is different. Theodds of getting the disease if exposed isDE/HE=20/380.0526,{\displaystyle D_{E}/H_{E}=20/380\approx .0526,} and the odds ifnot exposed isDN/HN=6/594.0101.{\displaystyle D_{N}/H_{N}=6/594\approx .0101\,.} Theodds ratio is the ratio of the two,

Odds ratio=DE/HEDN/HN=20/3806/594.0526.0101=5.2.{\displaystyle {\text{Odds ratio}}={\frac {D_{E}/H_{E}}{D_{N}/H_{N}}}={\frac {20/380}{6/594}}\approx {\frac {.0526}{.0101}}=5.2\,.}

As illustrated by this example, in arare-disease case like this, therelative risk and the odds ratio are almost the same. By definition, rare disease implies thatVEHE{\displaystyle V_{E}\approx H_{E}} andVNHN{\displaystyle V_{N}\approx H_{N}}. Thus, the denominators in the relative risk and odds ratio are almost the same (400380{\displaystyle 400\approx 380} and600594){\displaystyle 600\approx 594)}.

Relative risk is easier to understand than the odds ratio, but one reason to use odds ratio is that usually, data on the entire population is not available andrandom sampling must be used. In the example above, if it were very costly to interview villagers and find out if they were exposed to the radiation, then theprevalence of radiation exposure would not be known, and neither would the values ofVE{\displaystyle V_{E}} orVN{\displaystyle V_{N}}. One could take a random sample of fifty villagers, but quite possibly such a random sample would not include anybody with the disease, since only 2.6% of the population are diseased. Instead, one might use acase-control study[2] in which all 26 diseased villagers are interviewed as well as a random sample of 26 who do not have the disease. The results might turn out as follows ("might", because this is a random sample):

 Diseased  Healthy  Exposed 2010 Not exposed 616{\displaystyle {\begin{array}{|r|cc|}\hline &{\text{ Diseased }}&{\text{ Healthy }}\\\hline {\text{ Exposed }}&20&10\\{\text{ Not exposed }}&6&16\\\hline \end{array}}}

The odds in this sample of getting the disease given that someone is exposed is 20/10 and the odds given that someone is not exposed is 6/16. The odds ratio is thus20/106/165.3{\displaystyle {\frac {20/10}{6/16}}\approx 5.3}, quite close to the odds ratio calculated for the entire village. The relative risk, however, cannot be calculated, because it is the ratio of the risks of getting the disease and we would needVE{\displaystyle V_{E}} andVN{\displaystyle V_{N}} to figure those out. Because the study selected for people with the disease, half the people in the sample have the disease and it is known that that is more than the population-wide prevalence.

It is standard in the medical literature to calculate the odds ratio and then use the rare-disease assumption (which is usually reasonable) to claim that the relative risk is approximately equal to it. This not only allows for the use of case-control studies, but makes controlling for confounding variables such as weight or age using regression analysis easier and has the desirable properties discussed in other sections of this article ofinvariance andinsensitivity to the type of sampling.[3]

Definition in terms of group-wise odds

[edit]

The odds ratio is the ratio of theodds of an event occurring in one group to the odds of it occurring in another group. The term is also used to refer to sample-based estimates of this ratio. These groups might be men and women, an experimental group and acontrol group, or any otherdichotomous classification. If the probabilities of the event in each of the groups arep1 (first group) andp2 (second group), then the odds ratio is:

OR=p1/(1p1)p2/(1p2)=p1/q1p2/q2=p1q2p2q1,{\displaystyle OR={\frac {p_{1}/(1-p_{1})}{p_{2}/(1-p_{2})}}={\frac {p_{1}/q_{1}}{p_{2}/q_{2}}}={\frac {\;p_{1}q_{2}\;}{\;p_{2}q_{1}\;}},}

whereqx = 1 − px. An odds ratio of 1 indicates that the condition or event under study is equally likely to occur in both groups. An odds ratio greater than 1 indicates that the condition or event is more likely to occur in the first group. And an odds ratio less than 1 indicates that the condition or event is less likely to occur in the first group. The odds ratio must be nonnegative if it is defined. It is undefined ifp2q1 equals zero, i.e., ifp2 equals zero orq1 equals zero.

Definition in terms of joint and conditional probabilities

[edit]

The odds ratio can also be defined in terms of the jointprobability distribution of two binaryrandom variables. The joint distribution of binary random variablesX andY can be written

Y=1Y=0X=1p11p10X=0p01p00{\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&p_{11}&p_{10}\\X=0&p_{01}&p_{00}\end{array}}}

wherep11,p10,p01 andp00 are non-negative "cell probabilities" that sum to one. The odds forY within the two subpopulations defined byX = 1 andX = 0 are defined in terms of theconditional probabilities givenX,i.e.,P(Y |X):

Y=1Y=0X=1p11p11+p10p10p11+p10X=0p01p01+p00p00p01+p00{\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&{\frac {p_{11}}{p_{11}+p_{10}}}&{\frac {p_{10}}{p_{11}+p_{10}}}\\X=0&{\frac {p_{01}}{p_{01}+p_{00}}}&{\frac {p_{00}}{p_{01}+p_{00}}}\end{array}}}

Thus the odds ratio is

OR=p11/(p11+p10)p10/(p11+p10)/p01/(p01+p00)p00/(p01+p00)=p11p00p10p01{\displaystyle OR={\dfrac {p_{11}/(p_{11}+p_{10})}{p_{10}/(p_{11}+p_{10})}}{\bigg /}{\dfrac {p_{01}/(p_{01}+p_{00})}{p_{00}/(p_{01}+p_{00})}}={\frac {p_{11}p_{00}}{p_{10}p_{01}}}}

The simple expression on the right, above, is easy to remember as the product of the probabilities of the "concordant cells"(X = Y) divided by the product of the probabilities of the "discordant cells"(X ≠ Y). However in some applications the labeling of categories as zero and one is arbitrary, so there is nothing special about concordant versus discordant values in these applications.

Symmetry

[edit]

If we had calculated the odds ratio based on the conditional probabilities givenY,

Y=1Y=0X=1p11p11+p01p10p10+p00X=0p01p11+p01p00p10+p00{\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&{\frac {p_{11}}{p_{11}+p_{01}}}&{\frac {p_{10}}{p_{10}+p_{00}}}\\X=0&{\frac {p_{01}}{p_{11}+p_{01}}}&{\frac {p_{00}}{p_{10}+p_{00}}}\end{array}}}

we would have obtained the same result

p11/(p11+p01)p01/(p11+p01)/p10/(p10+p00)p00/(p10+p00)=p11p00p10p01.{\displaystyle {\dfrac {p_{11}/(p_{11}+p_{01})}{p_{01}/(p_{11}+p_{01})}}{\bigg /}{\dfrac {p_{10}/(p_{10}+p_{00})}{p_{00}/(p_{10}+p_{00})}}={\dfrac {p_{11}p_{00}}{p_{10}p_{01}}}.}

Other measures of effect size forbinary data such as therelative risk do not have this symmetry property.

Relation to statistical independence

[edit]

IfX andY are independent, their joint probabilities can be expressed in terms of their marginal probabilitiespx = P(X = 1) andpy = P(Y = 1), as follows

Y=1Y=0X=1pxpypx(1py)X=0(1px)py(1px)(1py){\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&p_{x}p_{y}&p_{x}(1-p_{y})\\X=0&(1-p_{x})p_{y}&(1-p_{x})(1-p_{y})\end{array}}}

In this case, the odds ratio equals one, and conversely the odds ratio can only equal one if the joint probabilities can be factored in this way. Thus the odds ratio equals one if and only ifX andY areindependent.

Recovering the cell probabilities from the odds ratio and marginal probabilities

[edit]

The odds ratio is a function of the cell probabilities, and conversely, the cell probabilities can be recovered given knowledge of the odds ratio and the marginal probabilitiesP(X = 1) = p11 + p10 andP(Y = 1) = p11 + p01. If the odds ratioR differs from 1, then

p11=1+(p1+p1)(R1)S2(R1){\displaystyle p_{11}={\frac {1+(p_{1\cdot }+p_{\cdot 1})(R-1)-S}{2(R-1)}}}

wherep1• = p11 + p10,  p•1 = p11 + p01, and

S=(1+(p1+p1)(R1))2+4R(1R)p1p1.{\displaystyle S={\sqrt {(1+(p_{1\cdot }+p_{\cdot 1})(R-1))^{2}+4R(1-R)p_{1\cdot }p_{\cdot 1}}}.}

In the case whereR = 1, we have independence, sop11 = p1•p•1.

Once we havep11, the other three cell probabilities can easily be recovered from the marginal probabilities.

Example

[edit]
A graph showing how the log odds ratio relates to the underlying probabilities of the outcomeX occurring in two groups, denotedA andB. The log odds ratio shown here is based on the odds for the event occurring in groupB relative to the odds for the event occurring in groupA. Thus, when the probability ofX occurring in groupB is greater than the probability ofX occurring in groupA, the odds ratio is greater than 1, and the log odds ratio is greater than 0.

Suppose that in a sample of 100 men, 90 drank wine in the previous week (so 10 did not), while in a sample of 80 women only 20 drank wine in the same period (so 60 did not). This forms the contingency table:

M=1M=0D=19020D=01060{\displaystyle {\begin{array}{c|cc}&M=1&M=0\\\hline D=1&90&20\\D=0&10&60\end{array}}}

The odds ratio (OR) can be directly calculated from this table as:

OR=90×6010×20=27{\displaystyle {OR}={\frac {\;90\times 60\;}{\;10\times 20\;}}=27}

Alternatively, the odds of a man drinking wine are 90 to 10, or 9:1, while the odds of a woman drinking wine are only 20 to 60, or 1:3 = 0.33. The odds ratio is thus 9/0.33, or 27, showing that men are much more likely to drink wine than women. The detailed calculation is:

0.9/0.10.2/0.6=0.9×0.60.1×0.2=0.540.02=27{\displaystyle {0.9/0.1 \over 0.2/0.6}={\frac {\;0.9\times 0.6\;}{\;0.1\times 0.2\;}}={0.54 \over 0.02}=27}

This example also shows how odds ratios are sometimes sensitive in stating relative positions: in this sample men are (90/100)/(20/80) = 3.6 times as likely to have drunk wine than women, but have 27 times the odds. The logarithm of the odds ratio, the difference of thelogits of theprobabilities, tempers this effect, and also makes the measuresymmetric with respect to the ordering of groups. For example, usingnatural logarithms, an odds ratio of 27/1 maps to 3.296, and an odds ratio of 1/27 maps to −3.296.

Statistical inference

[edit]
A graph showing the minimum value of the sample log odds ratio statistic that must be observed to be deemed significant at the 0.05 level, for a given sample size. The three lines correspond to different settings of the marginal probabilities in the 2×2 contingency table (the row and column marginal probabilities are equal in this graph).

Several approaches to statistical inference for odds ratios have been developed.

One approach to inference uses large sample approximations to the sampling distribution of the log odds ratio (thenatural logarithm of the odds ratio). If we use the joint probability notation defined above, the population log odds ratio is

log(p11p00p01p10)=log(p11)+log(p00)log(p10)log(p01).{\displaystyle {\log \left({\frac {p_{11}p_{00}}{p_{01}p_{10}}}\right)=\log(p_{11})+\log(p_{00}{\big )}-\log(p_{10})-\log(p_{01})}.\,}

If we observe data in the form of acontingency table

Y=1Y=0X=1n11n10X=0n01n00{\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&n_{11}&n_{10}\\X=0&n_{01}&n_{00}\end{array}}}

then the probabilities in the joint distribution can be estimated as

Y=1Y=0X=1p^11p^10X=0p^01p^00{\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&{\hat {p}}_{11}&{\hat {p}}_{10}\\X=0&{\hat {p}}_{01}&{\hat {p}}_{00}\end{array}}}

where︿pij = nij / n, withn = n11 + n10 + n01 + n00 being the sum of all four cell counts. The sample log odds ratio is

L=log(p^11p^00p^10p^01)=log(n11n00n10n01){\displaystyle {L=\log \left({\dfrac {{\hat {p}}_{11}{\hat {p}}_{00}}{{\hat {p}}_{10}{\hat {p}}_{01}}}\right)=\log \left({\dfrac {n_{11}n_{00}}{n_{10}n_{01}}}\right)}}.

The distribution of the log odds ratio is approximatelynormal with:

L  N(log(OR),σ2).{\displaystyle L\ \sim \ {\mathcal {N}}(\log(OR),\,\sigma ^{2}).\,}

Thestandard error for the log odds ratio is approximately

SE=1n11+1n10+1n01+1n00{\displaystyle {{\rm {SE}}={\sqrt {{\dfrac {1}{n_{11}}}+{\dfrac {1}{n_{10}}}+{\dfrac {1}{n_{01}}}+{\dfrac {1}{n_{00}}}}}}}.

This is an asymptotic approximation, and will not give a meaningful result if any of the cell counts are very small. IfL is the sample log odds ratio, an approximate 95%confidence interval for the population log odds ratio isL ± 1.96SE.[4] This can be mapped toexp(L − 1.96SE), exp(L + 1.96SE) to obtain a 95% confidence interval for the odds ratio. If we wish to test the hypothesis that the population odds ratio equals one, the two-sidedp-value is2P(Z < −|L|/SE), whereP denotes a probability, andZ denotes astandard normal random variable.

An alternative approach to inference for odds ratios looks at the distribution of the data conditionally on the marginal frequencies ofX andY. An advantage of this approach is that the sampling distribution of the odds ratio can be expressed exactly.

Role in logistic regression

[edit]

Logistic regression is one way to generalize the odds ratio beyond two binary variables. Suppose we have a binary response variableY and a binary predictor variableX, and in addition we have other predictor variablesZ1, ...,Zp that may or may not be binary. If we use multiple logistic regression to regressY onX,Z1, ...,Zp, then the estimated coefficientβ^x{\displaystyle {\hat {\beta }}_{x}} forX is related to a conditional odds ratio. Specifically, at the population level

eβx=exp(βx)=P(Y=1X=1,Z1,,Zp)/P(Y=0X=1,Z1,,Zp)P(Y=1X=0,Z1,,Zp)/P(Y=0X=0,Z1,,Zp),{\displaystyle e^{\beta _{x}}=\exp(\beta _{x})={\frac {P(Y=1\mid X=1,Z_{1},\ldots ,Z_{p})/P(Y=0\mid X=1,Z_{1},\ldots ,Z_{p})}{P(Y=1\mid X=0,Z_{1},\ldots ,Z_{p})/P(Y=0\mid X=0,Z_{1},\ldots ,Z_{p})}},}

soexp(β^x){\displaystyle \exp({\hat {\beta }}_{x})} is an estimate of this conditional odds ratio. The interpretation ofexp(β^x){\displaystyle \exp({\hat {\beta }}_{x})} is as an estimate of the odds ratio betweenY andX when the values ofZ1, ...,Zp are held fixed.

Insensitivity to the type of sampling

[edit]

If the data form a "population sample", then the cell probabilitiesp^ij{\displaystyle {\widehat {p\,}}_{ij}} are interpreted as the frequencies of each of the four groups in the population as defined by theirX andY values. In many settings it is impractical to obtain a population sample, so a selected sample is used. For example, we may choose to sampleunits withX = 1 with a given probabilityf, regardless of their frequency in the population (which would necessitate sampling units withX = 0 with probability1 − f). In this situation, our data would follow the following joint probabilities:

Y=1Y=0X=1fp11p11+p10fp10p11+p10X=0(1f)p01p01+p00(1f)p00p01+p00{\displaystyle {\begin{array}{c|cc}&Y=1&Y=0\\\hline X=1&{\frac {fp_{11}}{p_{11}+p_{10}}}&{\frac {fp_{10}}{p_{11}+p_{10}}}\\X=0&{\frac {(1-f)p_{01}}{p_{01}+p_{00}}}&{\frac {(1-f)p_{00}}{p_{01}+p_{00}}}\end{array}}}

Theodds ratiop11p00 / p01p10 for this distribution does not depend on the value off. This shows that the odds ratio (and consequently the log odds ratio) is invariant to non-random sampling based on one of the variables being studied. Note however that the standard error of the log odds ratio does depend on the value off.[citation needed]

This fact is exploited in two important situations:

  • Suppose it is inconvenient or impractical to obtain a population sample, but it is practical to obtain aconvenience sample of units with differentX values, such that within theX = 0 andX = 1 subsamples theY values are representative of the population (i.e. they follow the correct conditional probabilities).
  • Suppose the marginal distribution of one variable, sayX, is very skewed. For example, if we are studying the relationship between high alcohol consumption and pancreatic cancer in the general population, the incidence of pancreatic cancer would be very low, so it would require a very large population sample to get a modest number of pancreatic cancer cases. However we could use data from hospitals to contact most or all of their pancreatic cancer patients, and then randomly sample an equal number of subjects without pancreatic cancer (this is called a "case-control study").

In both these settings, the odds ratio can be calculated from the selected sample, without biasing the results relative to what would have been obtained for a population sample.

Use in quantitative research

[edit]

Due to the widespread use oflogistic regression, the odds ratio is widely used in many fields of medical and social science research. The odds ratio is commonly used insurvey research, inepidemiology, and to express the results of someclinical trials, such as incase-control studies. It is often abbreviated "OR" in reports. When data from multiple surveys is combined, it will often be expressed as "pooled OR".

Relation to relative risk

[edit]
Risk ratio vs odds ratio

As explained in the"Motivating Example" section, therelative risk is usually better than the odds ratio for understanding the relation between risk and some variable such as radiation or a new drug. That section also explains that if therare disease assumption holds, the odds ratio is a good approximation to relative risk[5] and that it has some advantages over relative risk. When the rare disease assumption does not hold, the unadjusted odds ratio will be greater than the relative risk,[6][7][8] but novel methods can easily use the same data to estimate the relative risk, risk differences, base probabilities, or other quantities.[9]

If the absolute risk in the unexposed group is available, conversion between the two is calculated by:[6]

Relative riskOdds ratio1RC+(RC×Odds ratio){\displaystyle {\text{Relative risk}}\approx {\frac {\text{Odds ratio}}{1-R_{C}+(R_{C}\times {\text{Odds ratio}})}}}

whereRC is the absolute risk of the unexposed group.

If the rare disease assumption does not apply, the odds ratio may be very different from the relative risk and should not be interpreted as a relative risk.

Consider the death rate of men and women passengers when a ship sank.[3] Of 462 women, 154 died and 308 survived. Of 851 men, 709 died and 142 survived. Clearly a man on the ship was more likely to die than a woman, but how much more likely? Since over half the passengers died, the rare disease assumption is strongly violated.

To compute the odds ratio, note that for women the odds of dying were 1 to 2 (154/308). For men, the odds were 5 to 1 (709/142). The odds ratio is 9.99 (4.99/.5). Men had ten times the odds of dying as women.

For women, the probability of death was 33% (154/462). For men the probability was 83% (709/851). The relative risk of death is 2.5 (.83/.33). A man had 2.5 times a woman's probability of dying.

Confusion and exaggeration

[edit]

Odds ratios have often been confused with relative risk in medical literature. For non-statisticians, the odds ratio is a difficult concept to comprehend, and it gives a more impressive figure for the effect.[10] However, most authors consider that the relative risk is readily understood.[11] In one study, members of a national disease foundation were actually 3.5 times more likely than nonmembers to have heard of a common treatment for that disease – but the odds ratio was 24 and the paper stated that members were ‘more than 20-fold more likely to have heard of’ the treatment.[12] A study of papers published in two journals reported that 26% of the articles that used an odds ratio interpreted it as a risk ratio.[13]

This may reflect the simple process of uncomprehending authors choosing the most impressive-looking and publishable figure.[11] But its use may in some cases be deliberately deceptive.[14] It has been suggested that the odds ratio should only be presented as a measure ofeffect size when therisk ratio cannot be estimated directly,[10] but with newly available methods it is always possible to estimate the risk ratio, which should generally be used instead.[9]

While relative risks are potentially easier to interpret for a general audience, there are mathematical and conceptual advantages when using an odds-ratio instead of a relative risk, particularly in regression models. For that reason, there is not a consensus within the fields of epidemiology or biostatistics that relative risks or odds-ratios should be preferred when both can be validly used, such as in clinical trials and cohort studies[15]

Invertibility and invariance

[edit]

The odds ratio has another unique property of being directly mathematically invertible whether analyzing the OR as either disease survival or disease onset incidence – where the OR for survival is direct reciprocal of 1/OR for risk. This is known as the 'invariance of the odds ratio'. In contrast, the relative risk does not possess this mathematical invertible property when studying disease survival vs. onset incidence. This phenomenon of OR invertibility vs. RR non-invertibility is best illustrated with an example:

Suppose in a clinical trial, one has an adverse event risk of 4/100 in drug group, and 2/100 in placebo... yielding a RR=2 and OR=2.04166 for drug-vs-placebo adverse risk. However, if analysis was inverted and adverse events were instead analyzed as event-free survival, then the drug group would have a rate of 96/100, and placebo group would have a rate of 98/100—yielding a drug-vs-placebo a RR=0.9796 for survival, but an OR=0.48979. As one can see, a RR of 0.9796 is clearly not the reciprocal of a RR of 2. In contrast, an OR of 0.48979 is indeed the direct reciprocal of an OR of 2.04166.

This is again what is called the 'invariance of the odds ratio', and why a RR for survival is not the same as a RR for risk, while the OR has this symmetrical property when analyzing either survival or adverse risk. The danger to clinical interpretation for the OR comes when the adverse event rate is not rare, thereby exaggerating differences when the OR rare-disease assumption is not met. On the other hand, when the disease is rare, using a RR for survival (e.g. the RR=0.9796 from above example) can clinically hide and conceal an important doubling of adverse risk associated with a drug or exposure.[citation needed]

Estimators of the odds ratio

[edit]

Sample odds ratio

[edit]

Thesample odds ration11n00 / n10n01 is easy to calculate, and for moderate and large samples performs well as an estimator of the population odds ratio. When one or more of the cells in the contingency table can have a small value, the sample odds ratio can bebiased and exhibit highvariance.

Alternative estimators

[edit]

A number of alternative estimators of the odds ratio have been proposed to address limitations of the sample odds ratio. One alternative estimator is the conditional maximum likelihood estimator, which conditions on the row and column margins when forming the likelihood to maximize (as inFisher's exact test).[16] Another alternative estimator is theMantel–Haenszel estimator.[citation needed]

Numerical examples

[edit]

The following four contingency tables contain observed cell counts, along with the corresponding sample odds ratio (OR) and sample log odds ratio (LOR):

OR = 1,LOR = 0OR = 1,LOR = 0OR = 4,LOR = 1.39OR = 0.25,LOR = −1.39
Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0
X = 1101010010020101020
X = 055505010202010

The followingjoint probability distributions contain the population cell probabilities, along with the corresponding population odds ratio (OR) and population log odds ratio (LOR):

OR = 1,LOR = 0OR = 1,LOR = 0OR = 16,LOR = 2.77OR = 0.67,LOR = −0.41
Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0Y = 1Y = 0
X = 10.20.20.40.40.40.10.10.3
X = 00.30.30.10.10.10.40.20.4

Numerical example

[edit]
Example of risk reduction
QuantityExperimental group (E)Control group (C)Total
Events (E)EE = 15CE = 100115
Non-events (N)EN = 135CN = 150285
Total subjects (S)ES =EE +EN = 150CS =CE +CN = 250400
Event rate (ER)EER =EE /ES = 0.1, or 10%CER =CE /CS = 0.4, or 40%
VariableAbbr.FormulaValue
Absolute risk reductionARRCEREER0.3, or 30%
Number needed to treatNNT1 / (CEREER)3.33
Relative risk (risk ratio)RREER /CER0.25
Relative risk reductionRRR(CEREER) /CER, or 1 −RR0.75, or 75%
Preventable fraction among the unexposedPFu(CEREER) /CER0.75
Odds ratioOR(EE /EN) / (CE /CN)0.167

Related statistics

[edit]
See also:Category:Summary statistics for contingency tables

There are various othersummary statistics for contingency tables that measure association between two events, such asYule'sY,Yule'sQ; these two are normalized so they are 0 for independent events, 1 for perfectly correlated, −1 for perfectly negatively correlated.A. W. F. Edwards studied these and argued that these measures of association must be functions of the odds ratio, which he referred to as thecross-ratio.[17]

Odds Ratio for a Matched Case-Control Study

[edit]

Acase-control study involves selecting representative samples of cases and controls who do, and do not, have some disease, respectively. These samples are usually independent of each other. The prior prevalence of exposure to some risk factor is observed in subjects from both samples. This permits the estimation of the odds ratio for disease in exposed vs. unexposed people as noted above.[18] Sometimes, however, it makes sense to match cases to controls on one or moreconfounding variables.[19] In this case, the prior exposure of interest is determined for each case and her/his matched control. The data can be summarized in the following table.

Matched 2 × 2 Table

[edit]
Case-control pairsControl exposedControl unexposed
Case exposedn11{\displaystyle n_{11}}n10{\displaystyle n_{10}}
Case unexposedn01{\displaystyle n_{01}}n00{\displaystyle n_{00}}

This table gives the exposure status of the matched pairs of subjects. There aren11{\displaystyle n_{11}} pairs where both the case and her/his matched control were exposed,n10{\displaystyle n_{10}} pairs where the case patient was exposed but the control subject was not,n01{\displaystyle n_{01}} pairs where the control subject was exposed but the case patient was not, andn00{\displaystyle n_{00}} pairs were neither subject was exposed. The exposure of matched case and control pairs is correlated due to the similar values of their shared confounding variables.

The following derivation is due toBreslow &Day.[19] We consider each pair as belonging to a stratum with identical values of the confounding variables. Conditioned on belonging to the same stratum, the exposure status of cases and controls are independent of each other. For any case-control pair within the same stratum let

p1{\displaystyle p_{1}} be the probability that a case patient is exposed,
p0{\displaystyle p_{0}} be the probability that a control patient is exposed,
q1=1p1{\displaystyle q_{1}=1-p_{1}} be the probability that a case patient is not exposed, and
q0=1p0{\displaystyle q_{0}=1-p_{0}} be the probability that a control patient is not exposed.

Then the probability that a case is exposed and a control is not isp1q0{\displaystyle p_{1}q_{0}}, and the probability that a control is exposed and a case in not isp0q1{\displaystyle p_{0}q_{1}}. The within-stratum odds ratio for exposure in cases relative to controls is

ψ=(p1/q1)/(p0/q0)=p1q0/(q1p0){\displaystyle \psi =(p_{1}/q_{1})/(p_{0}/q_{0})=p_{1}q_{0}/(q_{1}p_{0})}

We assume thatψ is constant across strata.[19]

Now concordant pairs in which either both the case and the control are exposed, or neither are exposed tell us nothing about the odds of exposure in cases relative to the odds of exposure among controls. The probability that the case is exposed and the control is not given that the pair is discordant is

π=(p1q0)/(p1q0+q1p0)=ψ/(ψ+1){\displaystyle \pi =(p_{1}q_{0})/(p_{1}q_{0}+q_{1}p_{0})=\psi /(\psi +1)}

The distribution ofn10{\displaystyle n_{10}} given the number of discordant pairs isbinomial  ~  B(n10+n01,π){\displaystyle (n_{10}+n_{01},\pi )} and themaximum likelihood estimate ofπ is

π^=n10/(n10+n01)=ψ^/(ψ^+1){\displaystyle {\hat {\pi }}=n_{10}/(n_{10}+n_{01})={\hat {\psi }}/({\hat {\psi }}+1)}

Multiplying both sides of this equation by(n10+n01)(ψ^+1){\displaystyle (n_{10}+n_{01})({\hat {\psi }}+1)} and subtractingn10ψ^{\displaystyle n_{10}{\hat {\psi }}} gives

n10=ψ^(n10+n01n10){\displaystyle n_{10}={\hat {\psi }}(n_{10}+n_{01}-n_{10})} and hence
ψ^=n10/n01{\displaystyle {\hat {\psi }}=n_{10}/n_{01}}.

Nowπ^{\displaystyle {\hat {\pi }}} is the maximum likelihood estimate ofπ, andψ is amonotonic function ofπ^{\displaystyle {\hat {\pi }}}. It follows thatψ^{\displaystyle {\hat {\psi }}} is the conditional maximum likelihood estimate ofψ^{\displaystyle {\hat {\psi }}} given the number of discordant pairs. Rothman et al.[20] give an alternate derivation ofψ^{\displaystyle {\hat {\psi }}} by showing that it is a special case of the Mantel-Haenszel estimate of the intra-strata odds ratio for stratified 2x2 tables.[20] They also reference Breslow & Day[19] as providing the derivation given here.

Under the null hypothesis thatψ=1,π=1/(1+1)=0.5{\displaystyle \psi =1,\pi =1/(1+1)=0.5}.

Hence, we can test the null hypothesis thatψ=1{\displaystyle \psi =1} by testing the null hypothesis thatπ=0.5{\displaystyle \pi =0.5}. This is done usingMcNemar's test.

There are a number of ways to calculate aconfidence interval forπ. Letπ^LB{\displaystyle {\hat {\pi }}_{LB}} andπ^UB{\displaystyle {\hat {\pi }}_{UB}} denote the lower and upper bound of a confidence interval forπ, respectively. Sinceψ=π/(1π){\displaystyle \psi =\pi /(1-\pi )}, the corresponding confidence interval forψ is

(π^LB1π^LB,π^UB1π^UB){\displaystyle ({\frac {{\hat {\pi }}_{LB}}{1-{\hat {\pi }}_{LB}}},{\frac {{\hat {\pi }}_{UB}}{1-{\hat {\pi }}_{UB}}})}.

Matched 2x2 tables may also be analyzed usingconditional logistic regression.[21] This technique has the advantage of allowing users to regress case-control status against multiple risk factors from matched case-control data.

Example

[edit]

McEvoy et al.[22] studied the use of cell phones by drivers as a risk factor for automobile crashes in a case-crossover study.[18] All study subjects were involved in an automobile crash requiring hospital attendance. Each driver's cell phone use at the time of her/his crash was compared to her/his cell phone use in a control interval at the same time of day one week earlier. We would expect that a person's cell phone use at the time of the crash would be correlated with his/her use one week earlier. Comparing usage during the crash and control intervals adjusts for driver's characteristics and the time of day and day of the week. The data can be summarized in the following table.

Case-control pairsPhone used in control intervalPhone not used in control interval
Phone used in crash interval527
Phone not used in crash interval6288

There were 5 drivers who used their phones in both intervals, 27 who used them in the crash but not the control interval, 6 who used them in the control but not the crash interval, and 288 who did not use them in either interval. The odds ratio for crashing while using their phone relative to driving when not using their phone was

ψ^=27/6=4.5{\displaystyle {\hat {\psi }}=27/6=4.5}.

Testing the null hypothesis thatψ^=1{\displaystyle {\hat {\psi }}=1} is the same as testing the null hypothesis thatπ^=0.5{\displaystyle {\hat {\pi }}=0.5} given 27 out of 33 discordant pairs in which the driver was using her/his phone at the time of his crash.McNemar'sχ2=13.36{\displaystyle \chi ^{2}=13.36}. This statistic has one degree of freedom and yields aP value of 0.0003. This allows us to reject the hypothesis that cell phone use has no effect on the risk of automobile crashes (ψ=1{\displaystyle \psi =1}) with a high level of statistical significance.

UsingWilson's method, a 95%confidence interval forπ is (0.6561, 0.9139). Hence, a 95% confidence interval forψ is

(0.656110.6561,0.913910.9139)=(1.9,10.6){\displaystyle \left({\frac {0.6561}{1-0.6561}},{\frac {0.9139}{1-0.9139}}\right)=(1.9,10.6)}

(McEvoy et al.[22] analyzed their data usingconditional logistic regression and obtained almost identical results to those given here. See the last row of Table 3 in their paper.)

See also

[edit]

References

[edit]
  1. ^Szumilas M (August 2010)."Explaining Odds Ratios".Journal of the Canadian Academy of Child and Adolescent Psychiatry.19 (3):227–229.ISSN 1719-8429.PMC 2938757.PMID 20842279.
  2. ^LaMorte WW (May 13, 2013),Case-Control Studies,Boston University School of Public Health, archived fromthe original on 2013-10-08, retrieved2013-09-02
  3. ^abSimon S (July–August 2001)."Understanding the Odds Ratio and the Relative Risk".Journal of Andrology.22 (4):533–536.doi:10.1002/j.1939-4640.2001.tb02212.x.PMID 11451349.S2CID 6150799.
  4. ^Morris JA, Gardner MJ (May 1988)."Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates".British Medical Journal (Clinical Research Ed.).296 (6632):1313–6.doi:10.1136/bmj.296.6632.1313.PMC 2545775.PMID 3133061.
  5. ^Viera AJ (July 2008). "Odds ratios and risk ratios: what's the difference and why does it matter?".Southern Medical Journal.101 (7):730–4.doi:10.1097/SMJ.0b013e31817a7ee4.PMID 18580722.
  6. ^abZhang J, Yu KF (November 1998). "What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes".JAMA.280 (19):1690–1.doi:10.1001/jama.280.19.1690.PMID 9832001.S2CID 30509187.
  7. ^Robbins AS, Chao SY, Fonseca VP (October 2002). "What's the relative risk? A method to directly estimate risk ratios in cohort studies of common outcomes".Annals of Epidemiology.12 (7):452–4.doi:10.1016/S1047-2797(01)00278-2.PMID 12377421.
  8. ^Nurminen M (August 1995). "To use or not to use the odds ratio in epidemiologic analyses?".European Journal of Epidemiology.11 (4):365–71.doi:10.1007/BF01721219.PMID 8549701.S2CID 11609059.
  9. ^abKing G, Zeng L (2002-05-30)."Estimating risk and rate levels, ratios and differences in case-control studies"(PDF).Statistics in Medicine.21 (10):1409–1427.doi:10.1002/sim.1032.ISSN 0277-6715.PMID 12185893.S2CID 11387977.
  10. ^abTaeger D, Sun Y, Straif K (10 August 1998)."On the use, misuse and interpretation of odds ratios".The BMJ.
  11. ^abA'Court C, Stevens R, Heneghan C (March 2012)."Against all odds? Improving the understanding of risk reporting".The British Journal of General Practice.62 (596): e220-3.doi:10.3399/bjgp12X630223.PMC 3289830.PMID 22429441.
  12. ^Nijsten T, Rolstad T, Feldman SR, Stern RS (January 2005). "Members of the national psoriasis foundation: more extensive disease and better informed about treatment options".Archives of Dermatology.141 (1):19–26.doi:10.1001/archderm.141.1.19.PMID 15655138.
  13. ^Holcomb W (2001). "An odd measure of risk: Use and misuse of the odds ratio".Obstetrics & Gynecology.98 (4):685–688.doi:10.1016/S0029-7844(01)01488-0.PMID 11576589.S2CID 44782438.
  14. ^Taylor HG (January 1975)."Social perception of the mentally retarded".Journal of Clinical Psychology.31 (1):100–2.doi:10.1136/bmj.316.7136.989.PMC 1112884.PMID 9550961.
  15. ^Wells GA (2022). "Commentary on controversy and debate 4 paper series: Questionable utility of the relative risk in clinical research".Journal of Clinical Epidemiology.142:268–270.doi:10.1016/j.jclinepi.2021.09.016.PMID 34560254.
  16. ^Rothman KJ, Greenland S, Lash TL (2008).Modern Epidemiology. Lippincott Williams & Wilkins.ISBN 978-0-7817-5564-1.[page needed]
  17. ^Edwards AW (1963). "The Measure of Association in a 2 × 2 Table".Journal of the Royal Statistical Society. A (General).126 (1):109–114.doi:10.2307/2982448.JSTOR 2982448.
  18. ^abCelentano DD, Szklo M, Gordis L (2019).Gordis Epidemiology, Sixth Edition. Philadelphia, PA: Elsevier. p. 149-177.
  19. ^abcdBreslow, NE, Day, NE (1980).Statistical Methods in Cancer Research: Vol. 1 - The Analysis of Case-Control Studies. Lyon, France: IARC Scientific Publications. p. 162-189.
  20. ^abRothman KJ, Greenland S, Lash TL (2008).Modern Epidemiology, Third Edition. Philadelphia, PA: Lippincott Williams & Wilkins. p. 287,288.
  21. ^Breslow NE, Day NE, Halvorsen KT, Prentice RL, Sabai C (1978)."Estimation of multiple relative risk functions in matched case-control studies".Am J Epidemiol.108 (4):299–307.doi:10.1093/oxfordjournals.aje.a112623.PMID 727199.
  22. ^abMcEvoy SP, Stevenson MR, McCartt AT, Woodward M, Haworth C, Palamara P, Cercarelli R (2005)."Role of mobile phones in motor vehicle crashes resulting in hospital attendance: a case-crossover study".BMJ.331 (7514): 428.doi:10.1136/bmj.38537.397512.55.PMC 1188107.PMID 16012176.

External links

[edit]
Overview
Controlled study
(EBM I to II-1)
Observational study
(EBM II-2 to II-3)
Measures
Occurrence
Association
Population impact
Other
Trial/test types
Analysis of clinical trials
Interpretation of results
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Portals:
Retrieved from "https://en.wikipedia.org/w/index.php?title=Odds_ratio&oldid=1280146361"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp