The probability distribution of the number ofBernoulli trials needed to get one success, supported on;
The probability distribution of the number of failures before the first success, supported on.
These two different geometric distributions should not be confused with each other. Often, the nameshifted geometric distribution is adopted for the former one (distribution of); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the support explicitly.
The geometric distribution gives the probability that the first occurrence of success requires independent trials, each with success probability. If the probability of success on each trial is, then the probability that the-th trial is the first success is
for
The above form of the geometric distribution is used for modeling the number of trials up to and including the first success. By contrast, the following form of the geometric distribution is used for modeling the number of failures until the first success:
for
The geometric distribution gets its name because its probabilities follow ageometric sequence. It is sometimes called the Furry distribution afterWendell H. Furry.[1]: 210
The support may also be, defining. This alters the probability mass function into where is the number of failures before the first success.[3]: 66
An alternative parameterization of the distribution gives the probability mass function where and.[1]: 208–209
An example of a geometric distribution arises from rolling a six-sideddie until a "1" appears. Each roll isindependent with a chance of success. The number of rolls needed follows a geometric distribution with.
The geometric distribution is the only memoryless discrete probability distribution.[4] It is the discrete version of the same property found in theexponential distribution.[1]: 228 The property asserts that the number of previously failed trials does not affect the number of future trials needed for a success.
Because there are two definitions of the geometric distribution, there are also two definitions of memorylessness for discrete random variables.[5] Expressed in terms ofconditional probability, the two definitions areand
where and arenatural numbers, is a geometrically distributed random variable defined over, and is a geometrically distributed random variable defined over. Note that these definitions are not equivalent for discrete random variables; does not satisfy the first equation and does not satisfy the second.
Theexpected value andvariance of a geometrically distributedrandom variable defined over is[2]: 261 With a geometrically distributed random variable defined over, the expected value changes intowhile the variance stays the same.[6]: 114–115
For example, when rolling a six-sided die until landing on a "1", the average number of rolls needed is and the average number of failures is.
Themoment generating function of the geometric distribution when defined over and respectively is[7][6]: 114 The moments for the number of failures before the first success are given by
Consider the expected value ofX as above, i.e. the average number of trials until a success. The first trial either succeeds with probability, or fails with probability. If it fails, theremaining mean number of trials until a success is identical to the original mean - this follows from the fact that all trials are independent.
From this we get the formula:
which, when solved for, gives:
The expected number offailures can be found from thelinearity of expectation,. It can also be shown in the following way:
The interchange of summation and differentiation is justified by the fact that convergentpower seriesconverge uniformly oncompact subsets of the set of points where they converge.
Themean of the geometric distribution is its expected value which is, as previously discussed in§ Moments and cumulants, or when defined over or respectively.
Themedian of the geometric distribution iswhen defined over[9] and when defined over.[3]: 69
Themode of the geometric distribution is the first value in the support set. This is 1 when defined over and 0 when defined over.[3]: 69
Theskewness of the geometric distribution is.[6]: 115
Thekurtosis of the geometric distribution is.[6]: 115 Theexcess kurtosis of a distribution is the difference between its kurtosis and the kurtosis of anormal distribution,.[10]: 217 Therefore, the excess kurtosis of the geometric distribution is. Since, the excess kurtosis is always positive so the distribution isleptokurtic.[3]: 69 In other words, the tail of a geometric distribution decays faster than a Gaussian.[10]: 217
Entropy is a measure of uncertainty in a probability distribution. For the geometric distribution that models the number of failures before the first success, the probability mass function is:
The entropy for this distribution is defined as:
The entropy increases as the probability decreases, reflecting greater uncertainty as success becomes rarer.
Fisher's information (geometric distribution, failures before success)
Fisher information measures the amount of information that an observable random variable carries about an unknown parameter. For the geometric distribution (failures before the first success), the Fisher information with respect to is given by:
Proof:
Thelikelihood function for a geometric random variable is:
Thelog-likelihood function is:
The score function (first derivative of the log-likelihood w.r.t.) is:
The second derivative of the log-likelihood function is:
Fisher information is calculated as the negative expected value of the second derivative:
Fisher information increases as decreases, indicating that rarer successes provide more information about the parameter.
Entropy (geometric distribution, trials until success)
The geometric distribution defined on isinfinitely divisible, that is, for any positive integer, there exist independent identically distributed random variables whose sum is also geometrically distributed. This is because the negative binomial distribution can be derived from a Poisson-stopped sum oflogarithmic random variables.[11]: 606–607
The decimal digits of the geometrically distributed random variableY are a sequence ofindependent (andnot identically distributed) random variables.[citation needed] For example, the hundreds digitD has this probability distribution: whereq = 1 −p, and similarly for the other digits, and, more generally, similarly fornumeral systems with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions areindecomposable.
The sum ofindependent geometric random variables with parameter is anegative binomial random variable with parameters and.[14] The geometric distribution is a special case of the negative binomial distribution, with.
The minimum of geometric random variables with parameters is also geometrically distributed with parameter.[15]
Suppose 0 < r < 1, and fork = 1, 2, 3, ... the random variableXk has aPoisson distribution with expected valuerk/k. Then has a geometric distribution taking values in, with expected valuer/(1 − r).[citation needed]
Theexponential distribution is the continuous analogue of the geometric distribution. Applying thefloor function to the exponential distribution with parameter creates a geometric distribution with parameter defined over.[3]: 74 This can be used to generate geometrically distributed random numbers as detailed in§ Random variate generation.
Ifp = 1/n andX is geometrically distributed with parameterp, then the distribution ofX/n approaches anexponential distribution with expected value 1 asn → ∞, sinceMore generally, ifp = λ/n, whereλ is a parameter, then asn→ ∞ the distribution ofX/n approaches an exponential distribution with rateλ: therefore the distribution function ofX/n converges to, which is that of an exponential random variable.[citation needed]
Provided they exist, the first moments of a probability distribution can be estimated from a sample using the formulawhere is theth sample moment and.[16]: 349–350 Estimating with gives thesample mean, denoted. Substituting this estimate in the formula for the expected value of a geometric distribution and solving for gives the estimators and when supported on and respectively. These estimators arebiased since as a result ofJensen's inequality.[17]: 53–54
InBayesian inference, the parameter is a random variable from aprior distribution with aposterior distribution calculated usingBayes' theorem after observing samples.[17]: 167 If abeta distribution is chosen as the prior distribution, then the posterior will also be a beta distribution and it is called theconjugate distribution. In particular, if a prior is selected, then the posterior, after observing samples, is[19]Alternatively, if the samples are in, the posterior distribution is[20]Since the expected value of a distribution is,[11]: 145 as and approach zero, the posterior mean approaches its maximum likelihood estimate.
The geometric distribution can be generated experimentally fromi.i.d.standard uniform random variables by finding the first such random variable to be less than or equal to. However, the number of random variables needed is also geometrically distributed and the algorithm slows as decreases.[21]: 498
Random generation can be done inconstant time by truncatingexponential random numbers. An exponential random variable can become geometrically distributed with parameter through. In turn, can be generated from a standard uniform random variable altering the formula into.[21]: 499–500 [22]
The geometric distribution is used in many disciplines. Inqueueing theory, theM/M/1 queue has a steady state following a geometric distribution.[23] Instochastic processes, the Yule Furry process is geometrically distributed.[24] The distribution also arises when modeling the lifetime of a device in discrete contexts.[25] It has also been used to fit data including modeling patients spreadingCOVID-19.[26]