Algebraic statistics is the use ofalgebra to advancestatistics. Algebra has been useful forexperimental design,parameter estimation, andhypothesis testing.
Traditionally, algebraic statistics has been associated with the design of experiments andmultivariate analysis (especiallytime series). In recent years, the term "algebraic statistics" has been sometimes restricted, sometimes being used to label the use ofalgebraic geometry andcommutative algebra in statistics.
In the past, statisticians have used algebra to advance research in statistics. Some algebraic statistics led to the development of new topics in algebra and combinatorics, such asassociation schemes.
For example,Ronald A. Fisher,Henry B. Mann, andRosemary A. Bailey appliedAbelian groups to thedesign of experiments. Experimental designs were also studied withaffine geometry overfinite fields and then with the introduction ofassociation schemes byR. C. Bose.Orthogonal arrays were introduced byC. R. Rao also for experimental designs.
Invariant measures onlocally compact groups have long been used instatistical theory, particularly inmultivariate analysis.Beurling'sfactorization theorem and much of the work on (abstract)harmonic analysis sought better understanding of theWolddecomposition ofstationary stochastic processes, which is important intime series statistics.
Encompassing previous results on probability theory on algebraic structures,Ulf Grenander developed a theory of "abstract inference". Grenander's abstract inference and histheory of patterns are useful forspatial statistics andimage analysis; these theories rely onlattice theory.
Partially ordered vector spaces andvector lattices are used throughout statistical theory.Garrett Birkhoff metrized the positive cone usingHilbert's projective metric and provedJentsch's theorem using thecontraction mappingtheorem.[1] Birkhoff's results have been used formaximum entropyestimation (which can be viewed aslinear programming ininfinite dimensions) byJonathan Borwein and colleagues.
Vector lattices andconical measures were introduced intostatistical decision theory byLucien Le Cam.
In recent years, the term "algebraic statistics" has been used more restrictively, to label the use ofalgebraic geometry andcommutative algebra to study problems related todiscrete random variables with finite state spaces. Commutative algebra and algebraic geometry have applications in statistics because many commonly used classes of discrete random variables can be viewed asalgebraic varieties.
Consider arandom variableX which can take on the values 0, 1, 2. Such a variable is completely characterized by the three probabilities
and these numbers satisfy
Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variableX with the tuple (p0,p1,p2)∈R3.
Now supposeX is abinomial random variable with parameterq andn = 2, i.e.X represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability ofq. Then
and it is not hard to show that the tuples (p0,p1,p2) which arise in this way are precisely the ones satisfying
The latter is apolynomial equation defining an algebraic variety (or surface) inR3, and this variety, when intersected with thesimplex given by
yields a piece of analgebraic curve which may be identified with the set of all 3-state Bernoulli variables. Determining the parameterq amounts to locating one point on this curve; testing the hypothesis that a given variableX isBernoulli amounts to testing whether a certain point lies on that curve or not.
Algebraic geometry has also recently found applications tostatistical learning theory, including ageneralization of theAkaike information criterion tosingular statistical models.[2]