Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Theil index

From Wikipedia, the free encyclopedia
Index to measure economic inequality

TheTheil index is a statistic primarily used to measureeconomic inequality[1] and other economic phenomena, though it has also been used to measure racial segregation.[2][3] The Theil indexTT is the same asredundancy ininformation theory which is the maximum possibleentropy of the data minus the observed entropy. It is a special case of thegeneralized entropy index. It can be viewed as a measure of redundancy, lack of diversity, isolation, segregation, inequality, non-randomness, and compressibility. It was proposed by a DutcheconometricianHenri Theil (1924–2000) at theErasmus University Rotterdam.[3]

Henri Theil himself said (1967): "The (Theil) index can be interpreted as the expected information content of the indirect message which transforms the population shares as prior probabilities into the income shares as posterior probabilities."[4]Amartya Sen noted, "But the fact remains that the Theil index is an arbitrary formula, and the average of the logarithms of the reciprocals of income shares weighted by income is not a measure that is exactly overflowing with intuitive sense."[4]

Formula

[edit]

For a population ofN "agents" each with characteristicx, the situation may be represented by the listxi (i = 1,...,N) wherexi is the characteristic of agenti. For example, if the characteristic is income, thenxi is the income of agenti.

The TheilT index is defined as[5]

TT=Tα=1=1Ni=1Nxiμln(xiμ){\displaystyle T_{T}=T_{\alpha =1}={\frac {1}{N}}\sum _{i=1}^{N}{\frac {x_{i}}{\mu }}\ln \left({\frac {x_{i}}{\mu }}\right)}

and the TheilL index is defined as[5]

TL=Tα=0=1Ni=1Nln(μxi){\displaystyle T_{L}=T_{\alpha =0}={\frac {1}{N}}\sum _{i=1}^{N}\ln \left({\frac {\mu }{x_{i}}}\right)}

whereμ{\displaystyle \mu } is the mean income:

μ=1Ni=1Nxi{\displaystyle \mu ={\frac {1}{N}}\sum _{i=1}^{N}x_{i}}

Theil-L is an income-distribution's dis-entropy per person, measured with respect to maximum entropy (...which is achieved with complete equality).

(In an alternative interpretation of it, Theil-L is the natural-logarithm of the geometric-mean of the ratio: (mean income)/(income i), over all the incomes. The related Atkinson(1) is just 1 minus the geometric-mean of (income i)/(mean income), over the income distribution.)

Because a transfer between a larger income & a smaller one will change the smaller income's ratio more than it changes the larger income's ratio, the transfer-principle is satisfied by this index.

Equivalently, if the situation is characterized by a discrete distribution functionfk (k = 0,...,W) wherefk is the fraction of the population with incomek andW = is the total income, thenk=0Wfk=1{\displaystyle \sum _{k=0}^{W}f_{k}=1} and the Theil index is:

TT=k=0Wfkkμln(kμ){\displaystyle T_{T}=\sum _{k=0}^{W}\,f_{k}\,{\frac {k}{\mu }}\ln \left({\frac {k}{\mu }}\right)}

whereμ{\displaystyle \mu } is again the mean income:

μ=k=0Wkfk{\displaystyle \mu =\sum _{k=0}^{W}kf_{k}}

Note that in this case incomek is aninteger andk=1 represents the smallest increment of income possible (e.g., cents).

if the situation is characterized by a continuous distribution functionf(k) (supported from 0 to infinity) wheref(kdk is the fraction of the population with incomek tok + dk, then the Theil index is:

TT=0f(k)kμln(kμ)dk{\displaystyle T_{T}=\int _{0}^{\infty }f(k){\frac {k}{\mu }}\ln \left({\frac {k}{\mu }}\right)dk}

where the mean is:

μ=0kf(k)dk{\displaystyle \mu =\int _{0}^{\infty }kf(k)\,dk}

Theil indices for some common continuous probability distributions are given in the table below:

Income distribution functionPDF(x) (x ≥ 0)Theil coefficient (nats)
Dirac delta functionδ(xx0),x0>0{\displaystyle \delta (x-x_{0}),\,x_{0}>0}0
Uniform distribution{1baaxb0otherwise{\displaystyle {\begin{cases}{\frac {1}{b-a}}&a\leq x\leq b\\0&{\text{otherwise}}\end{cases}}}ln(2a(a+b)e)+b2b2a2ln(b/a){\displaystyle \ln \left({\frac {2a}{(a+b){\sqrt {e}}}}\right)+{\frac {b^{2}}{b^{2}-a^{2}}}\ln(b/a)}
Exponential distributionλexλ,x>0{\displaystyle \lambda e^{-x\lambda },\,\,x>0}1{\displaystyle 1-}γ{\displaystyle \gamma }
Log-normal distribution1σ2πe((ln(x)μ)2)/σ2{\displaystyle {\frac {1}{\sigma {\sqrt {2\pi }}}}e^{(-(\ln(x)-\mu )^{2})/\sigma ^{2}}}σ22{\displaystyle {\frac {\sigma ^{2}}{2}}}
Pareto distribution{αkαxα+1xk0x<k{\displaystyle {\begin{cases}{\frac {\alpha k^{\alpha }}{x^{\alpha +1}}}&x\geq k\\0&x<k\end{cases}}}ln(11/α)+1α1{\displaystyle \ln(1\!-\!1/\alpha )+{\frac {1}{\alpha -1}}}    (α>1)
Chi-squared distribution2k/2ex/2xk/21Γ(k/2){\displaystyle {\frac {2^{-k/2}e^{-x/2}x^{k/2-1}}{\Gamma (k/2)}}}ln(2/k)+{\displaystyle \ln(2/k)+}ψ(0){\displaystyle \psi ^{(0)}}(1+k/2){\displaystyle (1\!+\!k/2)}
Gamma distribution[6]ex/θxk1θkΓ(k){\displaystyle {\frac {e^{-x/\theta }x^{k-1}\theta ^{-k}}{\Gamma (k)}}}ψ(0){\displaystyle \psi ^{(0)}}(1+k)ln(k){\displaystyle (1+k)-\ln(k)}
Weibull distributionkλ(xλ)k1e(x/λ)k{\displaystyle {\frac {k}{\lambda }}\left({\frac {x}{\lambda }}\right)^{k-1}e^{-(x/\lambda )^{k}}}1k{\displaystyle {\frac {1}{k}}}ψ(0){\displaystyle \psi ^{(0)}}(1+1/k)ln(Γ(1+1/k)){\displaystyle (1+1/k)-\ln \left(\Gamma (1+1/k)\right)}

If everyone has the same income, thenTT equals 0. If one person has all the income, thenTT gives the resultlnN{\displaystyle \ln N}, which is maximum inequality. DividingTT bylnN{\displaystyle \ln N} can normalize the equation to range from 0 to 1, but then theindependence axiom is violated:T[xx]T[x]{\displaystyle T[x\cup x]\neq T[x]} and does not qualify as a measure of inequality.

The Theil index measures an entropic "distance" the population is away from the egalitarian state of everyone having the same income. The numerical result is in terms of negative entropy so that a higher number indicates more order that is further away from the complete equality. Formulating the index to represent negative entropy instead of entropy allows it to be a measure of inequality rather than equality.

Relation to Atkinson Index

[edit]

The Theil index can be transformed into anAtkinson index, which has a range between 0 and 1 (0% and 100%), where 0 indicates perfect equality and 1 (100%) indicates maximum inequality. (SeeGeneralized entropy index for the transformation.)

Derivation from entropy

[edit]

The Theil index is derived fromShannon's measure ofinformation entropyS{\displaystyle S}, where entropy is a measure of randomness in a given set of information. In information theory, physics, and the Theil index, the general form of entropy is

S=ki=1N(piloga(1pi))=ki=1N(piloga(pi)){\displaystyle S=k\sum _{i=1}^{N}\left(p_{i}\log _{a}\left({\frac {1}{p_{i}}}\right)\right)=-k\sum _{i=1}^{N}\left(p_{i}\log _{a}\left({p_{i}}\right)\right)}
where

When looking at the distribution of income in a population,pi{\displaystyle p_{i}} is equal to the ratio of a particular individual's income to the total income of the entire population. This gives the observed entropySTheil{\displaystyle S_{\text{Theil}}} of a population to be:

STheil=i=1N(xiNx¯ln(Nx¯xi)){\displaystyle S_{\text{Theil}}=\sum _{i=1}^{N}\left({\frac {x_{i}}{N{\bar {x}}}}\ln \left({\frac {N{\bar {x}}}{x_{i}}}\right)\right)}
where

The Theil indexTT{\displaystyle T_{T}} measures how far the observed entropy (STheil{\displaystyle S_{\text{Theil}}}, which represents how randomly income is distributed) is from the highest possible entropy (Smax=ln(N){\displaystyle S_{\text{max}}=\ln \left({N}\right)},[note 3] which represents income being maximally distributed amongst individuals in the population– a distribution analogous to the [most likely] outcome of an infinite number of random coin tosses: an equal distribution of heads and tails). Therefore, the Theil index is the difference between the theoretical maximum entropy (which would be reached if the incomes of every individual were equal) minus the observed entropy:

TT=SmaxSTheil=ln(N)STheil{\displaystyle T_{T}=S_{\text{max}}-S_{\text{Theil}}=\ln \left({N}\right)-S_{\text{Theil}}}


Whenx{\displaystyle x} is in units of population/species,STheil{\displaystyle S_{\text{Theil}}} is a measure of biodiversity and is called theShannon index. If the Theil index is used with x=population/species, it is a measure of inequality of population among a set of species, or "bio-isolation" as opposed to "wealth isolation".

The Theil index measures what is calledredundancy in information theory.[5] It is the left over "information space" that was not utilized to convey information, which reduces the effectiveness of theprice signal.[original research?] The Theil index is a measure of the redundancy of income (or other measure of wealth) in some individuals. Redundancy in some individuals implies scarcity in others. A high Theil index indicates the total income is not distributed evenly among individuals in the same way an uncompressedtext file does not have a similar number of byte locations assigned to the available unique byte characters.

NotationInformation theoryTheil index TT
N{\displaystyle N}number of unique charactersnumber of individuals
i{\displaystyle i}a particular charactera particular individual
xi{\displaystyle x_{i}}count ofith characterincome ofith individual
Nx¯{\displaystyle N{\bar {x}}}total characters in documenttotal income in population
TT{\displaystyle T_{T}}unused information spaceunused potential in price mechanism[original research?]
data compressionprogressive tax[original research?]

Decomposability

[edit]

According to theWorld Bank,

"The best-known entropy measures are Theil’s T (TT{\displaystyle T_{T}}) and Theil’s L (TL{\displaystyle T_{L}}), both of which allow one to decompose inequality into the part that is due to inequality within areas (e.g. urban, rural) and the part that is due to differences between areas (e.g. the rural-urban income gap). Typically at least three-quarters of inequality in a country is due to within-group inequality, and the remaining quarter to between-group differences."[7]

If the population is divided intom{\displaystyle m} subgroups and

then Theil's T index is

TT=i=1msiTi+i=1msilnx¯iμ{\displaystyle T_{T}=\sum _{i=1}^{m}s_{i}T_{i}+\sum _{i=1}^{m}s_{i}\ln {\frac {{\overline {x}}_{i}}{\mu }}} forsi=NiNx¯iμ{\displaystyle s_{i}={\frac {N_{i}}{N}}{\frac {{\overline {x}}_{i}}{\mu }}}

For example, inequality within the United States is the average inequality within each state, weighted by state income, plus the inequality between states.

Map of economic inequality in the United States using the Theil Index. A high positive theil index indicates more income than population while a negative value shows more population than income. A value of zero shows equality between population and income.
Map of economic inequality in the United States using the Theil Index. A high positive theil index indicates more income than population while a negative value shows more population than income. A value of zero shows equality between population and income.
Note: This image is not the Theil Index in each area of the United States, but of contributions to the Theil Index for the U.S. by each area. The Theil Index is always positive, although individual contributions to the Theil Index may be negative or positive.

The decomposition of the Theil index which identifies the share attributable to the between-region component becomes a helpful tool for the positive analysis of regional inequality as it suggests the relative importance of spatial dimension of inequality.[8]

Theil'sT versus Theil'sL

[edit]

Both Theil'sT and Theil'sL are decomposable. The difference between them is based on the part of the outcomes distribution that each is used for. Indexes of inequality in the generalized entropy (GE) family are more sensitive to differences in income shares among the poor or among the rich depending on a parameter that defines the GE index. The smaller the parameter value for GE, the more sensitive it is to differences at the bottom of the distribution.[9]

GE(0) = Theil'sL and is more sensitive to differences at the lower end of the distribution. It is also referred to as themean log deviation measure.
GE(1) = Theil'sT and is more sensitive to differences at the top of the distribution.

The decomposability is a property of the Theil index which the more popularGini coefficient does not offer. The Gini coefficient is more intuitive to many people since it is based on theLorenz curve. However, it is not easily decomposable like the Theil.

Applications

[edit]

In addition to multitude of economic applications, the Theil index has been applied to assess performance ofirrigation systems[10] and distribution ofsoftware metrics.[11]

See also

[edit]

Notes

[edit]
  1. ^When this equation is used in physics,k{\displaystyle k} typically represents theBoltzmann constant. In information theory or statistics,k{\displaystyle k} is typically equal to 1 (such as in the Theil Index).
  2. ^In information theory, when information is given in binary digits, thebinary logarithm is used (witha{\displaystyle a} equal to 2). In physics and also in computation of Theil index, thenatural logarithm is used (witha{\displaystyle a} equal toe).
  3. ^When the income of every individual is equal to the average income,i=1N((xix¯=1)1Nln((x¯xi=1)N)){\displaystyle \sum _{i=1}^{N}\left(\left({\frac {x_{i}}{\bar {x}}}=1\right){\frac {1}{N}}\ln \left({\left({\frac {\bar {x}}{x_{i}}}=1\right)N}\right)\right)}=i=1N(1Nln(N)){\displaystyle =\sum _{i=1}^{N}\left({\frac {1}{N}}\ln \left({N}\right)\right)}=ln(N){\displaystyle =\ln \left({N}\right)}

References

[edit]
  1. ^Introduction to the Theil index from the University of Texas
  2. ^"Segregation Measures".www.urban.org. Urban Institute. Retrieved5 February 2018.
  3. ^abParker, Lauren (20 July 2015)."Racial and Ethnic Segregation: In the News and On PolicyMap".PolicyMap. Retrieved5 February 2018.
  4. ^abConceicao, Pedro; Ferreira, Pedro M. (2000)."The Young Person's Guide to the Theil Index: Suggesting Intuitive Interpretations and Exploring Analytical Applications".SSRN Electronic Journal.doi:10.2139/ssrn.228703.ISSN 1556-5068.S2CID 19009769.
  5. ^abchttp://www.poorcity.richcity.org (Redundancy, Entropy and Inequality Measures)
  6. ^McDonald, James B; Jensen, Bartell C. (December 1979). "An Analysis of Some Properties of Alternative Measures of Income Inequality Based on the Gamma Distribution Function".Journal of the American Statistical Association.74 (368):856–860.doi:10.1080/01621459.1979.10481042.
  7. ^"6. Inequality Measures".Poverty Manual(PDF).World Bank. 8 August 2005. p. 95. Retrieved4 February 2018.
  8. ^Novotny, J. (2007)."On the measurement of regional inequality: Does spatial dimension of income inequality matter?"(PDF).Annals of Regional Science.41 (3):563–580.doi:10.1007/s00168-007-0113-y.S2CID 51753883.
  9. ^"Inequality Measures".www.urban.org. Urban Institute. Retrieved5 February 2018.
  10. ^Rajan K. Sampath. Equity Measures for Irrigation Performance Evaluation. Water International, 13(1), 1988.
  11. ^A. Serebrenik, M. van den Brand. Theil index for aggregation of software metrics values. 26th IEEE International Conference on Software Maintenance. IEEE Computer Society.

External links

[edit]
  • Software:
    • Free Online Calculator computes the Gini Coefficient, plots the Lorenz curve, and computes many other measures of concentration for any dataset
    • Free Calculator:Online anddownloadable scripts (Python andLua) for Atkinson, Gini, and Hoover inequalities
    • Users of theR data analysis software can install the "ineq" package which allows for computation of a variety of inequality indices including Gini, Atkinson, Theil.
    • AMATLAB Inequality PackageArchived 2008-10-04 at theWayback Machine, including code for computing Gini, Atkinson, Theil indexes and for plotting the Lorenz Curve. Many examples are available.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Theil_index&oldid=1277865187"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp