Movatterモバイル変換

Descriptive statistics

From Wikipedia, the free encyclopedia

Type of statistics

Research
Part ofa series on

Research design Ethics Proposal Question Writing Argument Referencing
Research strategy Interdisciplinary Multimethodology Qualitative Art-based Quantitative
Philosophical schools Antipositivism Constructivism Critical rationalism Empiricism Fallibilism Positivism Postpositivism Pragmatism Realism Critical realism Subtle realism
Methodology Action research Art methodology Critical theory Grounded theory Hermeneutics Historiography Human subject research Narrative inquiry Phenomenology Pragmatism Scientific method
Methods Analysis Case study Content analysis Descriptive statistics Discourse analysis Ethnography Autoethnography Experiment Field experiment Social experiment Quasi-experiment Field research Historical method Inferential statistics Interviews Mapping Cultural mapping Phenomenography Secondary research Bibliometrics Literature review Meta-analysis Scoping review Systematic review Scientific modelling Simulation Survey
Tools and software Argument technology GIS software LIS software Bibliometrics Reference management Science software Qualitative data analysis Simulation Statistics
Philosophy portal
v t e

Adescriptive statistic (in thecount noun sense) is asummary statistic that quantitatively describes or summarizes features from a collection ofinformation,^[1] whiledescriptive statistics (in themass noun sense) is the process of using and analysing those statistics. Descriptive statistics is distinguished frominferential statistics (or inductive statistics) by its aim to summarize asample, rather than use the data to learn about thepopulation that the sample of data is thought to represent.^[2] This generally means that descriptive statistics, unlike inferential statistics, is not developed on the basis ofprobability theory, and are frequentlynonparametric statistics.^[3] Even when a data analysis draws its main conclusions using inferential statistics, descriptive statistics are generally also presented.^[4] For example, in papers reporting on human subjects, typically a table is included giving the overallsample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), anddemographic or clinical characteristics such as theaverage age, the proportion of subjects of each sex, the proportion of subjects with relatedco-morbidities, etc.

Some measures that are commonly used to describe a data set are measures ofcentral tendency and measures of variability ordispersion. Measures of central tendency include themean,median andmode, while measures of variability include thestandard deviation (orvariance), the minimum and maximum values of the variables,kurtosis andskewness.^[5]

Use in statistical analysis

[edit]

Descriptive statistics provide simple summaries about the sample and about the observations that have been made. Such summaries may be eitherquantitative, i.e.summary statistics, or visual, i.e. simple-to-understand graphs. These summaries may either form the basis of the initial description of the data as part of a more extensive statistical analysis, or they may be sufficient in and of themselves for a particular investigation.

For example, the shootingpercentage inbasketball is a descriptive statistic that summarizes the performance of a player or a team. This number is the number of shots made divided by the number of shots taken. For example, a player who shoots 33% is making approximately one shot in every three. The percentage summarizes or describes multiple discrete events. Consider also thegrade point average. This single number describes the general performance of a student across the range of their course experiences.^[6]

The use of descriptive and summary statistics has an extensive history and, indeed, the simple tabulation of populations and of economic data was the first way the topic ofstatistics appeared. More recently, a collection of summarisation techniques has been formulated under the heading ofexploratory data analysis: an example of such a technique is thebox plot.

In the business world, descriptive statistics provides a useful summary of many types of data. For example, investors and brokers may use a historical account of return behaviour by performing empirical and analytical analyses on their investments in order to make better investing decisions in the future.

Univariate analysis

[edit]

Univariate analysis involves describing thedistribution of a single variable, including its central tendency (including themean,median, andmode) and dispersion (including therange andquartiles of the data-set, and measures of spread such as thevariance andstandard deviation). The shape of the distribution may also be described via indices such asskewness andkurtosis. Characteristics of a variable's distribution may also be depicted in graphical or tabular format, includinghistograms andstem-and-leaf display.

Bivariate and multivariate analysis

[edit]

When a sample consists of more than one variable, descriptive statistics may be used to describe the relationship between pairs of variables. In this case, descriptive statistics include:

Cross-tabulations andcontingency tables
Graphical representation viascatterplots
Quantitative measures ofdependence
Descriptions ofconditional distributions

The main reason for differentiating univariate and bivariate analysis is that bivariate analysis is not only a simple descriptive analysis, but also it describes the relationship between two different variables.^[7] Quantitative measures of dependence include correlation (such asPearson's r when both variables are continuous, orSpearman's rho if one or both are not) andcovariance (which reflects the scale variables are measured on). The slope, in regression analysis, also reflects the relationship between variables. The unstandardised slope indicates the unit change in the criterion variable for a one unit change in thepredictor. The standardised slope indicates this change in standardised (z-score) units. Highly skewed data are often transformed by taking logarithms. The use of logarithms makes graphs more symmetrical and look more similar to thenormal distribution, making them easier to interpret intuitively.^[8]^: 47

References

[edit]

^Mann, Prem S. (1995).Introductory Statistics (2nd ed.). Wiley.ISBN 0-471-31009-3.
^Christopher, Andrew N. (2017),"Drawing Conclusions From Data: Descriptive Statistics, Inferential Statistics, and Hypothesis Testing",Interpreting and Using Statistics in Psychological Research, Thousand Oaks, CA: SAGE Publications, Inc, pp. 145–183,doi:10.4135/9781506304144.n6,ISBN 978-1-5063-0416-8, retrieved2021-06-01
^Dodge, Y. (2003).The Oxford Dictionary of Statistical Terms. OUP.ISBN 0-19-850994-4.
^Christopher, Andrew N. (2017),"Drawing Conclusions From Data: Descriptive Statistics, Inferential Statistics, and Hypothesis Testing",Interpreting and Using Statistics in Psychological Research, Thousand Oaks, CA: SAGE Publications, Inc, pp. 145–183,doi:10.4135/9781506304144.n6,ISBN 978-1-5063-0416-8, retrieved2021-06-01
^Investopedia,Descriptive Statistics Terms
^Trochim, William M. K. (2006)."Descriptive statistics".Research Methods Knowledge Base. Retrieved14 March 2011.
^Babbie, Earl R. (2009).The Practice of Social Research (12th ed.). Wadsworth. pp. 436–440.ISBN 978-0-495-59841-1.
^Nick, Todd G. (2007). "Descriptive Statistics".Topics in Biostatistics.Methods in Molecular Biology. Vol. 404. New York: Springer. pp. 33–52.doi:10.1007/978-1-59745-530-5_3.ISBN 978-1-58829-531-6.PMID 18450044.

External links

[edit]

Descriptive Statistics Lecture: University of Pittsburgh Supercourse:http://www.pitt.edu/~super1/lecture/lec0421/index.htm

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Contraharmonic Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test(normal) Student'st-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality(Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank(Wilcoxon) Hodges–Lehmann estimator Rank sum(Mann–Whitney) Nonparametric anova 1-way(Kruskal–Wallis) 2-way(Friedman) Ordered alternative(Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis (see alsoTemplate:Least squares and regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Homoscedasticity and Heteroscedasticity
Generalized linear model	Exponential families Logistic(Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / multivariate / time-series / survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic(Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model(Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR) (Autoregressive model (AR))
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Authority control databases
International	GND
National	Czech Republic

Retrieved from "https://en.wikipedia.org/w/index.php?title=Descriptive_statistics&oldid=1306028973"

Category:

Descriptive statistics

Hidden categories:

[8]ページ先頭