John Tukey | |
|---|---|
| Born | (1915-06-16)June 16, 1915 |
| Died | July 26, 2000(2000-07-26) (aged 85) |
| Education | |
| Known for |
|
| Awards |
|
| Scientific career | |
| Fields | Topology |
| Institutions | |
| Thesis | On Denumerability in Topology[2] |
| Doctoral advisor | Solomon Lefschetz[2] |
| Doctoral students | |
John Wilder Tukey (/ˈtuːki/;[3] June 16, 1915 – July 26, 2000) was an Americanmathematician andstatistician, best known for the development of thefast Fourier Transform (FFT) algorithm and thebox plot.[4] TheTukey range test, theTukey lambda distribution, theTukey test of additivity, and theTeichmüller–Tukey lemma all bear his name. He is also credited with coining the termbit and the first published use of the wordsoftware.
Tukey was born inNew Bedford, Massachusetts, in 1915, to a Latin teacher father and a private tutor. He was mainly taught by his mother and attended regular classes only for certain subjects like French.[5] Tukey obtained aB.A. in 1936 andM.S. in 1937 in chemistry, fromBrown University, before moving toPrinceton University, where in 1939 he received aPhD inmathematics after completing a doctoral dissertation titled "Ondenumerability intopology".[2][6][7]
DuringWorld War II, Tukey worked at the Fire Control Research Office and collaborated withSamuel Wilks andWilliam Cochran. After the war, he returned to Princeton, dividing his time between the university andAT&T Bell Laboratories. In 1962, Tukey was elected to theAmerican Philosophical Society.[8] He became a full professor at 35 and founding chairman of the Princeton statistics department in 1965.[5]
He was awarded theNational Medal of Science by President Nixon in 1973.[5] He was awarded theIEEE Medal of Honor in 1982 "For his contributions to the spectral analysis of random processes and thefast Fourier transform (FFT)algorithm".[9]
Tukey retired in 1985. He died inNew Brunswick, New Jersey, on July 26, 2000.[5]
Early in his career Tukey worked on developingstatistical methods for computers atBell Labs, where he coined the wordbit in 1947.[10][11][12]
His statistical interests were many and varied. He is particularly remembered for his development withJames Cooley of theCooley–Tukey FFT algorithm. In 1970, he contributed significantly to what is today known as thejackknife—also termed Quenouille–Tukey jackknife. He introduced thebox plot in his 1977 book, "Exploratory Data Analysis".
Tukey's range test, theTukey lambda distribution,Tukey's test of additivity,Tukey's lemma, and theTukey window all bear his name. He is also the creator of several little-known methods such as thetrimean andmedian-median line, an easier alternative tolinear regression.
In 1974, he developed, withJerome H. Friedman, the concept of theprojection pursuit.[13]
John Tukey contributed greatly to statistical practice and data analysis in general. In fact, some regard John Tukey as the father of data science. At the very least, he pioneered many of the key foundations of what came later to be known asdata science.[14]
Making sense of data has a long history and has been addressed by statisticians, mathematicians, scientists, and others for many many years. During the 1960s, Tukey challenged the dominance at the time of what he called "confirmatory data analysis", statistical analyses driven by rigid mathematical configurations.[15] Tukey emphasized the importance of having a more flexible attitude towards data analysis and of exploring data carefully to see what structures and information might be contained therein. He called this "exploratory data analysis" (EDA). In many ways, EDA was a precursor to data science.
Tukey also realized the importance of computer science to EDA. Graphics are an integral part of EDA methodology and, while much of Tukey's work focused on static displays (such as box plots) that could be drawn by hand, he realized that computer graphics would be much more effective for studying multivariate data. PRIM-9, the first program for viewing multivariate data, was conceived by him during the early 1970s.[16]
This coupling of data analysis and computer science is what is now called data science.
Tukey articulated the important distinction betweenexploratory data analysis andconfirmatory data analysis, believing that much statistical methodology placed too great an emphasis on the latter. Though he believed in the utility of separating the two types of analysis, he pointed out that sometimes, especially innatural science, this was problematic and termed such situationsuncomfortable science.
A. D. Gordon offered the following summary of Tukey's principles for statistical practice:[17]
... the usefulness and limitation of mathematical statistics; the importance of having methods of statistical analysis that are robust to violations of the assumptions underlying their use; the need to amass experience of the behaviour of specific methods of analysis in order to provide guidance on their use; the importance of allowing the possibility of data's influencing the choice of method by which they are analysed; the need for statisticians to reject the role of "guardian of proven truth", and to resist attempts to provide once-for-all solutions and tidy over-unifications of the subject; the iterative nature of data analysis; implications of the increasing power, availability, and cheapness of computing facilities; the training of statisticians.
Tukey's lectures were described to be unusual. McCullagh described his lecture given in London in 1977:[17][18]
Tukey ambled to the podium, a great bear of a man dressed in baggy pants and a black knitted shirt. These might once have been a matching pair but the vintage was such that it was hard to tell. ... Carefully and deliberately a list of headings was chalked on the blackboard. The words came too, not many, like overweight parcels, delivered at a slow unfaltering pace. ... When it was complete, Tukey turned to face the audience and the podium ... "Comments, queries, suggestions?" he asked the audience ... As he waited for a response, he clambered onto the podium and manoeuvred until he was sitting cross-legged facing the audience. ... We in the audience sat like spectators at the zoo waiting for the great bear to move or say something. But the great bear appeared to be doing the same thing, and the feeling was not comfortable.
Tukey made wide-ranging contributions beyond statistics, once reportedly remarking, "The best thing about being a statistician is that you get to play in everyone's backyard."[5]
In the 1950s, Tukey served on a committee of theNational Research Council that produced a report critiquing the statistical methodology of theKinsey Report, and he chaired a committee in the 1970s on the role of aerosol sprays in damaging theozone layer.[5]
From 1960 to 1980, Tukey helped design the NBC television network polls used to predict and analyze elections. He was also a consultant to theEducational Testing Service, theXerox Corporation, andMerck & Company.[5]
During the 1970s and early 1980s, Tukey played a key role in the design and conduct of theNational Assessment of Educational Progress.[citation needed]
While working withJohn von Neumann on early computer designs, Tukey introduced the wordbit as a portmanteau ofbinary digit.[19] The termbit was first used inan article byClaude Shannon in 1948.
d with the first use of the wordsoftware to describe computer programs in a 1958 article[20] inAmerican Mathematical Monthly.[21]
The choice of a logarithmic base corresponds to the choice of a unit for measuring information. If the base 2 is used the resulting units may be called binary digits, or more brieflybits, a word suggested by J. W. Tukey.
{{cite journal}}: CS1 maint: untitled periodical (link){{cite journal}}: CS1 maint: untitled periodical (link){{cite journal}}: CS1 maint: untitled periodical (link){{cite journal}}: CS1 maint: untitled periodical (link)