Serial axes coordinate is a methodology for visualizing the\(p\)-dimensional geometry and multivariatedata. As the name suggested, all axes are shown in serial. The axes canbe a finite\(p\) space or transformedto an infinite space (e.g. Fourier transformation).
In the finite\(p\) space, all axescan be displayed in parallel which is known as the parallel coordinate;also, all axes can be displayed under a polar coordinate that is oftenknown as the radial coordinate or radar plot. In the infinite space, amathematical transformation is often applied. More details will beexplained in the sub-sectionInfinite axes
A point in Euclidean\(p\)-space\(R^p\) is represented as a polyline inserial axes coordinate, it is found that a point <–> line dualityis induced in the Euclidean plane\(R^2\)(A. Inselberg and Dimsdale 1990).
Before we start, a couple of things should be noticed:
In the serial axes coordinate system, nox ory (evengroup) are required; but otheraesthetics, such ascolour,fill,size, etc, are accommodated.
Layergeom_path is used to draw the serial lines;layergeom_histogram,geom_quantiles, andgeom_density are used to draw the histograms, quantiles(notquantile regression) and densities. Users canalso customize their own layer (i.e. geom_boxplot,geom_violin, etc) by editing functionadd_serialaxes_layers.
Suppose we are interested in the data setiris. Aparallel coordinate chart can be created as followings:
library(ggmulti)# parallel axes plotggplot(iris,mapping =aes(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Petal.Length = Petal.Length,Petal.Width = Petal.Width,colour =factor(Species)))+geom_path(alpha =0.2)+coord_serialaxes()-> ppA histogram layer can be displayed by adding layergeom_histogram
p+geom_histogram(alpha =0.3,mapping =aes(fill =factor(Species)))+theme(axis.text.x =element_text(angle =30,hjust =0.7))A density layer can be drawn by adding layergeom_density
A parallel coordinate can be converted to radial coordinate bysettingaxes.layout = "radial" in functioncoord_serialaxes.
Andrews (1972) plot is a way to projectmulti-response observations into a function\(f(t)\), by defining\(f(t)\) as an inner product of the observedvalues of responses and orthonormal functions in\(t\)
\[f_{y_i}(t) = <\mathbf{y}_i,\mathbf{a}_t>\]
where\(\mathbf{y}_i\) is the\(i\)th responses and\(\mathbf{a}_t\) is the orthonormal functionsunder certain interval. Andrew suggests to use the Fouriertransformation
\[\mathbf{a}_t = \{\frac{1}{\sqrt{2}},\sin(t), \cos(t), \sin(2t), \cos(2t), ...\}\]
which are orthonormal on interval\((-\pi,\pi)\). In other word, we can project a\(p\) dimensional space to an infinite\((-\pi, \pi)\) space. The following figureillustrates how to construct an “Andrew’s plot”.
p<-ggplot(iris,mapping =aes(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Petal.Length = Petal.Length,Petal.Width = Petal.Width,colour = Species))+geom_path(alpha =0.2,stat ="dotProduct")+coord_serialaxes()pA quantile layer can be displayed on top
A couple of things should be noticed:
mapping aesthetics is used to define the\(p\) dimensional space, if not provided, allcolumns in the dataset ‘iris’ will be transformed. An alternative way todetermine the\(p\) dimensional spaceto set parameteraxes.sequence in each layer or incoord_serialaxes.
To construct a dot product serial axes plot, say Fouriertransformation, “Andrew’s plot”, we need to set the parameterstat ingeom_path to “dotProduct”. The defaulttransformation function is the Andrew’s (functionandrews).Users can customize their own, for example, Tukey suggests the followingprojected space
\[\mathbf{a}_t = \{\cos(t),\cos(\sqrt{2}t), \cos(\sqrt{3}t), \cos(\sqrt{5}t), ...\}\]
where\(t \in [0, k\pi]\)(Gnanadesikan2011).
tukey<-function(p =4,k =50* (p-1), ...) { t<-seq(0, p* base::pi,length.out = k) seq_k<-seq(p) values<-sapply(seq_k,function(i) {if(i==1)return(cos(t))if(i==2)return(cos(sqrt(2)* t)) Fibonacci<- seq_k[i-1]+ seq_k[i-2]cos(sqrt(Fibonacci)* t) })list(vector = t,matrix =matrix(values,nrow = p,byrow =TRUE) )}ggplot(iris,mapping =aes(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Petal.Length = Petal.Length,Petal.Width = Petal.Width,colour = Species))+geom_path(alpha =0.2,stat ="dotProduct",transform = tukey)+coord_serialaxes()Rather than calling functioncoord_serialaxes, analternative way to create a serial axes object is to add ageom_serialaxes_... object in our model.
For example, Figure 1 to 4 can be created by calling
g<-ggplot(iris,mapping =aes(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Petal.Length = Petal.Length,Petal.Width = Petal.Width,colour = Species))g+geom_serialaxes(alpha =0.2)g+geom_serialaxes(alpha =0.2)+geom_serialaxes_hist(mapping =aes(fill = Species),alpha =0.2)g+geom_serialaxes(alpha =0.2)+geom_serialaxes_density(mapping =aes(fill = Species),alpha =0.2)# radial axes can be created by# calling `coord_radial()`# this is slightly different, check it out!g+geom_serialaxes(alpha =0.2)+geom_serialaxes(alpha =0.2)+coord_radial()Figure 5 and 7 can be created by setting “stat” and “transform” ingeom_serialaxes; to Figure 6,geom_serialaxes_quantile can be added to create a serialaxes quantile layer.
Some slight difference should be noticed here:
One benefit of callingcoord_serialaxes rather thangeom_serialaxes_... is thatcoord_serialaxescan accommodate duplicated axes in mapping aesthetics (e.g. Eulerianpath,Hamiltonian path, etc). However, ingeom_serialaxes_..., duplicated axes will beomitted.
Meaningful axes labels incoord_serialaxes can becreated automatically, while ingeom_serialaxes_..., usershave to set axes labels byggplot2::scale_x_continuous orggplot2::scale_y_continuous manually.
As we turn the serial axes into interactive graphics (via packageloon.ggplot),serial axes lines incoord_serialaxes() could be turned asinteractive but ingeom_serialaxes_... all objects arestatic.
# The serial axes is `Sepal.Length`, `Sepal.Width`, `Sepal.Length`# With meaningful labelsggplot(iris,mapping =aes(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Sepal.Length = Sepal.Length))+geom_path()+coord_serialaxes()# The serial axes is `Sepal.Length`, `Sepal.Length`# No meaningful labelsggplot(iris,mapping =aes(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Sepal.Length = Sepal.Length))+geom_serialaxes()Also, if the dimension of data is large, typing each variate inmapping aesthetics is such a headache. Parameteraxes.sequence is provided to determine the axes. Forexample, aserialaxes object can be created as
At very end, please report bugshere. Enjoy the highdimensional visualization! “Don’t panic… Just do it in ‘serial’”(Alfred Inselberg1999).