- Notifications
You must be signed in to change notification settings - Fork0
An R package for plotting wide-format data
License
xuan-liang/ggmatplot
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
ggmatplot is a quick and easy way of plotting the columns of twomatrices or data frames against each other usingggplot2.
ggplot2 requires wide format data tobe wrangled into long format for plotting, which can be quite cumbersomewhen creating simple plots. Therefore, the motivation forggmatplot isto provide a solution that allowsggplot2 to handle wide format data.Althoughggmatplot doesn’t provide the same flexibility asggplot2, it can be used as aworkaround for having to wrangle wide format data into long format andcreating simple plots usingggplot2.
ggmatplot is built uponggplot2,and its functionality is inspired bymatplot.Therefore,ggmatplot can be considered as aggplot version ofmatplot.
Similar tomatplot,ggmatplot plots a vector against the columns of a matrix, or thecolumns of two matrices against each other, or a vector/matrix on itsown. However, unlikematplot,ggmatplot returns aggplot object. Therefore,ggplot addons such as scales,faceting specifications, coordinate systems and themes can also be addedon to the resultantggplot object.
The latest version can be installed fromCRAN:
install.packages("ggmatplot")Or you can install the development version fromGitHub:
# install.packages("devtools")devtools::install_github("xuan-liang/ggmatplot")
library(ggmatplot)The first example plots a vector against each column of matrix using thedefaultplot_type = "point" ofggmatplot(). We consider a simplecase that we have a covariate vectorx and a matrixz with theresponsey and the fitted valuefit.y as the two columns.
# vector xx<- c(rnorm(100,sd=2))# matrix zy<-x*0.5+ rnorm(100,sd=1)fit.y<- fitted(lm(y~x))z<- cbind(actual=y,fitted=fit.y)ggmatplot(x,z)
If two matrices with equal number of columns are used, the correspondingcolumns of the matrices will be plotted against each other, i.e. column1 of matrixy will be plotted against column 1 of matrixx, column 2of matrixy will be plotted against column 2 of matrixx, etc.
The next example uses the iris dataset, with matricesx andy asshown below. TheSepal.Width is plotted againstSepal.Length and thePetal.Width is plotted againstPetal.Length. Therefore the groups‘Column 1’ and ‘Column 2’ can be interpreted as ‘Sepal’ and ‘Petal’respectively. To make the plot more meaningful, we can further modifythe legend label and axis names bylegend_label,xlab andylab.
x<- (iris[, c(1,3)])head(x,5)#> Sepal.Length Petal.Length#> 1 5.1 1.4#> 2 4.9 1.4#> 3 4.7 1.3#> 4 4.6 1.5#> 5 5.0 1.4y<- (iris[, c(2,4)])head(y,5)#> Sepal.Width Petal.Width#> 1 3.5 0.2#> 2 3.0 0.2#> 3 3.2 0.2#> 4 3.1 0.2#> 5 3.6 0.2ggmatplot(x,y)
ggmatplot(x,y,xlab="Length",ylab="Width",legend_label= c("Sepal","Petal"))
The next example creates a line plot of vectorx against the columnsof matrixy by usingplot_type = "line". Although the lines would berepresented using different colors by default, thecolor parameterallows custom colors to be assigned to them.
The following plot assigns custom colors to the lines, and the limits ofthe y axis are updated using theylim parameter. Further, a ggplottheme is added on to the resultant ggplot object.
# matrix xx<-1:10# matrix yy<- cbind(square=x^2,cube=x^3)ggmatplot(x,y,plot_type="line",color= c("blue","purple"),ylim= c(0,750))+ theme_minimal()
Next is the plot of the US personal expenditure over 5 categories and 5years, and is a simple example of how wide format data can be used withggmatplot(). Note how the expenditure categories to be used on the xaxis is used as vectorx, and the expenditure values is used in wideformat as matrixy - with its columns corresponding to the groupingstructure.
The plot specifies the plot type asplot_type = "both", which is acombination of ‘point’ and ‘line’ plots. It is further customized byusingggmatplot() parameters and aggplot theme as well.
USPersonalExpenditure#> 1940 1945 1950 1955 1960#> Food and Tobacco 22.200 44.500 59.60 73.2 86.80#> Household Operation 10.500 15.500 29.00 36.5 46.20#> Medical and Health 3.530 5.760 9.71 14.0 21.10#> Personal Care 1.040 1.980 2.45 3.4 5.40#> Private Education 0.341 0.974 1.80 2.6 3.64# vector xx<- rownames(USPersonalExpenditure)ggmatplot(x,USPersonalExpenditure,plot_type="both",xlab="Category",ylab="Expenditure (in Billions of Dollars)",legend_title="Year",legend_label= c(1940,1945,1950,1955,1960))+ theme(axis.text.x= element_text(angle=45,hjust=1))
Density plots only accept a single matrix or data frame and will groupthe plot based on its columns. The following density plot uses a twocolumn matrix, and groups the plot by the two columns. While the defaultdensity estimate is represented in the measurement units of the data, anaesthetic mapping is added on to the ggplot object to scale the densityestimate to a maximum of 1.
# matrix xx<- (iris[,1:2])ggmatplot(x,plot_type="density")+ aes(y= stat(scaled))+ theme_bw()
Boxplots accept only a single matrix or data frame as well, and uses itscolumns as individual groups. By default, the fill color is white. Butit is easy to customize and the transparency can be modified by thealpha parameter.
It is also worth noticing thatalpha isn’t a parameter defined inggmatplot(), but can be used. This is becauseggmatplot is builtuponggplot2, and eachplot_type corresponds to ageomas listedhere.Therefore, all valid parameters with the underlyingggplot2 geomcan be used withggmatplot().
# matrix xx<- (iris[,1:4])ggmatplot(x,plot_type="boxplot",color='black',fill='grey',alpha=0.8,xlab="",ylab="")
Violin plots accept a single matrix or data frame input, and behavesimilar to density plots and boxplots.
This plot updates the colors of the two groups using thecolorparameter, and it can be seen that the fill of the violin plots has beenupdated too. This is because updating either thecolor orfillparameter will automatically update the other, unless they are bothdefined simultaneously.
# matrix xx<- (iris[,1:2])ggmatplot(x,plot_type="violin",color= c("#00AFBB","#E7B800"),xlab="",ylab="")
Dotplots too accept a single matrix input and plot the distribution ofeach of its columns. The next example uses theplot_type = "dotplot"to visualize the distribution of the data with the custom color andbinwidth value. Note that the default setting for binwidth is 1/30 ofthe range of the data.
# matrix xx<- (iris[,1:2])ggmatplot(x,plot_type="dotplot",color= c("#00AFBB","#E7B800"),binwidth=0.1,xlab="",ylab="")
Similar to density, violin, dotplots, and box plots, histograms tooaccept a single matrix or data frame input and group the plot using itscolumns. The histogram in the following example uses a matrix of 4columns, and therefore groups the plots based on these 4 columns. Theplot is also faceted by group and the legend is removed by aggplottheme setting.
Thecolor andfill parameters have been defined simultaneously onthis plot. However, only a singlecolor value is defined whereas thenumber offill colors correspond to the number of groups. If a singlevalue is defined it will be used over all groups, like the black linecolor is used across all groups in this example.
# matrix xx<- (iris[,1:4])ggmatplot(x,plot_type="histogram",xlab="",color="black",fill= c("#F8766D","#7CAE00","#00BFC4","#C77CFF"))+ facet_wrap(~Group,scales="free")+ theme(legend.position='none')
The next example is of theplot_type = ecdf, and also uses a singlematrix input to plot out the empirical cumulative distributions of thecolumns of the matrix individually.
# matrix xx<- (iris[,1:4])ggmatplot(x,plot_type="ecdf",xlab="",ylab='Empirical CDF',size=1)+ theme_minimal()
Error plots also accept only a single matrix input, and plots out errorbars for each column of the matrix. Thedesc_stat parameter ofggmatplot() can be used to define what the mid point and error bars ofthe plot should represent.
The next example, plots out anerrorplot using the medians andinterquartile ranges of each variable.
# matrix xx<- (iris[,1:4])ggmatplot(x,plot_type="errorplot",desc_stat="median_iqr",xlab="",size=1)+ theme_minimal()
About
An R package for plotting wide-format data
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.











