- Notifications
You must be signed in to change notification settings - Fork27
📈 📊 Introduces geom_pointdensity(): A Cross Between a Scatter Plot and a 2D Density Plot.
License
LKremer/ggpointdensity
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Introducesgeom_pointdensity()
: A cross between a scatter plot and a 2D density plot.
To install the package, type this command in R:
install.packages("ggpointdensity")# Alternatively, you can install the latest# development version from GitHub:if (!requireNamespace("devtools",quietly=TRUE)) install.packages("devtools")devtools::install_github("LKremer/ggpointdensity")
There are several ways to visualize data points on a 2D coordinate system:If you have lots of data points on top of each other,geom_point()
fails togive you an estimate of how many points are overlapping.geom_density2d()
andgeom_bin2d()
solve this issue, but they make it impossibleto investigate individual outlier points, which may be of interest.
geom_pointdensity()
aims to solve this problem by combining the best of bothworlds: individual points are colored by the number of neighboring points.This allows you to see the overall distribution, as well as individual points.
Generate some toy data and visualize it withgeom_pointdensity()
:
library(ggplot2)library(dplyr)library(viridis)library(ggpointdensity)dat<- bind_rows( tibble(x= rnorm(7000,sd=1),y= rnorm(7000,sd=10),group="foo"), tibble(x= rnorm(3000,mean=1,sd=.5),y= rnorm(3000,mean=7,sd=5),group="bar"))ggplot(data=dat,mapping= aes(x=x,y=y))+ geom_pointdensity()+ scale_color_viridis()
Each point is colored according to the number of neighboring points.The distance threshold to consider two points as neighbors (smoothingbandwidth) can be adjusted with theadjust
argument, whereadjust = 0.5
means use half of the default bandwidth.
ggplot(data=dat,mapping= aes(x=x,y=y))+ geom_pointdensity(adjust=.1)+ scale_color_viridis() ggplot(data=dat,mapping= aes(x=x,y=y))+ geom_pointdensity(adjust=4)+ scale_color_viridis()
Of course you can combine the geom with standardggplot2
featuressuch as facets...
# Facetting by groupggplot(data=dat,mapping= aes(x=x,y=y))+ geom_pointdensity()+ scale_color_viridis()+ facet_wrap(~group)
... or point shape and size:
dat_subset<- sample_frac(dat,.1)# smaller data setggplot(data=dat_subset,mapping= aes(x=x,y=y))+ geom_pointdensity(size=3,shape=17)+ scale_color_viridis()
Zooming into the axis works as well, keep in mind thatxlim()
andylim()
change the density since they remove data points.It may be better to usecoord_cartesian()
instead.
ggplot(data=dat,mapping= aes(x=x,y=y))+ geom_pointdensity()+ scale_color_viridis()+ xlim(c(-1,3))+ ylim(c(-5,15))ggplot(data=dat,mapping= aes(x=x,y=y))+ geom_pointdensity()+ scale_color_viridis()+ coord_cartesian(xlim= c(-1,3),ylim= c(-5,15))
You can re-use or modify the density estimates using ggplot2'safter_stat()
function.
For instance, let's say you want to plot the density estimates on arelative instead of an absolute scale, i.e. scaled from 0 to 1.Of course this can be achieved by dividing the absolute density values by the maximum, but how do you access the density estimates on R code?The short answer is to useafter_stat(density)
inside an aesthetics mapping like so:
ggplot(data=dat, aes(x=x,y=y,color= after_stat(density/ max(density))))+ geom_pointdensity(size=.3)+ scale_color_viridis()+ labs(color="relative\ndensity")
For a more in-depth explanation onafter_stat()
, check outthe relevant ggplot documentation.
Since plotting the relative density is a common use-case, we provide a little shortcut.Instead of the solution above you can simply useafter_stat(ndensity)
.This is especially useful when facetting data, since sometimes you want to inspect the point density separately for each facet:
ggplot(data=dat, aes(x=x,y=y,color= after_stat(ndensity)))+ geom_pointdensity(size=.25)+ scale_color_viridis()+ facet_wrap(~group)+ labs(color="relative\ndensity")
Even though thefoo
data group is not as dense asbar
overall, this plot uses the whole color scale between 0 and 1 in both facets.
Lastly, you can useafter_stat()
to affect other plot aesthetics such as point size:
ggplot(data=dat, aes(x=x,y=y,size= after_stat(1/density^1.8)))+ geom_pointdensity(adjust=.2)+ scale_color_viridis()+ scale_size_continuous(range= c(.001,3))
Here the point size is proportional to1 / density ^ 1.8
.
Lukas PM Kremer (@LPMKremer) and Simon Anders (@s_anders_m), 2019
About
📈 📊 Introduces geom_pointdensity(): A Cross Between a Scatter Plot and a 2D Density Plot.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors5
Uh oh!
There was an error while loading.Please reload this page.