Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

📈 📊 Introduces geom_pointdensity(): A Cross Between a Scatter Plot and a 2D Density Plot.

License

NotificationsYou must be signed in to change notification settings

LKremer/ggpointdensity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRAN_Status_BadgeDownloads

Introducesgeom_pointdensity(): A cross between a scatter plot and a 2D density plot.

Installation

To install the package, type this command in R:

install.packages("ggpointdensity")# Alternatively, you can install the latest# development version from GitHub:if (!requireNamespace("devtools",quietly=TRUE))    install.packages("devtools")devtools::install_github("LKremer/ggpointdensity")

Motivation

There are several ways to visualize data points on a 2D coordinate system:If you have lots of data points on top of each other,geom_point() fails togive you an estimate of how many points are overlapping.geom_density2d() andgeom_bin2d() solve this issue, but they make it impossibleto investigate individual outlier points, which may be of interest.

geom_pointdensity() aims to solve this problem by combining the best of bothworlds: individual points are colored by the number of neighboring points.This allows you to see the overall distribution, as well as individual points.

Changelog

Addedmethod argument and renamed then_neighbor stat todensity. The available optionsaremethod="auto",method="default" andmethod="kde2d".default is the regular n_neighbor calculationas in the CRAN package.kde2d uses 2D kernel density estimation to estimate the point density(credits to @slowkow).This method is slower for few points, but faster for many (ca. >20k) points. By default,method="auto" picks eitherkde2d ordefault depending on the number of points.

Demo

Generate some toy data and visualize it withgeom_pointdensity():

library(ggplot2)library(dplyr)library(viridis)library(ggpointdensity)dat<- bind_rows(  tibble(x= rnorm(7000,sd=1),y= rnorm(7000,sd=10),group="foo"),  tibble(x= rnorm(3000,mean=1,sd=.5),y= rnorm(3000,mean=7,sd=5),group="bar"))ggplot(data=dat,mapping= aes(x=x,y=y))+  geom_pointdensity()+  scale_color_viridis()

Each point is colored according to the number of neighboring points.(Note: this here is the dev branch, where I decided to plot the density estimateinstead of n_neighbors now.)The distance threshold to consider two points as neighbors (smoothingbandwidth) can be adjusted with theadjust argument, whereadjust = 0.5means use half of the default bandwidth.

ggplot(data=dat,mapping= aes(x=x,y=y))+  geom_pointdensity(adjust=.1)+  scale_color_viridis() ggplot(data=dat,mapping= aes(x=x,y=y))+  geom_pointdensity(adjust=4)+  scale_color_viridis()

Of course you can combine the geom with standardggplot2 featuressuch as facets...

# Facetting by groupggplot(data=dat,mapping= aes(x=x,y=y))+  geom_pointdensity()+  scale_color_viridis()+  facet_wrap(~group)

... or point shape and size:

dat_subset<- sample_frac(dat,.1)# smaller data setggplot(data=dat_subset,mapping= aes(x=x,y=y))+  geom_pointdensity(size=3,shape=17)+  scale_color_viridis()

Zooming into the axis works as well, keep in mind thatxlim() andylim() change the density since they remove data points.It may be better to usecoord_cartesian() instead.

ggplot(data=dat,mapping= aes(x=x,y=y))+  geom_pointdensity()+  scale_color_viridis()+  xlim(c(-1,3))+ ylim(c(-5,15))ggplot(data=dat,mapping= aes(x=x,y=y))+  geom_pointdensity()+  scale_color_viridis()+  coord_cartesian(xlim= c(-1,3),ylim= c(-5,15))

Authors

Lukas PM Kremer (@LPMKremer) and Simon Anders (@s_anders_m), 2019


[8]ページ先頭

©2009-2025 Movatter.jp