Movatterモバイル変換


[0]ホーム

URL:


mtb: Change Axis

Y. Hsu

2025-02-28

Background - Why using data derived axis transformation

For continuous\(x\) values, thecommon options for the\(x\)-axis scalecould be either linear or log. Assume that the\(x\) vector has a smaller number of distinctvalues. Some of those\(x\) valuescould be clustered tightly while others are far away from the cluster.In such case, the plot could have an area that is overfilled and haveother areas that are underused.

The functiontrans_composition() yields linear scale for\(x\)-axis below a breaking point. Allunique\(x\) values above the breakingpoint will be plotted in equal spaces in between. The functiontrans_loglinear() is based on an algorithm derived logtransformation such that the\(x\)values could be spaced out for easier interpreting.

Caution - The use of nonlinear axis scale can be DISTORTION and MISLEADING. The use of transformed scales should be **clearly noted** in the text or on the figure to avoid misleading.

These transformations can work withposition_dodge() andother adjustments.

How to use

Below is an example of using default linear scaled axis. The x valuesare selected to demonstrate the situation that\(x\) values are clustered at the lower side.As shown in the figure below, the details near\(x<15\) are hard to be seen.

library(mtb)library(ggplot2)pdt=data.frame(x=rep(c(0.5,1,10,11.5,17,100,300),each=5),gp=factor(rep(seq(1,5),7)))pdt$y=log10(pdt$x)+rnorm(length(pdt$x))p=ggplot(pdt,aes(x=x,y=y,group=gp,color=gp))+geom_point()+geom_line()+ggtitle("Plant Growth Chart")p

Below is an example that shows how to usetrans_composition() with break point set as 50. Distinct\(x\) values above 50 are plotted withequal spaces in between. While using non-linear transformed X-axis, itgives a distort sense regarding distances between the last two measuredtime points. In addition to that, the long distance between the lastthree\(x\) values makes the linearconnections between observations less certain, but the connections arestill needed to identify individual groups. The example below useddashed line above x=17 to have clearer indication of the change ofscales.

t=trans_composition(pdt$x,nb=30,brk=50,dab=1.5,dgrd=1,dgrd2=0.5)p2=ggplot(pdt,aes(x=x,y=y,group=gp,color=gp))+geom_point()+geom_line(pdt[pdt$x<=t$brk,],mapping=aes(x=x,y=y,group=gp,color=gp))+geom_line(pdt[pdt$x>=max(pdt$x<=t$brk),],mapping=aes(x=x,y=y,group=gp,color=gp),lty=2)+ggtitle("Plant Growth Chart")p2+scale_x_continuous(trans=t)+geom_vline(xintercept=t$brk,lwd=6,alpha=0.7,color='lightgray')+geom_text(x=-Inf,y=Inf,hjust=0,vjust=1,label='Caution: X-axis scale is not linear above 17',color='darkred')

Below is another example that shows how to usetrans_composition() with break point set as 0. In thisexample, all\(x\) values are plottedwith equal spaces in between.

t=trans_composition(pdt$x,nb=30,brk=0,dab=2,dgrd=1,dgrd2=1)p3=ggplot(pdt,aes(x=x,y=y,group=gp,color=gp))+geom_point()+geom_line(lty=2)+ggtitle("Plant Growth Chart")p3+scale_x_continuous(trans=t)+geom_text(x=-Inf,y=Inf,hjust=0,vjust=1,label='Caution: X-axis scale is not linear',color='darkred')

Danger Zone - Please Use Responsibly

Below is an example that shows how to usetrans_loglinear(). Thetrans_loglinear()transformation increases spaces between smaller x values and that mightbe only appropriate in certain situations.

t=trans_loglinear(pdt$x,mindist=0.03)p4=ggplot(pdt,aes(x=x,y=y,group=gp,color=gp))+geom_point()+geom_line(lty=3)+ggtitle("Plant Growth Chart")p4+scale_x_continuous(trans=t)+geom_text(x=-Inf,y=Inf,hjust=0,vjust=1,label='Caution: X-axis scale is not linear',color='red')

In some line plots, a straight line between two points is used toindicate that these two points are observations corresponding to thesame group. Those straight lines are not necessarily indicate or hintlinear trends between points. Also note that, for plots based on logscales, a straight line on a log-scale plot is a convex curve on alinear-scale plot.


[8]ページ先頭

©2009-2025 Movatter.jp