Movatterモバイル変換

Type:

Package

Title:

Hierarchical Heatmaps

Version:

0.0.1.1

Maintainer:

Michael Mahony <michael.mahony@cantab.net>

Description:

Allows users to create high-quality heatmaps from labelled, hierarchical data. Specifically, for data with a two-level hierarchical structure, it will produce a heatmap where each row and column represents a category at the lower level. These rows and columns are then grouped by the higher-level group each category belongs to, with the names for each category and groups shown in the margins. While other packages (e.g. 'dendextend') allow heatmap rows and columns to be arranged by groups only, 'hhmR' also allows the labelling of the data at both the category and group level.

License:

MIT + file LICENSE

Encoding:

UTF-8

URL:

https://github.com/sgmmahon/hhmR,https://sgmmahon.github.io/hhmR/

BugReports:

https://github.com/sgmmahon/hhmR/issues

Depends:

R (≥ 3.5.0)

Imports:

dplyr, purrr, tidyr, rlang, grid, ggplot2, patchwork,grDevices, magrittr, utils

Language:

en-GB

LazyData:

true

RoxygenNote:

7.3.2

Suggests:

knitr, rmarkdown

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2025-10-20 17:29:04 UTC; hornik

Author:

Michael Mahony

[cre, aut, cph], Francisco Rowe

[aut], Carmen Cabrera-Arnau

[aut]

Repository:

CRAN

Date/Publication:

2025-10-20 17:41:43 UTC

Pipe operator

Description

Seemagrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of calling 'rhs(lhs)'.

cg

Description

Creates colour gradient between two hexcodes.

Usage

cg(colour1, colour2, n = 15)

Arguments

colour1

The first hexcode colour.

colour2

The second hexcode colour.

n

The length of the vector returned by the function.

Value

A vector of hexcodes of length n, containing a colour gradient between colour =1 and colour2.

Examples

cg("white","black",20)

decimalplaces

Description

Tests the number of non-zero decimal places within a number.

Usage

decimalplaces(x)

Arguments

x

The number for the number of decimal places is to be measured.

Value

A single number, indicating the number of non-zero decimal places in 'x'.

Examples

decimalplaces(23.43234525)decimalplaces(334.3410000000000000)decimalplaces(2.000)

example_migration

Description

Fake migration dataset used to demonstrate the functionality of the hhm function in the hhmR package.It contains the information on the number of people who have moved between a series of fictionalgeographies. The geographies themselves have a hierarchical structure, with each county existing withina smaller subset of regions.

Usage

data(example_migration)

Format

A data frame with 324 rows and 5 variables.

Origin County: The county (lower-level geography) that each migrant began in.
Destination County: The county (lower-level geography) that each migrant ended up in.
Origin Region: The region (higher-level geography) that each migrant began in.
Destination Region: The region (higher-level geography) that each migrant ended up in.
Migration: The number of migrants that moved from each origin county to each destination county.

Examples

library(dplyr)# Code to create dataset# Define names of fake countiesfake_counties = c("Greenridge","Windermoor","Bramblewood","Silverlake",                  "Thornbury","Maplewood","Hawthorne","Pinehurst",                  "Riverton","Meadowbrook","Fairhaven","Oakdale","Stonebridge",                  "Brookfield","Ashford","Glenville","Sunnyvale","Westfield")# Create region county lookup tablesrc_lkp = data.frame(region = c(rep("North",3),rep("Midlands",5),                               rep("South West",4),rep("South East",6)),                    county = fake_counties)og_lkp = rc_lkp %>% setNames(c("Origin Region"     ,"Origin County"     ))dn_lkp = rc_lkp %>% setNames(c("Destination Region","Destination County"))# Create dataframe of fake migration dataset.seed(1234)example_migration = expand.grid(fake_counties,fake_counties) %>%                    setNames(paste(c("Origin","Destination"),"County",sep=" ")) %>%                    full_join(og_lkp) %>% full_join(dn_lkp) %>%                    mutate(Migration = (1/rgamma(18*18, shape = 17, rate = 0.5)) %>%                                       {. * 1000} %>% round())example_migration[example_migration$`Origin County` ==                  example_migration$`Destination County`,"Migration"] = example_migration[example_migration$`Origin County` ==                   example_migration$`Destination County`,"Migration"] * 10

example_time_series

Description

Fake migration dataset used to demonstrate the functionality of the tshm function in the hhmR package.It contains the information on the number of people who have immigrated a series of fictionalgeographies over the years 2011 to 2015. The geographies themselves have a hierarchical structure,with each county existing within a smaller subset of regions.

Usage

data(example_time_series)

Format

A data frame with 90 rows and 4 variables.

County: The county (lower-level geography) that immigrants move to.
Region: The region (higher-level geography) that immigrants move to.
Year: The year during which each wave of immigration occured.
Immigration: The number of immigrants that moved each county in each year.

Examples

library(dplyr)library(tidyr)# Define names of fake countiesfake_counties = c("Greenridge","Windermoor","Bramblewood","Silverlake",                  "Thornbury","Maplewood","Hawthorne","Pinehurst",                  "Riverton","Meadowbrook","Fairhaven","Oakdale","Stonebridge",                  "Brookfield","Ashford","Glenville","Sunnyvale","Westfield")# Create dataframe of fake migration dataset.seed(1234)example_time_series = data.frame(region = c(rep("North",3),rep("Midlands",5),                                            rep("South West",4),rep("South East",6)),                                 county = fake_counties,                                 year_2011 = sample(1:10000,length(fake_counties)),                                 year_2012 = sample(1:10000,length(fake_counties)),                                 year_2013 = sample(1:10000,length(fake_counties)),                                 year_2014 = sample(1:10000,length(fake_counties)),                                 year_2015 = sample(1:10000,length(fake_counties))) %>%  setNames(c("Region","County",2011:2015)) %>%  pivot_longer(cols = `2011`:`2015`,                      names_to = "Year",                      values_to = "Immigration") %>%  mutate(Year = as.numeric(Year))example_time_series[sample(1:(length(fake_counties)*5),5),"Immigration"] = NA

exp_seq

Description

Creates a vector of exponentially increasing values between 0 and a specified value 'n'.If 'n' is specified as 1, the vector will be scaled to between 0 and 1.

Usage

exp_seq(n, ln = 15, exponent = 2, round_values = TRUE, rmv_extremes = TRUE)

Arguments

n

The maximum value that the values in the sequence are scaled to.

ln

How long the vector should be (defaults to 15).

exponent

The exponential power with which to multiply the sequence by (defaults to 2).

round_values

Option to round values to whole numbers (defaults to 'TRUE'). If 'n' equals 1,round_values will automatically be set to FALSE.

rmv_extremes

Option to remove zero and the maximum value (i.e. 'n') from the beginningand the end of the returned vector (defaults to 'FALSE'). Note that this will mean the lengthof the returned vector will be 'n' - 2.

Value

A vector containing exponentially increasing values between 0 and a specified value 'n'.

Examples

# Create sequence of length 8, scaled between 0 and 10000exp_seq(10000,8)# Set rmv_extremes = FALSE to get full sequenceexp_seq(10000,8,rmv_extremes = FALSE)# The exponent defaults to 2. Setting it to between 1 and 2 causes it to converge on# a linear sequence. When exponent is set to 1 the sequence increases linearlyexp_seq(10000,8,exponent=1)# Setting it to greater than 2 will cause it the values in the sequence to shift towards zeroexp_seq(10000,8,exponent=4)# Create sequence of length 12, scaled between 0 and 1exp_seq(1,12)exp_seq(1,12,rmv_extremes = FALSE)exp_seq(1,12,exponent=1)exp_seq(1,12,exponent=4)

Hierarchical Heatmap

Description

Creates a labelled heatmap from heirarchical data. This function isuseful if you wish to create a heatmap where the categories shown on both the xand y axis can be grouped in some way. This heatmap will order the categories bytheir assigned group and present both the categories and group labels along theaxes. An example might be a series of smaller geographies (lower categories) whichaggregate into larger geographical regions (upper groups).

Usage

hhm(  df,  ylower,  yupper,  xlower,  xupper,  values,  rm_diag = FALSE,  lgttl = NULL,  bins = NULL,  cbrks = NULL,  cclrs = NULL,  norm_lgd = FALSE,  lgdps = 0,  xttl_height = 0.15,  yttl_width = 0.15)

Arguments

df

A data.frame with containing values with which to populate the heatmap.The data.frame must include columns specifying the lower categories ('ylower','xlower') and upper groups ('yupper', 'xupper') that each value corresponds to.These categories and groups will be used to arrange and label the rows andcolumns of the heatmap. It must also contain a 'values' variable containing thevalues used to populate the heatmap. Note that the groups will by default bearranged alphabetically (top to bottom / left to right). The ordering of thegroups can be manually specified by converting yupper and/or xupper to factors.In this case, the groups will be ordered based on the ordering of the factorlevels.

ylower

A column in 'df' containing the categories that will be presentedas rows along the y-axis of the heatmap.

yupper

A column in 'df' containing the groupings that will be used toarrange the heatmap rows.

xlower

A column in 'df' containing the categories that will be presentedas columns along the x-axis of the heatmap.

xupper

A column in 'df' containing the groupings that will be used toarrange the heatmap columns.

values

A column in 'df' containing the values used to populate theheatmap.

rm_diag

Do not show values for categories along the x and y axes thatare identical (defaults to 'FALSE'). This is particularly useful fororigin-destination heatmaps, where the user may want to hide the diagonalvalues.

lgttl

Option to manually define legend title.

bins

Option to break the data into a specified number of groups(defaults to 'NULL'). The thresholds between these groups will be equallyspaced between zero and the maximum value observed in 'values'.

cbrks

Vector of custom breaks, if users wish to use a discrete legendcolour scheme (defaults to 'NULL'). For example, a supplied vector of 'c(5,10,20)' would break he values up into 5 ordered groups of ranges 0, 0-5, 5-10,10-20 and 20+.

cclrs

Vector of hexcodes, which to create a custom legend colour scheme(defaults to 'NULL'). If 'cbrks' is supplied, 'cclrs' must have a lengthtwo longer than 'cbrks'. If 'bins' is supplied, 'cclrs' must have a lengthequal to the values provided to 'bins'.

norm_lgd

Normalised to between 0 and 1 in legend (defaults to 'FALSE').Allows for consistency when comparing heatmaps across different datasets. Atpresent, this only works if all heatmap values are positive.

lgdps

If using custom breaks, define the number of decimal points toround the legend scale to (defaults to 0). If 'norm_lgd' is 'TRUE', it willdefault to 3.

xttl_height

The space allocated to the group titles on the x-axis as aproportion of the heatmap's height (defaults to 0.15).

yttl_width

The space allocated to the group titles on the y-axis as aproportion of the heatmap's width (defaults to 0.15).

Value

A ggplot object containing the final heatmap.

Examples

# Import toy demonstration dataset (see `?example_migration` for see details)data(example_migration)# Intial heatmaphierarchical_heatmap = hhm(df = example_migration,                           ylower = "Origin County",                           xlower = "Destination County",                           yupper = "Origin Region",                           xupper = "Destination Region",                           values = "Migration",                           yttl_width = 0.22,                           xttl_height = 0.4)# For more details, see the package vignette at# https://sgmmahon.github.io/hhmR/articles/hhmR_overview.html

log_seq

Description

Creates a vector of logarithmicly increasing values between 0 and a specified value 'n'.If 'n' is specified as 1, the vector will be scaled to between 0 and 1.

Usage

log_seq(n, ln = 15, round_values = TRUE, rmv_extremes = FALSE)

Arguments

n

The maximum value that the values in the sequence are scaled to.

ln

How long the vector should be (defaults to 15).

round_values

Option to round values to whole numbers (defaults to 'TRUE').

rmv_extremes

Value

A vector containing logarithmicly increasing values between 0 and a specified value 'n'.

Examples

# Create sequence of length 20, scaled between 0 and 500log_seq(500,20)# Create sequence of length 15, scaled between 0 and 1log_seq(1,12)

plt_ttl

Description

Creates plot containing the name of a given upper group. Used in combination with thepatchwork package to plot the names of the upper groups within the hhm function.

Usage

plt_ttl(ttl, axs = "x", rotate_title = TRUE)

Arguments

ttl

The name of the upper group.

axs

The axis on which the name will appear (defaults to "x"). If 'x', the text will bewritten at the top-centre of the plot. If 'y', the text will be written at the middle-right of theplot.

rotate_title

Whether the title should be rotate to be perpendicular to the axis (defaultsto TRUE). If TRUE, the title text on the x and y axes will be printed horizontally and verticallyrespectively, with the reverse orientation if set to FALSE.

Value

A ggplot object containing the title of a given upper group, for use in the hhm function.

Examples

plt_ttl("Group 1", axs = "y")plt_ttl("Group 2")plt_ttl("Group 1", axs = "y",rotate_title = FALSE)plt_ttl("Group 2"           ,rotate_title = FALSE)

Time-series Hierarchical Heatmap

Description

Creates a labelled time-series heatmap from heirarchical data. Thisfunction is useful if you wish to create a time-series heatmap where thecategories shown on the y axis can be grouped in some way. This heatmapwill order the categories by their assigned group and present both the categoriesand group labels along the y-axis. An example might be series of smallergeographies (lower categories) which aggregate into larger geographical regions(upper groups).

Usage

tshhm(  df,  lower,  upper,  times,  values,  sort_lower = "alphabetical",  lgttl = NULL,  bins = NULL,  cbrks = NULL,  cclrs = NULL,  norm_lgd = FALSE,  lgdps = 0,  na_colour = NULL,  xttl_height = 0.05,  yttl_width = 0.15)

Arguments

df

A data.frame with containing values with which to populate the heatmap.The data.frame must include columns specifying the lower categories ('lower') andupper groups ('upper') that each value corresponds to. These categories andgroups will be used to arrange and label the rows of the heatmap. 'df' must alsocontain a 'values' variable, containing the values used to populate the heatmap,and a 'times' variable, containing the time period during which each value wasobserved. Note that the groups in 'upper' will by default be arrangedalphabetically (top to bottom). The ordering of the groups can be manuallyspecified by converting 'upper' to a factor. In this case, the groupswill be ordered based on the ordering of the factor levels. The ordering of rowswithin each group can also be specified using the 'sort_lower' variable.

lower

A column in 'df' containing the categories that will be presentedas rows along the y-axis of the heatmap.

upper

A column in 'df' containing the groupings that will be used toarrange the heatmap rows.

times

A column in 'df' containing the time-period during which eacheach value in 'values' was observed.

values

A column in 'df' containing the values used to populate theheatmap.

sort_lower

Option to define how rows (lower) within each group (upper)are ordered. The default option is 'alphabetical', which orders rows inalphabetical order from top to bottom. Other options include 'sum_ascend' and'mean_ascend', which order rows in ascending order (top to bottom) based onthe row totals and row means respectively. This order can be reversed with theoptions 'sum_descend' and 'mean_descend'.

lgttl

Option to manually define legend title.

bins

Option to break the data into a specified number of groups(defaults to 'NULL'). The thresholds between these groups will be equallyspaced between the minimum and maximum values observed in 'values'.

cbrks

cclrs

norm_lgd

Normalised to between 0 and 1 in legend (defaults to 'FALSE').Allows for consistency when comparing heatmaps across different datasets. Atpresent, this only works if all heatmap values are positive.

lgdps

If using custom breaks, define the number of decimal points toround the legend scale to (defaults to 0). If 'norm_lgd' is 'TRUE', it willdefault to 3.

na_colour

Option to define the colour of NA values in the legend (defaultsto 'NULL', meaning NA values will be assigned no colour).

xttl_height

The space allocated to the title on the x-axis as aproportion of the heatmap's height (defaults to 0.05).

yttl_width

The space allocated to the group titles on the y-axis as aproportion of the heatmap's width (defaults to 0.15).

Value

A ggplot object containing the final heatmap.

Examples

library(dplyr)# Import toy demonstration dataset (see `?example_time_series` for see details)data(example_time_series)# Intial heatmaptime_series_heatmap = tshhm(df = example_time_series,                            lower  = "County",                            upper  = "Region",                            times  = "Year",                            values = "Immigration",                            yttl_width  = 0.25)# View resulttime_series_heatmap# For more details, see the package vignette at# https://sgmmahon.github.io/hhmR/articles/hhmR_overview.html