We will use the same example as in theintroduction vignette, theAnimals object.
library(tidyverse)animals<-as.matrix(MASS::Animals)log_animals<-log(animals)animal_info<- MASS::Animals%>%rownames_to_column("Animal")%>%mutate(is_extinct =case_when(Animal%in%c("Dipliodocus","Triceratops","Brachiosaurus")~TRUE,TRUE~FALSE),class =case_when(Animal%in%c("Mountain beaver","Guinea pig","Golden hamster","Mouse","Rabbit","Rat")~"Rodent", Animal%in%c("Potar monkey","Gorilla","Human","Rhesus monkey","Chimpanzee")~"Primate", Animal%in%c("Cow","Goat","Giraffe","Sheep")~"Ruminant", Animal%in%c("Asian elephant","African elephant")~"Elephantidae", Animal%in%c("Grey wolf")~"Canine", Animal%in%c("Cat","Jaguar")~"Feline", Animal%in%c("Donkey","Horse")~"Equidae", Animal=="Pig"~"Sus", Animal=="Mole"~"Talpidae", Animal=="Kangaroo"~"Macropodidae",TRUE~"Dinosaurs"))%>%select(-body,-brain)Annotations are internally stored as [tibble::tibble()] objects andcan be viewed as simple data bases. As such, a key is needed to uniquelyidentify the rows or the columns. This key is the rownames for rowannotation and colnames for column annotation.
This key is called the tag and, unless specified otherwise at thematrixset creation, is stored as.rowname/.colname. This special tag can almostbe used as any other annotation traits - seeApplying Functions.
When using an externaldata.frame to create newannotations, the data frame must contain this key - it doesn’t have tobe called.rowname/.colname - in asingle column.
Moreover, the key values must correspond to rownames/colnames. Valuesthat do not match will simply left out.
To use the annotation at creation, simply use a command like this
ms<-matrixset(msr = animals,log_msr = log_animals,row_info = animal_info,row_key ="Animal")ms#> matrixset of 2 28 × 2 matrices#>#> matrix_set: msr#> A 28 × 2 <dbl> matrix#> body brain#> Mountain beaver 1.35 8.10#> ... ... ...#> Pig 192.00 180.00#>#> matrix_set: log_msr#> A 28 × 2 <dbl> matrix#> body brain#> Mountain beaver 0.30 2.09#> ... ... ...#> Pig 5.26 5.19#>#>#> row_info:#> # A tibble: 28 × 3#> .rowname is_extinct class#> <chr> <lgl> <chr>#> 1 Mountain beaver FALSE Rodent#> 2 Cow FALSE Ruminant#> 3 Grey wolf FALSE Canine#> 4 Goat FALSE Ruminant#> 5 Guinea pig FALSE Rodent#> 6 Dipliodocus TRUE Dinosaurs#> 7 Asian elephant FALSE Elephantidae#> 8 Donkey FALSE Equidae#> 9 Horse FALSE Equidae#> 10 Potar monkey FALSE Primate#> # ℹ 18 more rows#>#>#> column_info:#> # A tibble: 2 × 1#> .colname#> <chr>#> 1 body#> 2 brainNotice how we used therow_key argument to specify howto link the two objects together.
The internaltibble can be replaced by a new one. Thiscould be an interesting possibility to add annotations to an existingmatrixset object where none were registered.
To do so, you can simply do
For the operation to work, a column called.rowname (ormore generally, what is returned byrow_tag()) must be partof the data frame.
The column equivalents arecolumn_info andcolumn_tag.
Annotation tibble replacement works even if annotations wereregistered. Be aware of two things:
This is equivalent to performing a mutating join (default:[dplyr::left_join()], though all mutating joins - except cross-joins -are available via thetype argument) between thematrixset (.ms) object’s annotationtibble and adata.frame (.y).
Theby argument will determine how to join the twodata.frames together, so it is not necessary fory to have a.rowname/.colnamecolumn. But when theby argument is not provided, a naturaljoin is performed.
One behavior that differs with a true mutating join, is that when arow from.ms matches more than one row in.y,no row duplication will be performed. Instead, a condition error will beissued. This is to preserve thematrixset property that allrow names (and column names) must be unique.
matrixset(msr = animals,log_msr = log_animals)%>%join_row_info(animal_info,by =c(".rowname"="Animal"))Indeed! In usingjoin_row_info()/join_column_info(),.y can be amatrixset object, in which casethe appropriate annotationtibble will be used.
The only difference is when using the defaultby = NULLargument. In that case the row/column tag of each object is used.
If you are familiar withdplyr::mutate(), then you knowalmost everything you need to know about usingannotate_row() andannotate_column().
ms<-matrixset(msr = animals,log_msr = log_animals)%>%join_row_info(animal_info,by =c(".rowname"="Animal"))%>%annotate_column(unit =case_when(.colname=="body"~"kg",TRUE~"g"))%>%annotate_column(us_unit =case_when(unit=="kg"~"lb",TRUE~"oz"))column_info(ms)#> # A tibble: 2 × 3#> .colname unit us_unit#> <chr> <chr> <chr>#> 1 body kg lb#> 2 brain g ozYou can decide that you don’t need two unit systems and keep onlyone
Applying functions to amatrixset’s matrices is coveredin theApplying Functions vignette.
The idea here is the same, but with the added benefit that thefunction result is stored directly as annotation for thematrixset object.
ms%>%annotate_row_from_apply(msr,ratio_brain_body =~ .i[2]/(10*.i[1]))%>%row_info()#> # A tibble: 28 × 4#> .rowname is_extinct class ratio_brain_body#> <chr> <lgl> <chr> <dbl>#> 1 Mountain beaver FALSE Rodent 0.6#> 2 Cow FALSE Ruminant 0.0910#> 3 Grey wolf FALSE Canine 0.329#> 4 Goat FALSE Ruminant 0.416#> 5 Guinea pig FALSE Rodent 0.529#> 6 Dipliodocus TRUE Dinosaurs 0.000427#> 7 Asian elephant FALSE Elephantidae 0.181#> 8 Donkey FALSE Equidae 0.224#> 9 Horse FALSE Equidae 0.126#> 10 Potar monkey FALSE Primate 1.15#> # ℹ 18 more rowsWhen groups are registered, results are spread usingtidyr::pivot_wider().
ms%>%row_group_by(class)%>%annotate_column_from_apply(msr, mean)%>%column_info()#> # A tibble: 2 × 13#> .colname unit Canine Dinosaurs Elephantidae Equidae Feline Macropodidae#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>#> 1 body kg 36.3 36033. 4600. 354. 51.6 35#> 2 brain g 120. 91.5 5158. 537 91.3 56#> # ℹ 5 more variables: Primate <dbl>, Rodent <dbl>, Ruminant <dbl>, Sus <dbl>,#> # Talpidae <dbl>