Movatterモバイル変換


[0]ホーム

URL:


How to annotate a matrixset

Annotate at object creation

We will use the same example as in theintroduction vignette, theAnimals object.

library(tidyverse)animals<-as.matrix(MASS::Animals)log_animals<-log(animals)animal_info<- MASS::Animals%>%rownames_to_column("Animal")%>%mutate(is_extinct =case_when(Animal%in%c("Dipliodocus","Triceratops","Brachiosaurus")~TRUE,TRUE~FALSE),class =case_when(Animal%in%c("Mountain beaver","Guinea pig","Golden hamster","Mouse","Rabbit","Rat")~"Rodent",                           Animal%in%c("Potar monkey","Gorilla","Human","Rhesus monkey","Chimpanzee")~"Primate",                           Animal%in%c("Cow","Goat","Giraffe","Sheep")~"Ruminant",                           Animal%in%c("Asian elephant","African elephant")~"Elephantidae",                           Animal%in%c("Grey wolf")~"Canine",                           Animal%in%c("Cat","Jaguar")~"Feline",                           Animal%in%c("Donkey","Horse")~"Equidae",                           Animal=="Pig"~"Sus",                           Animal=="Mole"~"Talpidae",                           Animal=="Kangaroo"~"Macropodidae",TRUE~"Dinosaurs"))%>%select(-body,-brain)

Annotations are internally stored as [tibble::tibble()] objects andcan be viewed as simple data bases. As such, a key is needed to uniquelyidentify the rows or the columns. This key is the rownames for rowannotation and colnames for column annotation.

This key is called the tag and, unless specified otherwise at thematrixset creation, is stored as.rowname/.colname. This special tag can almostbe used as any other annotation traits - seeApplying Functions.

When using an externaldata.frame to create newannotations, the data frame must contain this key - it doesn’t have tobe called.rowname/.colname - in asingle column.

Moreover, the key values must correspond to rownames/colnames. Valuesthat do not match will simply left out.

To use the annotation at creation, simply use a command like this

ms<-matrixset(msr = animals,log_msr = log_animals,row_info = animal_info,row_key ="Animal")ms#> matrixset of 2 28 × 2 matrices#>#> matrix_set: msr#> A 28 × 2 <dbl> matrix#>                   body  brain#> Mountain beaver   1.35   8.10#>             ...    ...    ...#>             Pig 192.00 180.00#>#> matrix_set: log_msr#> A 28 × 2 <dbl> matrix#>                 body brain#> Mountain beaver 0.30  2.09#>             ...  ...   ...#>             Pig 5.26  5.19#>#>#> row_info:#> # A tibble: 28 × 3#>    .rowname        is_extinct class#>    <chr>           <lgl>      <chr>#>  1 Mountain beaver FALSE      Rodent#>  2 Cow             FALSE      Ruminant#>  3 Grey wolf       FALSE      Canine#>  4 Goat            FALSE      Ruminant#>  5 Guinea pig      FALSE      Rodent#>  6 Dipliodocus     TRUE       Dinosaurs#>  7 Asian elephant  FALSE      Elephantidae#>  8 Donkey          FALSE      Equidae#>  9 Horse           FALSE      Equidae#> 10 Potar monkey    FALSE      Primate#> # ℹ 18 more rows#>#>#> column_info:#> # A tibble: 2 × 1#>   .colname#>   <chr>#> 1 body#> 2 brain

Notice how we used therow_key argument to specify howto link the two objects together.

Replacing an annotation tibble

The internaltibble can be replaced by a new one. Thiscould be an interesting possibility to add annotations to an existingmatrixset object where none were registered.

To do so, you can simply do

row_info(ms)<- animal_info%>%rename(.rowname = Animal)

For the operation to work, a column called.rowname (ormore generally, what is returned byrow_tag()) must be partof the data frame.

The column equivalents arecolumn_info andcolumn_tag.

Annotation tibble replacement works even if annotations wereregistered. Be aware of two things:

Appending data frame values to the annotation tibble

This is equivalent to performing a mutating join (default:[dplyr::left_join()], though all mutating joins - except cross-joins -are available via thetype argument) between thematrixset (.ms) object’s annotationtibble and adata.frame (.y).

Theby argument will determine how to join the twodata.frames together, so it is not necessary fory to have a.rowname/.colnamecolumn. But when theby argument is not provided, a naturaljoin is performed.

One behavior that differs with a true mutating join, is that when arow from.ms matches more than one row in.y,no row duplication will be performed. Instead, a condition error will beissued. This is to preserve thematrixset property that allrow names (and column names) must be unique.

matrixset(msr = animals,log_msr = log_animals)%>%join_row_info(animal_info,by =c(".rowname"="Animal"))

The data frame can be taken from a second matrixset object

Indeed! In usingjoin_row_info()/join_column_info(),.y can be amatrixset object, in which casethe appropriate annotationtibble will be used.

The only difference is when using the defaultby = NULLargument. In that case the row/column tag of each object is used.

Creating new annotations from existing ones (and modify/delete)

If you are familiar withdplyr::mutate(), then you knowalmost everything you need to know about usingannotate_row() andannotate_column().

ms<-matrixset(msr = animals,log_msr = log_animals)%>%join_row_info(animal_info,by =c(".rowname"="Animal"))%>%annotate_column(unit =case_when(.colname=="body"~"kg",TRUE~"g"))%>%annotate_column(us_unit =case_when(unit=="kg"~"lb",TRUE~"oz"))column_info(ms)#> # A tibble: 2 × 3#>   .colname unit  us_unit#>   <chr>    <chr> <chr>#> 1 body     kg    lb#> 2 brain    g     oz

You can decide that you don’t need two unit systems and keep onlyone

ms<- ms%>%annotate_column(us_unit =NULL)column_info(ms)#> # A tibble: 2 × 2#>   .colname unit#>   <chr>    <chr>#> 1 body     kg#> 2 brain    g

Creating new annotations from applying function(s) to an object’smatrix

Applying functions to amatrixset’s matrices is coveredin theApplying Functions vignette.

The idea here is the same, but with the added benefit that thefunction result is stored directly as annotation for thematrixset object.

ms%>%annotate_row_from_apply(msr,ratio_brain_body =~ .i[2]/(10*.i[1]))%>%row_info()#> # A tibble: 28 × 4#>    .rowname        is_extinct class        ratio_brain_body#>    <chr>           <lgl>      <chr>                   <dbl>#>  1 Mountain beaver FALSE      Rodent               0.6#>  2 Cow             FALSE      Ruminant             0.0910#>  3 Grey wolf       FALSE      Canine               0.329#>  4 Goat            FALSE      Ruminant             0.416#>  5 Guinea pig      FALSE      Rodent               0.529#>  6 Dipliodocus     TRUE       Dinosaurs            0.000427#>  7 Asian elephant  FALSE      Elephantidae         0.181#>  8 Donkey          FALSE      Equidae              0.224#>  9 Horse           FALSE      Equidae              0.126#> 10 Potar monkey    FALSE      Primate              1.15#> # ℹ 18 more rows

When groups are registered, results are spread usingtidyr::pivot_wider().

ms%>%row_group_by(class)%>%annotate_column_from_apply(msr, mean)%>%column_info()#> # A tibble: 2 × 13#>   .colname unit  Canine Dinosaurs Elephantidae Equidae Feline Macropodidae#>   <chr>    <chr>  <dbl>     <dbl>        <dbl>   <dbl>  <dbl>        <dbl>#> 1 body     kg      36.3   36033.         4600.    354.   51.6           35#> 2 brain    g      120.       91.5        5158.    537    91.3           56#> # ℹ 5 more variables: Primate <dbl>, Rodent <dbl>, Ruminant <dbl>, Sus <dbl>,#> #   Talpidae <dbl>

[8]ページ先頭

©2009-2025 Movatter.jp