Thepedmut package is part of thepedsuiteecosystem for pedigree analysis in R. Its aim is to provide a frameworkfor modelling mutations in pedigree computations.
Althoughpedmut is self-contained, its main purposeis to be imported by otherpedsuite packages, likepedprobr (markerprobabilities and pedigree likelihoods),forrel (forensic pedigreeanalysis) anddvir.
For the theoretical background of mutation models and theirproperties (stationarity, reversibility, lumpability), I recommendChapter 5 ofPedigreeanalysis in R, and the references therein.
# The easiest way to get `pedmut` is to install the entire `pedsuite`:install.packages("pedsuite")# Alternatively, you can install just `pedmut`:install.packages("pedmut")# If you need the latest development version, install it from GitHub:# install.packages("devtools")devtools::install_github("magnusdv/pedmut")The examples below require the packagespedtools andpedprobr in addition topedmut. Thefirst two are core members of the pedsuite and can be loadedcollectively withlibrary(pedsuite).
library(pedsuite)library(pedmut)The figure below shows a father and son who are homozygous fordifferent alleles. We assume that the locus is an autosomal marker withtwo alleles, labelled 1 and 2.
# Create pedigreex=nuclearPed(father ="fa",mother ="mo",child ="boy")# Add markerx=addMarker(x,fa ="1/1",boy ="2/2")# Plot with genotypesplot(x,marker =1)
The data clearly constitutes aMendelian error, and gives alikelihood of 0 without mutation modelling:
likelihood(x)#> [1] 0The following code sets a simple mutation model and recomputes thepedigree likelihood.
x2=setMutmod(x,model ="equal",rate =0.1)likelihood(x2)#> [1] 0.0125Under the mutation model, the combination of genotypes is no longerimpossible, yielding a non-zero likelihood. To see details about themutation model, we can use themutmod() accessor:
mutmod(x2,marker =1)#> Unisex mutation matrix:#> 1 2#> 1 0.9 0.1#> 2 0.1 0.9#>#> Model: Equal#> Rate: 0.1#> Frequencies: 0.5, 0.5#>#> Bounded: Yes#> Stationary: Yes#> Reversible: Yes#> Lumpable: AlwaysA mutation matrix inpedmut is a stochastic matrix,with each row summing to 1, where the rows and columns are named withallele labels.
Two central functions of package aremutationMatrix()andmutationModel(). The first constructs a single mutationmatrix according to various model specifications. The second produceswhat is typically required in applications, namely a list oftwo mutation matrices, named “male” and “female”.
The mutation models currently implemented inpedmutare:
equal: All mutations equally likely; probability1-rate of no mutation. Parameters:rate.
proportional: Mutation probabilities areproportional to the target allele frequencies. Parameters:rate,afreq.
onestep: Applicable if all alleles are integers.Mutations are allowed only to the nearest integer neighbour. Parameters:rate.
stepwise: For this model alleles must be integers orsingle-decimal microvariants (e.g. 17.1). Mutation rates depend on group(integer vs microvariant), withrate for same-group andrate2 for between-group mutations. Mutations also depend onstep size; therange parameter gives the relativeprobability of mutating n+1 steps versus n steps. Parameters:rate,rate2,range.
dawid: A reversible stepwise mutation model,following the approach of Dawid et al. (2002). Parameters:rate,range.
random: Generates a random mutation matrix,optionally conditioned on a fixed overall mutation rate. Parameters:rate,seed (both optional).
trivial: Diagonal mutation matrix with 1 on thediagonal. Parameters: None.
custom: Any valid mutation matrix provided by theuser. Parameters:matrix.
Several properties of mutation models are of interest (boththeoretical and practical) for likelihood computations. Thepedmut package provides utility functions for quicklychecking these:
isBounded(M, afreq): Checks ifM isbounded by the allele frequencies, meaning that the probability ofmutating into an allele never exceeds the population frequency of thatallele. Unbounded models may give counter-intuitive results, like LR> 1 in a paternity case where the alleged father and child have noalleles in common.
isStationary(M, afreq): Checks ifafreqis a right eigenvector of the mutation matrixM. Stationarymodels have the desirable property that allele frequencies don’t changeacross generations.
isReversible(M, afreq): Checks ifMtogether withafreq form areversible Markovchain, i.e., that they satisfy thedetailedbalance criterion.
isLumpable(M, lump): Checks ifM allowsclustering (“lumping”) of a given subset of alleles. This implements thenecessary and sufficient condition ofstrong lumpability ofKemeny and Snell (Finite Markov Chains, 1976).
alwaysLumpable(M): Checks ifM allowslumping ofany allele subset.
Anequal model with rate 0.1:
mutationMatrix("equal",rate =0.1,alleles =c("a","b","c"))#> a b c#> a 0.90 0.05 0.05#> b 0.05 0.90 0.05#> c 0.05 0.05 0.90#>#> Model: Equal#> Rate: 0.1#>#> Lumpable: AlwaysNext, aproportional model with rate 0.1. Note that thismodel depends on the allele frequencies.
mutationMatrix("prop",rate =0.1,alleles =c("a","b","c"),afreq =c(0.7,0.2,0.1))#> a b c#> a 0.93478261 0.04347826 0.02173913#> b 0.15217391 0.82608696 0.02173913#> c 0.15217391 0.04347826 0.80434783#>#> Model: Proportional#> Rate: 0.1#> Frequencies: 0.7, 0.2, 0.1#>#> Bounded: Yes#> Stationary: Yes#> Reversible: Yes#> Lumpable: AlwaysTo illustrate thestepwise model, we recreate themutation matrix in Section 2.1.3 of Simonsson and Mostad (FSI:Genetics,2015). This is done as follows:
mutationMatrix(model ="stepwise",alleles =c("16","17","18","16.1","17.1"),rate =0.003,rate2 =0.001,range =0.5)#> 16 17 18 16.1 17.1#> 16 0.9960000000 0.0020000000 0.0010000000 0.0005000000 0.0005000000#> 17 0.0015000000 0.9960000000 0.0015000000 0.0005000000 0.0005000000#> 18 0.0010000000 0.0020000000 0.9960000000 0.0005000000 0.0005000000#> 16.1 0.0003333333 0.0003333333 0.0003333333 0.9960000000 0.0030000000#> 17.1 0.0003333333 0.0003333333 0.0003333333 0.0030000000 0.9960000000#>#> Model: Stepwise#> Rate: 0.003#>#> Lumpable: Not alwaysA simpler version of thestepwise model above, is theonestep model, in which only the immediate neighbouringintegers are reachable by mutation. This model is only applicable whenall alleles are integers.
mutationMatrix(model ="onestep",alleles =c("16","17","18"),rate =0.04)#> 16 17 18#> 16 0.96 0.04 0.00#> 17 0.02 0.96 0.02#> 18 0.00 0.04 0.96#>#> Model: Onestep#> Rate: 0.04#>#> Lumpable: Not always