bpp-phyl  2.1.0
 All Classes Namespaces Files Functions Variables Friends Pages
bpp::SubstitutionModel Class Referenceabstract

Interface for all substitution models. More...

#include <Bpp/Phyl/Model/SubstitutionModel.h>

+ Inheritance diagram for bpp::SubstitutionModel:
+ Collaboration diagram for bpp::SubstitutionModel:

Public Member Functions

 SubstitutionModel ()
 
virtual ~SubstitutionModel ()
 
SubstitutionModelclone () const =0
 
virtual std::string getName () const =0
 Get the name of the model. More...
 
virtual const std::vector< int > & getAlphabetChars () const =0
 
virtual int getAlphabetChar (size_t i) const =0
 Get the char in the alphabet corresponding to a given state in the model. More...
 
virtual std::vector< size_t > getModelStates (int i) const =0
 Get the state in the model corresponding to a particular char in the alphabet. More...
 
virtual double freq (size_t i) const =0
 
virtual double Qij (size_t i, size_t j) const =0
 
virtual double Pij_t (size_t i, size_t j, double t) const =0
 
virtual double dPij_dt (size_t i, size_t j, double t) const =0
 
virtual double d2Pij_dt2 (size_t i, size_t j, double t) const =0
 
virtual const VdoublegetFrequencies () const =0
 
virtual const Matrix< double > & getGenerator () const =0
 
virtual const Matrix< double > & getExchangeabilityMatrix () const =0
 
virtual double Sij (size_t i, size_t j) const =0
 
virtual const Matrix< double > & getPij_t (double t) const =0
 
virtual const Matrix< double > & getdPij_dt (double t) const =0
 
virtual const Matrix< double > & getd2Pij_dt2 (double t) const =0
 
virtual void enableEigenDecomposition (bool yn)=0
 Set if eigenValues and Vectors must be computed. More...
 
virtual bool enableEigenDecomposition ()=0
 Tell if eigenValues and Vectors must be computed. More...
 
virtual const VdoublegetEigenValues () const =0
 
virtual const VdoublegetIEigenValues () const =0
 
virtual bool isDiagonalizable () const =0
 
virtual bool isNonSingular () const =0
 
virtual const Matrix< double > & getRowLeftEigenVectors () const =0
 
virtual const Matrix< double > & getColumnRightEigenVectors () const =0
 
virtual const AlphabetgetAlphabet () const =0
 
virtual size_t getNumberOfStates () const =0
 Get the number of states. More...
 
virtual double getInitValue (size_t i, int state) const =0 throw (IndexOutOfBoundsException, BadIntException)
 
virtual double getScale () const =0
 Get the scalar product of diagonal elements of the generator and the frequencies vector. If the generator is normalized, then scale=1. Otherwise each element must be multiplied by 1/scale. More...
 
virtual void setScale (double scale)=0
 Set the rate of the generator, defined as the scalar product of diagonal elements of the generator and the frequencies vector. More...
 
virtual double getRate () const =0
 Get the rate. More...
 
virtual void setRate (double rate)=0
 Set the rate of the model (must be positive). More...
 
virtual void addRateParameter ()=0
 
virtual void setFreqFromData (const SequenceContainer &data, double pseudoCount=0)=0
 Set equilibrium frequencies equal to the frequencies estimated from the data. More...
 
virtual void setFreq (std::map< int, double > &frequencies)
 Set equilibrium frequencies. More...
 
virtual const FrequenciesSetgetFrequenciesSet () const
 If the model owns a FrequenciesSet, returns a pointer to it, otherwise return 0. More...
 
virtual size_t getNumberOfIndependentParameters () const =0
 
virtual void aliasParameters (const std::string &p1, const std::string &p2)=0
 
virtual void unaliasParameters (const std::string &p1, const std::string &p2)=0
 
virtual const ParameterListgetIndependentParameters () const =0
 
virtual std::vector< std::string > getAlias (const std::string &name) const =0
 

Detailed Description

Interface for all substitution models.

A substitution model is based on a Markov generator $Q$, the size of which depends on the alphabet used (4 for nucleotides, 20 for proteins, etc.). Each SubstitutionModel object hence includes a pointer toward an alphabet, and provides a method to retrieve the alphabet used (getAlphabet() method).

What we want from a substitution model is to compute the probabilities of state j at time t geven state j at time 0 ( $P_{i,j}(t)$). Typically, this is computed using the formula

\[ P(t) = e^{r \times t \times Q}, \]

where $P(t)$ is the matrix with all probabilities $P_{i,j}(t)$, and $ r $ the rate. For some models, the $P_{i,j}(t)$'s can be computed analytically.

For more complex models, we need to use a eigen-decomposition of $Q$:

\[ Q = U^{-1} . D . U, \]

where $D = diag(\lambda_i)$ is a diagonal matrix. Hence

\[ P(t) = e^{r \times t \times Q} = U^{-1} . e^{r \times D \times t} . U, \]

where $e^{r \times D \times t} = diag\left(e^{r \times \lambda_i \times t}\right)$ is a diagonal matrix with all terms equal to exp the terms in $D$ multiplied per $ r \times t $. $U$ is the matrix of left eigen vectors (by row), and $U^{-1}$ is the matrix of right eigen vectors (by column). The values in $D$ are the eigen values of $Q$. All $Q,U,U^{-1}$ and $D$ (its diagonal) may be retrieved from the class (getEigenValues(), getRowRightEigenVectors() and getColumnLeftEigenVectors() functions).

First and second order derivatives of $P(t)$ with respect to $t$ can also be retrieved. These methods may be useful for optimization processes. Derivatives may be computed analytically, or using the general formulas:

\[ \frac{\partial P(t)}{\partial t} = U^{-1} . diag\left(r \times \lambda_i \times e^{r \times \lambda_i \times t}\right) . U \]

and

\[ \frac{\partial^2 P(t)}{\partial t^2} = U^{-1} . diag\left(r^2 \times \lambda_i^2 \times e^{r \times \lambda_i \times t}\right) . U \]

If Q is not symmetric, then the eigenvalue matrix D is block diagonal with the real eigenvalues in 1-by-1 blocks and any complex eigenvalues, a + i*b, in 2-by-2 blocks, [a, b; -b, a]. That is, if the complex eigenvalues look like

          a + ib   .        .    .
          .        a - ib   .    .
          .        .        x    .
          .        .        .    y

then D looks like

          a          b      .    .
         -b          a      .    .
          .          .      x    .
          .          .      .    y

and exp(tD) equals

          exp(ta)cos(tb)   exp(ta)sin(tb)  .        .
         -exp(ta)sin(tb)   exp(ta)cos(tb)  .        . 
          .                .               exp(tx)  .
          .                .               .        exp(ty)

If U is singular, it cannot be inverted. In this case exp(tQ) is approximated using Taylor decomposition:

\[ P(t) = Id + tQ + \frac{(tQ)^2}{2!} + ... + \frac{(tQ)^n}{n!} + ... \]

To prevent approximation issues, if \f$ max(tQ) \f$ is too high (currently above 0.5), \f$ t \f$ is divided in an ad hoc way (e.g. by \f$ N \f$), and we compute \f$ P(t) = (P(t/N))^N \f$ with a Taylor decomposition for \f$ P(t/N) \f$.

In this case, derivatives according to \f$ t \f$ are computed analytically too.

Definition at line 195 of file SubstitutionModel.h.

Constructor & Destructor Documentation

bpp::SubstitutionModel::SubstitutionModel ( )
inline

Definition at line 199 of file SubstitutionModel.h.

virtual bpp::SubstitutionModel::~SubstitutionModel ( )
inlinevirtual

Definition at line 200 of file SubstitutionModel.h.

Member Function Documentation

virtual void bpp::SubstitutionModel::addRateParameter ( )
pure virtual
SubstitutionModel* bpp::SubstitutionModel::clone ( ) const
pure virtual

Implements bpp::Clonable.

Implemented in bpp::ReversibleSubstitutionModel, bpp::AbstractReversibleSubstitutionModel, bpp::YpR_Gen, bpp::AbstractSubstitutionModel, bpp::YpR_Sym, bpp::HKY85, bpp::F84, bpp::JCprot, bpp::MarkovModulatedSubstitutionModel, bpp::T92, bpp::K80, bpp::GTR, bpp::RN95, bpp::CodonDistanceFrequenciesSubstitutionModel, bpp::MixtureOfSubstitutionModels, bpp::TN93, bpp::RE08, bpp::JCnuc, bpp::AbstractCodonSubstitutionModel, bpp::CodonDistancePhaseFrequenciesSubstitutionModel, bpp::CodonDistanceFitnessPhaseFrequenciesSubstitutionModel, bpp::RN95s, bpp::BinarySubstitutionModel, bpp::CodonDistanceSubstitutionModel, bpp::UserProteinSubstitutionModel, bpp::MixtureOfASubstitutionModel, bpp::WAG01, bpp::SSR, bpp::L95, bpp::AbstractMixedSubstitutionModel, bpp::DSO78, bpp::JTT92, bpp::LLG08_EX3, bpp::G2001, bpp::LG08, bpp::LLG08_EHO, bpp::LLG08_UL3, bpp::CodonRateFrequenciesSubstitutionModel, bpp::LGL08_CAT, bpp::LLG08_EX2, bpp::LLG08_UL2, bpp::TS98, bpp::gBGC, bpp::GY94, bpp::WordSubstitutionModel, bpp::TripletSubstitutionModel, bpp::YN98, bpp::MG94, bpp::YNGKP_M1, bpp::CodonRateSubstitutionModel, bpp::YNGKP_M8, bpp::Coala, bpp::YNGKP_M7, bpp::YNGKP_M2, bpp::YNGKP_M3, bpp::LLG08_EX3::EmbeddedModel, bpp::LLG08_EHO::EmbeddedModel, bpp::LLG08_UL3::EmbeddedModel, bpp::LGL08_CAT::EmbeddedModel, bpp::LLG08_EX2::EmbeddedModel, bpp::LLG08_UL2::EmbeddedModel, bpp::AbstractBiblioSubstitutionModel, bpp::AbstractBiblioMixedSubstitutionModel, bpp::MixedSubstitutionModel, bpp::CodonSubstitutionModel, bpp::NucleotideSubstitutionModel, and bpp::ProteinSubstitutionModel.

Referenced by bpp::NonHomogeneousSequenceSimulator::NonHomogeneousSequenceSimulator().

virtual double bpp::SubstitutionModel::d2Pij_dt2 ( size_t  i,
size_t  j,
double  t 
) const
pure virtual
Returns
The second order derivative of the probability of change from state i to state j with respect to time t, at time t.
See Also
getd2Pij_dt2(), getStates()

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, bpp::RE08, bpp::JCnuc, bpp::BinarySubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::d2Pij_dt2(), and bpp::RE08::d2Pij_dt2().

virtual double bpp::SubstitutionModel::dPij_dt ( size_t  i,
size_t  j,
double  t 
) const
pure virtual
Returns
The first order derivative of the probability of change from state i to state j with respect to time t, at time t.
See Also
getdPij_dt(), getStates()

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, bpp::RE08, bpp::JCnuc, bpp::BinarySubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::RE08::d2Pij_dt2(), bpp::AbstractBiblioSubstitutionModel::dPij_dt(), and bpp::RE08::dPij_dt().

virtual bool bpp::SubstitutionModel::enableEigenDecomposition ( )
pure virtual

Tell if eigenValues and Vectors must be computed.

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

virtual int bpp::SubstitutionModel::getAlphabetChar ( size_t  i) const
pure virtual

Get the char in the alphabet corresponding to a given state in the model.

In most cases, this method will return i.

Parameters
iThe index of the state.
Returns
The corresponding state in the alphabet.
See Also
MarkovModulatedSubstitutionModel
getStates()

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::getAlphabetChar(), bpp::MarginalAncestralStateReconstruction::getAncestralSequenceForNode(), bpp::MarginalAncestralStateReconstruction::getAncestralSequenceForNode(), and bpp::NonHomogeneousSequenceSimulator::multipleEvolve().

virtual const std::vector<int>& bpp::SubstitutionModel::getAlphabetChars ( ) const
pure virtual
virtual const Matrix<double>& bpp::SubstitutionModel::getColumnRightEigenVectors ( ) const
pure virtual
virtual const Vdouble& bpp::SubstitutionModel::getEigenValues ( ) const
pure virtual
virtual const Matrix<double>& bpp::SubstitutionModel::getExchangeabilityMatrix ( ) const
pure virtual
Returns
The matrix of exchangeability terms. It is recommended that exchangeability matrix be normalized so that the normalized generator be obtained directly by the dot product $S . \pi$.

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::getExchangeabilityMatrix(), bpp::RE08::updateMatrices(), and bpp::MarkovModulatedSubstitutionModel::updateMatrices().

virtual const Matrix<double>& bpp::SubstitutionModel::getGenerator ( ) const
pure virtual
Returns
The normalized Markov generator matrix, i.e. all normalized rates of changes from state i to state j. The generator is normalized so that (i) $ \forall i; \sum_j Q_{i,j} = 0 $, meaning that $ $ \forall i; Q_{i,i} = -\sum_{j \neq i}Q_{i,j}$, and (ii) $ \sum_i Q_{i,i} \times \pi_i = -1$. This means that, under normalization, the mean rate of replacement at equilibrium is 1 and that time $t$ are measured in units of expected number of changes per site. Additionnaly, the rate_ attibute provides the possibility to increase or decrease this mean rate.

See Kosiol and Goldman (2005), Molecular Biology And Evolution 22(2) 193-9.

See Also
Qij()

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::LaplaceSubstitutionCount::computeCounts(), bpp::UniformizationSubstitutionCount::computeCounts_(), bpp::AbstractBiblioSubstitutionModel::getGenerator(), bpp::SimpleMutationProcess::SimpleMutationProcess(), bpp::RE08::updateMatrices(), and bpp::MarkovModulatedSubstitutionModel::updateMatrices().

virtual const Vdouble& bpp::SubstitutionModel::getIEigenValues ( ) const
pure virtual
Returns
A vector with all imaginary parts of the eigen values of the generator of this model;

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::getIEigenValues().

virtual double bpp::SubstitutionModel::getInitValue ( size_t  i,
int  state 
) const throw (IndexOutOfBoundsException, BadIntException)
pure virtual

This method is used to initialize likelihoods in reccursions. It typically sends 1 if i = state, 0 otherwise, where i is one of the possible states of the alphabet allowed in the model and state is the observed state in the considered sequence/site.

Parameters
ithe index of the state in the model.
stateAn observed state in the sequence/site.
Returns
1 or 0 depending if the two states are compatible.
Exceptions
IndexOutOfBoundsExceptionif array position is out of range.
BadIntExceptionif states are not allowed in the associated alphabet.
See Also
getStates();

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, bpp::RE08, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::getInitValue().

virtual std::vector<size_t> bpp::SubstitutionModel::getModelStates ( int  i) const
pure virtual

Get the state in the model corresponding to a particular char in the alphabet.

Parameters
iThe alphabet char to check.
Returns
A vector of indices of model states.

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::getModelStates(), bpp::MarkovModulatedSubstitutionModel::getModelStates(), and bpp::MarginalAncestralStateReconstruction::recursiveMarginalAncestralStates().

virtual std::string bpp::SubstitutionModel::getName ( ) const
pure virtual

Get the name of the model.

Returns
The name of this model.

Implemented in bpp::YpR_Gen, bpp::YpR_Sym, bpp::F84, bpp::HKY85, bpp::JCprot, bpp::T92, bpp::K80, bpp::RN95, bpp::CodonDistanceFrequenciesSubstitutionModel, bpp::GTR, bpp::TN93, bpp::RE08, bpp::MixtureOfSubstitutionModels, bpp::JCnuc, bpp::CodonDistancePhaseFrequenciesSubstitutionModel, bpp::CodonDistanceFitnessPhaseFrequenciesSubstitutionModel, bpp::RN95s, bpp::BinarySubstitutionModel, bpp::CodonDistanceSubstitutionModel, bpp::UserProteinSubstitutionModel, bpp::MixtureOfASubstitutionModel, bpp::LLG08_EX3, bpp::WordSubstitutionModel, bpp::LLG08_EHO, bpp::LLG08_UL3, bpp::WAG01, bpp::SSR, bpp::L95, bpp::LGL08_CAT, bpp::LLG08_EX2, bpp::LLG08_UL2, bpp::DSO78, bpp::JTT92, bpp::gBGC, bpp::CodonRateFrequenciesSubstitutionModel, bpp::G2001, bpp::LG08, bpp::YNGKP_M1, bpp::YNGKP_M8, bpp::TS98, bpp::YNGKP_M7, bpp::GY94, bpp::YNGKP_M2, bpp::YNGKP_M3, bpp::TripletSubstitutionModel, bpp::YN98, bpp::CodonRateSubstitutionModel, bpp::MG94, bpp::Coala, bpp::LLG08_EX3::EmbeddedModel, bpp::LLG08_EHO::EmbeddedModel, bpp::LLG08_UL3::EmbeddedModel, bpp::LGL08_CAT::EmbeddedModel, bpp::LLG08_EX2::EmbeddedModel, and bpp::LLG08_UL2::EmbeddedModel.

Referenced by bpp::AbstractCodonFitnessSubstitutionModel::AbstractCodonFitnessSubstitutionModel(), bpp::AbstractSubstitutionModel::fireParameterChanged(), bpp::gBGC::getName(), bpp::MixtureOfSubstitutionModels::getSubmodelNumbers(), bpp::BppOSubstitutionModelFormat::readMixed_(), bpp::AbstractWordSubstitutionModel::updateMatrices(), bpp::AbstractBiblioSubstitutionModel::updateMatrices(), and bpp::BppOSubstitutionModelFormat::write().

virtual const Matrix<double>& bpp::SubstitutionModel::getRowLeftEigenVectors ( ) const
pure virtual
virtual double bpp::SubstitutionModel::getScale ( ) const
pure virtual

Get the scalar product of diagonal elements of the generator and the frequencies vector. If the generator is normalized, then scale=1. Otherwise each element must be multiplied by 1/scale.

Returns
Minus the scalar product of diagonal elements and the frequencies vector.

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::getScale().

virtual bool bpp::SubstitutionModel::isDiagonalizable ( ) const
pure virtual
virtual bool bpp::SubstitutionModel::isNonSingular ( ) const
pure virtual
virtual void bpp::SubstitutionModel::setFreqFromData ( const SequenceContainer data,
double  pseudoCount = 0 
)
pure virtual

Set equilibrium frequencies equal to the frequencies estimated from the data.

Parameters
dataThe sequences to use.
pseudoCountA quantity $\psi$ to add to adjust the observed values in order to prevent issues due to missing states on small data set. The corrected frequencies shall be computed as

\[ \pi_i = \frac{n_i+\psi}{\sum_j (f_j+\psi)} \]

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::setFreqFromData(), and bpp::MarkovModulatedSubstitutionModel::setFreqFromData().

virtual void bpp::SubstitutionModel::setRate ( double  rate)
pure virtual
virtual void bpp::SubstitutionModel::setScale ( double  scale)
pure virtual

Set the rate of the generator, defined as the scalar product of diagonal elements of the generator and the frequencies vector.

When the generator is normalized, scale=1. Otherwise each element is multiplied such that the correct scale is set.

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::setScale(), and bpp::MarkovModulatedSubstitutionModel::setScale().

virtual double bpp::SubstitutionModel::Sij ( size_t  i,
size_t  j 
) const
pure virtual
Returns
The exchangeability between state i and state j.

By definition Sij(i,j) = Sij(j,i).

Implemented in bpp::AbstractSubstitutionModel, bpp::MarkovModulatedSubstitutionModel, and bpp::AbstractBiblioSubstitutionModel.

Referenced by bpp::AbstractBiblioSubstitutionModel::Sij().


The documentation for this class was generated from the following file: