Compute shared polymorphism

From Bio++ Wiki
Jump to: navigation, search

This recipe is meant to compare two alignments in order to estimate the number of shared / fixed synonymous polymorphism.

Get the alignments

We assume here that the data for the two populations to compare are in two SiteContainers names sites1 and sites2. These containers need not be PolymorphismSiteContainers, but have to contain codon sequences and the exact same number of sites.

Restrict to synonymous polymorphism

[This step can be omitted, depending on your needs]

There we need to loop over all sites in the containers. This can efficiently be done via a function:

<source lang="cpp"> StandardGeneticCode gc(&AlphabetTools::DNA_ALPHABET); //Of course you use the one you need here... unsigned int n = sites1.getNumberOfSites(); for (unsigned int i = n; i > 0; --i) {

 if (! CodonSiteTools::isSynonymousPolymorphic(sites1.getSite(i-1), gc) ||
     ! CodonSiteTools::isSynonymousPolymorphic(sites2.getSite(i-1), gc))

} </source>

Compare the two containers

<source lang="cpp">

  1. include <NumCalc/VectorTools.h>

using namespace bpp;

unsigned int ntot = sites1.getNumberOfSites(); //This might be different from the n before, if there was some nonsynonymous polymorphism. unsigned int npoly1 = 0; unsigned int npoly2 = 0; unsigned int nfixed = 0; unsigned int nshared = 0; for (unsigned int i = 0; i < ntot; ++i) {

 vector<int> v1 = VectorTools::unique(sites1.getSite(i).getContent());
 vector<int> v2 = VectorTools::unique(sites2.getSite(i).getContent());
 if (v1.size() == 1 && v2.size() == 1 && v1[0] != v2[0])
 if (v1.size() == 1 && v2.size() > 1)
 if (v1.size() > 1 && v2.size() == 1)
 if (v1.size() > 1 && v2.size() > 1 && VectorTools::vectorIntersection(v1, v2).size() > 1)
 //Note: it is possible to set some additional filter to restrict the analysis for biallelic sites only, for instance by changing all > 1 to == 2.

} </source>

There you go!