Difference between revisions of "List of methods"
From Bio++ Wiki
(→Substitution models) 
(→Miscellaneous) 

Line 459:  Line 459:  
    
 RN95∩L95   RN95∩L95  
−   <code>  +   <code>RN95s</code> 
 bppphyl   bppphyl  
 Intersection of models RN95 and L95   Intersection of models RN95 and L95  
Line 466:  Line 466:  
    
 RivasEddy   RivasEddy  
−   <code>  +   <code>RE08</code> 
 bppphyl   bppphyl  
 substitution model with gap characters   substitution model with gap characters 
Revision as of 20:03, 6 March 2013
Here is a list of methods available in the Bio++ libraries, with appropriate class/function names, links and references!
Contents
Sequence analysis
Data structures
Method  Class(es) / Function(s)  Library  References / Links / Notes 

Simple sequence data structure  BasicSequence

bppseq  
Sequence with annotations  SequenceWithAnnotation

bppseq  
Sequence with quality scores  SequenceWithQuality

bppseq  
Simple container of sequences  VectorSequenceContainer

bppseq  Provides access and edit by id and by index 
Alignment container, optimized for sequence access  AlignedSequenceContainer

bppseq  Sequence access is <math>O(1)</math>, site access is <math>O(n)</math>, where n is the number of sequences. 
Alignment container, optimized for memory usage  CompressedVectorSiteContainer

bppseq  Same efficiency as VectorSiteContainer , yet with reduced memory footprint. Sequence edition is not possible, and meta information such as original position or any inherited attribute is lost.

File formats
Supported formats are here listed with there corresponding parser classes. It is further mentioned whether the format is implemented for reading and/or writing. The streaming option indicates that the parser also implements an iterator function, so that it is possible to loop over all sequences without storing them in memory.
Format  Class  Library  Reading  Writing  Streaming  References / Links / Notes 

Fasta  Fasta

bppseq  yes  yes  yes  
Mase  Mase

bppseq  yes  yes  no  
Clustal  Clustal

bppseq  yes  yes  no  
Phylip sequential  Phylip

bppseq  yes  yes  no  
Phylip interleaved  Phylip

bppseq  yes  yes  no  
Phylip sequential, extended (for PAML and PhyML)  Phylip

bppseq  yes  yes  no  
Phylip interleaved, extended (for PAML and PhyML)  Phylip

bppseq  yes  yes  no  
Nexus  NexusIOSequence

bppseq  yes  no  no  
GenBank  GenBank

bppseq  yes  no  no  Only raw sequences are imported, annotations are ignored. 
DCSE  DCSE

bppseq  yes  no  no  Format used by the Dedicated Comparative Sequence Editor, which could encode RNA secondary structure. Does not seem to be maintained anymore? 
Stockholm  Stockholm

bppseq  no  yes  no  Contain structure information, although the current parser does not support this. This is the format used notably by PFam and RFam. 
Phylogenetics
Method  Class(es) / Function(s)  Library  Reference  Links 

Neighbor Joining  NeighborJoining

bppphyl  Saitou and Nei (1986)  
BioNJ  BioNJ

bppphyl  Gascuel (1997) 
Substitution models
These models can be used for pairwise distance estimation, likelihood estimation, sequence simulation, ancestral sequence reconstruction, etc.
Substitution models for nucleotides
Model  Class(es) / Function(s)  Library  Comment  Reference  Links 

JukesCantor model for nucleotides  JCnuc

bppphyl  [Jukes & Cantor (1969), Evolution of proteins molecules, 121123 in Mammalian protein metabolism]  
Kimura 1980  K80

bppphyl  Kimura (1980)  
Felsenstein 1984  F84

bppphyl  [Felsenstein (1984), Phylip version 2.6]  
Hasegawa, Kishino & Yano 1985  HKY85

bppphyl  Hasegawa et al. (1985)  
Tamura 92  T92

bppphyl  for strong transitiontransversion and G+C content biases  Tamura (1992)  
Tamura & Nei 1993  TN93

bppphyl  Tamura & Nei (1993)  
General TimeReversible substitution model  GTR

bppphyl  Yang (1994)  
Lobry 1995  L95

bppphyl  Nostrand bias  Lobry (1995)  
Rhetsky & Nei 1995  RN95

bppphyl  Rzhetsky and Nei (1995)  
Strand symmetric reversible model  SSR

bppphyl  Hobolth et al. (2007) 
Substitution models for proteins
Model  Class(es) / Function(s)  Library  Comment  Reference  Links 

JukesCantor model for proteins  JCprot

bppphyl  [Jukes & Cantor (1969), Evolution of proteins molecules, 121123 in Mammalian protein metabolism]  
Dayhoff, Schwartz & Orcutt  DSO78

bppphyl  Kosiol & Goldman (2005)  
Jones, Taylor & Thornton 1992  JTT92

bppphyl  Jones et al. (1992)  
Whelan & Goldman 2001  WAG01

bppphyl  Whelan & Goldman (2001)  
Le & Gascuel 2008  LG08

bppphyl  mixture substitution model for proteins  Le et al. (2008)  
EX2 model  LLG08_EX2

bppphyl  mixture model: buried/exposed sites  Le et al. (2008)  
EX3 model  LLG08_EX3

bppphyl  mixture model: buried/intermediate/highly exposed sites  Le et al. (2008)  
EH0 model  LLG08_EHO

bppphyl  mixture model: helix/elongated/other sites  Le et al. (2008)  
UL2 model  LLG08_UL2

bppphyl  mixture of 2 models built by unsupervised method  Le et al. (2008)  
UL3 model  LLG08_UL3

bppphyl  mixture of 3 models, Q1, Q2, Q3 built by unsupervised method  Le et al. (2008) 
Substitution models for codon
Model  Class(es) / Function(s)  Library  Comment  Reference  Links 

Goldman & Yang 1994  GY94

bppphyl  uses biochemical distances between residues  Goldman & Yang (1994)  
Muse & Gaut 1994  MG94

bppphyl  Muse & Gaut (1994)  
Yang & Nielsen 1998  YN98

bppphyl  Yang & Nielsen (1998)  
M1 model  YNGKP_M1

bppphyl  mixture of YN98 class models  Galtier (2001)  
M2 model  YNGKP_M2

bppphyl  mixture of YN98 class models  Galtier (2001)  
M3 model  YNGKP_M3

bppphyl  mixture of YN98 class models  Galtier (2001)  
M7 model  YNGKP_M7

bppphyl  mixture of YN98 class models  Galtier (2001)  
M8 model  YNGKP_M8

bppphyl  mixture of YN98 class models  Galtier (2001) 
Covarion models
Model  Class(es) / Function(s)  Library  Comment  Reference  Links 

Tuffley & Steel 1998  TS98

bppphyl  Tuffley & Steel (1998)  
Galtier 2001  G2001

bppphyl  Galtier (2001)  
YpR  YpR

bppphyl  Dinucleotides transition model  Bérard et al. (2008) 
Miscellaneous
Model  Class(es) / Function(s)  Library  Comment  Reference  Links 

RN95∩L95  RN95s

bppphyl  Intersection of models RN95 and L95  Lobry (1995)  
RivasEddy  RE08

bppphyl  substitution model with gap characters  Rivas & Eddy (2008)  
Custom model  UserProteinSubstitutionModel

bppphyl  model customizable by user  
2states substitution model  BinarySubstitutionModel

bppphyl 
Population genetics
Method  Class(es) / Function(s)  Library  Reference  Links / Notes 

Alignment container for sequences with reference to groups (populations) identifiers.  PolymorphismSequenceContainer

bpppopgen  
Container for allelic data  PolymorphismMultiGContainer

bpppopgen  
Expected heterozygosity or Gene diversity  SequenceStatistics / heterozygosity

bpppopgen  Weir (1996)  
Genetic diversity estimator <math>\theta</math> of Watterson  SequenceStatistics / watterson75

bpppopgen  Watterson (1975)  Also exist for synonymous and nonsynonymous sites with functions : watterson75Synonymous / watterson75NonSynonymous

Mean nucleotide diversity estimator <math>\pi</math> of Tajima  SequenceStatistics / tajima83

bpppopgen  Tajima (1983)  Also exist for synonymous and nonsynonymous sites with functions : piSynonymous / piNonSynonymous

Diversity estimator H of Fay and Wu  SequenceStatistics / FayWu2000

bpppopgen  Fay and Wu (2000)  
Haplotype number in the sample  SequenceStatistics / DVK

bpppopgen  Depaulis and Veuille (1998)  
Haplotype diversity in the sample  SequenceStatistics / DVH

bpppopgen  Depaulis and Veuille (1998)  
Scaled recombination parameter (C = 4Nr)  SequenceStatistics / hudson87

bpppopgen  Hudson (1987)  
McDonaldKreitman contingency table  SequenceStatistics / MKtable

bpppopgen  McDonald and Kreitman (1991)  
Neutralityindex (NI)  SequenceStatistics / neutralityIndex

bpppopgen  Rand and Kann (1996)  
Tajima's D  SequenceStatistics / tajimaDSS and tajimaDTNM

bpppopgen  Tajima (1989)  tajimaDSS is the calculation using the number of polymorphic (segregating) sites and tajimaDTNM is the calculation using the total number of mutation.

Fu and Li (1993) statistics D, D*, F and F*  SequenceStatistics / fuliD, fuliDstar, fuliF and fuliFstar

bpppopgen  Fu and Li (1993)  
Fst from frequencies at polymorphic sites  SequenceStatistics / FstHudson92

bpppopgen  Hudson, Slatkin, Maddison (1992)  Taken from eq. 3 of Hudson, Slatkin and Maddison (1992) 
F statistics of Weir and Cockerham (including Fit, Fis and Fst)  MultilocusGenotypeStatistics / getAllelesFstats

bpppopgen  Weir and Cockerham (1984) 