Here is a list of methods available in the Bio++ libraries, with appropriate class/function names, links and references!
Sequence analysis
Data structures
Method
|
Class(es) / Function(s)
|
Library
|
References / Links / Notes
|
Simple sequence data structure
|
BasicSequence
|
bpp-seq
|
|
Sequence with annotations
|
SequenceWithAnnotation
|
bpp-seq
|
|
Sequence with quality scores
|
SequenceWithQuality
|
bpp-seq
|
|
Simple container of sequences
|
VectorSequenceContainer
|
bpp-seq
|
Provides access and edit by id and by index
|
Alignment container, optimized for sequence access
|
AlignedSequenceContainer
|
bpp-seq
|
Sequence access is <math>O(1)</math>, site access is <math>O(n)</math>, where n is the number of sequences.
|
Alignment container, optimized for memory usage
|
CompressedVectorSiteContainer
|
bpp-seq
|
Same efficiency as VectorSiteContainer , yet with reduced memory footprint. Sequence edition is not possible, and meta information such as original position or any inherited attribute is lost.
|
File formats
Supported formats are here listed with there corresponding parser classes. It is further mentioned whether the format is implemented for reading and/or writing. The streaming option indicates that the parser also implements an iterator function, so that it is possible to loop over all sequences without storing them in memory.
Format
|
Class
|
Library
|
Reading
|
Writing
|
Streaming
|
References / Links / Notes
|
Fasta
|
Fasta
|
bpp-seq
|
yes
|
yes
|
yes
|
|
Mase
|
Mase
|
bpp-seq
|
yes
|
yes
|
no
|
|
Clustal
|
Clustal
|
bpp-seq
|
yes
|
yes
|
no
|
|
Phylip sequential
|
Phylip
|
bpp-seq
|
yes
|
yes
|
no
|
|
Phylip interleaved
|
Phylip
|
bpp-seq
|
yes
|
yes
|
no
|
|
Phylip sequential, extended (for PAML and PhyML)
|
Phylip
|
bpp-seq
|
yes
|
yes
|
no
|
|
Phylip interleaved, extended (for PAML and PhyML)
|
Phylip
|
bpp-seq
|
yes
|
yes
|
no
|
|
Nexus
|
NexusIOSequence
|
bpp-seq
|
yes
|
no
|
no
|
|
GenBank
|
GenBank
|
bpp-seq
|
yes
|
no
|
no
|
Only raw sequences are imported, annotations are ignored.
|
DCSE
|
DCSE
|
bpp-seq
|
yes
|
no
|
no
|
Format used by the Dedicated Comparative Sequence Editor, which could encode RNA secondary structure. Does not seem to be maintained anymore?
|
Stockholm
|
Stockholm
|
bpp-seq
|
no
|
yes
|
no
|
Contains structure information, although the current parser does not support this. This is the format used notably by PFam and RFam.
|
FastQ
|
Fastq
|
bpp-seq-omics
|
yes
|
yes
|
yes
|
Quality scores are also imported. The parser returns SequenceWithQuality objects.
|
Multiple Alignment Format (MAF)
|
MafParser / OutputMafIterator
|
bpp-seq-omics
|
yes
|
yes
|
yes
|
Streaming is performed on alignment blocks. Meta information, including quality scores are supported. Uses specific classes for alignment blocks and sequences.
|
Phylogenetics
Substitution models
These models can be used for pairwise distance estimation, likelihood estimation, sequence simulation, ancestral sequence reconstruction, etc.
Substitution models for nucleotides
Model
|
Class(es) / Function(s)
|
Library
|
Comment
|
Reference
|
Links
|
Jukes-Cantor model for nucleotides
|
JCnuc
|
bpp-phyl
|
|
Jukes & Cantor (1969), Evolution of proteins molecules, 121-123 in Mammalian protein metabolism
|
|
Kimura 1980
|
K80
|
bpp-phyl
|
|
Kimura (1980)
|
|
Felsenstein 1984
|
F84
|
bpp-phyl
|
|
Felsenstein (1984), Phylip version 2.6
|
|
Hasegawa, Kishino & Yano 1985
|
HKY85
|
bpp-phyl
|
|
Hasegawa et al. (1985)
|
|
Tamura 92
|
T92
|
bpp-phyl
|
for strong transition-transversion and G+C content biases
|
Tamura (1992)
|
|
Tamura & Nei 1993
|
TN93
|
bpp-phyl
|
|
Tamura & Nei (1993)
|
|
General Time-Reversible substitution model
|
GTR
|
bpp-phyl
|
|
Yang (1994)
|
|
Lobry 1995
|
L95
|
bpp-phyl
|
No-strand bias
|
Lobry (1995)
|
|
Rhetsky & Nei 1995
|
RN95
|
bpp-phyl
|
|
Rzhetsky and Nei (1995)
|
|
Strand symmetric reversible model
|
SSR
|
bpp-phyl
|
|
Hobolth et al. (2007)
|
|
Substitution models for proteins
Model
|
Class(es) / Function(s)
|
Library
|
Comment
|
Reference
|
Links
|
Jukes-Cantor model for proteins
|
JCprot
|
bpp-phyl
|
|
Jukes & Cantor (1969), Evolution of proteins molecules, 121-123 in Mammalian protein metabolism
|
|
Dayhoff, Schwartz & Orcutt
|
DSO78
|
bpp-phyl
|
|
Kosiol & Goldman (2005)
|
|
Jones, Taylor & Thornton 1992
|
JTT92
|
bpp-phyl
|
|
Jones et al. (1992)
|
|
Whelan & Goldman 2001
|
WAG01
|
bpp-phyl
|
|
Whelan & Goldman (2001)
|
|
Le & Gascuel 2008
|
LG08
|
bpp-phyl
|
mixture substitution model for proteins
|
Le et al. (2008)
|
|
EX2 model
|
LLG08_EX2
|
bpp-phyl
|
mixture model: buried/exposed sites
|
Le et al. (2008)
|
|
EX3 model
|
LLG08_EX3
|
bpp-phyl
|
mixture model: buried/intermediate/highly exposed sites
|
Le et al. (2008)
|
|
EH0 model
|
LLG08_EHO
|
bpp-phyl
|
mixture model: helix/elongated/other sites
|
Le et al. (2008)
|
|
UL2 model
|
LLG08_UL2
|
bpp-phyl
|
mixture of 2 models built by unsupervised method
|
Le et al. (2008)
|
|
UL3 model
|
LLG08_UL3
|
bpp-phyl
|
mixture of 3 models, Q1, Q2, Q3 built by unsupervised method
|
Le et al. (2008)
|
|
Substitution models for codon
Covarion models
Miscellaneous
Model
|
Class(es) / Function(s)
|
Library
|
Comment
|
Reference
|
Links
|
RN95∩L95
|
RN95s
|
bpp-phyl
|
Intersection of models RN95 and L95
|
Lobry (1995)
|
|
Rivas-Eddy
|
RE08
|
bpp-phyl
|
substitution model with gap characters
|
Rivas & Eddy (2008)
|
|
Custom model
|
UserProteinSubstitutionModel
|
bpp-phyl
|
model customizable by user
|
|
|
2-states substitution model
|
BinarySubstitutionModel
|
bpp-phyl
|
|
|
|
Population genetics
Method
|
Class(es) / Function(s)
|
Library
|
Reference
|
Links / Notes
|
Alignment container for sequences with reference to groups (populations) identifiers.
|
PolymorphismSequenceContainer
|
bpp-popgen
|
|
|
Container for allelic data
|
PolymorphismMultiGContainer
|
bpp-popgen
|
|
|
Expected heterozygosity or Gene diversity
|
SequenceStatistics / heterozygosity
|
bpp-popgen
|
Weir (1996)
|
|
Genetic diversity estimator <math>\theta</math> of Watterson
|
SequenceStatistics / watterson75
|
bpp-popgen
|
Watterson (1975)
|
Also exist for synonymous and non-synonymous sites with functions : watterson75Synonymous / watterson75NonSynonymous
|
Mean nucleotide diversity estimator <math>\pi</math> of Tajima
|
SequenceStatistics / tajima83
|
bpp-popgen
|
Tajima (1983)
|
Also exist for synonymous and non-synonymous sites with functions : piSynonymous / piNonSynonymous
|
Diversity estimator H of Fay and Wu
|
SequenceStatistics / FayWu2000
|
bpp-popgen
|
Fay and Wu (2000)
|
|
Haplotype number in the sample
|
SequenceStatistics / DVK
|
bpp-popgen
|
Depaulis and Veuille (1998)
|
|
Haplotype diversity in the sample
|
SequenceStatistics / DVH
|
bpp-popgen
|
Depaulis and Veuille (1998)
|
|
Scaled recombination parameter (C = 4Nr)
|
SequenceStatistics / hudson87
|
bpp-popgen
|
Hudson (1987)
|
|
McDonald-Kreitman contingency table
|
SequenceStatistics / MKtable
|
bpp-popgen
|
McDonald and Kreitman (1991)
|
|
Neutrality-index (NI)
|
SequenceStatistics / neutralityIndex
|
bpp-popgen
|
Rand and Kann (1996)
|
|
Tajima's D
|
SequenceStatistics / tajimaDSS and tajimaDTNM
|
bpp-popgen
|
Tajima (1989)
|
tajimaDSS is the calculation using the number of polymorphic (segregating) sites and tajimaDTNM is the calculation using the total number of mutation.
|
Fu and Li (1993) statistics D, D*, F and F*
|
SequenceStatistics / fuliD, fuliDstar, fuliF and fuliFstar
|
bpp-popgen
|
Fu and Li (1993)
|
|
Fst from frequencies at polymorphic sites
|
SequenceStatistics / FstHudson92
|
bpp-popgen
|
Hudson, Slatkin, Maddison (1992)
|
Taken from eq. 3 of Hudson, Slatkin and Maddison (1992)
|
F statistics of Weir and Cockerham (including Fit, Fis and Fst)
|
MultilocusGenotypeStatistics / getAllelesFstats
|
bpp-popgen
|
Weir and Cockerham (1984)
|
|
Numerical methods
Function minimization