bpp-seq  2.1.0
bpp::WordAlphabet Class Reference

The base class for word alphabets. More...

#include <Bpp/Seq/Alphabet/WordAlphabet.h>

+ Inheritance diagram for bpp::WordAlphabet:
+ Collaboration diagram for bpp::WordAlphabet:

List of all members.

Public Member Functions

 WordAlphabet (const std::vector< const Alphabet * > &vAlpha)
 Builds a new word alphabet from a vector of Alphabets.
 WordAlphabet (const Alphabet *pAlpha, unsigned int num)
 Builds a new word alphabet from a pointer to number of Alphabets.
virtual ~WordAlphabet ()
bool hasUniqueAlphabet () const
 Returns True if the Alphabet of the letters in the word are the same type.
unsigned int getLength () const
 Returns the length of the word.
unsigned int getNumberOfTypes () const
 Returns the number of resolved states + one for unresolved.
std::string getAlphabetType () const
 Identification method.
int getUnknownCharacterCode () const
bool isUnresolved (int state) const
bool isUnresolved (const std::string &state) const
std::vector< int > getAlias (int state) const throw (BadIntException)
 Get all resolved states that match a generic state.
std::vector< std::string > getAlias (const std::string &state) const throw (BadCharException)
 Get all resolved states that match a generic state.
int getGeneric (const std::vector< int > &states) const throw (BadIntException)
 Get the generic state that match a set of states.
std::string getGeneric (const std::vector< std::string > &states) const throw (BadCharException)
 Get the generic state that match a set of states.
Methods redefined from Alphabet
std::string getName (const std::string &state) const throw (BadCharException)
 Get the complete name of a state given its string description.
int charToInt (const std::string &state) const throw (BadCharException)
 Give the int description of a state given its string description.
unsigned int getSize () const
 Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases.
Word specific methods
const AlphabetgetNAlphabet (size_t n) const
 Get the pointer to the Alphabet of the n-position.
virtual int getWord (const std::vector< int > &vint, size_t pos=0) const throw (IndexOutOfBoundsException)
 Get the int code for a word given the int code of the underlying positions.
virtual std::string getWord (const std::vector< std::string > &vpos, size_t pos=0) const throw (IndexOutOfBoundsException, BadCharException)
 Get the char code for a word given the char code of the underlying positions.
int getNPosition (int word, size_t n) const throw (BadIntException)
 Get the int code of the n-position of a word given its int description.
std::vector< int > getPositions (int word) const throw (BadIntException)
 Get the int codes of each position of a word given its int description.
std::string getNPosition (const std::string &word, size_t n) const throw (BadCharException)
 Get the char code of the n-position of a word given its char description.
std::vector< std::string > getPositions (const std::string &word) const throw (BadCharException)
 Get the char codes of each position of a word given its char description.
Sequencetranslate (const Sequence &sequence, size_t=0) const throw (AlphabetMismatchException, Exception)
 Translate a whole sequence from letters alphabet to words alphabet.
Sequencereverse (const Sequence &sequence) const throw (AlphabetMismatchException, Exception)
 Translate a whole sequence from words alphabet to letters alphabet.
Overloaded AbstractAlphabet methods.
unsigned int getStateCodingSize () const
 Get the size of the string coding a state.
Implement these methods from the Alphabet interface.
unsigned int getNumberOfChars () const
 Get the number of supported characters in this alphabet, including generic characters (e.g. return 20 for DNA alphabet).
std::string getName (int state) const throw (BadIntException)
 Get the complete name of a state given its int description.
std::string intToChar (int state) const throw (BadIntException)
 Give the string description of a state given its int description.
bool isIntInAlphabet (int state) const
 Tell if a state (specified by its int description) is allowed by the the alphabet.
bool isCharInAlphabet (const std::string &state) const
 Tell if a state (specified by its string description) is allowed by the the alphabet.
const std::vector< int > & getSupportedInts () const
const std::vector< std::string > & getSupportedChars () const
int getGapCharacterCode () const
bool isGap (int state) const
bool isGap (const std::string &state) const
Specific methods to access AlphabetState
const AlphabetStategetState (const std::string &letter) const throw (BadCharException)
 Get a state by its letter.
const AlphabetStategetState (int num) const throw (BadIntException)
 Get a state by its num.

Protected Member Functions

virtual void registerState (const AlphabetState &st)
 Add a state to the Alphabet.
virtual void setState (size_t pos, const AlphabetState &st) throw (IndexOutOfBoundsException)
 Set a state in the Alphabet.
void resize (unsigned int size)
 Resize the private alphabet_ vector.
virtual AlphabetStategetStateAt (size_t pos) throw (IndexOutOfBoundsException)
 Get a state at a position in the alphabet_ vector.
virtual const AlphabetStategetStateAt (size_t pos) const throw (IndexOutOfBoundsException)
 Get a state at a position in the alphabet_ vector.
void remap ()
 Re-update the maps using the alphabet_ vector content.

Protected Attributes

std::vector< const Alphabet * > vAbsAlph_
Available codes

These vectors will be computed the first time you call the getAvailableInts or getAvailableChars method.

std::vector< std::string > charList_
std::vector< int > intList_

Private Member Functions

Inner utilitary functions
bool containsUnresolved (const std::string &state) const throw (BadCharException)
bool containsGap (const std::string &state) const throw (BadCharException)
void build_ ()

Detailed Description

The base class for word alphabets.

These alphabets are compounds of several alphabets. The only constraint on these alphabets is that their words have length one (so it is not possible to make WordAlphabets from other WordAlphabets). The construction is made from a vector of pointers to AbstractAlphabets.

The strings of the WordAlphabet are concatenations of the strings of the Alphabets. They are made from the resolved letters of the Alphabets.

Definition at line 66 of file WordAlphabet.h.


Constructor & Destructor Documentation

WordAlphabet::WordAlphabet ( const std::vector< const Alphabet * > &  vAlpha)

Builds a new word alphabet from a vector of Alphabets.

The unit alphabets are not owned by the world alphabet, and won't be destroyed when this instance is destroyed.

Parameters:
vAlphaThe vector of Alphabets to be used.

Definition at line 51 of file WordAlphabet.cpp.

References build_().

WordAlphabet::WordAlphabet ( const Alphabet pAlpha,
unsigned int  num 
)

Builds a new word alphabet from a pointer to number of Alphabets.

Parameters:
pAlphaThe Pointer to the Alphabet to be used.
numthe length of the words.

Definition at line 58 of file WordAlphabet.cpp.

References build_(), and vAbsAlph_.

virtual bpp::WordAlphabet::~WordAlphabet ( ) [inline, virtual]

Definition at line 93 of file WordAlphabet.h.


Member Function Documentation

bool WordAlphabet::containsGap ( const std::string &  state) const throw (BadCharException) [private]

Definition at line 169 of file WordAlphabet.cpp.

Referenced by charToInt().

bool WordAlphabet::containsUnresolved ( const std::string &  state) const throw (BadCharException) [private]

Definition at line 151 of file WordAlphabet.cpp.

Referenced by charToInt().

std::vector< int > WordAlphabet::getAlias ( int  state) const throw (BadIntException) [virtual]

Get all resolved states that match a generic state.

If the given state is not a generic code then the output vector will contain this unique code.

Parameters:
stateThe alias to resolve.
Returns:
A vector of resolved states.
Exceptions:
BadIntExceptionWhen state is not a valid integer.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 200 of file WordAlphabet.cpp.

std::vector< std::string > WordAlphabet::getAlias ( const std::string &  state) const throw (BadCharException) [virtual]

Get all resolved states that match a generic state.

If the given state is not a generic code then the output vector will contain this unique code.

Parameters:
stateThe alias to resolve.
Returns:
A vector of resolved states.
Exceptions:
BadCharExceptionWhen state is not a valid char description.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 224 of file WordAlphabet.cpp.

References bpp::TextTools::toUpper().

std::string WordAlphabet::getAlphabetType ( ) const [virtual]

Identification method.

Used to tell if two alphabets describe the same type of sequences. For instance, this method is used by sequence containers to compare two alphabets and allow or deny addition of sequences.

Returns:
A text describing the alphabet.

Implements bpp::Alphabet.

Reimplemented in bpp::CodonAlphabet, bpp::InvertebrateMitochondrialCodonAlphabet, bpp::YeastMitochondrialCodonAlphabet, bpp::EchinodermMitochondrialCodonAlphabet, bpp::VertebrateMitochondrialCodonAlphabet, and bpp::StandardCodonAlphabet.

Definition at line 129 of file WordAlphabet.cpp.

References vAbsAlph_.

Referenced by hasUniqueAlphabet().

int bpp::AbstractAlphabet::getGapCharacterCode ( ) const [inline, virtual, inherited]
Returns:
The int code for gap characters.

Implements bpp::Alphabet.

Definition at line 132 of file AbstractAlphabet.h.

Referenced by bpp::SequenceTools::replaceStopsWithGaps().

int WordAlphabet::getGeneric ( const std::vector< int > &  states) const throw (BadIntException) [virtual]

Get the generic state that match a set of states.

If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.

Parameters:
statesA vector of states to resolve.
Returns:
A int code for the computed state.
Exceptions:
BadIntExceptionWhen a state is not a valid integer.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 256 of file WordAlphabet.cpp.

std::string WordAlphabet::getGeneric ( const std::vector< std::string > &  states) const throw (BadCharException) [virtual]

Get the generic state that match a set of states.

If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.

Parameters:
statesA vector of states to resolve.
Returns:
A string code for the computed state.
Exceptions:
BadCharExceptionwhen a state is not a valid char description.
CharStateNotSupportedExceptionwhen the alphabet does not support Char state for unresolved state.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 263 of file WordAlphabet.cpp.

unsigned int bpp::WordAlphabet::getLength ( ) const [inline]

Returns the length of the word.

Definition at line 142 of file WordAlphabet.h.

References vAbsAlph_.

const Alphabet* bpp::WordAlphabet::getNAlphabet ( size_t  n) const [inline]

Get the pointer to the Alphabet of the n-position.

Parameters:
nThe position in the word (starting at 0).
Returns:
The pointer to the Alphabet of the n-position.

Definition at line 195 of file WordAlphabet.h.

References vAbsAlph_.

std::string WordAlphabet::getName ( const std::string &  state) const throw (BadCharException) [virtual]

Get the complete name of a state given its string description.

In case of undefined characters (i.e. N and X for nucleic alphabets), this method will return the name of the undefined word.

Parameters:
stateThe string description of the given state.
Returns:
The name of the state.
Exceptions:
BadCharExceptionWhen state is not a valid char description.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 186 of file WordAlphabet.cpp.

References bpp::AbstractAlphabet::getName().

Referenced by bpp::CodonAlphabet::isStop().

std::string AbstractAlphabet::getName ( int  state) const throw (BadIntException) [virtual, inherited]

Get the complete name of a state given its int description.

In case of several states with identical number (i.e. N and X for nucleic alphabets), this method returns the name of the first found in the vector.

Parameters:
stateThe int description of the given state.
Returns:
The name of the state.
Exceptions:
BadIntExceptionWhen state is not a valid integer.

Implements bpp::Alphabet.

Definition at line 130 of file AbstractAlphabet.cpp.

int bpp::WordAlphabet::getNPosition ( int  word,
size_t  n 
) const throw (BadIntException) [inline]

Get the int code of the n-position of a word given its int description.

Parameters:
wordThe int description of the word.
nThe position in the word (starting at 0).
Returns:
The int description of the n-position of the word.

Definition at line 233 of file WordAlphabet.h.

References bpp::AbstractAlphabet::intToChar(), and vAbsAlph_.

Referenced by bpp::SequenceContainerTools::getCodonPosition().

std::string bpp::WordAlphabet::getNPosition ( const std::string &  word,
size_t  n 
) const throw (BadCharException) [inline]

Get the char code of the n-position of a word given its char description.

Parameters:
wordThe char description of the word.
nThe position in the word (starting at 0).
Returns:
The char description of the n-position of the word.

Definition at line 267 of file WordAlphabet.h.

References charToInt(), and vAbsAlph_.

unsigned int bpp::WordAlphabet::getNumberOfTypes ( ) const [inline, virtual]

Returns the number of resolved states + one for unresolved.

Implements bpp::Alphabet.

Definition at line 152 of file WordAlphabet.h.

References bpp::AbstractAlphabet::getNumberOfChars().

std::vector<int> bpp::WordAlphabet::getPositions ( int  word) const throw (BadIntException) [inline]

Get the int codes of each position of a word given its int description.

Parameters:
wordThe int description of the word.
Returns:
The int description of the positions of the codon.

Definition at line 249 of file WordAlphabet.h.

References charToInt(), bpp::AbstractAlphabet::intToChar(), and vAbsAlph_.

Referenced by bpp::GeneticCode::isFourFoldDegenerated(), bpp::CodonSiteTools::numberOfSynonymousDifferences(), and bpp::CodonSiteTools::numberOfSynonymousPositions().

std::vector<std::string> bpp::WordAlphabet::getPositions ( const std::string &  word) const throw (BadCharException) [inline]

Get the char codes of each position of a word given its char description.

Parameters:
wordThe char description of the word.
Returns:
The char description of the three positions of the word.

Definition at line 285 of file WordAlphabet.h.

References charToInt().

unsigned int bpp::WordAlphabet::getSize ( ) const [inline, virtual]

Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases.

Returns:
The number of resolved states.

Implements bpp::Alphabet.

Definition at line 124 of file WordAlphabet.h.

References bpp::AbstractAlphabet::getNumberOfChars().

Referenced by build_(), charToInt(), and getUnknownCharacterCode().

const AlphabetState & AbstractAlphabet::getState ( const std::string &  letter) const throw (BadCharException) [virtual, inherited]

Get a state by its letter.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
letterThe letter of the state to find.
Exceptions:
BadCharExceptionIf the letter is not in the Alphabet.

Implements bpp::Alphabet.

Reimplemented in bpp::NucleicAlphabet, and bpp::ProteicAlphabet.

Definition at line 89 of file AbstractAlphabet.cpp.

Referenced by bpp::CaseMaskedAlphabet::CaseMaskedAlphabet().

const AlphabetState & AbstractAlphabet::getState ( int  num) const throw (BadIntException) [virtual, inherited]

Get a state by its num.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
numThe num of the state to find.
Exceptions:
BadIntExceptionIf the num is not in the Alphabet.

Implements bpp::Alphabet.

Reimplemented in bpp::NucleicAlphabet, and bpp::ProteicAlphabet.

Definition at line 98 of file AbstractAlphabet.cpp.

AlphabetState & AbstractAlphabet::getStateAt ( size_t  pos) throw (IndexOutOfBoundsException) [protected, virtual, inherited]

Get a state at a position in the alphabet_ vector.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
posThe index of the state in the alphabet_ vector.
Exceptions:
IndexOutOfBoundsExceptionIf pos is out of the vector.

Definition at line 107 of file AbstractAlphabet.cpp.

Referenced by build_(), bpp::EchinodermMitochondrialCodonAlphabet::EchinodermMitochondrialCodonAlphabet(), bpp::InvertebrateMitochondrialCodonAlphabet::InvertebrateMitochondrialCodonAlphabet(), bpp::StandardCodonAlphabet::StandardCodonAlphabet(), bpp::VertebrateMitochondrialCodonAlphabet::VertebrateMitochondrialCodonAlphabet(), and bpp::YeastMitochondrialCodonAlphabet::YeastMitochondrialCodonAlphabet().

const AlphabetState & AbstractAlphabet::getStateAt ( size_t  pos) const throw (IndexOutOfBoundsException) [protected, virtual, inherited]

Get a state at a position in the alphabet_ vector.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
posThe index of the state in the alphabet_ vector.
Exceptions:
IndexOutOfBoundsExceptionIf pos is out of the vector.

Definition at line 115 of file AbstractAlphabet.cpp.

unsigned int bpp::WordAlphabet::getStateCodingSize ( ) const [inline, virtual]

Get the size of the string coding a state.

Returns:
The size of the tring coding each states in the Alphabet.
Author:
Sylvain Gaillard

Reimplemented from bpp::AbstractAlphabet.

Definition at line 324 of file WordAlphabet.h.

References vAbsAlph_.

const std::vector< std::string > & AbstractAlphabet::getSupportedChars ( ) const [virtual, inherited]
Returns:
A list of all supported character codes.

Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.

Implements bpp::Alphabet.

Definition at line 260 of file AbstractAlphabet.cpp.

Referenced by bpp::CaseMaskedAlphabet::CaseMaskedAlphabet().

const std::vector< int > & AbstractAlphabet::getSupportedInts ( ) const [virtual, inherited]
Returns:
A list of all supported int codes.

Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.

Implements bpp::Alphabet.

Definition at line 243 of file AbstractAlphabet.cpp.

int bpp::WordAlphabet::getUnknownCharacterCode ( ) const [inline, virtual]
Returns:
The int code for unknown characters.

Implements bpp::Alphabet.

Definition at line 158 of file WordAlphabet.h.

References getSize().

Referenced by isUnresolved().

int WordAlphabet::getWord ( const std::vector< int > &  vint,
size_t  pos = 0 
) const throw (IndexOutOfBoundsException) [virtual]

Get the int code for a word given the int code of the underlying positions.

The int code of each position must match the corresponding alphabet specified at this position.

Parameters:
vintdescription for all the positions.
posthe start position to match in the vector.
Returns:
The int code of the word.
Exceptions:
IndexOutOfBoundsExceptionIn case of wrong position.

Definition at line 270 of file WordAlphabet.cpp.

std::string WordAlphabet::getWord ( const std::vector< std::string > &  vpos,
size_t  pos = 0 
) const throw (IndexOutOfBoundsException, BadCharException) [virtual]

Get the char code for a word given the char code of the underlying positions.

The char code of each position must match the corresponding alphabet specified at this position.

Parameters:
vposvector description for all the positions.
posthe start position to match in the vector.
Returns:
The string of the word.
Exceptions:
IndexOutOfBoundsExceptionIn case of wrong position.

Definition at line 286 of file WordAlphabet.cpp.

Returns True if the Alphabet of the letters in the word are the same type.

Definition at line 140 of file WordAlphabet.cpp.

References getAlphabetType(), and vAbsAlph_.

bool AbstractAlphabet::isCharInAlphabet ( const std::string &  state) const [virtual, inherited]

Tell if a state (specified by its string description) is allowed by the the alphabet.

Parameters:
stateThe string description.
Returns:
'true' if the state in known.

Implements bpp::Alphabet.

Reimplemented in bpp::LetterAlphabet.

Definition at line 161 of file AbstractAlphabet.cpp.

bool bpp::AbstractAlphabet::isGap ( int  state) const [inline, virtual, inherited]
Parameters:
stateThe state to test.
Returns:
'True' if the state is a gap.

Implements bpp::Alphabet.

Reimplemented in bpp::RNY.

Definition at line 133 of file AbstractAlphabet.h.

bool bpp::AbstractAlphabet::isGap ( const std::string &  state) const [inline, virtual, inherited]
Parameters:
stateThe state to test.
Returns:
'True' if the state is a gap.

Implements bpp::Alphabet.

Definition at line 134 of file AbstractAlphabet.h.

References bpp::AbstractAlphabet::charToInt().

bool AbstractAlphabet::isIntInAlphabet ( int  state) const [virtual, inherited]

Tell if a state (specified by its int description) is allowed by the the alphabet.

Parameters:
stateThe int description.
Returns:
'true' if the state in known.

Implements bpp::Alphabet.

Definition at line 151 of file AbstractAlphabet.cpp.

bool bpp::WordAlphabet::isUnresolved ( int  state) const [inline, virtual]
Parameters:
stateThe state to test.
Returns:
'True' if the state is unresolved.

Implements bpp::Alphabet.

Definition at line 163 of file WordAlphabet.h.

References getUnknownCharacterCode().

Referenced by bpp::CodonSiteTools::numberOfSynonymousPositions().

bool bpp::WordAlphabet::isUnresolved ( const std::string &  state) const [inline, virtual]
Parameters:
stateThe state to test.
Returns:
'True' if the state is unresolved.

Implements bpp::Alphabet.

Definition at line 164 of file WordAlphabet.h.

References charToInt(), and getUnknownCharacterCode().

void AbstractAlphabet::registerState ( const AlphabetState st) [protected, virtual, inherited]

Add a state to the Alphabet.

Parameters:
stThe state to add.

Reimplemented in bpp::LetterAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.

Definition at line 65 of file AbstractAlphabet.cpp.

References bpp::AlphabetState::clone().

void bpp::AbstractAlphabet::remap ( ) [inline, protected, inherited]

Re-update the maps using the alphabet_ vector content.

Definition at line 207 of file AbstractAlphabet.h.

References bpp::AbstractAlphabet::alphabet_, and bpp::AbstractAlphabet::updateMaps_().

Referenced by build_().

void bpp::AbstractAlphabet::resize ( unsigned int  size) [inline, protected, inherited]

Resize the private alphabet_ vector.

Parameters:
sizeThe new size of the Alphabet.

Definition at line 182 of file AbstractAlphabet.h.

References bpp::AbstractAlphabet::alphabet_.

Referenced by bpp::BinaryAlphabet::BinaryAlphabet(), build_(), bpp::DefaultAlphabet::DefaultAlphabet(), and bpp::RNY::RNY().

Translate a whole sequence from words alphabet to letters alphabet.

Parameters:
sequenceA sequence in words alphabet.
Returns:
The corresponding sequence in letters alphabet.
Exceptions:
AlphabetMismatchExceptionIf the sequence alphabet do not match the target alphabet.
ExceptionOther kind of error, depending on the implementation.

Definition at line 327 of file WordAlphabet.cpp.

References bpp::Sequence::append(), and bpp::BasicSymbolList::size().

void AbstractAlphabet::setState ( size_t  pos,
const AlphabetState st 
) throw (IndexOutOfBoundsException) [protected, virtual, inherited]

Set a state in the Alphabet.

Parameters:
posThe index of the state in the alphabet_ vector.
stThe new state to put in the Alphabet.

Reimplemented in bpp::LetterAlphabet.

Definition at line 74 of file AbstractAlphabet.cpp.

Referenced by bpp::BinaryAlphabet::BinaryAlphabet(), build_(), and bpp::RNY::RNY().

Sequence * WordAlphabet::translate ( const Sequence sequence,
size_t  pos = 0 
) const throw (AlphabetMismatchException, Exception)

Translate a whole sequence from letters alphabet to words alphabet.

Parameters:
sequenceA sequence in letters alphabet.
posthe start postion (default 0)
Returns:
The corresponding sequence in words alphabet.
Exceptions:
AlphabetMismatchExceptionIf the sequence alphabet do not match the source alphabet.
ExceptionOther kind of error, depending on the implementation.

Definition at line 303 of file WordAlphabet.cpp.


Member Data Documentation

std::vector<std::string> bpp::AbstractAlphabet::charList_ [mutable, protected, inherited]

Definition at line 99 of file AbstractAlphabet.h.

std::vector<int> bpp::AbstractAlphabet::intList_ [mutable, protected, inherited]

Definition at line 100 of file AbstractAlphabet.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Friends