|
bpp-seq
2.1.0
|
The Alphabet interface. More...
#include <Bpp/Seq/Alphabet/Alphabet.h>
Inheritance diagram for bpp::Alphabet:Public Member Functions | |
| Alphabet () | |
| virtual | ~Alphabet () |
| virtual std::string | getName (int state) const =0 throw (BadIntException) |
| Get the complete name of a state given its int description. More... | |
| virtual std::string | getName (const std::string &state) const =0 throw (BadCharException) |
| Get the complete name of a state given its string description. More... | |
| virtual std::string | getAlphabetType () const =0 |
| Identification method. More... | |
| virtual unsigned int | getStateCodingSize () const =0 |
| Get the size of the string coding a state. More... | |
= Tests | |
| virtual bool | isIntInAlphabet (int state) const =0 |
| Tell if a state (specified by its int description) is allowed by the the alphabet. More... | |
| virtual bool | isCharInAlphabet (const std::string &state) const =0 |
| Tell if a state (specified by its string description) is allowed by the the alphabet. More... | |
State access | |
| virtual const AlphabetState & | getState (int state) const =0 throw (BadIntException) |
| Get a state given its int description. More... | |
| virtual const AlphabetState & | getState (const std::string &state) const =0 throw (BadCharException) |
| Get a state given its string description. More... | |
Conversion methods | |
| virtual std::string | intToChar (int state) const =0 throw (BadIntException) |
| Give the string description of a state given its int description. More... | |
| virtual int | charToInt (const std::string &state) const =0 throw (BadCharException) |
| Give the int description of a state given its string description. More... | |
Sizes. | |
| virtual unsigned int | getNumberOfChars () const =0 |
| Get the number of supported characters in this alphabet, including generic characters (e.g. return 20 for DNA alphabet). More... | |
| virtual unsigned int | getNumberOfTypes () const =0 |
| Get the number of distinct states in alphabet (e.g. return 15 for DNA alphabet). This is the number of integers used for state description. More... | |
| virtual unsigned int | getSize () const =0 |
| Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases. More... | |
Utilitary methods | |
| virtual std::vector< int > | getAlias (int state) const =0 throw (BadIntException) |
| Get all resolved states that match a generic state. More... | |
| virtual std::vector< std::string > | getAlias (const std::string &state) const =0 throw (BadCharException) |
| Get all resolved states that match a generic state. More... | |
| virtual int | getGeneric (const std::vector< int > &states) const =0 throw (BadIntException) |
| Get the generic state that match a set of states. More... | |
| virtual std::string | getGeneric (const std::vector< std::string > &states) const =0 throw (AlphabetException) |
| Get the generic state that match a set of states. More... | |
| virtual const std::vector< int > & | getSupportedInts () const =0 |
| virtual const std::vector < std::string > & | getSupportedChars () const =0 |
| virtual int | getUnknownCharacterCode () const =0 |
| virtual int | getGapCharacterCode () const =0 |
| virtual bool | isGap (int state) const =0 |
| virtual bool | isGap (const std::string &state) const =0 |
| virtual bool | isUnresolved (int state) const =0 |
| virtual bool | isUnresolved (const std::string &state) const =0 |
The Alphabet interface.
An alphabet object defines all the states allowed for a particular type of sequence. These states are coded as a string and an integer. The string description is the one found in the text (human comprehensive) description of sequences, typically in sequence files. However, for computionnal needs, this is often more efficient to store the sequences as a vector of integers. The link between the two descriptions is made via the Alphabet classes, and the two methods intToChar() and charToInt(). The Alphabet interface also provides other methods, like getting the full name of the states and so on.
The alphabet objects may throw several exceptions derived of the AlphabetException class.
Definition at line 121 of file Alphabet.h.
|
inline |
Definition at line 124 of file Alphabet.h.
|
inlinevirtual |
Definition at line 125 of file Alphabet.h.
|
pure virtual |
Give the int description of a state given its string description.
| state | The string description. |
| BadCharException | When state is not a valid char description. |
Implemented in bpp::AbstractAlphabet, bpp::WordAlphabet, bpp::LetterAlphabet, and bpp::RNY.
|
pure virtual |
Get all resolved states that match a generic state.
If the given state is not a generic code then the output vector will contain this unique code.
| state | The alias to resolve. |
| BadIntException | When state is not a valid integer. |
Implemented in bpp::WordAlphabet, bpp::AbstractAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::DNA, and bpp::RNA.
Referenced by bpp::SymbolListTools::getCounts(), bpp::SequenceTools::getPutativeHaplotypes(), bpp::AlphabetTools::match(), bpp::AlphabetTools::match(), and bpp::SequenceTools::subtractHaplotype().
|
pure virtual |
Get all resolved states that match a generic state.
If the given state is not a generic code then the output vector will contain this unique code.
| state | The alias to resolve. |
| BadCharException | When state is not a valid char description. |
Implemented in bpp::WordAlphabet, bpp::AbstractAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::DNA, and bpp::RNA.
|
pure virtual |
Identification method.
Used to tell if two alphabets describe the same type of sequences. For instance, this method is used by sequence containers to compare two alphabets and allow or deny addition of sequences.
Implemented in bpp::WordAlphabet, bpp::ProteicAlphabet, bpp::CodonAlphabet, bpp::RNY, bpp::CaseMaskedAlphabet, bpp::DNA, bpp::RNA, bpp::DefaultAlphabet, bpp::BinaryAlphabet, bpp::InvertebrateMitochondrialCodonAlphabet, bpp::YeastMitochondrialCodonAlphabet, bpp::EchinodermMitochondrialCodonAlphabet, bpp::VertebrateMitochondrialCodonAlphabet, and bpp::StandardCodonAlphabet.
Referenced by bpp::SiteTools::areSitesIdentical(), bpp::SiteTools::areSitesIdentical(), bpp::SequenceTools::invertComplement(), and bpp::SequenceTools::invertComplement().
|
pure virtual |
Implemented in bpp::AbstractAlphabet.
Referenced by bpp::SymbolListTools::changeUnresolvedCharactersToGaps(), bpp::SiteContainerTools::changeUnresolvedCharactersToGaps(), bpp::SiteContainerTools::computeSimilarity(), bpp::SequenceTools::getPutativeHaplotypes(), bpp::SiteContainerTools::getSequencePositions(), bpp::SequenceWithAnnotation::setToSizeL(), bpp::BasicSequence::setToSizeL(), bpp::SequenceWithAnnotation::setToSizeR(), bpp::BasicSequence::setToSizeR(), and bpp::AbstractTransliterator::translate().
|
pure virtual |
Get the generic state that match a set of states.
If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.
| states | A vector of states to resolve. |
| BadIntException | When a state is not a valid integer. |
Implemented in bpp::WordAlphabet, bpp::AbstractAlphabet, bpp::ProteicAlphabet, bpp::DNA, and bpp::RNA.
Referenced by bpp::SequenceTools::combineSequences(), bpp::PhredPoly::nextSequence(), and bpp::SequenceTools::subtractHaplotype().
|
pure virtual |
Get the generic state that match a set of states.
If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.
| states | A vector of states to resolve. |
| BadCharException | when a state is not a valid char description. |
| CharStateNotSupportedException | when the alphabet does not support Char state for unresolved state. |
Implemented in bpp::WordAlphabet, bpp::AbstractAlphabet, bpp::ProteicAlphabet, bpp::DNA, and bpp::RNA.
|
pure virtual |
Get the complete name of a state given its int description.
In case of several states with identical number (i.e. N and X for nucleic alphabets), this method returns the name of the first found in the vector.
| state | The int description of the given state. |
| BadIntException | When state is not a valid integer. |
Implemented in bpp::AbstractAlphabet.
Referenced by bpp::SequenceTools::subtractHaplotype().
|
pure virtual |
Get the complete name of a state given its string description.
In case of several states with identical number (i.e. N and X for nucleic alphabets), this method will return the name of the first found in the vector.
| state | The string description of the given state. |
| BadCharException | When state is not a valid char description. |
Implemented in bpp::AbstractAlphabet, and bpp::WordAlphabet.
|
pure virtual |
Get the number of supported characters in this alphabet, including generic characters (e.g. return 20 for DNA alphabet).
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Get the number of distinct states in alphabet (e.g. return 15 for DNA alphabet). This is the number of integers used for state description.
Implemented in bpp::NucleicAlphabet, bpp::WordAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::CaseMaskedAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
Referenced by bpp::CaseMaskedAlphabet::getNumberOfTypes().
|
pure virtual |
Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases.
Implemented in bpp::NucleicAlphabet, bpp::WordAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::CaseMaskedAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
Referenced by bpp::SequenceTools::bowkerTest(), bpp::SequenceContainerTools::getFrequencies(), bpp::SequenceContainerTools::getFrequencies(), bpp::SequenceApplicationTools::getSitesToAnalyse(), bpp::CaseMaskedAlphabet::getSize(), and bpp::SimpleScore::SimpleScore().
|
pure virtual |
Get a state given its int description.
| state | The int description. |
| BadIntException | When state is not a valid integer. |
Implemented in bpp::AbstractAlphabet, bpp::NucleicAlphabet, and bpp::ProteicAlphabet.
|
pure virtual |
Get a state given its string description.
| state | The string description. |
| BadCharException | When state is not a valid string. |
Implemented in bpp::AbstractAlphabet, bpp::NucleicAlphabet, and bpp::ProteicAlphabet.
|
pure virtual |
Get the size of the string coding a state.
Implemented in bpp::WordAlphabet, and bpp::AbstractAlphabet.
Referenced by bpp::SequenceWithQuality::append(), bpp::SequenceWithQuality::append(), bpp::MaseTools::getSelectedSites(), bpp::Phylip::writeInterleaved(), and bpp::Phylip::writeSequential().
|
pure virtual |
Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Implemented in bpp::NucleicAlphabet, bpp::WordAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::CaseMaskedAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
Referenced by bpp::SymbolListTools::changeGapsToUnknownCharacters(), bpp::SiteContainerTools::changeGapsToUnknownCharacters(), bpp::CaseMaskedAlphabet::getUnknownCharacterCode(), bpp::SiteTools::hasUnknown(), and bpp::SequenceTools::subtractHaplotype().
|
pure virtual |
Give the string description of a state given its int description.
| state | The int description. |
| BadIntException | When state is not a valid integer. |
Implemented in bpp::AbstractAlphabet, and bpp::RNY.
Referenced by bpp::SequenceTools::subtractHaplotype(), and bpp::SequenceTools::subtractHaplotype().
|
pure virtual |
Tell if a state (specified by its string description) is allowed by the the alphabet.
| state | The string description. |
Implemented in bpp::AbstractAlphabet, and bpp::LetterAlphabet.
|
pure virtual |
| state | The state to test. |
Implemented in bpp::AbstractAlphabet, and bpp::RNY.
Referenced by bpp::SequenceTools::bowkerTest(), bpp::SequenceTools::bowkerTest(), bpp::SymbolListTools::changeGapsToUnknownCharacters(), bpp::SiteContainerTools::changeGapsToUnknownCharacters(), bpp::SiteContainerTools::computeSimilarity(), bpp::SiteContainerTools::computeSimilarity(), bpp::SiteContainerTools::computeSimilarity(), bpp::SiteContainerTools::computeSimilarity(), bpp::SiteContainerTools::computeSimilarity(), bpp::SiteContainerTools::computeSimilarity(), bpp::SequenceTools::getNumberOfCompleteSites(), bpp::SequenceTools::getNumberOfSites(), bpp::SequenceTools::getSequenceWithoutGaps(), bpp::SiteTools::hasGap(), bpp::SiteTools::isComplete(), bpp::SiteTools::isGapOnly(), bpp::SiteTools::isGapOrUnresolvedOnly(), bpp::SequenceWithQualityTools::removeGaps(), and bpp::SequenceTools::removeGaps().
|
pure virtual |
| state | The state to test. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Tell if a state (specified by its int description) is allowed by the the alphabet.
| state | The int description. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
| state | The state to test. |
Implemented in bpp::NucleicAlphabet, bpp::WordAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::CaseMaskedAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
Referenced by bpp::SequenceTools::bowkerTest(), bpp::SequenceTools::bowkerTest(), bpp::SymbolListTools::changeUnresolvedCharactersToGaps(), bpp::SiteContainerTools::changeUnresolvedCharactersToGaps(), bpp::SiteContainerTools::computeSimilarity(), bpp::SiteContainerTools::computeSimilarity(), bpp::SequenceTools::getNumberOfCompleteSites(), bpp::SequenceTools::getNumberOfUnresolvedSites(), bpp::SiteTools::isComplete(), bpp::SiteTools::isGapOrUnresolvedOnly(), bpp::CaseMaskedAlphabet::isUnresolved(), bpp::CaseMaskedAlphabet::isUnresolved(), bpp::SequenceTools::subtractHaplotype(), and bpp::SequenceTools::subtractHaplotype().
|
pure virtual |
| state | The state to test. |
Implemented in bpp::NucleicAlphabet, bpp::WordAlphabet, bpp::ProteicAlphabet, bpp::RNY, bpp::CaseMaskedAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.