bpp-seq  2.1.0
bpp::RNA Class Reference

This alphabet is used to deal with RNA sequences. More...

#include <Bpp/Seq/Alphabet/RNA.h>

+ Inheritance diagram for bpp::RNA:
+ Collaboration diagram for bpp::RNA:

List of all members.

Public Member Functions

 RNA (bool exclamationMarkCountsAsGap=false)
virtual ~RNA ()
std::vector< int > getAlias (int state) const throw (BadIntException)
 Get all resolved states that match a generic state.
std::vector< std::string > getAlias (const std::string &state) const throw (BadCharException)
 Get all resolved states that match a generic state.
int getGeneric (const std::vector< int > &states) const throw (BadIntException)
 Get the generic state that match a set of states.
std::string getGeneric (const std::vector< std::string > &states) const throw (BadCharException)
 Get the generic state that match a set of states.
std::string getAlphabetType () const
 Identification method.
unsigned int getSize () const
 Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases.
unsigned int getNumberOfTypes () const
 Get the number of distinct states in alphabet (e.g. return 15 for DNA alphabet). This is the number of integers used for state description.
int getUnknownCharacterCode () const
bool isUnresolved (int state) const
bool isUnresolved (const std::string &state) const
bool isCharInAlphabet (char state) const
bool isCharInAlphabet (const std::string &state) const
 Tell if a state (specified by its string description) is allowed by the the alphabet.
int charToInt (const std::string &state) const throw (BadCharException)
 Give the int description of a state given its string description.
Specific methods
const NucleicAlphabetStategetStateByBinCode (int code) const throw (BadIntException)
 Get a state by its binary representation.
int subtract (int s1, int s2) const throw (BadIntException)
 Subtract states.
std::string subtract (const std::string &s1, const std::string &s2) const throw (BadCharException)
 Subtract states.
int getOverlap (int s1, int s2) const throw (BadIntException)
 Get the overlap between to states.
std::string getOverlap (const std::string &s1, const std::string &s2) const throw (BadCharException)
 Get the overlap between to states.
Implement these methods from the Alphabet interface.
unsigned int getNumberOfChars () const
 Get the number of supported characters in this alphabet, including generic characters (e.g. return 20 for DNA alphabet).
std::string getName (const std::string &state) const throw (BadCharException)
 Get the complete name of a state given its string description.
std::string getName (int state) const throw (BadIntException)
 Get the complete name of a state given its int description.
std::string intToChar (int state) const throw (BadIntException)
 Give the string description of a state given its int description.
bool isIntInAlphabet (int state) const
 Tell if a state (specified by its int description) is allowed by the the alphabet.
const std::vector< int > & getSupportedInts () const
const std::vector< std::string > & getSupportedChars () const
int getGapCharacterCode () const
bool isGap (int state) const
bool isGap (const std::string &state) const

Protected Member Functions

void registerState (const AlphabetState &st)
 Add a state to the Alphabet.
void setState (size_t pos, const AlphabetState &st) throw (IndexOutOfBoundsException)
 Set a state in the Alphabet.
virtual AlphabetStategetStateAt (size_t pos) throw (IndexOutOfBoundsException)
 Get a state at a position in the alphabet_ vector.
virtual const AlphabetStategetStateAt (size_t pos) const throw (IndexOutOfBoundsException)
 Get a state at a position in the alphabet_ vector.
void resize (unsigned int size)
 Resize the private alphabet_ vector.
void remap ()
 Re-update the maps using the alphabet_ vector content.
unsigned int getStateCodingSize () const
 Get the size of the string coding a state.

Protected Attributes

Available codes

These vectors will be computed the first time you call the getAvailableInts or getAvailableChars method.

std::vector< std::string > charList_
std::vector< int > intList_

Overloaded methods from AbstractAlphabet

const NucleicAlphabetStategetState (const std::string &letter) const throw (BadCharException)
 Get a state by its letter.
const NucleicAlphabetStategetState (int num) const throw (BadIntException)
 Get a state by its num.
void registerState (const NucleicAlphabetState &st)
void setState (unsigned int pos, const NucleicAlphabetState &st)
const NucleicAlphabetStategetStateAt (unsigned int pos) const throw (IndexOutOfBoundsException)
NucleicAlphabetStategetStateAt (unsigned int pos) throw (IndexOutOfBoundsException)

Detailed Description

This alphabet is used to deal with RNA sequences.

It supports all 4 nucleotides (A, U, G and C) with their standard denomination. Gaps are coded by '-', unresolved characters are coded by 'X, N, O, 0 or ?'. Extensive support for generic characters (e.g. 'P', 'Y', etc.) is provided.

Definition at line 58 of file RNA.h.


Constructor & Destructor Documentation

RNA::RNA ( bool  exclamationMarkCountsAsGap = false)
Parameters:
exclamationMarkCountsAsGapIf yes, '!' characters are replaced by gaps. Otherwise, they are counted as unknown characters.

Definition at line 55 of file RNA.cpp.

virtual bpp::RNA::~RNA ( ) [inline, virtual]

Definition at line 67 of file RNA.h.


Member Function Documentation

std::vector< int > RNA::getAlias ( int  state) const throw (BadIntException) [virtual]

Get all resolved states that match a generic state.

If the given state is not a generic code then the output vector will contain this unique code.

Parameters:
stateThe alias to resolve.
Returns:
A vector of resolved states.
Exceptions:
BadIntExceptionWhen state is not a valid integer.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 90 of file RNA.cpp.

References bpp::NucleicAlphabetState::getBinaryCode().

std::vector< std::string > RNA::getAlias ( const std::string &  state) const throw (BadCharException) [virtual]

Get all resolved states that match a generic state.

If the given state is not a generic code then the output vector will contain this unique code.

Parameters:
stateThe alias to resolve.
Returns:
A vector of resolved states.
Exceptions:
BadCharExceptionWhen state is not a valid char description.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 112 of file RNA.cpp.

References bpp::TextTools::toUpper().

std::string bpp::RNA::getAlphabetType ( ) const [inline, virtual]

Identification method.

Used to tell if two alphabets describe the same type of sequences. For instance, this method is used by sequence containers to compare two alphabets and allow or deny addition of sequences.

Returns:
A text describing the alphabet.

Implements bpp::Alphabet.

Definition at line 74 of file RNA.h.

int bpp::AbstractAlphabet::getGapCharacterCode ( ) const [inline, virtual, inherited]
Returns:
The int code for gap characters.

Implements bpp::Alphabet.

Definition at line 132 of file AbstractAlphabet.h.

Referenced by bpp::SequenceTools::replaceStopsWithGaps().

int RNA::getGeneric ( const std::vector< int > &  states) const throw (BadIntException) [virtual]

Get the generic state that match a set of states.

If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.

Parameters:
statesA vector of states to resolve.
Returns:
A int code for the computed state.
Exceptions:
BadIntExceptionWhen a state is not a valid integer.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 125 of file RNA.cpp.

std::string RNA::getGeneric ( const std::vector< std::string > &  states) const throw (BadCharException) [virtual]

Get the generic state that match a set of states.

If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.

Parameters:
statesA vector of states to resolve.
Returns:
A string code for the computed state.
Exceptions:
BadCharExceptionwhen a state is not a valid char description.
CharStateNotSupportedExceptionwhen the alphabet does not support Char state for unresolved state.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 137 of file RNA.cpp.

std::string AbstractAlphabet::getName ( const std::string &  state) const throw (BadCharException) [virtual, inherited]

Get the complete name of a state given its string description.

In case of several states with identical number (i.e. N and X for nucleic alphabets), this method will return the name of the first found in the vector.

Parameters:
stateThe string description of the given state.
Returns:
The name of the state.
Exceptions:
BadCharExceptionWhen state is not a valid char description.

Implements bpp::Alphabet.

Reimplemented in bpp::WordAlphabet.

Definition at line 123 of file AbstractAlphabet.cpp.

Referenced by bpp::WordAlphabet::getName().

std::string AbstractAlphabet::getName ( int  state) const throw (BadIntException) [virtual, inherited]

Get the complete name of a state given its int description.

In case of several states with identical number (i.e. N and X for nucleic alphabets), this method returns the name of the first found in the vector.

Parameters:
stateThe int description of the given state.
Returns:
The name of the state.
Exceptions:
BadIntExceptionWhen state is not a valid integer.

Implements bpp::Alphabet.

Definition at line 130 of file AbstractAlphabet.cpp.

unsigned int bpp::NucleicAlphabet::getNumberOfTypes ( ) const [inline, virtual, inherited]

Get the number of distinct states in alphabet (e.g. return 15 for DNA alphabet). This is the number of integers used for state description.

Returns:
The number of distinct states.

Implements bpp::Alphabet.

Definition at line 247 of file NucleicAlphabet.h.

int bpp::NucleicAlphabet::getOverlap ( int  s1,
int  s2 
) const throw (BadIntException) [inline, inherited]

Get the overlap between to states.

Get the overlapping states between two steps.

 int m = alpha->charToInt("M");
 int r = alpha->charToInt("R");
 int a = alpha->getOverlap(m, r);

 cout << alpha->intToChar(a) << endl;

 // should print A because M = A/C and R = A/G
Parameters:
s1the first state as an int
s2the second state as an int
Exceptions:
BadIntExceptionif one of the states is not valid
Returns:
The overlapping state
Author:
Sylvain Gaillard

Definition at line 212 of file NucleicAlphabet.h.

References bpp::AlphabetState::getNum(), bpp::NucleicAlphabet::getState(), and bpp::NucleicAlphabet::getStateByBinCode().

Referenced by bpp::NucleicAlphabet::getOverlap().

std::string bpp::NucleicAlphabet::getOverlap ( const std::string &  s1,
const std::string &  s2 
) const throw (BadCharException) [inline, inherited]

Get the overlap between to states.

Get the overlapping states between two steps.

 string m = "M";
 string r = R;

 cout << alpha->getOverlap(m, r) << endl;

 // should print A because M = A/C and R = A/G
Parameters:
s1the first state as a string
s2the second state as a string
Exceptions:
BadCharExceptionif one of the states is not valid
Returns:
The overlapping state
Author:
Sylvain Gaillard

Definition at line 236 of file NucleicAlphabet.h.

References bpp::LetterAlphabet::charToInt(), bpp::NucleicAlphabet::getOverlap(), and bpp::AbstractAlphabet::intToChar().

unsigned int bpp::NucleicAlphabet::getSize ( ) const [inline, virtual, inherited]

Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases.

Returns:
The number of resolved states.

Implements bpp::Alphabet.

Definition at line 244 of file NucleicAlphabet.h.

const NucleicAlphabetState& bpp::NucleicAlphabet::getState ( const std::string &  letter) const throw (BadCharException) [inline, virtual, inherited]

Get a state by its letter.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
letterThe letter of the state to find.
Exceptions:
BadCharExceptionIf the letter is not in the Alphabet.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 107 of file NucleicAlphabet.h.

Referenced by bpp::NucleicAlphabet::getOverlap(), bpp::NucleicAlphabet::getState(), and bpp::NucleicAlphabet::subtract().

const NucleicAlphabetState& bpp::NucleicAlphabet::getState ( int  num) const throw (BadIntException) [inline, virtual, inherited]

Get a state by its num.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
numThe num of the state to find.
Exceptions:
BadIntExceptionIf the num is not in the Alphabet.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 113 of file NucleicAlphabet.h.

References bpp::NucleicAlphabet::getState().

const NucleicAlphabetState& bpp::NucleicAlphabet::getStateAt ( unsigned int  pos) const throw (IndexOutOfBoundsException) [inline, protected, inherited]
NucleicAlphabetState& bpp::NucleicAlphabet::getStateAt ( unsigned int  pos) throw (IndexOutOfBoundsException) [inline, protected, inherited]

Definition at line 94 of file NucleicAlphabet.h.

References bpp::NucleicAlphabet::getStateAt().

AlphabetState & AbstractAlphabet::getStateAt ( size_t  pos) throw (IndexOutOfBoundsException) [protected, virtual, inherited]

Get a state at a position in the alphabet_ vector.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
posThe index of the state in the alphabet_ vector.
Exceptions:
IndexOutOfBoundsExceptionIf pos is out of the vector.

Definition at line 107 of file AbstractAlphabet.cpp.

Referenced by bpp::WordAlphabet::build_(), bpp::EchinodermMitochondrialCodonAlphabet::EchinodermMitochondrialCodonAlphabet(), bpp::InvertebrateMitochondrialCodonAlphabet::InvertebrateMitochondrialCodonAlphabet(), bpp::StandardCodonAlphabet::StandardCodonAlphabet(), bpp::VertebrateMitochondrialCodonAlphabet::VertebrateMitochondrialCodonAlphabet(), and bpp::YeastMitochondrialCodonAlphabet::YeastMitochondrialCodonAlphabet().

const AlphabetState & AbstractAlphabet::getStateAt ( size_t  pos) const throw (IndexOutOfBoundsException) [protected, virtual, inherited]

Get a state at a position in the alphabet_ vector.

This method must be overloaded in specialized classes to send back a reference of the corect type.

Parameters:
posThe index of the state in the alphabet_ vector.
Exceptions:
IndexOutOfBoundsExceptionIf pos is out of the vector.

Definition at line 115 of file AbstractAlphabet.cpp.

const NucleicAlphabetState& bpp::NucleicAlphabet::getStateByBinCode ( int  code) const throw (BadIntException) [inline, inherited]

Get a state by its binary representation.

Parameters:
codeThe binary representation as an unsigned char.
Returns:
The NucleicAlphabetState.
Exceptions:
BadIntExceptionIf the code is not a valide state.
Author:
Sylvain Gaillard

Definition at line 134 of file NucleicAlphabet.h.

References bpp::NucleicAlphabet::binCodes_, and bpp::NucleicAlphabet::getStateAt().

Referenced by bpp::NucleicAlphabet::getOverlap(), and bpp::NucleicAlphabet::subtract().

unsigned int bpp::AbstractAlphabet::getStateCodingSize ( ) const [inline, protected, virtual, inherited]

Get the size of the string coding a state.

Returns:
The size of the tring coding each states in the Alphabet.
Author:
Sylvain Gaillard

Implements bpp::Alphabet.

Reimplemented in bpp::WordAlphabet.

Definition at line 213 of file AbstractAlphabet.h.

const std::vector< std::string > & AbstractAlphabet::getSupportedChars ( ) const [virtual, inherited]
Returns:
A list of all supported character codes.

Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.

Implements bpp::Alphabet.

Definition at line 260 of file AbstractAlphabet.cpp.

Referenced by bpp::CaseMaskedAlphabet::CaseMaskedAlphabet().

const std::vector< int > & AbstractAlphabet::getSupportedInts ( ) const [virtual, inherited]
Returns:
A list of all supported int codes.

Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.

Implements bpp::Alphabet.

Definition at line 243 of file AbstractAlphabet.cpp.

int bpp::NucleicAlphabet::getUnknownCharacterCode ( ) const [inline, virtual, inherited]
Returns:
The int code for unknown characters.

Implements bpp::Alphabet.

Definition at line 249 of file NucleicAlphabet.h.

bool bpp::LetterAlphabet::isCharInAlphabet ( const std::string &  state) const [inline, virtual, inherited]

Tell if a state (specified by its string description) is allowed by the the alphabet.

Parameters:
stateThe string description.
Returns:
'true' if the state in known.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 73 of file LetterAlphabet.h.

References bpp::LetterAlphabet::isCharInAlphabet().

bool bpp::AbstractAlphabet::isGap ( int  state) const [inline, virtual, inherited]
Parameters:
stateThe state to test.
Returns:
'True' if the state is a gap.

Implements bpp::Alphabet.

Reimplemented in bpp::RNY.

Definition at line 133 of file AbstractAlphabet.h.

bool bpp::AbstractAlphabet::isGap ( const std::string &  state) const [inline, virtual, inherited]
Parameters:
stateThe state to test.
Returns:
'True' if the state is a gap.

Implements bpp::Alphabet.

Definition at line 134 of file AbstractAlphabet.h.

References bpp::AbstractAlphabet::charToInt().

bool AbstractAlphabet::isIntInAlphabet ( int  state) const [virtual, inherited]

Tell if a state (specified by its int description) is allowed by the the alphabet.

Parameters:
stateThe int description.
Returns:
'true' if the state in known.

Implements bpp::Alphabet.

Definition at line 151 of file AbstractAlphabet.cpp.

bool bpp::NucleicAlphabet::isUnresolved ( int  state) const [inline, virtual, inherited]
Parameters:
stateThe state to test.
Returns:
'True' if the state is unresolved.

Implements bpp::Alphabet.

Definition at line 251 of file NucleicAlphabet.h.

bool bpp::NucleicAlphabet::isUnresolved ( const std::string &  state) const [inline, virtual, inherited]
Parameters:
stateThe state to test.
Returns:
'True' if the state is unresolved.

Implements bpp::Alphabet.

Definition at line 252 of file NucleicAlphabet.h.

References bpp::LetterAlphabet::charToInt().

void bpp::NucleicAlphabet::registerState ( const NucleicAlphabetState st) [inline, protected, inherited]
void bpp::LetterAlphabet::registerState ( const AlphabetState st) [inline, protected, virtual, inherited]

Add a state to the Alphabet.

Parameters:
stThe state to add.

Reimplemented from bpp::AbstractAlphabet.

Reimplemented in bpp::DefaultAlphabet.

Definition at line 83 of file LetterAlphabet.h.

References bpp::LetterAlphabet::caseSensitive_, bpp::AlphabetState::getLetter(), bpp::AlphabetState::getNum(), and bpp::LetterAlphabet::letters_.

Referenced by bpp::CaseMaskedAlphabet::CaseMaskedAlphabet().

void bpp::AbstractAlphabet::remap ( ) [inline, protected, inherited]

Re-update the maps using the alphabet_ vector content.

Definition at line 207 of file AbstractAlphabet.h.

References bpp::AbstractAlphabet::alphabet_, and bpp::AbstractAlphabet::updateMaps_().

Referenced by bpp::WordAlphabet::build_().

void bpp::AbstractAlphabet::resize ( unsigned int  size) [inline, protected, inherited]

Resize the private alphabet_ vector.

Parameters:
sizeThe new size of the Alphabet.

Definition at line 182 of file AbstractAlphabet.h.

References bpp::AbstractAlphabet::alphabet_.

Referenced by bpp::BinaryAlphabet::BinaryAlphabet(), bpp::WordAlphabet::build_(), bpp::DefaultAlphabet::DefaultAlphabet(), and bpp::RNY::RNY().

void bpp::NucleicAlphabet::setState ( unsigned int  pos,
const NucleicAlphabetState st 
) [inline, protected, inherited]

Definition at line 84 of file NucleicAlphabet.h.

References bpp::NucleicAlphabet::updateMaps_().

void bpp::LetterAlphabet::setState ( size_t  pos,
const AlphabetState st 
) throw (IndexOutOfBoundsException) [inline, protected, virtual, inherited]

Set a state in the Alphabet.

Parameters:
posThe index of the state in the alphabet_ vector.
stThe new state to put in the Alphabet.

Reimplemented from bpp::AbstractAlphabet.

Definition at line 93 of file LetterAlphabet.h.

References bpp::LetterAlphabet::caseSensitive_, and bpp::LetterAlphabet::letters_.

Referenced by bpp::DefaultAlphabet::DefaultAlphabet().

int bpp::NucleicAlphabet::subtract ( int  s1,
int  s2 
) const throw (BadIntException) [inline, inherited]

Subtract states.

Get the remaining state when subtracting one state to another.

 int a = alpha->charToInt("A");
 int m = alpha->charToInt("M");
 int c = alpha->subtract(m, a);
 
 cout << alpha->intToChar(c) << endl;

 // should print C because M - A = C
Parameters:
s1the first state as an int
s2the second state as an int
Exceptions:
BadIntExceptionif one of the states is not valide.
Returns:
The remaining state as an int
Author:
Sylvain Gaillard

Definition at line 163 of file NucleicAlphabet.h.

References bpp::AlphabetState::getNum(), bpp::NucleicAlphabet::getState(), and bpp::NucleicAlphabet::getStateByBinCode().

Referenced by bpp::NucleicAlphabet::subtract().

std::string bpp::NucleicAlphabet::subtract ( const std::string &  s1,
const std::string &  s2 
) const throw (BadCharException) [inline, inherited]

Subtract states.

Get the remaining state when subtracting one state to another.

 string a = "A";
 string m = "M";
 
 cout << alpha->subtract(m, a) << endl;

 // should print C because M - A = C
Parameters:
s1the first state as a string
s2the second state as a string
Exceptions:
BadCharExceptionif one of the states is not valide.
Returns:
The remaining state as a string
Author:
Sylvain Gaillard

Definition at line 187 of file NucleicAlphabet.h.

References bpp::LetterAlphabet::charToInt(), bpp::AbstractAlphabet::intToChar(), and bpp::NucleicAlphabet::subtract().


Member Data Documentation

std::vector<std::string> bpp::AbstractAlphabet::charList_ [mutable, protected, inherited]

Definition at line 99 of file AbstractAlphabet.h.

std::vector<int> bpp::AbstractAlphabet::intList_ [mutable, protected, inherited]

Definition at line 100 of file AbstractAlphabet.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Friends