Bio++

C++ libraries for bioinformatics

MafFilter is a program to process genome alignment in the Multiple Alignment Format. Current version is 1.1.2

Note: a bug was found when extracting annotation. In case the alignment is not projected on the reference sequence, some blocks may include the negative strand of the reference sequence. This leads to a segfault during program execution. The bug was fixed on the source repository and in 1.1.1 release. Statically linked executables are provided which include the bug fix, but other packages are not been updated and still provided as version 1.1.0. New packages will be generated for the next release. Do not hesitate to contact us if you are facing this issue and need help.

Note2: another bug was found, leading to a segfault. The bug is linked to the use of block statistics, when written to files. A fix release for MafFilter? 1.1.2 has been issued, only as linux executables. The version can also be compiled from the source code, but will require the latest development version of bpp-seq-omics.

What does MafFilter do?

MafFilter applies a series of "filters" to a MAF file, in order to clean it, extract data and computer statistics while keeping track of the associated meta-data such as genome coordinates and quality scores.

  • It can process the alignment to remove low-quality / ambiguous / masked regions.
  • It can export data into a single or multiple alignment file in format such as Fasta or Clustal.
  • It can read annotation data in GFF or GTF format, and extract the corresponding alignment.
  • It can perform sliding windows calculations.
  • It can reconstruct phylogeny/genealogy along the genome alignment.
  • It can compute population genetics statistics, such as site frequency spectrum, number of fixed/polymorphic sites, etc.

How can I get it?

The MafFilter program is command-line driven. You can get executable files pre-compiled for your system (if there are any), use pre-compiled packages (if there are any) or compile the programs yourself (should work on any system with a decent C++ compiler). The latest version of MafFilter (1.1.0) is based on Bio++ 2.2.0 http://biopp.univ-montp2.fr/.

The programs depend on the Bio++ libraries. Pre-compiled executables are statically linked, and therefore already include all required code from the libraries. Pre-compiled packages will ask for all required dependencies, which can be found in the same download directory. For compiling the programs yourself, from the downloaded sources or from the git repository, please follow the instructions from the Bio++ website http://biopp.univ-montp2.fr/wiki/index.php/Installation.

How do I use it?

Several example data sets are distributed along with the source code of the package. A reference manual is also available here, or can be downloaded as PDF. Questions can be asked on the dedicated forum: here.

Last modified 2 years ago Last modified on Jan 26, 2015, 12:08:25 PM