NEW: AnA-FiTS: A very fast forward-in-time simulator for polymorphism data
Andre has designed a very fast
forward simulator for pop. gen. simulations that is between 2-3 orders
of magnitude faster than current codes.
For details and obtaining the code please go to the AnA-FiTS page.
NEW: A pipeline for perpetually updating trees
Fernando, Stephen Smith, and John
Cazes have developed a pipeline that automatically updates reference
trees using RAxML-Light when new sequences for the clade of interest
appear on GenBank. The tool uses RAxML-Light to extend trees and
Stephen's PHLAWD pipeline to extend alignments by new sequences. The
code can be run on stand-alone servers and on cluster systems.
The prototype version including documentation is available via Fernando's github repository.
NEW: PTP a tool for delimiting species on phylogenies
Jiajie Zhang has designed a tool
called PTP that is based on Poisson Tree Processes that can delimit for
species on phylogenies as the are generated for instance by RAxML.
Unlike other tools (e.g., GMYC) it does not require a time-calibrated
ultrametric tree as input.
Jiajie also generated a integrated pipeline that combines the PTP
method with the Evolutionary Placement algorithm in RAxML to assess the
diversity of a phylogenetic placement run, by inferring the number of
species per placement.
The code and data are available for download here:
An up-to date version of the code is maintained by Jiajie on his github repository
NEW: GapsMis Tool for by reference genome assembly
GapsMis, the successor of GapMis,
is a tool for flexible pairwise sequence alignment with a variable, but
bounded, number of gaps.
Around 6,000,000 pairwise sequence alignments performed, under
realistic conditions based on the properties of real full-length
genomes, show
that GapsMis can increase the accuracy of extending short-read alignments by 0.01-0.04% compared to state-of-the-art approaches.
The software is available here
The open source github code repository can be found here
NEW: ExaML Exascale Maximum Likelihood Code
Exascale Maximum Likelihood (ExaML) code for phylogenetic inference using MPI.
This code implements the popular RAxML search algorithm for maximum
likelihood based inference of phylogenetic trees. It uses a radically
new MPI parallelization approach that yields improved parallel
efficiency, in particular on partitioned multi-gene or whole-genome
datasets.
It is up to 3.2 times faster than RAxML-Light [1].
As RAxML-Light, ExaML also implements checkpointing, SSE3, AVX vectorization and memory saving techniques.
The code and some documentation can be downloaded via Alexis github repository
[1] A. Stamatakis, A.J. Aberer, C. Goll, S.A. Smith, S.A. Berger, F.
Izquierdo-Carrasco: "RAxML-Light: A Tool for computing TeraByte
Phylogenies", Bioinformatics 2012; doi: 10.1093/bioinformatics/bts309.
Sweed, a faster version of SweepFinder
We developed SweeD, a
parallel and checkpointable tool that implements a composite likelihood
ratio test for detecting selective sweeps.
SweeD is based on the SweepFinder algorithm (Nielsen et al. 2005).
SweeD can calculate the theoretical SFS of a given demographic model
(stepwise changes or with an exponential growth phase + stepwise
changes) by using the method by Živković and Stephan (2011).
SweeD is numerically more stable than SweepFinder (in terms of
floating-point arithmetic operations and in particular for folded
data), and is faster than SweepFinder when the number of sequences is
large.
SweeD has been tested on simulated datasets with up to 10,000 sequences and 1,000,000 SNPs.
The sequential version of SweeD is up to 21 times faster than
SweepFinder, depending on the number of SNPs and the number of
sequences.
Performance improves over SweepFinder with an increasing number of sequences.
For few sequences, SweeD is as fast as SweepFinder.
SweeD has been also used to analyze the Chromosome 1 from the 1000 Genomes Project.
The dataset comprises more than 2000 sequences and about 2,896,000 SNPs. The analysis required 8h and 15mins.
You can download the source code of version 3.1.1 here
bug history:
- v3.1 Fixed bug in the VCF file format parser that was associated with handling missing
- v3.1.1 Changed a default parameter value
The manual is available here
The experimental data and scripts used in the manuscript are available for download here
OPASM for efficient parallel string matching
Contributors: Christos Hadjinikolis, Costas S. Iliopoulos, Solon P. Pissis, Alexandros Stamatakis
Description: The tool focuses on the efficient parallelisation of
approximate string-matching algorithms, which are based on dynamic
programming, using the message-passing programming paradigm (MPI).
Technical report: C. Hadjinikolis, C.S. Iliopoulos, S.P. Pissis, A. Stamatakis: "Minimising processor communication in parallel approximate string matching", Heidelberg
Institute for Theoretical Studies, Exelixis-RRDR-2012-8, August 2012. PDF
Source code and data: provided freely for academic use under the
terms of the GNU General Public License here
An optimized version of DPPDIV
Download the open source code of
the optimized (using vector intrinsics) and parallelized version (using
OpenMP) of the DPPDIV (original code by Tracy Heath) code for
estimating divergence times with a dirichilet process prior here
For details about the program please read the paper
When using the optimized DPPDIV code please cite the original paper:
T. A. Heath, M.T. Holder, J.P. Huelsenbeck: "A Dirichlet process prior for estimating lineage-specific substitution rates". Molecular Biology and Evolution, 2011.
and the technical report below:
T. Flouris, A. Stamatakis: "An Improvement to DPPDIV", Heidelberg
Institute for Theoretical Studies, Exelixis-RRDR-2012-7, August 2012. PDF
For support please use the DPPDIV google group.
RAxML-Light Web-Service
The new RAxML-Light tool is now also available as web-service thanks to the efforts of the great colleagues at the San
Diego Supercomputer Center and support by the NSF iPlant collaborative.
To use this service you will first need to create an iPlant login here and subsequently log in on the CIPRES portal using your iPlant credentials.
Microbenchmark for Denormalized Floating Point Numbers
Denormalized floating point
values can have a dramatic impact on program performance, i.e.,
operations of O(n) theoretical run-time can have significantly
different execution times, depending on the input data.
The microbenchmark has been extracted from RAxML that exhibited some unexplicable run time variations due to this problem.
The benchmark is available under GNU GPL from Alexi's github repository.
Please read:
Björndalen J., Anshus O. Trusting floating point benchmarks-are your benchmarks really data-independent?
Applied Parallel Computing. State of the art in Scientific Computing 2010; pp 178-188, Springer.
for some background info.
The Gapmis library
solon, simon, tomas, and nikos have developed the gapmis library:
GapMis is a tool for pairwise sequence alignment with a single gap.
GapMis is a command-driven program implemented in the C
programming language and developed under GNU/Linux operating system. It
only requires standard C libraries and gcc compiler for compilation.
GAPMIS software page
Rogue Taxon Identification Web-Server (alpha version)
The consensus tree of a set of
bootstrap trees can frequently exhibit poor resolution or sub-optimal
branch support because ofunstable taxa (also referred to as rogue
taxa). Analogously, rogue taxa may also affect branch support values
for maximum likelihood trees.
Recently, Andre and Denis made available a web-service, that allows to
identify rogue taxa in a set of bootstrap trees. Optionally, if also a
single best-known tree (under ML/MP) is provided, our algorithm can
identify rogue taxa with respect to the branch support values drawn
onto this tree (for both cases the set of rogue taxa is often similar,
but not necessarily identical).
The URL to our server is http://exelixis-lab.org/roguenarok.html
Using the web-interface, you can compare the results of various rogue
taxon searches (including stability measures such as the taxonomic
instability index and the leaf stability index). Finally, the websites
integrates a tree viewer that can be used to visualize your consensus
tree/best-known tree before and after removing various sets of rogue
taxa.
Feedback and comments are most welcome, preferably via the RAxML google group
OmegaPlus Population Genetics code
A parallel tool for rapid & scalable detection of selective sweeps in whole-genome datasets
We have developed OmegaPlus, a scalable implementation of the omega-statistic (Kim and Nielsen 2004) to detect selective sweeps in whole-genome data based on linkage disequilibrium patterns.
OmegaPlus has been tested with fully phased data, but also with unphased data, where we can determine to which diploid individual a SNP belongs to, but we can not determine which of the two chromosomes carries the SNP.
Outgroup information is not required. The program recognizes FASTA, Hudson's ms-like, and MaCS-like ( http://www-hsc.usc.edu/~garykche/) formats.
OmegaPlus can scan the DPGP dataset ( www.dpgp.org, reference release 1.0 September 2009, 37 sequences and ~340,000 SNPs) for positive selection in 55 seconds.
In addition to the efficient sequential implementation, we provide three parallelized versions that use fine-, coarse-, and multi-grained parallelism.
Right now this is a pure command-line tool, available for Windows and Linux operating systems.
We strongly recommend to use the LINUX version of OmegaPlus.
Note that only limited support will be provided for the Windows version.
For compiling the code, GCC version 4.4
or greater is recommended. For gcc versions prior to version 4.4 please
remove the optimization flag (-O3) from the Makefiles before compiling
the code.
When OmegaPlus is compiled with older
gcc versions it will yield different results on identical input data
compared to the ouput it generates when -O3 is activated. This is most
probably due to some to aggressive optimizations under -O3.
Many thanks to Stefan Laurent (LMU Munich) for pointing this out.
download the most recent GNU GPL Linux version 2.2.2 here , it includes a bug fix in the VCF file parser that was associated with handling missing data.
previous Linux versions:
- Linux version 2.2.1 here , it includes a minor bug fix in the ms parser
- Linux version 2.2 here it includes a new command line flag -no-singletons to exclude the singletons from the analysis
- Linux version 2.1 here which can now also parse the Variant Call Format (.vcf)
- Linux version 2.0 here
download GNU GPL Windows version here
download the manual here
Examples are provided with the source code (see directory "examples").
If you have questions or you would like to report a bug, please register at the OmegaPlus google group ( http://groups.google.com/group/omegaplus)
or send an email to pavlidisp@gmail.com or n.alachiotis@gmail.com
RAxML Memory Requirements calculator by Simon Berger
Required size:
(n-2) * m * (x * 8) bytes = MEM
Code Availability
For up-to-date versions of
reconfigurable architectures go to opencores
For up-to-date development versions of RAxML, RAxML-Light,
Parsimonator, and TreeCounter go to github
RAxML questions, help & bug reports
A simple Visual Tree Comparison Tool
Simon Berger has developped a
simple graphical tree comparison tool that highlights differences
between up to four trees by highlighting the branches
(bipartitions/splits) that are not shared among those trees.
The JAVA code can be downloaded here
There are three different operating modes:
- run it on the command line and output a phyloxml file that
can be viewed with any compatible tree viewer: "java -jar vtd.jar tree1
tree2 > out.xml"
- run on the command line and start the archeopteryx tree
viewer to view the differences between tree1 and tree2: "java -jar
vtd.jar tree1 tree2 -v"
- GUI mode: just type "java -jar vtd.jar" and a file dialogue
will open to select input tree files
In the default mode (when two trees are supplied) branches that occur
(are shared) in both trees are colored white, missing branches are red.
If you open more than two trees, the tool will do a multi
tree comparison (this works for up to 4 tree files).
In this mode the first tree will be compared to the remaining trees.
The branch coloring is then based on rgb color mixing: If a branch
exists in trees 1 and 2 it is colored red. If it exists in trees 1 and
3/4 it will be green/blue.
If a branch exists in more than two trees the colors are mixed
(red+green=yellow, red+green+blue=white and so on...).
Here is a screen shot where we compare two small 10 taxon trees:
Graphical User Interface for the RAxML Evolutionary
Placement Algorithm
Denis Krompass, our Master's
student has put together JAVA-based GUI packages for running and
analyzing short read placement runs with the RAxML EPA algorithm
described in: S.A. Berger, D.
Krompaß, A. Stamatakis:
"Performance, Accuracy and Web-Server for Evolutionary Placement of
Short Sequence Reads under maximum-likelihood". In Systematic Biology 60(3):291-302,
2011. PDF
This stand-alone GUI is similar in functionality to the EPA
web-server here
It also allows you to build reference trees with RAxML for the original
full-length sequence alignment
Windows version
Linux 32-bit version
Linux 64-bit version
Just download the file (this may take some time), then unzip it: "unzip
RAxML_Workbench_Linux.zip" then change to the directory: "cd
RAxML_Workbench" and then start the GUI by typing: "java -jar
RAxML_Worbench.jar"
Here is a screenshot of the GUI:
Reconfigurable Architectures
Also look at opencores for latest
updates
Reconfigurable FPGA Pipelined
Floating-Point Exponential Unit available here
Source
code under GNU GPL version 3 or higher by Nikos Alachiotis. The
following restriction to GNU GPL applies: Always cite:
Nikos Alachiotis, Alexandros
Stamatakis:
"FPGA Optimizations for a Pipelined Floating-Point Exponential Unit",
accepted for publication, 7th
International Symposium on Applied
Reconfigurable Computing (ARC 2011), Belfast, United Kingdom,
March
2011.
when using this code.
An IEEE-754 compliant logarithm
approximation unit for FPGAs by Nikos Alachiotis
Download
an open-source VHDL implementation of a fast space- and
resource-efficient logarithm approximation unit for FPGAs.
By
using this component you agree to cite it as: "Efficient Floating-Point
Logarithm Unit for FPGAs", by Nikos Alachiotis and Alexandros
Stamatakis, accepted for publication at RAW workhsop, held in
conjunction with IPDPS 2010. PDF
UDP
Transceiver Core by Nikos Alachiotis and Simon A. Berger
Download an
open-source VHDL implementation of a component that can be connected to
the input port of the Virtex-5 Ethernet MAC Local Link Wrapper and that
allows for transceiving IPv4 ethernet packets. The archive contains a
JAVA test application and is also available at opencores.org
By
using this
component, you agree to cite it as: "Efficient PC-FPGA
Communication over Gigabit Ethernet", by Nikos Alachiotis, Simon A.
Berger, and Alexandros Stamatakis, Exelixis Rapid Research
Dissemination Report, Exelixis-RRDR-2010-4, TU Munich, February
2010. PDF
TreeCounter
Code by A. Stamatakis to compute
the number of possible rooted and unrooted binary trees for n taxa or to compute the number of
possible binary trees given a multi-furcating constraint tree. This
code needs the GNU GMP library.
Program options:
- treeCounter -h for
help
- treeCounter -n numTaxa
for the number of all possible trees with numTaxa taxa
- treeCounter -t constraint
for the number of all possible trees under the constraint
TreeCounter download
PaPaRa: PArsimony-based Phylogeny-Aware Read alignment
program
Code by Simon A. Berger for
aligning short reads to reference phylogenies and alignments.
NEW: significantly faster SSE3 vectorized version of PaPaRa 2.0 available for download here
NEW: SSE3-vectorized and hybrid CPU/GPU-optimized version of PaPaRa available for download here
Also please check for code updates on Simon's github repository
Parsimonator: A fast open-source parsimony program
Parsimonator v1.0.2 source code
available here
Parsimonator
is a no-frills light-weight implementation for building starting
trees under parsimony for RAxML-Light (see below)
It deploys a randomized
stepwise addition order algorithm to build trees and thereafter
conducts a couple of SPR
(Subtree Pruning Re-Grafting
moves) to
further improve the parsimony score.
Right now, parsimonator can only compute
trees on DNA datasets. It uses SSE3 128-bit wide and 256-bit
wide AVX (new as of v101) vector instructions
to significantly accelerate parsimony computations.
Although it is
significantly slower than TNT, I think that it is the fastest
open-source parsimony function implementation, albeit the search
algorithm itself is rather naïve.
It can also extend given trees
that do not comprise all taxa of an input alignment by using a
randomized stepwise addition order algorithm and for those taxa that
are not contained in the starting tree. The source archive includes a
manual.
The
current version has been used to compute a parsimony tree on an
alignment with 1481 taxa and 20,000,000 sites (a phylip file with a
size of 27GB!).
Old version without OpenMP parallelization and bug fixes for very large
datasets available here 1.0.1
Old Version without AVX
instructions still available here 1.0.0
RAxML-Light: a strapped down checkpointable RAxML
version for computing huge trees
Get the most up-to-date RAxML-Light version from github
RAxML-Light
v1.0.9 source code available here
RAxML-Light
is a strapped down RAxML version for conducting tree searches on very
large trees under the CAT approximation and GAMMA model of rate
heterogeneity.
It's key features are:
- A light-weight efficient checkpointing and restart
capability
- A
highly optimized fine-grain MPI parallelization that allows you to
concurrently compute the likelihood of a single tree on hundreds or
thousands of processors, provided that you have a low latency
interconnect.
- new as of v102:
memory saving option -S for gappy multi-gene datasets. With this new
option the memory consumption could be reduced from 70GB to 19GB for
analyzing a dataset with about 10 genes and 120,000 taxa with 90%
missing data.
- new as of v102:
AUTO protein model option: RAxML will automatically select the best
protein substitution model (WAG, JTT, LG, etc) when model parameters
are optimized during the tree search.
- new as of v103:
a little bug fix :-)
- new as of v104:
- a little bug fix for the restart from checkpoint option. This
will not affect previous results.
- Also, the so called search
convergence criterion (-D option) from the standard RAxML 728 version
has been re-introduced (including restart capability) for tree searches
on extremely large trees.
- new as of v105:
- Bug
fixes to compute TeraByte trees with MPI, i.e., trees with 1TB memory
requirements for the likelihood vectors of a single tree on more than
600 cores
- Implementation of GAMMA models
of rate heterogeneity
- Parsing option to parse and
compress large alignments into a binary file that can be read much
faster
- new as of v106:
- -r option to save memory by recomputing ancestral probability vectors instead of storing them
- -Q option for improved load balance on partitioned datasets
- new as of v108:
- some minor bug fixes
- improved manual
- new as of v109:
- major bug fix for CAT likelihood computations on protein models
A usage manual is also included
in the archive.
Older version RAxML-Light 1.0.8 available here
Older version RAxML-Light 1.0.6 available here
Older version RAxML-Light 1.0.5 available here
Older version RAxML-Light 1.0.4 available here
Older version RAxML-Light 1.0.3 available here
Older version RAxML-Light 1.0.2 available here
Older version RAxML-Light 1.0.1 available here
RAxML
Get the most up-to-date RAxML version from github
RAxML v7.2.8 alpha release source
code available here
new features:
- several bug fixes
- added some new protein substitution models: MTART, MTZOA,
PMB, HIVB, HIVW, JTTDCMUT, FLU
Documentation:
Read this before running a
RAxML analysis! compute RAxML memory requirements.
Since datasets are getting larger
here is a formula to estimate RAxML
memory requirements:
Given an alignment of n taxa and m distinct patterns the memory
consumption is approximately:
- MEM(AA+GAMMA) = (n-2) * m * (80
* 8) bytes
- MEM(AA+CAT)
= (n-2) * m * (20 * 8) bytes
- MEM(DNA+GAMMA) = (n-2) * m * (16 * 8) bytes
- MEM(DNA+CAT)
= (n-2) * m * (4 * 8) bytes
WEB-Servers for evolutionary
placement of short reads
Web-Servers for phylogenetic
placement of short sequence reads (including alignment and
visualization tools)
Web-Servers for tree building
not maintained by the exelixis lab.
Graphical User Interfaces
RAxML Graphical User Interfaces
- Daniele Silvestro
and Ingo Michalak at the Senckenberg Museum and Research
Center have started developing a GUI for RAxML that runs under
MACs, Windows, and Linux. The code for the GUI is available here. Please send
suggestions and comments to Daniele Silvestro at senckenberg de
- Jacek Kominek from
the University of Gdansk in Poland has developed this nice GUI here
Older Versions
RAxML v7.2.7
(alpha) available for download here
RAxML v7.2.6 available for download here and here is a windows executable
RAxML v7.2.5 (alpha) available for download here and here is a windows executable
Helper Scripts and Tools
Phylogenetic Binning tool
Phylogenetic binning tool for
paper on "Morphology-based phylogenetic
binning of the lichen genera Allographa and Graphis via molecular site
wieght calibration" by Simon Berger
available for download here
tech report PDF
Wrapper
Scripts
Apurva
Narechania at the American Museum of Natural
history has kindly put togetehr a couple of wrapper scripts for RAxML
:-)
- raxml_launch_serially.sh:
A simple shell script that launches one job after the other awaiting
for completion of each job.
- raxml_nexusPartConvert.pl:
A Perl script that parses a partitioned alignment in Nexus format
with charsets and produces a partition guide file to be fed to RAxML
with -q. Preliminary - works with DNA or AA, but not the two together
yet, so not suitable for mixed-molecule data. Unless the output gets
redirected to a file with ">", it will appear on screen.
- raxml_wrapper.pl:
A Perl script that reads a raxml.config file with common run
parameters and executes a directory of Phylip alignment files in batch,
then outputs the results in another directory. See the documentation
with "perldoc ./raxml_wrapper.pl".
Guy
Leonard at Exeter has updated his wrapper environment
called easyRax
Alexis
has developed a couple of perls scripts
A perl
script for computing bootstrap branch lengths with RAxML. This
script can be used to perform the following task with RAxML:
- Given a
best-known ML tree, generate a number of Bootstrap replicates and just
re-estimate the branch lengths for that given fixed tree topology on
each Bootstrap replicate.
- To invoke the script call it as follows: "perl bsBranchLengths.pl
alignmentFileName treeFileName numberOfReplicates". The
script assumes that the RAxML executable is located in the directory
where you execute it. Otherwise, if RAxML is located in your Linux/Unix
path just replace every occurence of "./raxmlHPC"
by "raxmlHPC" in the
script. The bootstrapped trees with branch lengths will be written into
a file called "bsTrees".
- This script is intended for use with programs that infer
divergence
time estimates.
A perl
script for finding the best protein substitution model
- Here
is a little perl-script that will automatically determine the
best-scoring AA substitution model on a fixed starting tree. Note
that raxmlHPC must be in your $PATH for this to work.
- For unpartitioned datasets execute it like this: perl ProteinModelSelection.pl
alignmentFile.phylip > outfile The outfile will then contain
the best-scoring AA model to use with RAxML.
- For partitioned datasets execute it like this: perl ProteinModelSelection.pl
alignmentFile.phylip partitionData.txt > outfile The outfile
will then contain the best-scoring AA model for every partition.
James Munro has written
a Guide
to install RAxML on MACs
Olaf Bininda-Emonds has
written batchRAxML.pl.
This nice script by my good colleague from Munich times Olaf
Bininda-Emonds provides a wrapper around RAxML to easily analyze a set
of data files according to a common set of the search criteria. Also
organizes the RAxML output into a set of subdirectories.
Frank Kauff has written PYRAXML2.
Frank Kauff at University of Kaiserslautern (formerly at Duke
University) has written this cool script that reads NEXUS-style data
files and prepares the necessary input files and command-line options
for RAxML-VI-HPC. You can download the BETA-version here: PYRAXML2 It requires
PYTHON and BIOPYTHON to be installed on your computer.
On-Line Material for some old
papers
Material (alignments) for 2008 Systematic
Biology paper on the rapid bootstrap algorithm
- test datasets available here
Material (test datasets) for 2007 Supercomputing paper on parallelizing
RAxML on the IBM BlueGene/L
- test datasets available here
Material for HICOMB2006
paper: “Phylogenetic Models of Rate Heterogeneity: A High Performance
Computing Perspective"
- Click here for a table with the experimental raw
data
Material for
HPCC05 paper: “Parallel Divide-and-Conquer Phylogeny Reconstruction by
Maximum Likelihood”
Material
on RAxML-VI performance:
- 1,000 taxa plot alignment Alignment of 1,000 sequences
from the ARB database containing Eucarya, Bacteria, Archaea by Harald
Meier, TU München

- 1,497 taxa plot Alignment
of 1,497 Bacteria by Josh Wilcox, Pace Lab, University of Colorado at
Boulder, for more information on this alignment please contact the Pace Lab
- 1,663 taxa plot alignment Alignment of 1,663 sequences
from the ARB database containing Eucarya, Bacteria, Archaea by Harald
Meier, TU München
- 1,728 taxa plot alignment Alignment of 1,728 Archaea by Chuck
Robertson, Pace Lab, University of Colorado at Boulder
- 2,000 taxa plot alignment Ribosomal
RNA sequences by Gutell Lab, University of Texas at Austin, for more
information on this alignment please contact Robin Gutell
- 2,560 taxa plot alignment upon request via email Kallersjo, M., et al.,
Simultaneous parsimony jackknife analysis of 2538 rbcL DNA sequences
reveals support for major clades of green plants, land plants, seed
plants and flowering plants. Pl. Syst. Evol., 1998. 213: p. 259-287.
- 4,114 taxa plot alignment 16S ribosomal
Actinobacteria RNA sequences, by Usman Roshan, New Jersey Institute of
Technology
- 6,722 taxa plot alignment
Ribosomal RNA sequences by Gutell Lab, University of Texas at Austin,
for more information on
this alignment please contact Robin Gutell
- 7,769 taxa plot alignment Ribosomal
RNA sequences by Gutell Lab, University of Texas at Austin, for more
information on this
alignment please contact Robin Gutell
- 8,780 taxa plot alignment Alignment
of 8,780 sequences from the ARB database containing Eucarya, Bacteria,
Archaea. Original alignment
by Harald Meier, TU München, modified by Usman Roshan, New Jersey
Institute of Technology
- 25,057 taxa plot alignment Alignment of 25,057 Protobacteria,
by Usman Roshan, New Jersey Institute
of Technology
Old
Alignment Benchmark Set
The old Alignment Benchmark set:
includes some large real-world alignments and best-known trees for
those alignments
ChromatoGate 1.2
A code for analyzing/editing
chromatogram data by Nikos Alachiotis: Windows code available for
download here
A manual with step-by-step instructions is available for download here PDF
ChromatoGate (CG) accelerates the process of detecting potential errors
in DNA sequences that have been introduced/generated by Sanger
sequencing.
To detect errors, CG starts from a multiple sequence alignment instead
of inspecting every sequence and chromatogram separately prior to
alignment.
CG does not align nor change anything in the sequences, that is, it
does not automatically remove potential sequencing errors. It
implements a series of user-controlled steps that are required in the
multiple sequence alignment generation and correction process. During
the alignment generation procedure (relying on any external MSA tool),
the tool gathers information about alignment gaps, trimmed sequence
edges, forward/reversed/consensus sequences, and corrections that have
already been applied to the sequences by the user. Using this collected
information, CG detects and reports chromatogram peaks to the user for
thoses bases in the sequence alignment that have been identified as
"problematic" based on a user-defined threshold.
AxParafit: Highly optimized and parallelized
version of Parafit
What do the Programs do?
AxParafit and AxPcoords are highly optimized versions of Pierre
Legendre's Parafit
and DistPCoA
programs for statistical analysis of host-parasite coevolution.
AxParafit has also been parallelized with
MPI (Message Passing Interface) for compute clusters. We have used
parallel AxParafit to carry out the largest co-evolutionary
analysis to date for the paper describing the software.
Citing AxParafit & AxPcoords: When publishing results using
AxParafit or AxPcoords please cite the following papers:
If you also used the CopyCat tool in your analyses, please cite:
Manual, Source Code (under GPL),
and Binaries
Libraries required for compiling
fast version:
Results and data from the
paper: An empirical Study of Smut Fungi and their Hosts
Tree Visualization Tool (pretty old)
MrBayes
A hybrid MPI/OpenMP version of
MrBayes v3.1.2 by Alexis Stamatakis and Wayne Pfeiffer
Download
a hybrid MPI/OpenMP parallelization of MrBayes. DNA and Protein
models work correctly, you will probably need an Intel compiler
(icc) to
produce fast code. By
using this component you agree to cite it as:
F. Pratas, P. Trancoso, A. Stamatakis ,
L.
Sousa: "Fine-grain
parallelism using Multi-core, Cell/BE, and GPU systems: Accelerating
the Phylogenetic Likelihood Function". Proceedings of ICPP 2009,
accepted for publication, Vienna, Austria, September 2009. PDF
and
F.
Ronquist, J.P. Huelsenbeck "MrBayes 3: Bayesian Phylogenetic
Inference under mixed models", Bioinformatics
19(12):1572-1574, 2003.
Some performance data: PDF
|