Enabling Research in Evolutionary Biology
PTP is a model for delimiting species on a rooted phylogenetic tree. In PTP, we model speciations or branching events in terms of number of substitutions. So it only requires a phylogenetic input tree, for example the output of RAxML. To be more clear, the branch lengths should represent number of substitutions.
In general, if you have single locus molecular data (such as 16S, 18S, ITS) and want to delimit species based on these sequences, you can try to run PTP on your data.
A close relative of PTP is the GMYC model. The GMYC model require an ultrametric tree as input, in other words, you must time calibrate your phylogenetic tree before using GMYC. However, this is known to be a difficult task. The most commonly used programs for getting an ultrametric tree are BEAST, DPPDIV and r8s. Note that after the calibration, the branch lengths should represent time. PTP has completely avoided this erroness procedure, only a simple phylogenetic tree is required. Our numerous tests show PTP outperferms GMYC on simulation data, and PTP results are comparable to GMYC on real data sets.
You can find a python implementation of the single threshold GMYC model in my GitHub repository. The original R implementation can be downloaded here. Be aware the results from Python and R implementation might differ slightly, this is due to different ways of parameter optimizations. Also note the input tree should be strictly ultrametric and bifurcating (with no zero branch lengths).
PTP can delimit species based on the Phylogenetic Species Concept. So the entities output by PTP are in theory species. OTU-pikcing by its definition should delimit Operational Taxonomic Unit. In essence, OTUs are sequence clusters, OTU-picking methods are clustering algorithm applied to sequences. In some cases species and OTUs are the same, this is because the population size is small and birth rate is low. In such cases, species are well seperated and nature sequence clusters corresponding to species. When sequence clusters do not exist, OTU-picking methods will inevitabley fail. But we show that PTP can still give resonable results when OTU-pikcing methods fail.
I also implemented an experimental pipeline that can delimit speices on NGS data (e.g. 454 sequencing of 16S). It is similar to the so called open reference OTU-picking. I first run EPA to place the query sequences onto the tree, then each placement is evaluated independently to count the number of species. Please read our paper for details.
Please find the up-to-date code and user manual at my GitHub repository.
A simple web server for PTP is here: http://species.h-its.org/ptp/
The server will accept a phylogenetic tree as input and output the species delimitation results.
For general questions, please post on the PTP google group.
If you find some bugs or want to discuss with me, here is my e-mail: bestzhangjiajie[at]gmail[dot]com. I am here to help!
My name is Jiajie Zhang, and currently a PhD student of Prof. Alexandros Stamatakis.