PAML
Introduction
"PAML (for Phylogentic Analysis by Maximum Likelihood) contains a few programs for model fitting and phylogenetic tree reconstruction using nucleotide or amino-acid sequence data." (doc/pamlDOC.pdf)
Version
Version 4.4d is currently available at OSC.
Availability
PAML is available on the Glenn Cluster.
Usage
On the Glenn Cluster paml is accessed by executing the following commands:
module load biosoftw
module load paml
PAML is a collection of several programs that will be added to the users PATH: baseml, basemlg, chi2, codeml, ds, evolver, mcmctree, pamp, and yn00. Each of the programs has separate, but typically similar usage and options.
Options
baseml / basemlg Maximum likelihood analysis of nucleotide sequences using a faster discrete model / Implements the (continuous) gamma model of Yang (Intensive Computation)
Both baseml and basemlg require a baseml.ctl in the current directory with the following variables set: seqfile, outfile, treefile
The following are optional variable to set in baseml.ctl:
noisy, verbose, runmode, model, Mgene, ndata, clock, fix_kappa, kappa, fix_alpha, alpha, Malpha, ncatG, fix_rho,nparK, nhomo, getSE, RateAncestor, Small_Diff, cleandata, icode, fix_blength, method
chi2 Calculates the x2 critical value and p value for conducting the likelihood ratio test
chi2 [p | INTEGER DOUBLE]
chi2 prints x2 critical values at set significance levels until ‘q+ENTER’ is reached
chi2 p interactive set the degrees of freedom and x2 value
chi2 INTEGER DOUBLE Computes the probability for INTEGER df and DOUBLE x2
codeml Implements the codon substitution model of Goldman & Yang for DNA and amino acid sequences
codeml requires codeml.ctl to be located in the current directory with the following variables set:
seqfile, outfile, treefile, aaRatefile
The following are optional variables to set in codeml.ctl:
noisy, verbose, runmode, seqtype, CodonFreq, ndata, aaDist, model, NSsites, icode, Mgene, fix_kappa, kappa, fix_omega, omega, fix_alpha, alpha, Malpha, ncatG, getSE, RateAncestor, Small_Diff, cleandata, fix_blength, method
ds Computes descriptive statistics from a baseml/basemlg analysis
ds filename.type
evolver Simulates sequences under nucleotide, codon, and amino acid substitution models; generates random trees; and calculates the partition distances between trees
EVOLVER in paml version 4.4d, March 2011
Results for options 1-4 & 8 go into evolver.out
Options
(1) Get random UNROOTED trees?
(2) Get random ROOTED trees?
(3) List all UNROOTED trees?
(4) List all ROOTED trees?
(5) Simulate nucleotide data sets (use MCbase.dat)?
(6) Simulate codon data sets (use MCcodon.dat)?
(7) Simulate amino acid data sets (use MCaa.dat)?
(8) Calculate identical bi-partitions between trees?
(9) Calculate clade support values (read 2 treefiles)?
(11) Label clades?
(0) Quit?
evolver’s option 5 requires MCbase.dat. evolver’s option 6 requires MCcodon.dat. evolver’s option 7 requires MCaa.dat and dat/mtmam.dat. evolver’s option 9 requires truetree rst1 (formed from stewart.trees & codeml's output rst1). evolver’s option 11 requires name.tress with user input.
mcmctree Implements the Bayesian MCMC algorithm of Yang and Rannala for estimating species divergence times
mcmctree requires mcmctree.ctl to be located in the current directory with the following variables set:
seqfile, treefile, outfile, RootAge, usedata
The following are optional variables to set in mcmctree.ctl:
seed, ndata, clock, model, alpha, ncatG, cleandata, BDparas, kappa_gamma, alpha_gamma, rgene_gamma, sigma2_gamma, finetune, print, burnin, sampfreq, nsample
pamp Implements the parsimony-based analysis of Yang and Kumar
pamp requires pamp.ctl to be located in the current directory with the following variables set:
seqfile, treefile, outfile
The following are optional variables to set in pamp.ctl:
seqtype, ncatG, nhomo
yn00 Implements the method of Yang and Nielson for estimating synonymous and nonsynonymous substitution rates in pairwise comparisons of protein-coding DNA sequences
yn00 requires yn00.ctl to be located in the current directory with the following variables set:
seqfile, outfile
The following are optional variables to set in yn00.ctl:
verbose, icode, weighting, commonf3x4, ndata
Control Files
All .ctl files (baseml.ctl, codeml.ctl, mcmctree.ctl, pamp.ctl, and yn00.ctl) have comment line starting with '*'.
Example
#PBS -N paml_test
#PBS -l walltime=0:05:00
#PBS -l nodes=1:ppn=4
cd $PBS_O_WORKDIR
module load biosoftw
module load paml
export PAML_DIR=/usr/local/biosoftw/paml44
cp $PAML_DIR/*.* .
cp -r $PAML_DIR/dat .
cp -r $PAML_DIR/examples .
baseml
chi2 1 3.84
codeml
ds in.baseml
echo -e "1\n5\n5 5\n0\n2\n5\n5 5\n0\n3\n5\n4\n5\n5\n6\n7\n8\n" | evolver"
mcmctree
pamp
yn00
Documentation
Four pdf documents are located in the following folder on Glenn: /usr/local/biosoftw/paml44/doc/
An online discussion group for users is paml is located at the following website: http://www.rannala.org/phpBB2/ |