top of page

E-Nertia Global Ente Group

Public·58 members

28K VAlid Mail Acces .txt ^NEW^


PHYLIPPhylogeny Inference PackageVersion 3.695April, 2013by Joseph FelsensteinDepartment of Genome Sciences and Department of BiologyUniversity of Washingtonaddress:Department of Genome SciencesBox 355065Seattle, WA 98195-5065USAE-mail address: joe (at) gs.washington.eduContents of This DocumentContents of This DocumentA Brief Description of the ProgramsCopyright Notice for PHYLIPThe Documentation Files and How to Read ThemWhat The Programs DoRunning the Programs A word about input files Installing a recent version of Oracle Java Running the programs on a Windows machine Running the programs on a Macintosh with Mac OS X Running the programs on a Unix or Linux system Running the programs on a Macintosh with Mac OS 8 or 9 (deprecated) Running the programs in MSDOS Running the Drawgram and Drawtree Java interfaces Running the Drawgram and Drawtree Java GUI interfaces in Windows Running the programs in background or under control of a command file An example (Unix, Linux or Mac OS X) Subtleties (in Unix, Linux, or Mac OS X) An example (Windows) Testing for existence of files Prototyping keyboard response filesPreparing Input Files Input and output files Where the files are Data file formatThe MenuThe Output FileThe Tree FileThe Options and How To Invoke Them Common options in the menu The U (User tree) option The G (Global) option The J (Jumble) option The O (Outgroup) option The T (Threshold) option The M (Multiple data sets) option The W (Weights) option The option to write out the trees into a tree file The (0) terminal type optionThe Algorithm for Constructing Trees Local rearrangements Global rearrangements Multiple jumbles Saving multiple tied trees Strategy for finding the best treeA Warning on Interpreting ResultsRelative Speed of Different Programs and Machines Relative speed of the different programs Speed with different numbers of species Relative speed of different machinesGeneral Comments on Adapting the Package to Different Computer SystemsCompiling the programs Unix and Linux On Windows systems Compiling with Cygnus Gnu C++ Compiling with Microsoft Visual C++ Macintosh Compiling with GCC on Mac OS X with our Makefile Compiling with GCC on Mac OS X with X Windows What aboutthe Metrowerks Codewarrior compiler? VMS VAX systems Parallel computers Other computer systems Compiling the Java interfacesFrequently Asked Questions How to make it do various things Background information needed: Questions about distribution and citation: Questions about documentation Additional Frequently Asked Questions, or: "Why didn't it occur to you to ... (Fortunately) obsolete questionsNew Features in This VersionComing Attractions, Future PlansEndorsements From the pages of Cladistics ... in the pages of other journals: ... and in the comments made by users when they register:References for the Documentation FilesCreditsOther Phylogeny Programs Available Elsewhere PAUP* MrBayes MEGA PAML Phyml RAxML TNT DAMBEHow You Can Help MeIn Case of TroubleA Brief Description of the ProgramsPHYLIP, the Phylogeny Inference Package, is a package of programs forinferring phylogenies (evolutionary trees). It has been distributed since1980, and has over 30,000 registered users, making it the most widelydistributed package of phylogeny programs. It is available free, fromits web site: is available as source code in C, and also as executables forsome common computer systems. It can infer phylogenies by parsimony,compatibility, distance matrix methods, and likelihood. It can alsocompute consensus trees, compute distances between trees, draw trees,resample data sets by bootstrapping or jackknifing, edit trees, andcompute distance matrices. It can handle data that are nucleotidesequences, protein sequences, gene frequencies, restriction sites,restriction fragments, distances, discrete characters, and continuouscharacters.Copyright Notice for PHYLIPThe following copyright notice is intended to cover all source code, alldocumentation, and all executable programs of the PHYLIP package. Copyright 1980-2013. University of Washington. Allrights reserved. Permission is granted to reproduce, perform, and modifythese programs and documentation files. Permission is granted to distributeor provide access to theseprograms provided that this copyright notice is not removed, the programs arenot integrated with or called by any product or service that generatesrevenue, and that your distribution of these documentation files and programsare free. Any modifiedversions of these materials that are distributed or accessible shall indicatethat they are based on these programs. Institutions of higher education aregranted permission to distribute this material to their students and stafffor a fee to recover distribution costs. Permission requests for any otherdistribution of these programs should be directed to license (at) u.washington.edu .The Documentation Files and How to Read ThemPHYLIP comes with an extensive set of documentation files. Theseinclude the main documentation file (this one), which you should readfairly completely. In addition there are files for groups of programs,including ones for the molecular sequenceprograms, the distance matrixprograms, thegene frequency and continuous charactersprograms, the discrete characters programs,and the tree drawing programs. Finally,each program has its own documentation file. References for thedocumentation files are all gathered together in this main documentationfile. A good strategy is to:Read this main documentation file.Tentatively decide which programs are of interest to you.Read the documentation files for the groups of programs thatcontain those.Read the documentation files for those individual programs.There is an excellent guide to using PHYLIP 3.6 also available. It was written byJarno Tuimala of the Center for Scientific Computing in Espoo, Finland and isavailable as a PDF here. It is alsodistributed at the main PHYLIP web site.What The Programs DoHere is a short description of each of the programs. For more detaileddiscussion you should definitely read the documentation file for theindividual program and the documentation file for the group of programsit is in. In this list the name of each program is a link which willtake you to the documentation file for that program. Note that there is noprogram in the PHYLIP package called PHYLIP.CliqueFinds the largest clique of mutually compatible characters, and the phylogeny which they recommend, for discrete character data with two states. The largest clique (or all cliques within a given size range of the largest one) are found by a very fast branch and bound search method. The method does not allow for missing data. For such cases the T (Threshold) option of Pars or Mix may be a useful alternative. Compatibility methods are particular useful when some characters are of poor quality and the rest of good quality, but when it is not known in advance which ones are which.ConsenseComputes consensus trees by the majority-rule consensus tree method, which also allows one to easily find the strict consensus tree. Is not able to compute the Adams consensus tree. Trees are input in a tree file in standard nested-parenthesis notation, which is produced by many of the tree estimation programs in the package. This program can be used as the final step in doing bootstrap analyses for many of the methods in the package.ContmlEstimates phylogenies from gene frequency data by maximum likelihood under a model in which all divergence is due to genetic drift in the absence of new mutations. Does not assume a molecular clock. An alternative method of analyzing this data is to compute Nei's genetic distance and use one of the distance matrix programs. This program can also do maximum likelihood analysis of continuous characters that evolve by a Brownian Motion model, but it assumes that the characters evolve at equal rates and in an uncorrelated fashion, so that it does not take into account the usual correlations of characters.ContrastReads a tree from a tree file, and a data set with continuous characters data, and produces the independent contrasts for those characters, for use in any multivariate statistics package. Will also produce covariances, regressions and correlations between characters for those contrasts. Can also correct for within-species sampling variation when individual phenotypes are available within a population.DnacompEstimates phylogenies from nucleic acid sequence data using the compatibility criterion, which searches for the largest number of sites which could have all states (nucleotides) uniquely evolved on the same tree. Compatibility is particularly appropriate when sites vary greatly in their rates of evolution, but we do not know in advance which are the less reliable ones.Dnadist Computes four different distances between species from nucleic acid sequences. The distances can then be used in the distance matrix programs. The distances are the Jukes-Cantor formula, one based on Kimura's 2- parameter method, the F84 model used in Dnaml, and the LogDet distance. The distances can also be corrected for gamma-distributed and gamma-plus-invariant-sites-distributed rates of change in different sites. Rates of evolution can vary among sites in a prespecified way, and also according to a Hidden Markov model. The program can also make a table ofDnainvarFor nucleic acid sequence data on four species, computes Lake's and Cavender's phylogenetic invariants, which test alternative tree topologies. The program also tabulates the frequencies of occurrence of the different nucleotide patterns. Lake's invariants are the method which he calls "evolutionary parsimony".Dnaml Estimates phylogenies from nucleotide sequences by maximum likelihood. The model employed allows for unequal expected frequencies of the four nucleotides, for unequal rates of transitions and transversions, and for different (prespecified) rates of change in different categories of sites, and also use of a Hidden Markov model of rates, with the program inferring which sites have which rates. This also allows gamma-distribution and gamma-plus-invariant sites distributions of rates across sites.DnamlkSame as Dnaml but assumes a molecular clock. The use of the two programs together permits a likelihood ratio test of the molecular clock hypothesis to be made.DnamoveInteractive construction of phylogenies from nucleic acid sequences, with their evaluation by parsimony and compatibility and the display of reconstructed ancestral bases. This can be used to find parsimony or compatibility estimates by hand. DnaparsEstimates phylogenies by the parsimony method using nucleic acid sequences. Allows use the full IUB ambiguity codes, and estimates ancestral nucleotide states. Gaps treated as a fifth nucleotide state. It can also do transversion parsimony. Can cope with multifurcations, reconstruct ancestral states, use 0/1 character weights, and infer branch lengths.DnapennyFinds all most parsimonious phylogenies for nucleic acid sequences by branch-and-bound search. This may not be practical (depending on the data) for more than 10-11 species or so.DollopEstimates phylogenies by the Dollo or polymorphism parsimony criteria for discrete character data with two states (0 and 1). Also reconstructs ancestral states and allows weighting of characters. Dollo parsimony is particularly appropriate for restriction sites data; with ancestor states specified as unknown it may be appropriate for restriction fragments data.DolmoveInteractive construction of phylogenies from discrete character data with two states (0 and 1) using the Dollo or polymorphism parsimony criteria. Evaluates parsimony and compatibility criteria for those phylogenies and displays reconstructed states throughout the tree. This can be used to find parsimony or compatibility estimates by hand. DolpennyFinds all most parsimonious phylogenies for discrete-character data with two states, for the Dollo or polymorphism parsimony criteria using the branch-and-bound method of exact search. May be impractical (depending on the data) for more than 10-11 species. DrawgramPlots rooted phylogenies, cladograms, circular trees and phenograms in a wide variety of user-controllable formats. The program is interactive. It has an interface in the Java language which gives it a closely similar menu on all three major operating systems. Final output can be to a file formatted for one of the drawing programs, for a ray-tracing or VRML browser, or one at can be sent to a laser printer (such as Postscript or PCL-compatible printers), on graphics screens or terminals, on pen plotters or on dot matrix printers capable of graphics. Many of these formats are historic so we no longer have hardware to test them. If you find a problem please report it.DrawtreeSimilar to Drawgram but plots unrooted phylogenies. It also has aJava interface for previews.FactorTakes discrete multistate data with character state trees and produces the corresponding data set with two states (0 and 1). Written by Christopher Meacham. This program was formerly used to accomodate multistate characters in Mix, but this is less necessary now that Pars is available.FitchEstimates phylogenies from distance matrix data under the "additive tree model" according to which the distances are expected to equal the sums of branch lengths between the species. Uses the Fitch-Margoliash criterion and some related least squares criteria, or the Minimum Evolution distance matrix method. Does not assume an evolutionary clock. This program will be useful with distances computed from molecular sequences, restriction sites or fragments distances, with DNA hybridization measurements, and with genetic distances computed from gene frequencies.GendistComputes one of three different genetic distance formulas from gene frequency data. The formulas are Nei's genetic distance, the Cavalli-Sforza chord measure, and the genetic distance of Reynolds et. al. The former is appropriate for data in which new mutations occur in an infinite isoalleles neutral mutation model, the latter two for a model without mutation and with pure genetic drift. The distances are written to a file in a format appropriate for input to the distance matrix programs.KitschEstimates phylogenies from distance matrix data under the "ultrametric" model which is the same as the additive tree model except that an evolutionary clock is assumed. The Fitch-Margoliash criterion and other least squares criteria, or the Minimum Evolution criterion are possible. This program will be useful with distances computed from molecular sequences, restriction sites or fragments distances, with distances from DNA hybridization measurements, and with genetic distances computed from gene frequencies. MixEstimates phylogenies by some parsimony methods for discrete character data with two states (0 and 1). Allows use of the Wagner parsimony method, the Camin-Sokal parsimony method, or arbitrary mixtures of these. Also reconstructs ancestral states and allows weighting of characters (does not infer branch lengths).MoveInteractive construction of phylogenies from discrete character data with two states (0 and 1). Evaluates parsimony and compatibility criteria for those phylogenies and displays reconstructed states throughout the tree. This can be used to find parsimony or compatibility estimates by hand. NeighborAn implementation by Mary Kuhner and John Yamato of Saitou and Nei's "Neighbor Joining Method," and of the UPGMA (Average Linkage clustering) method. Neighbor Joining is a distance matrix method producing an unrooted tree without the assumption of a clock. UPGMA does assume a clock. The branch lengths are not optimized by the least squares criterion but the methods are very fast and thus can handle much larger data sets.ParsMultistate discrete-characters parsimony method. Up to 8 states (as well as "?") are allowed. Cannot do Camin-Sokal or Dollo Parsimony. Can cope with multifurcations, reconstruct ancestral states, use character weights, and infer branch lengths.PennyFinds all most parsimonious phylogenies for discrete-character data with two states, for the Wagner, Camin-Sokal, and mixed parsimony criteria using the branch-and-bound method of exact search. May be impractical (depending on the data) for more than 10-11 species. PromlEstimates phylogenies from protein amino acid sequences by maximum likelihood. The PAM, JTT, or PMB models can be employed, and also use of a Hidden Markov model of rates, with the program inferring which sites have which rates. This also allows gamma-distribution and gamma-plus-invariant sites distributions of rates across sites. It also allows different rates of change at known sites.PromlkSame as Proml but assumes a molecular clock. The use of the two programs together permits a likelihood ratio test of the molecular clock hypothesis to be made.ProtdistComputes a distance measure for protein sequences, using maximum likelihood estimates based on the Dayhoff PAM matrix, the JTT matrix model, the PBM model, Kimura's 1983 approximation to these, or a model based on the genetic code plus a constraint on changing to a different category of amino acid. The distances can also be corrected for gamma-distributed and gamma-plus-invariant-sites-distributed rates of change in different sites. Rates of evolution can vary among sites in a prespecified way, and also according to a Hidden Markov model. The program can also make a table of percentage similarity among sequences. The distances can be used in the distance matrix programs.ProtparsEstimates phylogenies from protein sequences (input using the standard one-letter code for amino acids) using the parsimony method, in a variant which counts only those nucleotide changes that change the amino acid, on the assumption that silent changes are more easily accomplished. percentage similarity among sequences.RestdistDistances calculated from restriction sites data or restriction fragments data. The restriction sites option is the one to use to also make distances for RAPDs or AFLPs.RestmlEstimation of phylogenies by maximum likelihood using restriction sites data (not restriction fragments but presence/absence of individual sites). It employs the Jukes-Cantor symmetrical model of nucleotide change, which does not allow for differences of rate between transitions and transversions. This program is very slow.RetreeReads in a tree (with branch lengths if necessary) and allows you to reroot the tree, to flip branches, to change species names and branch lengths, and then write the result out. Can be used to convert between rooted and unrooted trees, and to write the tree into a preliminary version of a new XML tree file format which is under development and which is described in the Retree documentation web page.SeqbootReads in a data set, and produces multiple data sets from it by bootstrap resampling. Since most programs


About

Welcome to the group! You can connect with other members, ge...
bottom of page