PlantTribes - Home

With rapidly growing numbers of whole genome and expressed sequence tag (EST) sequences in our public databases, sequence-based protein classification systems are providing foundations for gene annotation, functional genomics, and comparative investigations of gene and genome evolution.

PlantTribes is an objective classification system for plan proteins based on cluster analyses of the inferred proteomes of the sequenced angiosperms Arabidopsis thaliana v. Columbia (TAIR, version 7), Oryza sativa v. japonica (Rice; TIGR, version 5), and Populus trichocarpa (poplar; JGI, version 1.0). Sequence data for Carica papaya (papaya v. 1.0) and Medicago papaya (barrel medic, 60% complete; IMGA, version 1.0) are also included in PlantTribes v. 1.0. In addition to the genome-based tribe scaffold, unigenes from more than 200 plant and algal species TIGR Transcript Assemblies have been associated with each tribe (see documentation), resulting in a global classification of about 4 million putative plant protein sequences.

PlantTribes 1.0 incorporates an extensive collection of microarray expression data from Arabidopsis microarray experiments (see documentation). Expression data is linked to the individual genes in PlantTribes, and can be accessed through any result including Arabidopsis gene sequences.

PlantTribes is based on the similarity-based clustering procedure TribeMCL (Enright et al, 2002,2003) to classify protein-coding genes into putative gene families. MCL classifications have been constructed using three clustering stringencies , allowing the user to explore the stability of the protein classification. A second round of MCL clustering identifies SuperTribes that approximate objective superfamilies. PlantTribes also includes information about domains, traditional gene family names, and a unified nomenclature based on common terms.

Phylogenetic analyses of exemplar gene families show a strong, but not perfect correspondence between tribe membership and cladistic relationships. The results of these analyses provide insights into the Arabidopsis, Rice, and Poplar genomes, gene family evolution, and the evolutionary dynamics of functional domains among gene families. In addition, the resulting classification schemes provide scaffolds for sorting protein sequences from other plant species.

How to cite PlantTribes - We hope you find PlantTribes useful in your research and teaching. Please cite the following paper, which includes many technical details about the construction of the database and the user interface:

Wall, P.K., J. Leebens-Mack, K. Müller, D. Field, N. Altman and C.W. dePamphilis. (2008) PlantTribes: A gene and gene family resource for comparative genomics in plants. Nucleic Acids Research, 36:D970-976.

Please see the Documentation page for help and send comments, suggestions, questions, and/or bugs to Claude dePamphilis (cwd3 at psu dot edu) or Eric Wafula (ekw10 at psu dot edu).