Evolutionary Analysis: We undertook an evolutionary analysis of the zinc finger protein family in order to help define broad groups of proteins for further structural analysis. The goal of this survey was to determine a) if an evolutionary model based on protein sequence produced groupings of functionally related proteins and b) if non-zinc finger motif sequence data would group proteins with conserved zinc finger structure together.
Identification of sequences: The yeast ZAP1 protein (GI:6322405, accession NP_012479) was used as a query sequence in a BLASTp (version 2.2.11, database versions as of May 1st, 2005) search. Over 1200 results were generated with an expectation (e-value) of 1 or less. We attempted to manually remove redundancy and smaller splice variants, and sampled 500 proteins from the final result set for further search results.
Phylogenetic analysis: Unaligned proteins from the zinc finger search were processed using the MEME(Bailey and Elkan, 1994) program (implemented in the Wisconsin Package, version 10.3), using the zero or more setting, which identifies conserved motifs within a protein sequence. We identified a set of 8 conserved motifs and utilized for the absolute alignment generated by MEME as input for a distance-based analysis. The distance tree was calculated using weighted distances implemented within the Wisconsin Package (v10.3; program distances for Kimura weighted protein distances and growtree for neighbor-joining phylogenetic tree). Preliminary analysis of this protein set with cladistic analysis (using PAUP v. 4.0b, heuristic analysis) shows a similar tree structure and major groups do not change in a strict consensus tree. You can download the initial dataset here and view the complete tree file here
Global alignment in clades: Once major groups were identified, proteins were aligned in a global alignment (utilizing the Blosum 62 matrix for weighting, a 10 open 3 extension affine gap penalty matrix, and weighting alignment ends) for each group utilizing pileup (Wisconsin Package, v.10.3). These alignments were then check to determine the position of zinc finger domains within the proteins, and a majority consensus was generated for zinc fingers in the proteins. Table XX displays a subset of the functional groups identified from this analysis which will be the focus for this grant.
Results: Five major groups of zinc finger proteins were shown, each with smaller sub-categories (classes) of proteins with more focused function. Group 1 consists largely of proteins with 5 fingers, Groups 2, 3, and 4, consist predominantly of 3 fingered proteins, and Group 5 consisted of a disparate set of proteins which included 4 fingered proteins. The high level of sequence conservation within the zinc finger motifs of the classes of proteins (and largely within groups) displays that there is remarkable consistency across the tree and that this should enable focused structure/function studies within zinc finger proteins by utilizing exemplar protein sequences from each targeted class.