Evolution of substrate specificities in the P-type ATPase superfamily

Journal of Molecular Evolution (1998) 46, 84-101

Evolution of substrate specificities in the P-type ATPase superfamily

Kristian B. Axelsen^1,2,*, Michael G. Palmgren¹

¹ Department of Plant Physiology, Institute of Molecular Biology, Copenhagen University, Øster Farimagsgade 2A, DK-1353 Copenhagen K, Denmark

²Department of Plant Biology, Royal Veterinary and Agricultural University, Thorvaldsensvej 40, DK-1871 Frederiksberg C, Denmark

*Correspondence should be addressed to K.B.A.
e-mail kax[at]life.ku.dk;

ABSTRACT

P-type ATPases make up a large superfamily of ATP-driven pumps involved in the trans-membrane transport of charged substrates. We have performed an analysis of conserved core sequences in 159 P-type ATPases. The various ATPases group together in five major branches according to substrate specificity, and not according to the evolutionary relationship of the parental species, indicating that invention of new substrate specificities are accompanied by abrupt changes in the rate of sequence evolution. A hitherto unrecognized family of P-type ATPases has been identified that is expected to be represented in all the major phylae of eukarya.

Key words: Na⁺ - K⁺ - Cu²⁺ - Ca²⁺ - H⁺ - Mg²⁺ - Cd²⁺ - Phospholipid translocases

INTRODUCTION

P-type ATPases comprise a ubiquitous family of proteins involved in the active pumping of charged substrates across biological membranes (Møller et al. 1996). Their distinguishing feature is the formation of a phosphorylated intermediate during the reaction cycle (hence P-type). P-type ATPases of various substrate specificities have several functions. In animals they provide, for example, the basis for action potentials in nervous tissues, secretion and reabsorption of solutes in the kidneys, acidification of the stomach, nutrient absorption in the intestines, relaxation of muscles, and Ca2+ dependent signal transduction.

Several reports have speculated about the relationships among the various P-type ATPases (Serrano 1988; Fagan and Saier 1994, Møller et al. 1996) but the number of proteins included in these analyses have been relatively low. Since the last detailed investigation (Fagan and Saier 1994), at least one new class of P-type ATPases has been discovered (Auland et al. 1994; Tang et al. 1996), systematic sequencing efforts have provided the complete sequences of several genomes (Fleischmann et al. 1995; Fraser et al. 1995; Bult et al. 1996; Goffeau et al. 1997; Himmelreich et al. 1996; O'Brien 1997) and a substantial amount of sequence information has been obtained from other organisms. This has made it possible to analyse the P-type ATPase content of living organisms from all three domains of life and to establish the presence of five major groups of P-type ATPases.

METHODS

Database Searches

The EMBL database (release 51), SWISS-PROT (release 34) and expressed sequence tag (EST) databases (EMEST and GBEST, until December 15 1996) were screened using the FASTA, TFASTA and BLAST programs of the Wisconsin package, version 8 (Program Manual for the Wisconsin Package, Version 8 September 1994 Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 53711 (1994). The genomes of Mycoplasma genitalium, Haemophilus influenzae, and Methanococcus jannaschii were accessed via the world wide web at The Institue for Genome Research, the genome of Synechocystis PCC6803 at the Kazusa DNA Research Institute (KDRI), and the genome of Escherichia coli was accessed at the E. coli Genome Project at University of Wisconsin-Madison and at the E. coli Databank.

Identification of P-type ATPase Sequences

The initial screen of the EMBL and SWISS-PROT databases was performed with the PROSITE consensus sequence for P-type ATPases DKTG[T,S][L,I,V,M][T,I]. All sequences but 13 obtained matched the sequence DKTGTLT, 12 sequences matched DKTGTIT (six mammalian and two bacterial heavy metal transporting-ATPases, three bacterial KdpB ATPases involved in the transport of K+, and the CTA3 ATPase of Schizosaccharomyces pombe believed to transport Ca2+ and one matched DKTGTII (a human Cu2+-ATPase; Bull et al. 1993). All matches contained in addition most or all of ten conserved sequence motifs previously identified in a number of P-type ATPases (Serrano 1988; Møller et al. 1996). Several of these conserved regions were used to re-screen the databases to find possible P-type ATPases which do not match the DKTGT[L,I][T,I] sequence. Using this method, no further sequences encoding putative P-type ATPases were found. The screening of the databases revealed a total of 211 independent full length open reading frames encoding P-type ATPases.

Construction of Phylogenetic Trees

Alignments of the sequences were performed with the clustalw program (Thompson et al. 1994). Phylogenetic analysis were performed with the Protdist and Fitch (Fitch-Margoliash and least squares method) programs of the Phylip package (Felsenstein 1989). To make sure that the phylogenetic relationship found was not skewed because only partial sequences were used in the analysis, the full length proteins of the members of the different families were aligned and the alignments were inspected for similarities outside the core regions. The closer relationship between members of the families resulting from the phylogenetic analysis could be confirmed by these alignments.

Analysis of the Expressed Sequence Tag (EST) Databases

The GBEST and EMEST databases as well as the TIGR non-redundant EST database were screened with all the known cDNA sequences and the coding region of genomic sequences encoding P-type ATPases in the organisms was analysed. In this way ESTs identical to already cloned P-type ATPases in that organism were identified. In addition, 25 full-length protein sequences from all branches of the evolutionary tree in Fig. 6 were searched against the EST databases. EST sequences that showed at least 20 % sequence identity within a stretch of 30 amino acid residues to any one sequence were translated and searched against all databases. EST sequences showing significant homology to P-type ATPases only were subsequently extracted. These EST sequences were compared to the sequences of all cloned P-type ATPases. In this way the P-type ATPase to which each of the EST sequences had the closest resemblance was identified.

RESULTS

Conserved Core Sequences in P-type ATPases

A total of 211 P-type ATPase sequences were identified in the databases. Many sequences showed less than 15% identity to each other and varied considerably in length; from 642 amino acid residues in a heavy metal (HM) ATPase from Synechocystis PCC6803 (seq. 27 in Table 1; Fig. 1) to 1956 residues in a gene from Plasmodium falciparum of unknown function (seq. 4 in Table 1; Fig. 1). Due to large differences in primary structure and low overall similarity, a complete multiple alignment of all full length P-type ATPases proved to be an impossible task. However, we identified in the partial multiple alignment eight regions (A - H; Figs. 2 and 3), comprising a total of 265 amino acids that always aligned and in which only minimal deletions and insertions were found. The KGAP motif found in a variety of P-type ATPases (Serrano 1988) can not be unambiguously detected in HM-ATPases (Møller et al. 1996) and was not considered a core region.

The eight conserved regions, defining the core of the P-type ATPase superfamily, were extracted from the various P-type ATPases and arranged in a linear sequence (Fig. 1). The number of sequences was subsequently reduced to avoid overrepresentation of closely related sequences. In the resulting dataset, containing 159 core sequences, no P-type ATPase represented was more than 90% identical (when comparing full-length proteins) to any other member of the set. The excluded P-type ATPases are listed in Table 2.

A Phylogenetic Tree of P-type ATPases

A total of 14 phylogenetic trees were produced using the same dataset of sequences, but presenting them in different orders. Among these, 12 were identical with respect to the position of the major branches of the tree. A simplified version of the different trees obtained is shown in Fig. 4. All trees reveal five main branches, designated Type I - V ATPases, with a number of minor families. Amino acids within core segments conserved in the various branches of P-type ATPases are indicated in Fig. 5. The main difference between the two types of trees presented in Fig. 4 is the position of Type V ATPases. Although Type IV and Type V ATPases group together in most trees, this is not always the case, and therefore these groups were not considered to belong to the same type. In contrast, Type IA and IB ATPases as well as Type IIIA and IIIB ATPases always grouped together and were therefore considered to be of monophyletic origin and belonging to the same type. The phylogenetic tree that appeared most times (12 times out of 14) is shown in Fig. 6 and is discussed below.

The first main branch (Type IB ATPases) encompasses HM-ATPases and constitutes a large, rather distantly related group of proteins. It includes at least two families, one of which contains all known Cu2+-ATPases and the other includes four putative Cd2+-ATPases (seq. 23-26; Fig. 6). The KdpB proteins of Escherichia coli, Synechocystis PCC6803, and Mycobacterium tuberculosis (seq. 20-22; Fig. 6), all of which are involved in K+ transport, form a family of their own (Type IA ATPases) that is closely related to HM-ATPases.

The second branch (Type II ATPases) is split into several families exhibiting various levels of conservation. Most ATPases in this branch are believed to transport Ca2+ and are divided into two relatively distant families (Type IIA and Type IIB ATPases). Type IIA ATPases also include five putative Ca2+-ATPases from three bacterial species that form two separate groups in this family (seq. 130-131 and 153-155; Fig. 6). Two families, Na+/K+ and H+/K+-ATPases, are very closely related and are designated Type IIC ATPases. In addition, a small family of fungal ATPases (Type IID ATPases) and the solitary P-type ATPases in Mycoplasma genitalium and Mycoplasma pneumoniae of unknown function (seq. 156-157; Fraser et al. 1995; Himmelreich et al. 1996) form separate groups in the branch of Type II ATPases.

The third branch (Type IIIA ATPases) covers plasma membrane H+-ATPases. A small group of Mg2+-ATPases from two bacterial species (Type IIIB ATPases) group together with the H+-ATPases.

The fourth branch (Type IV ATPases) constitutes a recently discovered family of enzymes found only in eukarya, some of which (e.g. seq. 14; Table 1) have been shown to be involved in the transport of aminophospholipids (Auland et al. 1994; Tang et al. 1996). A group of P-type ATPases (Type V ATPases) having no assigned specificity constitutes the fifth branch of the phylogenetic tree. This distinct group of eukaryotic ATPases has three members from Caenorhabditis elegans, two members from Saccharomyces cerevisiae and one member each from P. falciparum and Tetrahymena thermophila. A characteristic sequence feature of the Type V ATPases is the PPxxP motif in region D (Fig. 5). These pumps have not been identified as a group before.

Number and Nature of P-type ATPases in Fully Sequenced Bacteria and Archaea

Synechocystis PCC6803, Escherichia coli and Mycobacterium tuberculosis are phylogenetically rather distant bacterial species belonging to cyanobacteria, proteobacteria and Gram-positive bacteria, respectively (Olsen et al. 1994). Still, they have a comparable profile and number of P-type ATPases: nine and four genes are found in the fully sequenced bacteria Synechocystis PCC6803 and E. coli, respectively, and ten P-type ATPases are cloned from the partially sequenced bacterium M. tuberculosis (Table 3). This suggests that a common trait for bacteria is having several P-type ATPases. The P-type ATPases of these organisms fall into three main classes: Type IA ATPases (one each in the three organisms; Fig. 6), Type IB ATPases (five in M. tuberculosis, four in Synechocystis, and two in E. coli), and Type IIA ATPases (one in M. tuberculosis and three in Synechocystis). A number of bacterial P-type ATPases have no or very few close relatives among other ATPases and thus form their own branches. This is true for three P-type ATPases from M. tuberculosis (function unknown; seq. 19, 158, and 159; Fig. 6), the Type IIIB ATPases of E. coli and Salmonella typhimurium (seq. 58-60; Fig. 6), and one of the P-type ATPases from Synechocystis (seq. 96; Fig. 6), which is most closely related to Type IIC ATPases.

Haemophilus influenzae, although very closely related to E. coli (both are in the beta

subdivision of proteobacteria; Olsen et al. 1994) only has a single P-type ATPase (Table 3). The same is true for M. genitalium and M. pneumoniae (Table 3), two very closely related Gram-positive bacteria. The solitary P-type ATPases in H. influenzae and M. genitalium (or M. pneumoniae) exhibit very low similarity to each other and group in distinct families of the phylogenetic tree of ATPases (Type IB ATPases and Type II ATPases, respectively; Table 3; Fig. 6). H. influenzae, M. genitalium and M. pneumoniae live in humans in constant environmental conditions. This may explain why some bacteria in special niches can do without certain P-type ATPases and might have eliminated superfluous ATPases with time.

The archaeon Methanococcus jannaschii only possesses a single P-type ATPase showing high similarity to Type IIIA ATPases. Whether the simple genome of M. jannaschii with respect to P-type ATPases is an exception among archaea cannot be determined before more archaean genomes have been sequenced.

P-type ATPases in the Yeast Saccharomyces cerevisiae

As many as 16 P-type ATPase genes are found in the genome of the unicellular yeast S. cerevisiae (Table 1, 2, and 3). These belong to all the five main branches of the phylogenetic tree (Fig. 6). Single yeast sequences belong to Type IIA and IIB Ca2+-ATPases and two distantly related sequences are located in each of the branches of Type I and Type V ATPases. As many as five Type IV ATPases are present, two of which are closely related. In addition, in the S. cerevisiae genome, three additional Type IID ATPases and two Type IIIB ATPases are found which are rather similar in sequence and each form small isoenzyme subfamilies.

P-type ATPases in Higher Organisms Not Yet Fully Sequenced

Analysis of expressed sequence tags (ESTs) provides a means to study the number and nature of genes in higher eukarya. Twenty-five eukaryotic sequences from all families of the phylogenetic tree were used to search the EST databases of Arabidopsis thaliana, C. elegans, and Homo sapiens for expressed P-type ATPases with similarity to already cloned pumps (Table 4). In all three organisms P-type ATPases were found that belong to the families of Type IB, IIA, IIB, IV, and V. There were, however, also marked differences between the single plant and the two animal (invertebrate and vertebrate, respectively) P-type ATPase related ESTs. Strikingly, in the EST database several related A. thaliana sequences showed similarity to Type IIIA ATPases (H+-pumps; Table 4), but no sequence showed similarity to Type IIC ATPases (Na+/K+- and H+/K+-pumps). On the contrary, several C. elegans and H. sapiens EST sequences were similar to Type IIC ATPases (Table 4), but no EST sequence from these organisms resembled Type IIIA ATPases.

Due to the low general quality of the sequences in the EST databases it was in some cases difficult to ascertain whether two sequences were actually identical or not. With these uncertainties in mind, there are, in the EST databases, between 18 to 27, 19 to 23 and 21 to 34 different P-type ATPase sequences represented from A. thaliana, C. elegans and H. sapiens, respectively (Table 4). A. thaliana appears to harbour at least 10 P-type H+-ATPases (Harper et al. 1994), five of which are represented by ESTs in the database (Table 4). This would suggest that far from all ATPases are present in the EST databases. A recent estimate suggests that about half of all Arabidopsis genes are represented by an EST (Rounsley et al. 1996).

Residues Determining Ion Specificity in Two Closely Related Families

Na+/K+ ATPases and H+/K+ ATPases (Type IIC ATPases; Fig. 6) are the two most closely related P-type ATPases having distinct ion specificities and most likely represent a recent development. We may assume that amino acids determining Na+ specificity are conserved in all Na+/K+-ATPases but may not be essential for the function of H+/K+-ATPases. Likewise amino acids determining H+ specificity are probably conserved in all H+/K+-ATPases. A comparison between all members of the two families (Table 1 and 2) reveal that amino acids at several positions are conserved in all Na+/K+-ATPases, but may vary in H+/K+-ATPases, and vice versa (Fig. 2). These amino acids are not randomly distributed in the sequences and only a few are overlapping with the universally conserved core regions (Fig. 2). Rather, they appear to be clustered in the trans-membrane segments (numbers 3, 6, 8, 9, and 10; Fig. 2) and in the stalk regions (Møller et al. 1996; Fig. 2) connecting the cytoplasmic domains with the membrane spanning segments. The criteria in this study for defining the various types of P-type ATPsaes has solely been the phylogenetic relationship between the conserved segments defined in Fig. 3.

DISCUSSION

This investigation shows P-type ATPases dividing into five major branches. This differs from earlier investigations dividing the P-type ATPase superfamily into two (Fagan and Saier 1994) or three (Lutsenko and Kaplan 1995) branches. The five major branches in the phylogenetic tree (Fig. 6) are Type I ATPases (heavy metal pumps), Type II ATPases (Ca2+-ATPases, Na+/K+-ATPases, and H+/K+-ATPases), Type III ATPases (H+ and Mg2+ pumps), Type IV ATPases (phospholipid pumps), and Type V ATPases (a group of pumps having no assigned substrate specificity). It should be noted, however, that this analysis is limited by the fact that the ion specificity of Type IIC ATPases has not been confirmed in all cases (Tables 1 and 2).

A Hitherto Unrecognized Family of P-type ATPases

A wealth of sequences in higher eukarya resembling Type V ATPases are found in the databases of ESTs (Table 4). This indicates that although poorly characterized, this branch of ATPases is large and ubiquitous among eukarya. An electrogenic chloride pump resembling a P-type ATPase has been characterized in the intestine of Aplysia californica (Gerencser and Purushotham 1996), anion-stimulated ATPase activities are found in several tissues and organisms (Gerencser 1996), and electrophysiological evidence points to an electrogenic chloride pump in the plasma membrane of Acetabularia (Gradmann et al. 1982). However, no P-type ATPase involved in the transport of any anion has so far been cloned. Since Type V ATPases form the only P-type ATPase family having no substrate specificities assigned to it, anions may be candidate substrates to be transported by them. Cloning of the respective genes will illuminate whether any of the P-type anion pumps belong to Type V ATPases.

Phospholipid Transporting P-type ATPases Form a Large Eukaryotic Branch of P-type ATPases

Only one Type IV ATPase (seq. 14; Fig. 6) has been characterized at the biochemical level (Auland et al. 1994). This Bos taurus pump transports aminophospholipids such as phosphatidylserine and phosphatidylethanolamine (Auland et al. 1994). Five Type IV ATPases are found in the S. cerevisiae genome (Tables 1 and 3 ). Deletion of one of these sequences (seq. 15; Fig. 6) abolishes trans-membrane phosphatidylserine transport, suggesting that the S. cerevisiae and the B. taurus genes have similar functions (Tang et al. 1996). This suggests that trans-membrane transport of various phospholipids, which would contribute to establishing lipid bilayer asymmetry, is a general feature of the poorly characterized branch of Type IV ATPases. Type IV ATPases are not found in bacteria and archaea but are represented abundantly in eukaryotes as evident from their appearance in the EST database (Table 4).

Type IIA and IIB Ca2+-ATPases are Both Present in a Variety of Membranes

Type IIA Ca2+-ATPases are mainly present in the sarcoplasmic and endoplasmic reticulum but in addition includes a plant pump (seq. 150; Fig. 6) that seems to be present in both the vacuolar membrane and the plasma membrane (Ferrol and Bennett 1996). Type IIB Ca2+-ATPases are primarily found in the plasma membrane but this family also includes pumps (seq. 119, 128, and 129; Fig. 6) present in the vacuolar membrane (Cunningham and Fink 1994; Moniakis et al. 1995; Malmström et al. 1997). Since members of both families can be present in a variety of membranes it does not seem appropriate to name these families according to intracellular localization, i.e. SERCA pumps for sarco-endo-plasmatic reticulum Ca2+-ATPases and PMCA pumps for plasma membrane Ca2+-ATPases, but rather Type IIA and Type IIB Ca2+-ATPases, respectively, according to their phylogenetic relationship.

Complementary Function of Type III and Type IIC ATPases

H+ (Type IIIA)- and Na+/K+ (Type IIC)- transporting P-type ATPases form large eukaryotic families. Analysis of EST databases (Table 4) suggests that among plants, fungi and animals, plasma membrane H+-ATPases are specific for fungi and plants, and, on the other hand, that plasma membrane Na+/K+-ATPases are distinguishing features of animal cells. Secondary active transport in fungi and plants is energized by H+ gradients, whereas in animals Na+ gradients are used as an energy source for such transport systems (Skulachev 1994). Therefore, Type IIIA and Type IIC pumps seem to serve complementary functions in the various classes of eukarya.

Fungi apparently form an intermediate group harbouring both H+- and Na+-pumps, the latter in a distinct branch (Type IID ATPases) separate from that of animal Na+/K+- ATPases. However, it has not been conclusively demonstrated that Type IID ATPases transport Na+. The ENA genes (ENA1; seq. 94) were suggested to be Na+ pumps based on genetic evidence demonstrating that deletion of the genes confer Na+ sensitivity to S. cerevisiae (Haro et al. 1991; Garciadeblas et al. 1993). Nevertheless, disruption of the related CTA3 gene (seq. 93) in S. pombe leads to higher levels of cytosolic free Ca2+ (Ghislain et al 1990) in addition to reduced Ca2+ in intracellular organelles (Halachmi et al. 1992), suggesting that this pump is involved in removing Ca2+ from the cytosol to intracellular stores. Since Na+ tolerance in S. cerevisiae is regulated by Ca2+ (Mendoza et al., 1994; Hirata et al., 1995) it remains a theoretical possibility that the ENA genes encode Ca2+-ATPases essential for regulating intracellular Ca2+ during Na+ adaptation. A biochemical characterization of Type IID ATPases is needed in order to clarify this question.

It is uncertain whether P-type H+- or Na+-ATPases are found in bacterial species. H+-ATPase activity sensitive to the P-type ATPase inhibitor vanadate has been characterized in a cyanobacterium (Fresneau et al. 1993) and at least the solitary P-type ATPase in the archaeon M. jannaschii is very similar to H+-ATPases (Fig. 6). A single Synechocystis PCC6803 sequence (seq. 96; Fig. 6; Table 1) has some similarity to Type IIC ATPases although it may also be an unusual Type IIA Ca2+-ATPase.

KdpB ATPases May Represent an Ancient Group of P-type ATPases

No bacterial P-type ATPases have been cloned that group with Type IIB, Type IV, and Type V ATPases. Therefore, these ATPases most likely represent more recent evolutionary achievements. Most of the bacterial P-type ATPases are found in the branch of Type IB ATPases, in the related family of KdpB (Type IA) ATPases, and in the branch of Type IIA Ca2+-ATPases (Fig. 6). KdpB ATPases have an unusual trans-membrane organization compared to other P-type ATPases in having a reduced number of trans-membrane spans (Lutsenko and Kaplan 1995). In addition, unlike all other P-type ATPases, these ATPases have three subunits (Hesse et al 1984). KdpB is the catalytic subunit, whereas KdpA is involved in the binding of K+, and KdpC seems to stabilize the complex between KdpB and KdpA (Buurman et al. 1995). Fusion of genes occurred during the evolution of bacteria (Miozzari and Yanofsky 1979) and KdpB ATPases may represent ancestral P-type ATPase which acquired new trans-membrane helices and the ability to bind the transported substrate, for example, by fusion of the catalytic subunit with substrate binding proteins such as KdpA encoded within the same operon. The beta

subunit of the heterodimeric Na+/K+- and H+/K+-ATPases (Type IIC), which plays a role in stabilization of the catalytic alpha

subunit (Møller et al. 1996), may be a reminiscent of the KdpC subunit whereas at least H+- (Type IIIA) and Ca2+- (Type IIA and Type IIB) ATPases seem to have eliminated this subunit.

P-type ATPases Group in a Phylogenetic Tree According to Substrate Specificity

The substrate specificities of most P-type ATPases cloned so far are not known. Obtaining this information requires, for example, demonstration of amino acid identity to biochemically well-characterized proteins or expression of the genes in heterologous systems followed by biochemical characterization of the gene products. However, in each of the described branches of ATPases presented in Fig. 6 it is noteworthy that only single ion specificities have been found for those proteins in the branch that have been characterized with respect to function. The only exception to this rule so far is Type IID ATPases (see above).

The various P-type ATPases present in single organisms are often more related to P-type ATPases in evolutionarily distant species, than they are to endogenous P-type ATPases in other branches. For example, plasma membrane H+-ATPases of S. cerevisiae (seq. 86 and 87; Fig. 6) and the plant A. thaliana (seq. 68; Fig. 6) show high similarity to each other and are analogous enzymes with respect to function (Palmgren and Christensen 1993; Supply et al. 1993). On the contrary, all the remaining 14 P-type ATPases in S. cerevisiae (Table 1 and 2) have significantly weaker similarity to the H+-ATPase sequences. The H+-ATPase sequences are closely related to the Type IIIA P-type ATPase of the archaeon M. jannaschii (60% sequence identity within core sequences). The divergence between the M. jannaschii P-type ATPase and the H+-ATPases most likely took place about one and a half billion years ago, which is an estimated date for the divergence of eukarya and archaea (Doolittle et al. 1996). In contrast, there is less than 25% identity within core sequences between, for example, any Type IV and any Type I ATPase. Assuming that two diverging protein sequences change in a more or less stochastic manner consistent with an exponential decay (Doolittle 1995), the duplication giving rise to Type IV and Type V ATPases thus occurred several billion years ago and clearly preceded the divergence of eukarya and bacteria (or life for that matter). Still, we do not find any genes with detectable similarity to Type IV and Type V ATPases in the completed genomes of four bacteria and one archaeon. Although in theory it is possible that these organisms have eliminated, for example, Type IV and Type V ATPases from their genomes relative recently, the complete absence of these ATPases in prokaryotes argues against the proposition that Type IV ATPases represent an ancient group of P-type pumps (Tang et al. 1996). A plausible explanation would be that P-type ATPases have not evolved at a constant rate, and that Type IV and Type V ATPases are relatively recent evolutionary achievements that evolved more rapidly than other P-type ATPases.

Several Amino Acid Residues May be Involved in Determining Ion Specificity of P-type ATPases

It is generally believed that with ion pumps changing, only a few or a single amino acid residue may be sufficient in order to develop a new ion specificity (Sasaki et al. 1995; Zhang and Fillingame 1995). If a single or very few residues are sufficient determinants of ion specificity in P-type ATPases, we would expect to find many examples of closely related pumps having different ion specificity. However, this is not the case. Several amino acid residues may be required in determining the shape and specificity of the ion binding site(s) near or inside the proposed ion channel (Møller et al. 1996) of P-type ATPases. It is thus possible that several, if not all, of the residues characteristic for each of the closely related Na+/K+- and H+/K+-ATPase families (indicated by boxes in Fig. 2) are required in order to obtain Na+ and H+ specificities, respectively. In accordance with this observation, mutagenesis of the conserved Na+/K+-ATPase glutamate-961 into glutamine (which is present in all H+/K+-ATPases), aspartate, alanine, or leucine slightly reduces the Na+ affinity of the ATPase but does not abolish the transporting capabilities of the enzyme (Van Huysse and Lingrel 1993). Substitutions of serine-782 (present in all Na+/K+-ATPases but not in H+/K+-ATPases) does not result in any detectable changes in Na+ affinity (Argüello and Lingrel 1995). A concerted action between several residues as a requirement for obtaining absolute ion specificity would explain the large divergence in single species between P-type ATPases having different ion specificity.

In conclusion, P-type ATPases cluster in the phylogenetic tree according to substrate specificity despite the evolutionary distance between the parental species. This suggests that, firstly, within a given substrate specificity of P-type ATPases there are limits for evolutionary changes. Secondly, in order for a change in substrate specificity to occur, any P-type ATPase has to be released from a structural constraint that is subsequently followed by a dramatic change in primary structure. Actins are among the slowest-changing proteins known probably because actin evolution is limited by structural constraints that has to do with the large number of protein-protein interactions they must preserve (Doolittle 1995). Although actins evolve extremely slowly, they show little sequence identity to their closest relatives, eukaryotic centractins and bacterial ftsA proteins. This suggests abrupt changes in the rate of sequence evolution immediately following the divergence from common ancestors (Doolittle 1995). In this respect, the evolution of actins and the P-type ATPase superfamily resemble each other.

Acknowledgements

This work was supported by the European Union Biotechnology Program BIO4-CT-96-0775. We are indebted to Jeff Harper for useful suggestions and stimulating discussions. We would like to thank Hans Ullitz Møller and Bo Mikkelsen at Biobase for assistance with computational problems, Henrik Nielsen at the Centre for Biological Sequence Analysis for help with the helix predictions and Lone Baunsgaard, Thomas Jahn, Kirk Schnorr, and Kees Venema for critically reading the manuscript.

References

Argüello JM, Lingrel JB (1995) Substitutions of serine 775 in the alpha

subunit of the Na,K-ATPase selectively disrupt K+ high affinity activation without affecting Na+ interaction. J Biol Chem 270:22764-22771

Auland ME, Roufogalis BD, Devaux PF, Zachowski A (1994) A Reconstitution of ATP-dependent aminophospholipid translocation in proteoliposomes. Proc Natl Acad Sci USA 91:10938-10942

Bull PC, Thomas GR, Rommens JM, Forbes JR, Cox DW (1993) The Wilson disease gene is a putative copper transporting P-type ATPase similar to the Menkes gene. Nat Genet 5:327-337

Bult CJ, White O, Olsen GJ, Zhou L, Fleischmann RD, Sutton GG, Blake JA, FitzGerald LM, Clayton RA, Gocayne JD, Kerlavage AR, Dougherty BA, Tomb JF, Adams MD, Reich CI, Overbeek R, Kirkness EF, Weinstock KG, Merrick JM, Glodek A, Scott JL, Geoghagen NSM, Weidman JF, Fuhrmann JL, Nguyen D, Utterback TR, Kelley JM, Peterson JD, Sadow PW, Hanna MC, Cotton MD, Roberts KM, Hurst MA, Kaine BP, Borodovsky M, Klenk H-P, Fraser CM, Smith HO, Woese CR, Venter JC (1996) Complete genome sequence of the methanogenic Archaeon, Methanococcus jannaschii. Science 273:1058-1073

Buurman ET, Kim KT, Epstein W (1995) Genetic evidence for two sequentially occupied K+ binding sites in the Kdp transport ATPase. J Biol Chem 270:6678-6685

Cunningham KW, Fink GR (1994) Calcineurin-dependent growth control in Saccharomyces cerevisiae mutants lacking PMC1, a homolog of plasma membrane Ca2+ ATPases. J Cell Biol 124:351-363

Doolittle RF (1995) The origins and evolution of eukaryotic proteins. Philos Trans R Soc Lond B Biol Sci 349:235-240

Doolittle RF, Feng DF, Tsang S, Cho G, Little E (1996) Determining divergence times of the major kingdoms of living organisms with a protein clock. Science 271:470-477

Fagan MJ, Saier MH Jr (1994) ATPases of eukaryotes and bacteria: Sequence analyses and construction of phylogenetic trees. J Mol Evol 38:57-99

Felsenstein J (1989) PHYLIP - phylogeny inference package (Version 32). Cladistics 5:164-166

Ferrol N, Bennett AB (1996) A single gene may encode differentially localized Ca2+-ATPases in tomato. Plant Cell 8:1159-1169

Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, McKenney K, Sutton G, Fitzhugh W, Fields C, Gocayne JD, Scott J, Shirley R, Liu LI, Glodek A, Kelley JM, Weidman JF, Phillips CA , Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JL, Geoghagen NSM, Gnehm CL, McDonald LA, Small KV, Fraser CM, Smith HO, Venter JC (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae. Science 269:496-512

Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, Bult CJ, Kerlavage AR, Sutton G, Kelley JM, Fritchman JL, Weidman JF, Small KV, Sandusky M, Fuhrmann J, Nguyen D, Utterback TR, Saudek DM, Philips CA, Merrick JM, Tomb J-F, Dougherty BA, Bott KF, Hu P-C, Lucier TS, Peterson SN, Smith HO, Hutchison CA, III, Venter JC (1995) The minimal gene complement of Mycoplasma genitalium. Science 270:397-403

Fresneau C, Riviere ME, Arrio B (1993) Characterization of the plasmalemma ATPase from the cyanobacteria Synechococcus PCC 6311 and PCC 7942. Arch Biochem Biophys 306:254-260

Garciadeblas B, Rubio F, Quintero FJ, Banuelos MA, Haro R, Rodriguez-Navarro A (1993) Differential expression of two genes encoding isoforms of the ATPase involved in sodium efflux in Saccharomyces cerevisiae. Mol Gen Genet 238:363-368

Gerencser GA (1996) The chloride pump: a Cl--translocating P-type ATPase. Crit Rev Biochem Mol Biol 31:303-337

Gerencser GA, Purushotham KR (1996) Reconstituted Cl- pump protein: a novel ion(Cl-)-motive ATPase. J Bioenerg Biomembr 28:459-469

Ghislain M, Goffeau A, Halachmi D, Eilam Y (1990) Calcium homeostasis and transport are affected by disruption of cta3, a novel gene encoding Ca2+-ATPase in Schizosaccharomyces pombe. J Biol Chem 265:18400-18407

Goffeau A. et al. (1997) The yeast genome directory. Nature (supplement) 387:1-105

Gradmann D, Tittor J, Goldfarb V (1982) Electrogenic Cl- pump in Acetabularia. Philos Trans R Soc Lond B 299:447-457

Halachmi D, Ghislain M, Eilam Y (1992) An intracellular ATP-dependent calcium pump within the yeast Schizosaccharomyces pombe, encoded by the gene cta3. Eur J Biochem 207:1003-1008

Haro R, Garciadeblas B, Rodriguez-Navarro A (1991) A novel P-type ATPase from yeast involved in sodium transport. FEBS Lett 291:189-91

Harper JF, Manney L, Sussman MR (1994) The plasma membrane H+-ATPase gene family in Arabidopsis: genomic sequence of AHA10 which is expressed primarily in developing seeds. Mol Gen Genet 244:572-587

Hesse JE, Wieczorek L, Altendorf K, Reicin AS, Dorus E, Epstein W (1984) Sequence homology between two membrane transport ATPases, the Kdp-ATPase of Escherichia coli and the Ca2+-ATPase of sarcoplasmic reticulum. Proc Natl Acad Sci USA 81:4746-4750

Himmelreich R, Hilbert H, Plagens H, Pirkl E, Li BC, Herrmann R (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res 24:4420-4449

Hirata D, Harada S I, Namba H, Miyakawa T (1995) Adaptation to high-salt stress in Saccharomyces cerevisiae is regulated by Ca2+/calmodulin-dependent phosphoprotein phosphatase (calcineurin) and cAMP-dependent protein kinase. Mol Gen Genet 249:257-264

Lingrel JB, Kuntzweiler T (1994) Na+,K+-ATPase. J Biol Chem 269:19659-19662

Lutsenko S, Kaplan JH (1995) Organization of P-type ATPase: Significance of structural diversity. Biochemistry 34:15607-15612

Malmström S, Askerlund P, Palmgren MG (1997) A calmodulin-stimulated Ca2+-ATPase from plant vacuolar membranes with a putative regulatory domain at its N-terminus. FEBS Lett 400:324-328

Mendoza I, Rubio F, Rodriguez-Navarro A, Pardo J M (1994) The protein phosphatase calcineurin is essential for NaCl tolerance of Saccharomyces cerevisiae. J Biol Chem 269:8792-8796

Miozzari GF, Yanofsky C (1979) Gene fusion during the evolution of the tryptophan operon in Enterobacteriaceae. Nature 277:486-489

Moniakis J, Coukell MB, Forer A (1995) Molecular cloning of an intracellular P-type ATPase from Dictyostelium that is up-regulated in calcium-adapted cells. J Biol Chem 270:28276-28281

Møller JV, Juul B, Le Maire M (1996) Structural organization, ion transport, and energy transduction of P-type ATPases. Biochim Biophys Acta 1286:1-51

O'Brien C (1997) Entire E. coli genome sequenced - at last. Nature 385:472

Olsen GJ, Woese CR, Overbeek R (1994) The winds of (evolutionary) change: Breathing new life into microbiology. J Bacteriol 176:1-6

Palmgren MG, Christensen G (1993) Complementation in situ of the yeast plasma membrane H+-ATPase gene pma1 by an H+-ATPase gene from a heterologous species. FEBS Lett 317:216-222

Program Manual for the Wisconsin Package, Version 8 September 1994 Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 53711 (1994)

Rounsley SD, Glodek A, Sutton G, Adams MD, Somerville CR, Venter JC, Kerlavage AR (1996) The construction of Arabidopsis expressed sequence tag assemblies A new resource to facilitate gene identification. Plant Physiol 112:1177-1183

Sasaki J, Brown LS, Chon YS, Kandori H, Maeda A, Needleman R, Lanyi JK (1995) Conversion of bacteriorhodopsin into a chloride ion pump. Science 269:73-75

Serrano R (1988) Structure and function of proton translocating ATPase in plasma membranes of plants and fungi. Biochim Biophys Acta 947:1-28

Skulachev VP (1994) Bioenergetics: the evolution of molecular mechanisms and the development of bioenergetic concepts. Antonie van Leeuwenhoek 65:271-284

Supply P, Wach A, Thines-Sempoux D, Goffeau A (1993) Proliferation of intracellular structures upon overexpression of the PMA2 ATPase in Saccharomyces cerevisiae. J Biol Chem 268:19744-19752

Tang X, Halleck MS, Schlegel RA, Williamson P (1996) A subfamily of P-type ATPases with aminophospholipid transporting activity. Science 272:1495-1497

Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673-4680

Van Huysse JW, Lingrel JB (1993) Nonpolar amino acid substitutions of potential cation binding residues Glu-955 and Glu-956 of the rat alpha

1 isoform of Na+, K+-ATPase. Cell Mol Biol Res 39:497-507

Von Heijne G (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225:487-494

Zhang Y, Fillingame RH (1995) Changing the ion binding specificity of the Escherichia coli H+-transporting ATP synthase by directed mutagenesis of subunit c. J Biol Chem 270:87-93

FIGURES

Fig. 1.

Graphic presentation of various P-type ATPase sequences.

In the upper part of the protein the conserved segments are shown (in white), while in the lower part, predicted heavy metal binding sites are shown (in grey) together with predicted trans membrane segments (in white). The proteins are aligned at the conserved segment A (shown in white) common to all P-type ATPases. The length of each ATPase is given at the end. At the bottom an example shows how the core sequences were extracted. These were subsequently used for the phylogenetic analysis. The numbers to the left correspond to the numbers used in Table 1. The prediction of transmembrane segments was performed according to von Heijne (1992) but has only in a few cases been verified by other methods.

Core sequences in 20 P-type ATPases

Fig. 2.

A model of Na+/K+-ATPase showing the localization of core segments relative to transmembrane segments and conserved amino acid positions in Na+/K+- and H+/K+-ATPases.

Core segments are shown in black. The sequences of 26 cloned Na+/K+-ATPases and 11 cloned H+/K+-ATPases were aligned and amino acid identities between and within the two groups were determined. The empty circles represent amino acid positions which are variant. The amino acids shown in circles are identical among all Na+/K+- and H+/K+-ATPases. The squares show positions where amino acids are identical within one of the two groups with the Na+/K+-ATPases shown first. A dash is shown in the squares if the position is only identical in one of the two groups and the same amino acid or a conservative substitution is not present at that position in more than 20% of the sequences of the other group. A dot is shown if all sequences in the group have a gap at this position compared to the other group. The amino acid stretches where several squared positions are present are placed in frames. The membrane topology and numbering of residues correspond to the human Na+/K+-ATPase alpha

1 isoform (seq. 104 in Table 1). The prediction of transmembrane segments was performed according to von Heijne (1992) and resembles with respect to the number of transmembrane spans the prediction by Lingrel and Kuntzweiler (1994).

Schematic figure of a type IIC ATPase

Fig. 3.

Alignment of conserved segments of the 20 P-type ATPases shown in Fig. 1.

Repetitive lowercase letters on top indicate name of conserved segment. Amino acids present in at least 135 out of 159 analyzed sequences (Table 1) are in bold. Before and after each of the conserved segments the position in the original sequence is given. Numbers at top indicate the position of amino acids when conserved segments are arranged in a linear core sequence. Sequence numbers correspond to the numbers used in Table 1.

                  1          13             25          26                45          46                      71          72             
Type Segments     aaaaaaaaaaaaa   aaaaaaaaaaaa          bbbbbbbbbbbbbbbbbbbb          cccccccccccccccccccccccccc          ddddddddddddddd
IA   Seq.  20 122 PADQLRKGDIVLV   EAGDIIPCDGEV 146..153 VDESAITGESAPVIRESGGD 172..177 TGGTRILSDWLVIECSVNPGETFLDR 202..252 VTVLVALLVCLIPTT 
IB   Seq.  27 153 PISELKMGDQVLV   KPGELVPTDGLV 177..184 LNQASITGESMPVEKAIGDE 203..205 FAGTINGNGVLRLKIHQPPESSLIQR 230..283 IYRALIFLVVASPCA 
IB   Seq.  41 262 PASELKKRQRFVT   RPGETIAADGVV 286..293 IDMSAMTGEAKPVRAYPAAS 312..314 VGGTVVMDGRLVIEATAVGADTQFAA 339..391 FSVTLGVLVIACPCA 
IB   Seq.  48 427 PIELLQVNDIVEI   KPGMKIPADGII 451..458 IDESLMTGESILVPKKTGFP 477..479 IAGSVNGPGHFYFRTTTVGEETKLAN 504..572 LQTATSVVIVACPCA 
IB   Seq.  49 838 DVELVQRGDIIKV   VPGGKFPVDGRV 862..869 VDESLITGEAMPVAKKPGST 888..890 IAGSINQNGSLLICATHVGADTTLSQ 915..989 FQASITVLCIACPCS 
IIA  Seq. 154 149 PVAGLVPGDLILL   EAGDQVPADARL 173..181 VKESALTGEAEAVQKLADQQ 200..215 FQGTEVLQGRGQALVYATGMNTELGR 240..292 LSVGLSMAVAIVPEG 
IIA  Seq. 141 141 KAKDIVPGDIVEI   AVGDKVPADIRL 165..175 VDQSILTGESVSVIKHTDPV 194..209 FSGTNIAAGKAMGVVVATGVNTEIGK 234..296 FKIAVALAVAAIPEG 
IIB  Seq. 123 202 PVAEIVVGDIAQV   KYGDLLPADGIL 226..234 IDESSLTGESDHVKKSLDKD 253..257 LSGTHVMEGSGRMVVTAVGVNSQTGI 282..410 FIIGITVLVVAVPEG 
IIC  Seq. 104 181 NAEEVVVGDLVEV   KGGDRIPADLRI 205..213 VDNSSLTGESEPQTRSPDFT 232..244 FFSTNCVEGTARGIVVYTGDRTVMGR 269..321 VIFLIGIIVANVPEG 
IIIA Seq.  61 127 PAKELVPGDVVRI   RIGDIVPADIIL 151..159 VDESALTGESLPVEKKIGDI 178..180 YSGSIVKKGEMTGIVKATGLNTYFGK 205..256 AQFALVLAVSAIPAA 
IIIA Seq.  86 192 PANEVVPGDILQL   EDGTVIPTDGRI 216..225 IDQSAITGESLAVDKHYGDQ 244..246 FSSSTVKRGEGFMVVTATGDNTFVGR 271..323 LRYTLGITIIGVPVG 
IIIA Seq.  68 144 EAAILVPGDIVSI   KLGDIIPADARL 168..176 VDQSALTGESLPVTKHPGQE 195..197 FSGSTCKQGEIEAVVIATGVHTFFGK 222..274 IDNLLVLLIGGIPIA 
IIIB Seq.  59 175 PIDQLVPGDIIKL   AAGDMIPADLRI 199..207 VAQASLTGESLPVEKAATTR 226..241 FMGTTVVSGTAQAMVIATGANTWFGQ 266..318 ALFALSVAVGLTPEM 
IV   Seq.   9 266 PSKDLKVGDLIKV   HKGDRIPADLVL 290..302 IKTDQLDGETDWKLRVACPL 321..364 VDNTLWANTVLASSGFCIACVVYTGR 389..444 ILRYLILFSTIIPVS 
IV   Seq.  14 147 HWEKVNVGDIVII   KGKEYIPADTVL 171..183 IETSNLDGETNLKIRQGLPA 202..247 ADQILLRGAQLRNTQWVHGIVVYTGH 272..344 FLTFIILFNNLIPIS 
IV   Seq.  11 164 EWRYILVGDFVHI   SNNQDVPADIIL 188..200 IETCNLDGETSLKQRMVPAK 219..267 KENMLLRGSRIKNTTFVEGIVVYAGH 292..367 IGAFFINYQLLVPIS 
V    Seq.   4 112 SSSELVPGDIYEI   KNNMTIPCDTII 136..143 MSEHMLTGESVPIHKERLPF 162..363 IKYNNKEENRILGLVIKTGFITTKGK 388..440 IIKCLDIITDAIPPA 
V    Seq.   2 258 GSDQLVPGDILLI+  PHGCLMQCDSVL 283..290 VNESVLTGESVPITKVALTD 308..332 LQTRFYRGKKVKAIVLRTAYSTLKGQ 357..409 IVRSLDIITITVPPA 
V    Seq.   1 565 SSSELVPGDIYEV++ PNITILPCDSIL 591..598 VNESMLTGESVPVSKFPATE 617..649 RARIAPGQTAALAMVVRTGFSTTKGS 672..724 ILRALDIITIVVPPA 
V    Seq.   6 268 QTNELLPMDLVSI+++AEESAIPCDLIL 295..302 VNEAMLSGESTPLLKESIKL 321..355 SDIPPPPDGGALAIVTKTGFETSQGS 380..432 ILDCILIITSVVPPE 

                      95            96                               131            132                             166                   
Type Segments  ddddddddd            eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee            fffffffffffffffffffffffffffffffffff ffffffffffffffffff
IA   Seq.  20  IGGLLSAIG  275...286 NVIATSGRAVEAAGDVDVLLLDKTGTITLGNRQASE  321...432 VVVEGSRVLGVIALKDIVKGGIKEAFAQLRKMGIK TVMITGDNRLTAAAIAA
IB   Seq.  27  LMASIMPAL  306...317 GILFKNGAQLERIGRVRVIAFDKTGTLTTGKPEVVN  352...451 WVAYAGEILGLIAVADTVRPTAAQAIARLKRLGIE+IVMLTGDNSRTAHSIAQQ
IB   Seq.  41  LGLATPTAM  414...425 GIFIKGYRALETIRSIDTVVFDKTGTLTVGQLAVST  460...560 FVSVDGVVRAALTIADTLKDSAAAAVAALRSRGLR TILLTGDNRAAADAVAAQ
IB   Seq.  48  LGLATPTAI  595...606 GVLIKGGEVLEKFNSITTFVFDKTGTLTTGFMVVKK  641...748 YVSVNGHVFGLFEINDEVKHDSYATVQYLQRNGYE TYMITGDNNSAAKRVARE
IB   Seq.  49  LGLATPTAV 1012..1023 GILIKGGEPLEMAHKVKVVVFDKTGTITHGTPVVNQ 1058..1215 LVAVDDELCGLIAIADTVKPEAELAIHILKSMGLE VVLMTGDNSKTARSIASQ
IIA  Seq. 154  LPAVITVAL  315...326 ESLIRRLPAVETLGSVTTICSDKTGTLTQNKMVVQQ  361...548 DAETDLTWLGLMGQIDAPRPEVREAVQRCRQAGIR TLMITGDHPLTAQAIARD
IIA  Seq. 141  LPAVITTCL  319...330 NAIVRSLPSVETLGCTSVICSDKTGTLTTNQMSVCR  365...585 KYETNLTFVGCVGMLDPPRIEVASSVKLCRQAGIR VIMITGDNKGTAVAICRR
IIB  Seq. 123  LPLAVTISL  433...444 NNLVRHLDACETMGNATAICSDKTGTLTMNRMTVVQ  479...657 EILTELTCIAVVGIEDPVRPEVPDAIAKCKQAGIT VRMVTGDNINTARAIATK
IIC  Seq. 104  LLATVTVCL  344...355 NCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAH  390...578 FPIDNLCFVGLISMIDPPRAAVPDAVGKCRSAGIK VIMVTGDHPITAKAIAKG
IIIA Seq.  61  MPAVLSITM  279...290 DAIVKKLVAIEELAGVDILCSDKTGTLTKNQLVCGE  325...442 YKNGRWHFAGIIPLYDPPREDAPLAVKKIKELGVI IKMVTGDHVAIAKNIARM
IIIA Seq.  86  LPAVVTTTM  346...357 QAIVQKLSAIESLAGVEILCSDKTGTLTKNKLSLHE  392...519 RGEGHWEILGVMPCMDPPRDDTAQTVSEARHLGLR VKMLTGDAVGIAKETCRQ
IIIA Seq.  68  MPTVLSVTM  297...308 GAITKRMTAIEEMAGMDVLCSDKTGTLTLNKLSVDK  343...472 SPGAPWEFVGLLPLFDPPRHDSAETIRRALNLGVN VKMITGDQLAIGKETGRR
IIIB Seq.  59  LPMIVTSTL  341...352 KVIVKHLDAIQNFGAMDILCTDKTGTLTQDKIVLEN  387...530 ADESDLILEGYIAFLDPPKETTAPALKALKASGIT VKILTGDSELVAAKVCHE
IV   Seq.   9  LRVNLDLAK  467...482 ETIVRTSTIPEDLGRIEYLLSDKTGTLTQNDMQLKK  517...735 YLEHDLELLGLTGVEDKLQKDVKSSIELLRNAGIK IWMLTGDKVETARCVSIS
IV   Seq.  14  LLVTLEVVK  367...388 AAMARTSNLNVELGQVKYIFSDKTGTLTCNVMQFKK  423...630 LIEKNLQLLGATAIEDKLQDQVPETIETLMKADIK IWILTGDKQETAINIGHS
IV   Seq.  11  LYITVEIIK  390...411 AIDCRSLSIPEELGTVTHVLSDKTGTLTENMMIFRN  446...798 ELETNLKLSGVTGIEDRLQDGVPDTLRALRDAGIQ VWVLTGDKLETAQNIATS
V    Seq.   4  LPTTLTVGI  463...475 SISCLCPHKINIAGQINTMVFDKTGTLTENNLQFIG  510..1004 FVESNLHFLGFLIFTNNMKKNAPDIIHNLQTSGCQ CIMSTGDNVLTSIHVAKK
V    Seq.   2  LPAAMSVGI  433...444 EIFCISPSTINTCGAINVVCFDKTGTLTEDGLDFHV  479...680 AVECDLEMLGLIVMENRVKPVTLGVINQLNRANIR TVMVTGDNLLTGLSVARE
V    Seq.   1  LPATLTIGT  749...760 GIFCISPTRLNISGKIDVMCFDKTGTLTEDGLDVLG  795..1014 EVESNLEFLGFIIFQNKLKKETSETLKSLQDANIR TIMCTGDNILTAISVGRE
V    Seq.   6  LPMELTMAV  455...466 YVYCTEPFRIPFAGRIDVCCFDKTGTLTGEDLVFEG  501...658 DVESELTFNGFLIFHCPLKDDAIETIKMLNESSHR SIMITGDNPLTAVHVAKE

                188            189             207                             234            235                          265     
Type Segments  ffff            ggggggggggggggggggg     ggggggggggggggggggggggggggg            hhhh hhhhhhhhhhhhhhhhhhhhhhhhhhh       Total
IA   Seq.  20  AGVD  488...490 FLAEATPEAKLALIRQYQA     EGRLVAMTGDGTNDAPALAQADVAVAM  535...538 GTQA AKEAGNMVDLDSNPTKLIEVVHIGKQM  568...682
IB   Seq.  27  VGVN  508...509 VYAELLPEDKVDVIRQLQK+    QYQSVAMVGDGINDAPALAQASVGIAM  555...559 GSDV ALETADIVLMADRLERLEHAIRLGRRA  589...642
IB   Seq.  41  VGID  616...618 AVADMLPEGKVDVIQRLRE     EGHTVAMVGDGINDGPALVGADLGLAI  663...666 GTDV ALGAADIILVRDDLNTVPQALDLARAT  696...752
IB   Seq.  48  VGIS  804...808 VYSDVSPTGKCDLVKKIQD++   GNNKVAVVGDGINDAPALALSDLGIAI  855...858 GTEI AIEAADIVILCGNDLNTNSLRGLANAI  888..1004
IB   Seq.  49  VGIT 1271..1273 VFAEVLPSHKVAKVKQLQE     EGKRVAMVGDGINDSPALAMANVGIAI 1318..1321 GTDV AIEAADVVLIRNDLLDVVASIDLSRKT 1351..1500
IIA  Seq. 154  LGIT  604...633 VYARVAPEHKLRIVESLQR     QGEFVAMTGDGVNDAPALKQANIGVAM  678...682 GTDV SKEASDMVLLDDNFATIVAAVEEGRIV  712...926
IIA  Seq. 141  IGIF  641...674 CFARVEPSHKSKIVEFLQS     FDEITAMTGDGVNDAPALKKAEIGIAM  719...722 GTAV AKTASEMVLADDNFSTIVAAVEEGRAI  752..1042
IIB  Seq. 123  CGIL  713...752 VLARSSPTDKHTLVKGIID+++++HRQVVAVTGDGTNDGPALKKADVGFAM  802...806 GTDV AKEASDIILTDDNFTSIVKAVMWGRNV  836..1205
IIC  Seq. 104  VGII  634...689 VFARTSPQQKLIIVEGCQR     QGAIVAVTGDGVNDSPALKKADIGVAM  734...738 GSDV SKQAADMILLDDNFASIVTGVEEGRLI  768..1023
IIIA Seq.  61  LGIG  498...530 GFAEVFPEHKYKIVDSLQK     RGHLVAMTGDGVNDAPALKKADCGIAV  575...578 ATDA ARAAADIVLLSPGISVIVDAIQEARRI  608.. 805
IIIA Seq.  86  LGLG  575...606 GFAEVFPQHKYRVVEILQN     RGYLVAMTGDGVNDAPSLKKADTGIAV  651...654 ATDA ARSAADIVFLAPGLSAIIDALKTSRQI  684...919
IIIA Seq.  68  LGMG  528...560 GFAGVFPEHKYEIVKKLQE     RKHIVGMTGDGVNDAPALKKADIGIAV  605...608 ATDA ARGASDIVLTEPGLSVIISAVLTSRAI  638...948
IIIB Seq.  59  VGLD  586...613 LFARLTPMHKERIVTLLKR     EGHVVGFMGDGINDAPALRAADIGISV  658...661 AVDI AREAADIILLEKSLMVLEEGVIEGRRT  691...898
IV   Seq.   9  AKLI  791...853 IACRCTPQQKADVALVIRK+    TGKRVCCIGDGGNDVSMIQCADVGVGI  899...902 KEGK+ASLAADFSITQFCHLTELLLWHGRNSY  933..1151
IV   Seq.  14  CKLR  686...756 ICCRVSPLQKSEVVEMVKK+    VKVITLAIGDGANDVSMIQTAHVGVGI  802...805 NEGL+AANSSDYSIAQFKYLKNLLMVHGAWNY  836..1149
IV   Seq.  11  SGLF  854...908 LCYRMTPSEKATIVNTVKK+    IKGNVLAIGDGANDVPMIQAAHVGIGI  954...957 KEGL+AAMACDFAIARFKFLSRLLLVHGHWSY  988..1454
V    Seq.   4  CGII 1060..1730 VYARMKPKDKSDLILSLKK++   NNSYVGMCGDGANDCLALSCADIGISL 1777..1779 NNNE SSICSSFTSNKLCLHSIVHILIEGRAS 1809..1956
V    Seq.   2  CGII  736...852 VFARMAPDQKQSLVEQLQQ     IDYTVAMCGDGANDCAALKAAHAGISL  897...898 SDAE ASIAAPFTSKVPDIRCVPTVISEGRAA  928..1187
V    Seq.   1  AGLI 1070..1159 IYARMSPDEKHELMIQLQK     LDYTVGFCGDGANDCGALKAADVGISL 1204..1205 SEAE ASVAAPFTSKIFNISCVLDVIREGRAA 1235..1472
V    Seq.   6  VGIV  714...788 VYARVSPSQKEFLLNTLKD     MGYQTLMCGDGTNDVGALKQAHVGIAL  833...954 KLGD ASCAAPFTSKLANVSAVTNIIRQGRCA  984..1215

If Fig. 3 is not formatted perfectly by your browser try to click here.

Fig. 4.

Different phylogenetic trees produced from the dataset.

A total of 14 trees were constructed, in each case with a different input order of sequences into the Protdist and Fitch programmes. The number above the trees indicates how many times each tree occurred. Only major families are shown, indicated by hatched areas.

Two different phylogenetic trees

Fig. 5.

Consensus sequences in core segments of Type I - V P-type ATPases.

Repetitive lowercase letters on top indicate name of conserved segment. Amino acids identical in all members of given Type of P-type ATPases (Table 1) are in bold. The consensus amino acids shown in italic lowercase letters are positions where the sequences have one of two conserved amino acids I or V; D or E; F or Y. Numbers at top indicate the position of amino acids when conserved segments are arranged in a linear core sequence. Consensus sequences at top indicating amino acids present in 158, 135, and 90 sequences, respectively, are shown.

            1                      25                   45                         71                       95
Segments    aaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbb cccccccccccccccccccccccccc dddddddddddddddddddddddd
All 158/159 ---------------------D--- -------GE----------- -------------------------- ------------P----------- 
All 135/159 -------GD------G--iP-D--- ------TGE----------- --------G---------G------- ------------P----------- 
All 090/159 -----V-GDIv-v--Gd-iPAD--- vD-S-LTGES-PV-K----- f-G-----G-----V---G--T--G- ----v-i-V--iP--L---v---- 
IA     3/ 3 ----L---D---V--G--IP-DGE- VDESAITGESAPV--E-G-D TGGT---SD----------G--F-DR ---L-ALLV-LIPTTIG-LLSAIG 
IB    23/35 -------GD---V-PG--i--DG-v -D-S--TGE--PV-K--G-- --G--N--G---v--------T---- ------VLvIACPCALGLATP--- 
IB    28/35 ------------v--G-----DG-v ------TGE--Pv------- --G-----G----------------- ------VLvi-CPC-L----P--- 
IB    35/35 ---------------------DG-- -------GE----------- -------------------------- ------------P----------- 
IIA   17/26 -A-eLVPGDIV---VGD-VPAD-R- vd-S-LTGES--V-K--e-- F-GT----G----vVv-TG--TE-G- F-IAVALAVAAIPEGLPAViT-CL 
IIA   21/26 -----VPGD-V----G--vPAD-R- v--S-LTGE---V-K----- f-GT----G----vV--TG--T--G- --i-V--AVA-IPEGLP-ViT--L 
IIA   26/26 -------GD-v----G----AD--- -----L-GE---v-K----- f-------G---------------G- -------AV--iPEGLP--iT--L 
IIB    8/12 -v-d--VGDI-----GD--PADGv- IDESSLTGESd-v-K--d-- LSGT-V-EGSG-M-VTAVGvNS--G- FII-VTv-VVAVPEGLPLAVT-SL 
IIB   10/12 -v----VGDi-----GD--PADG-- iDESS-TGES--v-K----- -SGT-V-eG-G-M-vT-VG--S--G- FI--vTv-VVAVPEGLPLAVT--L 
IIB   12/12 -------GDi-----G----AD--- ----S--GE----------- -SG-----G-----v--VG-----G- ----v-v--v-vPE-LPL-v---L 
IIC   14/21 -AEieVVGD-VEvKGGDRiPAD-Ri VDNSSLTGESEPQ-RS-efT FFST---EGTA-GiVI-TGD-Tv-GR vIFLIGIIVANVPEGLLATVTVCL 
IIC   17/21 -----V-GD-VEvK-GD-iPAD-Ri VDNSSLTGESEPQ-R--e-T FfST---EGT--GiVI--GD-Tv-GR viF-IGIIVANVPEGLLATVTV-L 
IIC   21/21 -------G--V----G--iP---R- VD-SSLTGESe-Q------- -------e----G-vi--G-----G- -i-----iVA-VPEG---TvT--- 
IID    3/ 3 -S---V-GD------GD--PADLRL TDE-LLTGESLPV-KD---- -SSS-V-KGRA-GI---T-----IG- -IYA-----S-IP-SL--VL-ITM 
IIIA  21/32 eA--LVPGDIi-i--G-IiPAD-RL iDQSALTGESLPV-K--GD- fSGST-K-GE---VV-ATG--TFFG- ----LV-LI-GIPIA-P-V---TM 
IIIA  26/32 -A---VPGDI-----G-iiPAD-R- iDQSA-TGESL-V-K--Gd- fS-S--K-GE---vV-ATG--TF-G- ----L---I-GiPi--P-V---TM 
IIIA  32/32 -----V-Gd------G-----D--- iD----TGES---------- ---S----GE----v--T---Tf-G- --------i---P-----V---T- 
IIIB   3/ 3 PI--LVPGD---LAAGD--PAD-R- --Q--L-GESLPVEK----- -MGT-V-SG-AQA-V-ATG--TWFG- -LFAL-VAVGLTPEMLPMIV-S-L 
IV     7/11 -W--v-VGDiV-i---d-IPAD-iL IET--LDGETNLK-R----- -d--LLRG--L-NT--v-GiVv-TG- ---fvILf---vPISL-V--E-iK 
IV     9/11 ----v-VGD-v-------iPAD-iL i-T--LDGET--K------- ----------L-------G-Vv--G- -----I-f---vPISL-v--e--K 
IV    11/11 -------GD-v---------AD--- i-T--LDGET--K------- --------------------V---G- -----------vP--L-v-----K 
V      5/ 7 ---ELVPGDI---------PCD-iL V-E-MLTGESVPi-K----- --------------ViRTGF-T-KG- I---LDIIT--vPP-LP----v-i 
V      6/ 7 ------P-DI---------PCD--- V-E--LTGESVP--K----- --------------V--TGF-T--G- ------I-T--vPP-LP------- 
V      7/ 7 --------D------------D--- --E----GE-----K----- --------------V--T-f----G- ------I-----PP--P------- 

            96                               131                                                       188
Segments    eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee fffffffffffffffffffffffffffffffffffffffffffffffffffffffff
All 158/159 ---------------------DKTGT---------- ----------------------------------------GD--------------- 
All 135/159 ----------E----------DKTGTLT-------- ---------------D---------i------G------TGD----A-------G-- 
All 090/159 ---V----AvE-LG----iCSDKTGTLT-N---V-- ---------G-----DPPR------i------GI-V-MiTGD---TA-AIA---Gi- 
IA     3/ 3 NV-ATSGRAVEA-GD---L-LDKTGTITLGNR-A-- ---------GV--LKDI-K-GI-E-F---R-M---TVM-TGDN--TA--IA-EAGVD 
IB    23/35 GiLiK----LE-------v-FDKTGTLT-G---V-- ------------Av-D---------i--L---G-----LTGDN---A-AiA---GI- 
IB    28/35 GiL-K----LE-------v--DKTGTLT-G---V-- ---------------D---------i--L---G------TGD----A--iA---G-- 
IB    35/35 ---------------------DKTGT---------- ----------------------------------------GD--------------- 
IIA   17/26 NAiVR-LPSVETLGC--VICSDKTGTLTTN-M-V-- --E--L---G-vG--DPPR-EV--AI--C--AGIRV-MITGD---TA-AI---IG-- 
IIA   21/26 -A--R-LP-VETLG---VICSDKTGTLT-N-M-V-- -----L---G--G--DPPR--V---i--C--AGI-v--ITGD---TA-AI----G-- 
IIA   26/26 ------L--VE-LG----iC-DKTGTLT---M---- ---------G-----D--R--------------i-----TGD---TA--i------- 
IIB    8/12 NNLVRHL-ACETMG-AT-ICSDKTGTLT-N-MTVV- -----LT-I--VGI-DPVRPEVP-A---C--AGITVRMVTGDNI-TARAIA--CGIL 
IIB   10/12 -NLVR-L--CETMG-AT-ICSDKTGTLT-N-M-VV- ------------GI-DP-R--V--A---C--AGI-VRMVTGDN--TA-AIA--C-I- 
IIB   12/12 --LVR----CETM------C-DKTGTLT-N-M---- ------------GI-D--R--V------C--AG--VRMV-GDN--TA--IA--C-I- 
IIC   14/21 NCLVKNLEAVETLGSTS-ICSDKTGTLTQNRMTVAH FP---LCFVGL-SMIDPPRA-VPDAV-KCRSAGIKVIMVTGDHPITAKAIAK-VGII 
IIC   17/21 NCLVKNLEAVETLGSTS-ICSDKTGTLTQNRMTVAH FP---L-F-GL-SMIDPPR--VPDAV-KCRSAGIKVIMVTGDHPITAKAIA--VGII 
IIC   21/21 ------L--vETLGS---I-SDKTGTLTQNRMTV-H F------F--L-S---PPR--V--AV--C---GI-ViMVTGDHPITA-AIA--V-II 
IID    3/ 3 -V-VR-L--LEALG-V-DICSDKTGT-TQG-M--R- --E--L-F--L-GIYDPPR-E--GAV---H-AGI-VHMLTGD---TAKAIA-EVGI- 
IIIA  21/32 -AI-----AIEE-AG-dvLCSDKTGTLTLNKLSv-- ---G-W---G--P--DPPRHD-AeTI--A--LGv-VKMiTGD---I-KET-R-LGMG 
IIIA  26/32 -AI-----AIE--AG-dvLCSDKTGTLT-NKL---- -----W---G--P--DPPR-D---TI--A--LG--VKM-TGD---I-KET-R-LG-G 
IIIA  32/32 --I------iE-------LC-DKTGTLT-N------ -----W---------DPPR-D----i------G--vKM-TGD---I-------L--- 
IIIB   3/ 3 KVIVK-L-AIQNFGAMD-LCTDKTGTLTQD-I-LE- -DE--L--EG---FLDPPKE----A--AL---G--VK-LTGD---V-A--C-EVG-D 
IV     7/11 ----R---I-EELGQ-EYIFSDKTGTLT-N-M-FKK -IE--L-LLG-T-IED-LQDGV-d-IE-L--AGIKiW-LTGDK-ETAINIG-S--L-
IV     9/11 ----------eeLG---Yi-SDKTGTLT-N-M-F-K --E--L-L-G-T-iED-LQ--V------L--AGI--W-LTGDK-ETA--I--S--L-
IV    11/11 -----------eLG-------DKTGTLT-N-M---- --E----L---T-iED-LQ--V---------A-I--W-LTGD--ETA--i-------
V      5/ 7 -IfC--P--i---G-I---CFDKTGTLTED-L---G --E--L-FLGF-i--N-LK--T---I--L-------iMiTGDN-LT---V--E-GIi 
V      6/ 7 -I-C--P------G-I---CFDKTGTLT-d-L---- --E--L---G--i-----K------I-----------M-TGDN-LT---V------i 
V      7/ 7 -i----P------G-i----FDKTGTLT-------- --E------G--i-----K---------------------GDN--T---v------- 

                                                       234                             265
Segments    gggggggggggggggggggggggggggggggggggggggggggggg hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
All 158/159 ------P--K-----------------GDG-ND------------- -------------------------------
All 135/159 ------P--K---v-------------GDG-ND-P-L--A-iG--- ---------d---------------------
All 090/159 vFA---P--K--iV--LQ-----VAMTGDGVNDAPALK-ADiGIA- GTDvA--AAD-vL-d-----Iv-Av---R-i
IA     3/ 3 F-AEATPE-K---I---Q--G-LVAMTGDGTNDAPALAQA-V--AM GTQAA-EA-NMVDLDS-PTKLI--V-IGKQ-
IB    23/35 v-A---P-dK---i--L---G--VAMVGDGINDAPALA-A--GiA- GTDvA---ADi-L----L--v-----L---T
IB    28/35 ------P--K---i---------V-MVGDGiNDAP-LA-A--G-A- G-DvA---AD--L----L-------------
IB    35/35 ------P--K-----------------GDG-ND------------- -------------------------------
IIA   17/26 -F-R--P-HK--IV--L----ei-AMTGDGVNDAPALK-AdIGIAM GT-VAK-ASdMVL-DDNF-TIV-AV-EGR-I
IIA   21/26 -F-R--P-HK---V--L-------AMTGDGVNDAPALK-AdIGiAM GT-VAK-A-dMvL-DDNF-TIV-Av-EGR-I
IIA   26/26 ---R-----K---v----------AMTGDG-N---AL--------- ---V--------L-DD-F--i--A---G--i
IIB    8/12 VLARSSP-DK-TLV----d---VVAVTGDGTND-PALK-ADVGFAM GTdVAKEASDIIL-DDNF-SIV-AV-WGRNV
IIB   10/12 V-ARS-P-DK--LV--------VVAVTGDGTND-PALK-ADVG-AM GTdVAKeASDiI--DDNF--IV--v-WGR-V
IIB   12/12 V-AR--P-DK------------vVAVTGDG-ND-PAL-----G--M GT-VA-----ii---D-F--IV------R-v
IIC   14/21 VFARTSPQQKLIIVEGCQR-GAIVAVTGDGVNDSPALKKADIGVAM GSDVSKQAADMILLDDNFASIVTGVEEGRLI
IIC   17/21 VFARTSPQQKLIIVEGCQR---iVAVTGDGVNDSPALKKADIGvAM GSD--K-AADMILLDDNFASIVTGVEEGRLI
IIC   21/21 VFARTSP-QK--IVE--Q-----V-VTGDG-ND-PAL-KADIGvAM G-D--K--AD-iLL-DNFAS-V-GvE-GR-I
IID    3/ 3 VI-RC-PQTKV-MIEALHRR--F--MTGDGVNDSPSLK-ANVGI-M GSDV-K-ASDIVL-DDNF-SI-NA-EEGRRM
IIIA  21/32 GFA-VFPEHKY-IV--LQ-R---V-MTGDGVNDAPALKKAD-GIAV ATDAAR-A-DIVLT-PGLS-II-A--TSR-I
IIIA  26/32 GFA-VFP-HKY-iV--LQ-------MTGDGVNDAP-LKKAD-GIAV ATDAAR-A-DIV---PGLS-II-A---SR-I
IIIA  32/32 GF----P--Ky--------------MTGDGVND-P-LK-A--GiA- --DAAR---D-V----G---ii-A----R-i
IIIB   3/ 3 -FARLTP--K-RI---L---GH-VGF-GDGINDAPALR-AD-GISV A-DIA-E--DIILLEK-LMVLEEGVI-GR-T
IV     7/11 iCCR-SP-QKA-vV--v-------LAIGDG-NDV-MIQ-A-VGvGI -EG-A----DY-I-QF--L--LLLVHGR--Y
IV     9/11 -CCR--P-QKA-vV--v--------AIGDG-NDV-MIQ-A-vGvGI -EG-A----Dy-I-QF-----L-LVHG---Y
IV    11/11 ---R--P--K----------------I--G-ND--MI--A--GvGI -EG-A----D-----F-----L---HG---Y
V      5/ 7 VyARM-P-QK--Li--L------V-MCGDGANDC-ALK-A-vGISL ---eASiAAPFTS----i--V--VI-EGR--
V      6/ 7 VyAR--P--K------L--------MCGDG-ND--AL--A--GI-L ---e-S----FTS-------v--v---GR--
V      7/ 7 vyAR--P--K------L----------GDG-ND--A---A--Gi-- ---e------f-S-------v--v---GR--

If Fig. 5 is not formatted perfectly by your browser try to click here.

Fig. 6.

Phylogenetic tree based on core sequences of 159 P-type ATPases.

The tree was constructed using the neighbour joining method. Some areas (which are not connected to the rest of the tree) have been enlarged 40 % to clarify the distribution of species. When the substrate specificity of the ATPases present in each family is known, it corresponds in all cases to the name of the family. The numbers of the sequences correspond to the numbers used in Table 1. The black branches show ATPases originating from bacteria and archaea, and the grey branches show ATPases originating from eukarya. The P-type ATPases from the fully sequenced organisms are shown with the following symbols:

: Escherichia coli;

: Haemophilus influenzae;

: Methanococcus jannaschii;

: Mycoplasma genitalium;

: Mycoplasma pneumoniae;

: Saccharomyces cerevisiae;

: Synechocystis PCC6803. The abbreviations are HM: heavy metals; NAS: no assigned specificity; PL: phospholipids.

Phylogenetic tree of 159 P-type ATPase sequences

Phylogenetic tree of 159 P-type ATPase sequences