paper html

№№ библиотеки Ивана

\bibitem{evo001} Romain A. Studera and Marc Robinson-Rechavi

How confident can we be that orthologs are similar, but paralogs differ?

Trends in Genetics, Volume 25, Issue 5, May 2009, Pages 210-216, doi:10.1016/j.tig.2009.03.004

Opinion

How confident can we be that orthologs are similar, but paralogs differ?

Romain A. Studer^a and Marc Robinson-Rechavi^a^,

^aDepartment of Ecology and Evolution, Biophore, Lausanne University, CH-1015 Lausanne, Switzerland and Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland

Available online 14 April 2009.

Homologous genes are classified into orthologs and paralogs, depending on whether they arose by speciation or duplication. It is widely assumed that orthologs share similar functions, whereas paralogs are expected to diverge more from each other. But does this assumption hold up on further examination? We present evidence that orthologs and paralogs are not so different in either their evolutionary rates or their mechanisms of divergence. We emphasize the importance of appropriately designed studies to test models of gene evolution between orthologs and between paralogs. Thus, functional change between orthologs might be as common as between paralogs, and future studies should be designed to test the impact of duplication against this alternative model.

Article Outline

The relationship between gene duplication and gene function

Asymmetric rates of sequence evolution within and between species

Divergence in gene expression

Changes in protein function

Large-scale amino acid studies

Positive selection

The importance of study design in evolutionary genomics

Which study to test which evolutionary model?

The relationship between gene duplication and gene function

Understanding how genes acquire new functions is necessary if we are to have a more complete understanding of molecular evolution. Particular attention has been given to gene duplication because it is often assumed that changes in gene function are preferentially associated with duplication [1]. This means that it is important to distinguish orthologs (see Glossary) from paralogs [2] and [3] (Figure 1) because experimental information concerning one gene should be readily generalized to all its orthologs. The distinction between orthology and paralogy has been emphasized in recent years through the revival of interest in gene duplication [1] and [4] and the emergence of comparative approaches as a major tool of genomics [5]. The idea that orthologs share similar functions, whereas paralogs have different functions, has thus become accepted by many and is the standard textbook model, as exemplified by the ‘Phylogenetics Factsheet’ of the National Centre for Biotechnology Information (NCBI) (http://www.sciencedirect.com.lp.hscl.ufl.edu/science?_ob=RedirectURL&_method=externObjLink&_locator=url&_cdi=5183&_issn=01689525&_originPage=article&_zone=art_page&_plusSign=%2B&_targetURL=http%253A%252F%252Fwww.ncbi.nlm.nih.gov%252FAbout%252Fprimer%252Fphylo.html). Of note, only one-to-one orthologs should be expected to conserve function if duplication has an important impact, a distinction made in some resources [6] and[7].

Full-size image (20K)

Figure 1. Formation of orthologs and paralogs. The evolutionary tree shows six homologous genes from three species designated A, B and C. Genes are represented by circles and each color represents a different species; genes with paralogs are circled by a thicker line (only the gene in the A lineage does not have a paralog). Boxes at nodes represent duplication events. Duplication 1 produced paralogs α and β in the ancestor of B and C, whereas duplication 2 produced paralogs β₁ and β₂ in the C lineage. All genes from B and C are co-orthologs to the gene from A. Genes α and β are in-paralogs relative to speciation 1, but are out-paralogs relative to speciation 2. Genes β₁ and β₂ are in-paralogs relative to both speciations in the tree. Genes Bα and Cα are one-to-one orthologs.

The focus on duplication has also led to the elaboration of theoretical models of evolution after duplication and their testing in genome wide studies [1] and [8]. It is recognized that duplication does not always lead to changes in function. But the assumption that changes in function are commonly associated with duplication has rarely been explicitly tested. Although there have been many studies of comparative genomics focused on the role of duplication (for a review, see Ref. [1]), few have compared the evolution of paralogs with the evolution of orthologs. However, these studies repeatedly find little, if any, specific impact of duplication. This pattern is surprising if the standard model is correct.

This ‘standard model’ makes two predictions. First, paralogs are expected to diverge more per unit of time than orthologs. Second, paralogs are expected to diverge frequently in ways that are rarely observed between orthologs; for example, different substrate specificities. Divergence can concern different aspects of gene function [3], such as constraints on protein sequence or structure, patterns of expression, or participation in molecular networks. We contrast this to an ‘alternative model’, in which all homologs diverge approximately proportionally to time, whether they are paralogs or orthologs. First, distant homologs are expected to differ more than close homologs; notably, recent paralogs should have more similar function than ancient orthologs. Second, radical changes, for example in substrate specificity, are expected to be rare both between paralogs and between orthologs, but increasingly probable with time of divergence. This view is partially captured by the concept of in-paralogs [3] and [9] (Figure 1); according to this model, paralogs that have diverged recently are expected to share similar functions. But no change is still expected between orthologs. And continuous divergence of paralogs over time is not fully represented by two discrete classes, defined by a unique speciation event.

We examine the evidence that the evolution of gene function is different during the divergence of paralogs and orthologs. Our aim here is to highlight open questions, not to provide a comprehensive review on the nature of gene duplication. For recent reviews, please see Refs [1], [8] and [10]. We emphasize that the design of comparative genomics studies has a major impact on the conclusions that can be drawn. Comparing paralogs within a species can provide a measure of divergence after duplication, but cannot prove a specific role for duplication. Thus, although many studies report divergence between paralogs, their design does not usually enable the standard model to be tested.

Asymmetric rates of sequence evolution within and between species

One of the main themes of genomic studies of duplicate genes is that the copies do not evolve symmetrically [8]; that is, they do not evolve at the same rate. Protein sequences are known to evolve at different rates between paralogs [11], [12] and[13], providing a rough indication of the evolution of biochemical function. The rate of evolution of protein sequences has also been abundantly studied between orthologs, in which asymmetry is frequent. For example, proteins evolve faster in rodents than in other mammals [14], and rates are variable among insects [15]. More broadly, the hypothesis of a constant rate of evolution rarely seems to hold [16], which provides indirect evidence for a widespread asymmetry of rates of evolution. To our knowledge, the level of asymmetry has not been directly compared between paralogs and orthologs in a unified study.

The explanations for differences in the rates of evolution between species (i.e. when examining orthologs) have focused on life history traits, such as generation time and population size [14], whereas comparisons of paralogs have focused on functional change [1]. Although paralogs in the same species are not expected to differ in life history traits, asymmetry can be affected by non-functional differences between paralogs, such as differences in recombination [17]. And importantly, functional change could affect orthologs.

To take a ‘gene's eye’ view, the important distinction is between the following two mechanisms of increase in the rate of evolution: (i) a relaxation of negative selection; and (ii) an increase in positive selection. Both mechanisms can cause changes in gene function. And both can result from changes in population size, functional constraints, or other causes. Thus, from this perspective, the evolution of paralogs and orthologs can be affected by the same mechanisms.

Divergence in gene expression

The expression levels of paralogs also evolve asymmetrically, as reported in yeast[18] and [19], Arabidopsis [11] and [20] and Xenopus [21]. Comparing gene expression levels between different species is difficult because of differences in established experimental conditions, and in their anatomy or life cycles. For this reason, the putative regulatory elements are also often examined: if the regulatory sequence changes, then it is likely that the expression of the gene is also altered[22]. Thus, asymmetric divergence of cis-elements has been reported in teleost fish[23] and in yeast [19].

The extent of divergence in expression after duplication has rarely been compared with divergence in expression after speciation on a large scale. In general, duplicate genes show faster divergence of expression than singleton genes, as reported in mammals [24] and Drosophila [25]. Faster divergence for duplicate genes is also found for the divergence of cis-regulatory sequences in nematodes [22]. But the design of these studies cannot distinguish whether: (i) genes in which expression evolves rapidly are retained in duplicate; or (ii) gene expression evolves faster after duplication. This also highlights the importance of distinguishing one-to-one orthologs from orthologs with secondary duplication, that is, in-paralogs (Figure 1).

It has been suggested that functional divergence after duplication is mostly due to changes in expression [19]. Curiously, the same claim has been made for divergence between species [26], and genomic studies have reported significant divergence of expression patterns between orthologs. One of the most conservative studies, a comparison of human and mouse correcting for experimental differences between species and for estimation error [27], reports 16% of orthologs in which expression seems to diverge neutrally, whereas another third diverge in a detectable manner (d >0.02 in Figure 4 of Ref. [27]) despite purifying selection.

Changes in protein function

Although global changes in rates of protein evolution inform us that the process is not constant, they are not very informative about specific changes in function [3], [28]and [29]. For that, either specific proteins must be investigated in detail, or more specific classes of amino acid changes need to be compared. Small-scale studies have shown that functional divergence occurs both between orthologs and between paralogs, although they cannot provide a test for the relative importance of these events.

The nuclear hormone receptor superfamily provides classical examples of functional divergence between paralogs, but divergence between orthologs can also be found. For example, steroid receptors seem to have developed specialized functions after duplication [30]. However, in amphioxus, the ortholog of vertebrate estrogen receptor does not bind to estrogen, whereas the ortholog of other steroid receptors does [31]; in this case, paralogs share function, not orthologs. Nuclear receptor function can also change in the absence of any duplications. For example, the Drosophila ecdysone receptor differs from other insect orthologs in ligand binding and in dimerization pattern [32]. More broadly, there is mounting evidence that orthologous transcription factors are not always functionally equivalent [33].

Other well-studied protein families show a similar pattern, with some functional changes between paralogs, and others between orthologs. This is, for example, the case of the remarkable shifts in wavelength sensitivity between middle/long-wavelength-sensitive (M/LWS) pigments [34]. Changes in enzymatic activity have also been reported between orthologs, for example between lysozymes, with adaption to herbivorous diets [35], or between RubisCO enzymes in plants, with convergent adaptation to dry conditions [36].

Large-scale amino acid studies

Most small-scale studies show that changes in a few key protein sites have resulted in a change in function. More broadly, such changes in biochemical function can result from either: (i) a rare change in amino acid at a site that remains constrained under its new form; or (ii) a change in selective pressure (‘covarions’), that is, sites with different evolutionary rates in different parts of the phylogeny, owing to changes in functional role [28], [37] and [38]. For example, residue Trp348 is invariant in CED-3 caspases, where it is crucial for substrate specificity, whereas it is highly variable in paralogous ICE caspases [39]. Several methods to detect such changes have been developed explicitly to study duplication [37] and [40]. But studies of orthologs have also found evidence for ‘covarions’, for example, in functionally important sites between HIV-1 subtypes [41].

In the vertebrate hemoglobin family, similar proportions of ‘covarion’ sites were reported between orthologs and paralogs, but an excess of changes in constrained amino acids was found between paralogs [42]. This excess of rare changes was confirmed in a larger study of proteins domains [43], and provides some evidence for specific divergence after duplication. No large-scale study of ‘covarions’ has been carried out, to our knowledge.

Amino acid changes can also be classified as radical (e.g. a change in physicochemical properties) or conservative. Applying a simple model of evolution to 1821 proteins domains, the excess of radical replacement after duplication was found to be non-significant relative to speciation [43]. Similarly, no difference in the type of amino acid changes was found after whole genome duplication in yeast [13]. The application of more elaborate models to estimate radical changes in amino acids in five families of orthologous genes and five of paralogous genes found high variability in the substitution process but, again, no difference between paralogs and orthologs was found [44]. This similarity of patterns of amino acid substitutions between paralogs and orthologs implies that evolution of the molecular function of proteins might not follow the standard model of divergence after duplication.

Positive selection

A reasonable simplifying assumption to detect neofunctionalization is that it is driven, at least in part, by positive selection [1]. Thus, the detection of branch-specific episodes of positive selection could be indicative of changes in protein function after duplication [45] and [46]. In snake venom PLA₂ genes, positive selection has been linked with the evolution of toxin function both after duplication and after speciation[47]. We have recently reported a scan for such positive selection in vertebrate genes[48]. Although we found that evolution of one-third of paralogs has been shaped by positive selection after duplication, this was not more than that detected in the absence of duplication. We did find an excess of relaxed purifying selection after duplication. This relaxation might explain the patterns of divergence of constrained amino acid sites between paralogs [42] and [43], and of high asymmetry between paralogs [13].

Interestingly, positive selection has been increasingly detected between orthologs with the recent progress of data and methods [49], [50] and [51]. The functional implications of these observations are not yet clear, but such selection seems weak, although frequent [48], [51] and [52]. It could thus correspond to an accumulation of small changes, which might or might not result in large functional change over time[50]. An integration of the role of frequent positive selection between orthologs into the standard model (i.e. that most functional innovations are associated with duplication) remains to be performed.

The importance of study design in evolutionary genomics

An important issue for evaluating the impact of duplication on gene function is study design. Obviously, if only divergence between paralogs is inquired into, then divergence between orthologs cannot be reported (Figure 2i–iii; Table 1). The major role of design in the study of paralogs has previously been highlighted by the demonstration that a biased subset of genes is retained in duplicate [53]. This bias in gene retention (’Davis and Petrov effect’ [13]) can lead to confusion between the effect of duplication on genes and the effect of gene retention. Thus, we need to control for biased retention to conclude that there is an effect of duplication (e.g. see Refs [12] and [13]; Figure 2iv,v). Furthermore, we need to control for evolution in the absence of duplication (e.g. see Refs [43] and [48]; Figure 2viii,ix) to conclude that duplication has a specific effect. Because of biased gene retention, this control should include singleton orthologs of the duplicated genes.

Full-size image (99K)

Figure 2. Some designs for the study of gene duplication. Designs i–iii represent three strategies to study paralogs using information from a single species: (i)measuring the divergence between paralogs; (iii) contrasting the divergence of paralogs to the divergence of random gene pairs; (iii) contrasting the characteristics of paralogs to those of single copy genes (genes without a paralog detected in the same species). Designs iv–vii represent four ways of using outgroup species to determine more accurately the divergence of paralogs; note that, in all cases, these are in-paralogs relative to speciation with the outgroup: (iv) using singleton orthologs to determine the characteristics of genes retained as paralogs; (v) using pairs of singleton orthologs to determine the evolutionary rate of genes retained as paralogs; (vi) determining asymmetry of paralog divergence; (vii) contrasting divergence from the outgroup of paralogs versus singletons. Designs viii,ix represent a complete phylogenetic analysis, contrasting evolution of orthologs and paralogs in outgroups and ingroups; note that defining in-paralogs and out-paralogs in this case depends on the speciation used as reference, and that the paralogs are all co-orthologs to the singletons: (viii) comparison of branch specific evolutionary rates; (ix) comparison of functional characteristics of genes. The symbols used are the same as in Figure 1. Genes are represented by circles; each color represents a different species; genes with paralogs are circled by a thicker line. Boxes at nodes represent duplication events. The thick dashed arrows indicate which elements are compared to study the effect of duplication, whereas the thin dashed arrows indicate other comparisons included in this design. These examples of design show the importance both of using more species, and of defining the phylogenetic relationships between genes under study.

Table 1.

The impact of study design on tests of evolution after duplication

Full-size table

^a The numbering of study designs follows that outlined in Figure 2. All predictions must be understood as statistical (i.e. applicable to large datasets only).

^b ‘Functional’ indicates comparison of functional genomics data (e.g. microarrays, protein–protein interactions) or of functional annotations (e.g. Gene Ontology); ‘Sequence’ indicates sequence-based comparisons (e.g. dN/dS).

^c Limited here to DDC subfunctionalization; see Ref [1].

^d Corrected for divergence time (i.e. per Million years, or per dS).

^e In principle, positive selection should be expected, but this design provides very little power to detect it.

^f The outgroup also might have changed under this model.

^g Without an excess on branches after duplication.

^h For example, new expression domain, new interaction partner, not found in other homologs. The strength of inference depends on the number of homologs with functional data.

Which study to test which evolutionary model?

In Table 1 and Figure 2, we propose a classification of the design of studies of duplicate genes, with the predictions they enable being classified under three simplified models of evolution: (i) subfunctionalization after duplication; (ii) neofunctionalization after duplication; and (iii) the ‘alternative model’ of equal change after duplication or speciation. In this cartoon version of neo-functionalization, one copy evolves strictly in the same manner as a singleton, whereas the other acquires a new function by positive selection. Similarly, in this version of sub-functionalization, any gain of function is excluded. It is clear that more complex evolutionary fates are possible and probable [1] and [54], but our point here is that even such simplistic models, with very different expectations, will not be distinguished by inadequate study design. We represent the ‘alternative model’ by a version of neofunctionalization without the assumption that it occurs preferentially after duplication. It should be noted that this alternative model is voluntarily simplistic, for illustration purposes, and that we do not propose that it effectively describes gene evolution.

It is apparent that some designs do not enable any differential predictions, including the practice of comparing pairs of paralogs to random pairs within the same species (Figure 2ii). This does not mean of course that such studies are not useful. For example, protein–protein interaction data are available only for a handful of distantly related model organisms, which limits possible study designs. Although this has led to difficulty in defining clear evolutionary trajectories [55], notable results include biased retention of paralogs relative to network position [19], and the discovery that paralogs interact frequently with each other [55]. Such interactions could facilitate the evolution of new functions, for example if one member of a heterodimer pair loses the original function but retains the ability to interact, thereby becoming a specific repressor [31].

In a more complex design, a small but significant excess of amino acid changes on duplication branches was found in a phylogenomic study of Chordate genes [56]. But, curiously, the acceleration concerned both branches preceding and following the duplication. This suggests that, in the Chordate dataset analyzed, there could be simultaneous pressure for substitutions and duplication in some ‘diversifying genes’ [56]. That would make it difficult to disentangle retention bias and the effects of duplication in a design in which speciation branches were analyzed without taking into account the rest of the evolutionary history of each gene (i.e. frequent or rare duplications).

Not born equal

But what about the duplication events themselves? Different mechanisms result in the duplication of different functional categories of genes [57]. Moreover, a distinction should be made between mechanisms that are symmetric or not ‘at birth’ [58]: at one extreme, paralogs formed as a result of whole genome duplication are initially redundant in every aspect of their genomic context and organization, whereas at the other extreme, retrotransposed genes differ profoundly from their parent genes as soon as they are generated [10]. This has an obvious impact on the expectations of evolution after duplication.

A more subtle bias is that singleton genes in a lineage that experienced whole genome duplication have undergone a period of evolution as one member of a redundant pair; this might distinguish their evolutionary trajectory from that of orthologs, which did not experience such an event (e.g. mammalian versus teleost orthologs [59]). Finally, young and old paralogs can differ because evolutionary pressure can change over time [1], and because different types of genes can be retained in duplicate for different lengths of time. Thus, it is to be expected that different models explain the evolution of paralogs which have been generated by different mechanisms.

Concluding remarks

Few large-scale studies have been conducted that enable an explicit testing of all three models presented in Table 1, let alone more elaborate ones. It is clear that, as often occurs in evolutionary biology, specific cases of all scenarios can be found. The questions are thus: what is the most common mode of evolution? And do minority modes of evolution also have an important role, or are they of negligible impact?

Despite limitations, we are struck by the number of small- or large-scale studies that report less difference than expected under the ‘standard model’ between the evolution of paralogs and orthologs [2] and [3]. In addition, orthologs are increasingly found to diverge without duplication, in sequence [49], [50] and [51], in expression[27], and even in knockout phenotypes [60]. But the existence of these studies does not seem to modify the standard view of phylogenomics, as summarized in the NCBI factsheet for example. Perhaps this reflects the lack of interest for negative results: if a study did not find a difference, and we ‘know’ that there is one, then surely the authors did not look well enough, or in the right place? There might be something to this view, insofar that it is possible that protein sequence evolution, for which we have the most data, is less impacted by duplication than other features [19]. In addition, there seems to be consistent support for a relaxation of purifying selection on sequences after duplication, although its impact on function remains to be established.

Whether changes in gene function occur preferentially after duplication or not is important for our understanding of evolution because duplication is frequently viewed as the preferred mechanism to generate novelty in genomes [1]. It is also important to evaluate the relevance of transferring annotations between orthologs. In this context, an intriguing recent result is that sequence similarity seems to be a better predictor of common Gene Ontology terms than orthology [61]. This is expected under the ‘alternative model’ of functional divergence with time, but not under the ‘standard model’ of preferential divergence after duplication.

In conclusion, we would like to emphasize two points: one methodological, indicating that the design of studies of comparative genomics imposes strong limitations on the questions that can be answered; and one biological, suggesting that changes in function might be as common between orthologs as between paralogs. Future work should focus on testing the role of duplication versus speciation with appropriate designs and data.

Acknowledgements

We acknowledge funding from Etat de Vaud and the Swiss National Science Foundation grant 116798. We thank C. Dessimoz, L. Duret, G.V. Markov, J.-N. Volff, A. Wagner, K.H. Wolfe, and all members of the Robinson-Rechavi laboratory for helpful discussions, and the three reviewers and the Editor for constructive remarks.

References

1 G.C. Conant and K.H. Wolfe, Turning a hobby into a job: how duplicated genes find new functions, Nat. Rev. Genet. 9 (2008), pp. 938–950. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (32)

2 E.V. Koonin, Orthologs, paralogs, and evolutionary genomics, Annu. Rev. Genet.39 (2005), pp. 309–338. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (135)

3 Rentzsch, R. and Orengo, C.A. Protein function prediction – the power of multiplicity. Trends Biotechnol. DOI:10.1016/j.tibtech.2009.01.002.

4 J.S. Taylor and J. Raes, Duplication and divergence: the evolution of new genes and old ideas, Annu. Rev. Genet. 38 (2004), pp. 615–643. Full Text via CrossRef |View Record in Scopus | Cited By in Scopus (157)

5 E.H. Margulies and E. Birney, Approaches to comparative sequence analysis: towards a functional view of vertebrate genomes, Nat. Rev. Genet. 9 (2008), pp. 303–313. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (12)

6 A.C. Berglund et al., InParanoid 6: eukaryotic ortholog clusters with inparalogs,Nucleic Acids Res. 36 (2008), pp. D263–D266. View Record in Scopus | Cited By in Scopus (26)

7 T.J. Hubbard et al., Ensembl 2009, Nucleic Acids Res. 37 (2009), pp. D690–D697.Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (47)

8 M. Semon and K.H. Wolfe, Consequences of genome duplication, Curr. Opin. Genet. Dev. 17 (2007), pp. 505–512. Article |

PDF (714 K) | View Record in Scopus | Cited By in Scopus (32)9 E.L.L. Sonnhammer and E.V. Koonin, Orthology, paralogy and proposed classification for paralog subtypes, Trends Genet. 18 (2002), pp. 619–620. Article |

PDF (29 K) | View Record in Scopus | Cited By in Scopus (106)10 H. Kaessmann et al., RNA-based gene duplication: mechanistic and evolutionary insights, Nat. Rev. Genet. 10 (2009), pp. 19–31. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (16)

11 G. Blanc and K.H. Wolfe, Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution, Plant Cell 16 (2004), pp. 1679–1691. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (235)

12 F.G. Brunet et al., Gene loss and evolutionary rates following whole-genome duplication in teleost fishes, Mol. Biol. Evol. 23 (2006), pp. 1808–1816. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (71)

13 D.R. Scannell and K.H. Wolfe, A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast, Genome Res. 18 (2008), pp. 137–147. View Record in Scopus | Cited By in Scopus (13)

14 S.I. Nikolaev et al., Life-history traits drive the evolutionary rates of mammalian coding and noncoding genomic elements, Proc. Natl. Acad. Sci. U. S. A. 104 (2007), pp. 20443–20448. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (9)

15 E.M. Zdobnov and P. Bork, Quantification of insect genome divergence, Trends Genet. 23 (2007), pp. 16–20. Article |

PDF (454 K) | View Record in Scopus |Cited By in Scopus (40)16 D. Graur and W. Martin, Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision, Trends Genet. 20 (2004), pp. 80–86. Article |

PDF (211 K) | View Record in Scopus | Cited By in Scopus (228)

17 Y. Clement et al., Does lack of recombination enhance asymmetric evolution among duplicate genes? Insights from the Drosophila melanogaster genome, Gene385 (2006), pp. 89–95. Article |

PDF (254 K) | View Record in Scopus | Cited By in Scopus (6)

18 I. Tirosh and N. Barkai, Comparative analysis indicates regulatory neofunctionalization of yeast duplicates, Genome Biol. 8 (2007), p. R50. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (18)

19 I. Wapinski et al., Natural history and evolutionary principles of gene duplication in fungi, Nature 449 (2007), pp. 54–61. Full Text via CrossRef | View Record in Scopus| Cited By in Scopus (88)

20 E.W. Ganko et al., Divergence in expression between duplicated genes inArabidopsis, Mol. Biol. Evol. 24 (2007), pp. 2298–2309. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (16)

21 M. Semon and K.H. Wolfe, Preferential subfunctionalization of slow-evolving genes after allopolyploidization in Xenopus laevis, Proc. Natl. Acad. Sci. U. S. A. 105(2008), pp. 8333–8338. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (12)

22 C.I. Castillo-Davis et al., cis-Regulatory and protein evolution in orthologous and duplicate genes, Genome Res. 14 (2004), pp. 1530–1536. Full Text via CrossRef |View Record in Scopus | Cited By in Scopus (46)

23 A. Woolfe and G. Elgar, Comparative genomics using Fugu reveals insights into regulatory subfunctionalization, Genome Biol. 8 (2007), p. R53. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (20)

24 L. Huminiecki and K.H. Wolfe, Divergence of spatial gene expression profiles following species-specific gene duplications in human and mouse, Genome Res.14 (2004), pp. 1870–1879. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (50)

25 Z. Gu et al., Duplicate genes increase gene expression diversity within and between species, Nat. Genet. 36 (2004), pp. 577–579. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (62)

26 B. Prud’homme et al., Emerging principles of regulatory evolution, Proc. Natl. Acad. Sci. U. S. A. 104 (Suppl 1) (2007), pp. 8605–8612.

27 B.Y. Liao and J. Zhang, Evolutionary conservation of expression profiles between human and mouse orthologous genes, Mol. Biol. Evol. 23 (2006), pp. 530–540. View Record in Scopus | Cited By in Scopus (50)

28 H. Philippe et al., Heterotachy and functional shift in protein evolution, IUBMB Life55 (2003), pp. 257–265. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (17)

29 D.M. Kristensen et al., Prediction of enzyme function based on 3D templates of evolutionarily important amino acids, BMC Bioinformatics 9 (2008), p. 17. Full Textvia CrossRef | View Record in Scopus | Cited By in Scopus (18)

30 J.T. Bridgham et al., Evolution of hormone-receptor complexity by molecular exploitation, Science 312 (2006), pp. 97–101. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (95)

31 J.T. Bridgham et al., Evolution of a new function by degenerative mutation in cephalochordate steroid receptors, PLoS Genet. 4 (2008), p. e1000191. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (3)

32 Iwema, T. et al. Structural and evolutionary innovation of the heterodimerisation interface between USP and the ecdysone receptor ECR in insects. Mol. Biol. Evol.26, 753-768.

33 V.J. Lynch and G.P. Wagner, Resurrecting the role of transcription factor change in developmental evolution, Evolution 62 (2008), pp. 2131–2154. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (11)

34 S. Yokoyama, Evolution of dim-light and color vision pigments, Annu. Rev. Genomics Hum. Genet. 9 (2008), pp. 259–282. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (3)

35 W. Messier and C.-B. Stewart, Episodic adaptive evolution of primate lysozymes,Nature 385 (1997), pp. 151–154. Full Text via CrossRef | View Record in Scopus |Cited By in Scopus (235)

36 P.A. Christin et al., Evolutionary switch and genetic convergence on rbcL following the evolution of C4 photosynthesis, Mol. Biol. Evol. 25 (2008), pp. 2361–2368. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (4)

37 X. Gu, Statistical methods for testing functional divergence after gene duplication,Mol. Biol. Evol. 16 (1999), pp. 1664–1674. View Record in Scopus | Cited By in Scopus (178)

38 M. Anisimova and D.A. Liberles, The quest for natural selection in the age of comparative genomics, Heredity 99 (2007), pp. 567–579. Full Text via CrossRef |View Record in Scopus | Cited By in Scopus (13)

39 Y. Wang and X. Gu, Functional divergence in the caspase gene family and altered functional constraints: statistical analysis and prediction, Genetics 158 (2001), pp. 1311–1320. View Record in Scopus | Cited By in Scopus (61)

40 R.J. Edwards and D.C. Shields, BADASP: predicting functional specificity in protein families using ancestral sequences, Bioinformatics 21 (2005), pp. 4190–4191. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (7)

41 O. Penn et al., Evolutionary modeling of rate shifts reveals specificity determinants in HIV-1 subtypes, PLOS Comput. Biol. 4 (2008), p. e1000214. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (3)

42 S. Gribaldo et al., Functional divergence prediction from evolutionary analysis: a case study of vertebrate hemoglobin, Mol. Biol. Evol. 20 (2003), pp. 1754–1759. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (19)

43 C. Seoighe et al., Significantly different patterns of amino acid replacement after gene duplication as compared to after speciation, Mol. Biol. Evol. 20 (2003), pp. 484–490. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (28)

44 G.C. Conant et al., Modeling amino acid substitution patterns in orthologous and paralogous genes, Mol. Phylogenet. Evol. 42 (2007), pp. 298–307. Article |

PDF (241 K) | View Record in Scopus | Cited By in Scopus (5)45 A. Levasseur et al., Tracking the connection between evolutionary and functional shifts using the fungal lipase/feruloyl esterase A family, BMC Evol. Biol. 6 (2006), p. 92. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (10)

46 J.A. Tennessen, Positive selection drives a correlation between non-synonymous/synonymous divergence and functional divergence, Bioinformatics 24(2008), pp. 1421–1425. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (2)

47 V.J. Lynch, Inventing an arsenal: adaptive evolution and neofunctionalization of snake venom phospholipase A2 genes, BMC Evol. Biol. 7 (2007), p. 2. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (31)

48 R.A. Studer et al., Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes, Genome Res. 18 (2008), pp. 1393–1402. Full Textvia CrossRef | View Record in Scopus | Cited By in Scopus (11)

49 A.G. Clark et al., Evolution of genes and genomes on the Drosophila phylogeny,Nature 450 (2007), pp. 203–218. Full Text via CrossRef | View Record in Scopus |Cited By in Scopus (246)

50 A. Eyre-Walker, The genomic rate of adaptive evolution, Trends Ecol. Evol. 21(2006), pp. 569–575. Article |

PDF (164 K) | View Record in Scopus | Cited By in Scopus (65)51 C. Kosiol et al., Patterns of positive selection in six mammalian genomes, PLoS Genet. 4 (2008), p. e1000144. Full Text via CrossRef | View Record in Scopus |Cited By in Scopus (21)

52 P. Andolfatto, Hitchhiking effects of recurrent beneficial amino acid substitutions in the Drosophila melanogaster genome, Genome Res. 17 (2007), pp. 1755–1762.Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (27)

53 J.C. Davis and D.A. Petrov, Preferential duplication of conserved proteins in eukaryotic genomes, PLoS Biol. 2 (2004), p. E55.

54 X. He and J. Zhang, Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution, Genetics 169 (2005), pp. 1157–1164. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (150)

55 A. Presser et al., The evolutionary dynamics of the Saccharomyces cerevisiaeprotein interaction network after duplication, Proc. Natl. Acad. Sci. U. S. A. 105(2008), pp. 950–954. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (5)

56 C.R. Johnston et al., Evaluation of whether accelerated protein evolution in chordates has occurred before, after, or simultaneously with gene duplication, Mol. Biol. Evol. 24 (2007), pp. 315–323. View Record in Scopus | Cited By in Scopus (7)

57 J.C. Davis and D.A. Petrov, Do disparate mechanisms of duplication add similar genes to the genome?, Trends Genet. 21 (2005), pp. 548–551. Article |

PDF (91 K) | View Record in Scopus | Cited By in Scopus (32)58 B.P. Cusack and K.H. Wolfe, Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates, Mol. Biol. Evol. 24 (2007), pp. 679–686. View Record in Scopus | Cited By in Scopus (15)

59 V. Ravi and B. Venkatesh, Rapidly evolving fish genomes and teleost diversity,Curr. Opin. Genet. Dev. 18 (2008), pp. 544–550. Article |

PDF (845 K) | View Record in Scopus | Cited By in Scopus (12)60 B.-Y. Liao and J. Zhang, Null mutations in human and mouse orthologs frequently result in different phenotypes, Proc. Natl. Acad. Sci. U. S. A. 105 (2008), pp. 6987–6992. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (16)

61 A.M. Altenhoff and C. Dessimoz, Phylogenetic and functional assessment of orthologs inference projects and methods, PLOS Comput. Biol. 5 (2009), p. e1000262. View Record in Scopus | Cited By in Scopus (5)

62 I.K. Jordan et al., Duplicated genes evolve slower than singletons despite the initial rate increase, BMC Evol. Biol. 4 (2004), p. 22. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (0)

Glossary

Bias in gene retention

the retention of copies of genes after duplication is not random, relative to gene function or evolutionary rate. This bias can be a confounding factor in large scale analyses of paralogs: the function of duplicate genes in a genome results both from the bias in retention and from evolution after duplication.

Homologs

genes descending from a common ancestor.

In-paralogs

paralogs resulting from a duplication after a speciation event of reference.

Negative selection

selection that decreases the chance of fixation of a mutation because it is detrimental; it results in a decrease in the rate of evolution of selected mutations.

Neofunctionalization

the process in which one paralog gains a new function, which is selectively advantageous.

One-to-one orthologs

orthologs which are present in a single copy in each genome of interest.

Orthologs

homologs which have diverged since a speciation event.

Out-paralogs

paralogs resulting from a duplication before a speciation event of reference.

Paralogs

homologs which have diverged since a duplication event.

Positive selection

selection that increases the chance of fixation of a mutation because it is beneficial; it results in an increase in the rate of evolution of selected mutations.

Subfunctionalization

the process in which paralogs partition the ancestral function, so that each performs only part of this function. Subfunctionalization can happen by degenerating reciprocal mutations (DDC model), or by positive selection for specialization (escape from adaptive conflict).

Trends in Genetics

Volume 25, Issue 5, May 2009, Pages 210-216