Monday, October 31, 2016

This is the forest primeval: each tree an evolution

This is the forest primeval. 
The murmuring pines and the hemlocks,
Bearded with moss, and in garments green,
indistinct in the twilight,
Stand like Druids of old,
with voices sad and prophetic,
Stand like harpers hoar,
with beards that rest on their bosoms.
Loud from its rocky caverns,
the deep-voiced neighboring ocean Speaks,
and in accents disconsolate
answers the wail of the forest.

These famous lines, from Longfellow's 1847 epic poem Evangeline, spoke of a sad human tale in the days of early European settlement in the New World. The story was about people, but there is much to tell about the Druids of old, the lives and evolution of treest can be quite surprising.

This post was motivated by a recent trade book The Hidden Life of Trees, by Peter Wohlleben, a German forester, who described what his life in the woods has taught him about trees, their nature, evolution, and biology.  It's written at a pop sci level, and is often quite subjective and evocative, but it's laden with important facts when it comes to trying to understand the evolution of these terrestrial beasts. And, in a sense, these facts generalize in many ways.

The author discusses all sorts of observations that have been made about the responses of different parts of trees (bark, vessels, wood, leaves, roots) to their environment (sunlight, presence of trees of their own species, or of other species, of insect, fungal and other parasites), even going so far as to describe the sociology of trees and their responses to being isolated vs being in a forest of their friends and relatives. Trees interact with their own detected relatives, connected via communication through the air and underground via fungal networks, to the point that they even assist each other, when in trouble, with nutrients. It is a remarkable picture of interactions between organisms in organized, positively coordinated ecosystems.

The book is very selectionist, in that every trait is described as an adaptation to this or that condition, but trees that seem very similar can be different in these respects, so there is the assumption (very hard to prove, if even possible) that each trait evolved 'for' its current function. This is a more deterministically selectionist or even determinist viewpoint than we think is justified by actual fact, even if the functional aspects are as described (which we have no reason to doubt). Indeed, many examples are given of ways trees respond differently to different environments, and hence are not rigidly programmed to live in one particular way.

In any case, our point here aside from recommending an interesting and informative book, is to muse over some we think rather widely missed aspects of trees, their lives, and how they manage to survive and evolve.

While the author is a very strong selectionist when it comes to explaining who does what among trees or among woodsy species, I think he--and for all I know the vast majority of botanists--overlooks what is likely a very major aspect of arboreal evolution.

One major problem that seems to need to be more widely considered (maybe it is by botanists, but we haven't seen much that refers to this particular issue) relates to the implications of time scales (a matter that Wohlleben discusses in detail). Trees can live for decades, centuries, or even millennia.
Wohlleben very clearly and repeatedly stresses the fact that trees live on such a different time scale compared to us, that it can be hard for us to fathom how their lives evolve--and evolve is the appropriate word. If trees are, so to speak, rooted in their origins for hundreds or even thousands of years, while insects, fungi, and other plants and animals (not to mention microbes) have generations in years or even minutes, how can trees ever adapt or survive? By the time a tree has reached a venerable age, hasn't it been out-evolved by almost every other species that lives in or that is blown into its neighborhood?  By the time it dies, when any of its seeds germinate they must already be obsolete, ready to fight the last war-or the last war minus 10 or 100 or 1000.

One answer, in my view, is the largely overlooked fact of the evolution of tree--of each individual tree--during its lifetime.

The evolution of tree (not trees)
Unless my feeble knowledge of botany totally fails me, I think there is a lot going on even at the normal pace of things, within an individual tree. That is that each tree is a remarkable micro-example of evolution in itself.

Each tree starts life as a single fertilized egg (its seed). During its life, that little cell divides into billions, probably trillions, of descendant cells. These make up its roots and, important for us, its trunk, branches, leaves, and flowers. While there are various aspects of communication among these cells, they are essentially independent.

Each cell division along the way from the root tip to the branch tip (or 'meristem'), mutations will occur. This happens in humans, too; such mutations are called somatic because they don't occur in the individual's germ line (that is, the cell lineage that leads to sperm or egg), and hence while the mutation carried by the original cell and its descendants may affect the local tissue, the change isn't inherited by the next generation. Only mutations in the germ line are, and indeed that's where the idea of 'mutation' historically arose. Most somatic mutations will have no effect on the gene-usage of the cell involved, but if they do it might be negative and the cell will die or just misbehave in a way that has no consequences because it's surrounded by countless healthy cells. Sometimes, such as with cancer, somatic mutations can be devastating.

Trees are different. They have no separated somatic and germ lines. Mutations occurring from the seed to the roots and limbs may lead to dead cells, or do nothing, or they may be screened for their 'fitness', their ability to generate the bark, vascular, leave of other tissues in their local time and place. They are, relative to other cells in the tree, removed by what we could call a version of natural selection. Those mutations that survive will be passed down the line or, rather up the line as the trunk, branches, and leaves grow.

Here is a photo of an oak tree and (metaphorically) its single starting genome:

At the end of the countless stems in a tree, over its long lifetime, would be meristem cells each carrying a wide but individually unique variety of mutational differences from what was in the founding acorn. At the meristem, in the appropriate time of year, cells differentiate into pollen and ovule cells. These are many generations of selection away from their founding acorn, and on a given tree there must be a great variety of genotypes, whose sequences would form a tree (a phylogeny), much as we find when we compare DNA sequences from dog species, or from individual humans.

A single tree is a very large evolutionary 'experiment'. Branches affected by harmful mutation, simply aren't there, so to speak. They and their genomic lineage are 'extinct'. A single tree, and its lifetime, comprise such a large 'experiment' that they are comparable in numbers to whole species of shorter-lived, germ-line-dependent organisms.

Here is a photo of a tree from our yard that may illustrate the point. Why are only the leaves on this one branch turning to fall colors so much earlier than the others on the same tree? There may be local environmental reasons, such as different sunlight or water supply or parasite effects, but this seems rather unlikely because other branches in similar positions, even on this same tree, are still green.

And now here is another photo, of a different tree in our front yard that we think illustrates the points we're making. This red oak loses its leaves in the usual way....except for the one major branch shown. Its leaves do not fall until the following spring, but the remaining branches on this tree drop in fall as would be normal. This happens every year and is not a fluke of some particular season.

A forester might have a local explanation, that there is some connection between the location of roots supplying these particular branches, relative to the underground water or soil conditions, but one possible explanation is somatic mutation. That is, some mutational effect, arising when the branch was early in its formation, led to a difference in the abscission  layers of the leaves to be produced by that branch, that retained those leaves through the winter.   If the explanation is local physical conditions, of course, that means the tree cannot be predicted from its founding acorn's sequence. But it is rather difficult to believe that somatic mutation doesn't have at least the kinds of effects seen.  A good experiment would be to take an acorn from this part of the tree and plant it next to one from another part of the tree and see what happens. Unfortunately, the answer wouldn't be available for many years....

Our point here is that among the countless cells in a tree's life, between its origin as a single cell and the also countless generations of its own acorns from its founding genome through its long live, there simply must have been countless somatic mutations, occurring all along the roots and trunk and branches, cell division by cell division.  Their descendants, down the root network, and up the trunk and into the branches must have been screened for the viability of any phenotypic effects, which many must have had.  If insects or bacteria attack or animal predators or the climate change, parts of a tree may be better able to survive than others.  Cells in the trees' future lives will have the benefit of these changes.  They may be small, but they may accumulate over the decades.  The branches affected by less helpful changes would flower less, or lead to branches that die or fall to predators, and so on--ones we never see later on, when we look at the tree.  Among the countless meristems every generation will be a population of differing genotypes to be passed on to its season's thousands and thousands of seeds.

In this way, by working through meristems everywhere (above ground) on the tree are cells with new genotypes screened for suitability in its environment at each time during the tree's life.  A tree is not a single organism, but a population of descendants of a founder.  The acorn was primeval perhaps, but not the forest.  It is this kind of within-life evolution that may, or perhaps must, explain how a single, immobile organism can survive for so long in the dynamics of local ecosystems.

That is, it's the tree itself, in its ever-renewing parts from root to twig, not just its evolving population of annual seeds, that must be evolving.  Decades, centuries, or millennia must often encompass changes in the biota around each primeval individual, and would destroy it, if it, too, were not evolving.  Otherwise, it would seem like asking for doom to be fixed in a given location for hundreds or thousands of years, surrounded by junior, dynamically evolving predators and competitors.  

The forest is always primeval: Each individual tree, in this view, is an evolving population, always adapting in its unchanging location to its locally changing conditions.

Thursday, October 27, 2016

Causal complexity in life

Evolution is the process that generates the relationships between genomes and traits in organisms.  Although we have written extensively and repeatedly about the issues raised by causal complexity,  we were led to write this post by a recent paper, in the 21 October 2016 issue of Science, which discusses molecular pathways to hemoglobin (Hb) gene function.  Although one might expect this to be rather simple and genomically direct, it is in fact complex and there are many different ways to achieve comparable function.

The authors, C Nataragan et al.,  looked at the genetic basis of adaptation to habitats at different altitude, focusing on genes coding for Hb molecules, that transport oxygen in the blood to provide the body's tissues with this vital fuel.  As a basic aspect of our atmosphere, oxygen concentrations differ at different altitudes, being low in mountainous regions compared to lowlands.  Species must somehow adapt to their localities, and at least one way to to this is for oxygen transport efficiency mechanisms to differ at different elevations.  Bird species have moved into and among these various environments on many independent occasions.

The affinity of Hb molecules for, that is, ability to bind oxygen, depends on their amino acid sequence, and the authors found that this varies by altitude.  The efficiency is similar among species at similar altitudes, even if due to independent population expansions. But when they looked at the Hb coding sequences in different species, they found a variety of species-specific changes.  That is, there are multiple ways to achieve similar function, so that parallel evolution at the functional level, which is what Nature detects, is achieved by many different mutational pathways.  In that sense, while an adaptation can be predicted, a specific genetic reason cannot be.

The authors looked only at coding regions, but of course evolution also involves regulatory sequences (among other functional regions in DNA), so there is every reason to expect that there is even more complexity to the adaptive paths taken.

Important specific documentation....but not conceptually new, though unappreciated
The authors also looked at what they call 'resurrected ancestral' proteins, by experimentally testing the efficacy of some specific Hb mutations, and they found that genomic background made a major difference in how, or whether, a specific change would affect oxygen binding.  This shows that evolution is contingent on local conditions, and that a given genomic change depends on the genomic background.  The ad hoc, locally contingent nature of evolution is (or should be) a central aspect of evolutionary world views, but there is a widespread tendency to think in classical Mendelian terms, of a gene for this and a gene for that, so that one would expect similar results in similar, if independent areas or contexts.  This is a common, if often tacit, view underlying much of genome mapping to find genes 'for' some human trait, like important diseases.  But it is quite misleading, or more accurately, is very wrong.

In 2008 we wrote about this in Genetics, as we've done before and since here on MT and in other papers.  In the 2008 article we used the following image to suggest metaphorically the nature of this complex causation, with its alternative pathways and the like, where the 'trait' is the amount of water passing New Orleans on the Mississippi River.  The figure suggests how difficult it would be to determine 'the' causal source of the water, how many different ways there are to get the same river level.

Drainage complexity as a metaphor for genomic causal complexity.  Map by Richard Weiss and ArcInfo
One can go even further, and note that this is exactly the kind of findings that are to be expected from and documented by the huge list of association studies done of human traits.  These typically find a great many genome regions whose variation contributes to the trait, usually each with a small individual effect, and mainly at low frequency in the population.  That means that individuals with similar trait values (say, diabetes, obesity, tall, or short stature, etc.) have different genotypes, that overlap in incomplete and individually unique ways.

We have written about aspects of this aspect of life, in what we called evolution by phenotype, in various places.  Nature screens on traits directly and only on genes very indirectly in most situations in complex organisms.  This means that many genotypes yield the same phenotype, and these will be equivalent in the face of natural selection and will experience genetic drift among them even in the fact of natural selection, again because selection screens the phenotype.  This is the process we called phenogenetic drift.  These papers were not 'discoveries' of ours but just statements of what is pretty obvious even if inconvenient for those seeking simple genetic causation.

The Science paper on altitude adaptation shows this by stereotypical sequences from one individual each from a variety of different species, rather than different individuals within each species, but that one can expect must also exist.  The point is that a priori prediction of how hemoglobin adaptation will occur is problematic, except that each species must have some adaptation to available oxygen.  Parallel phenotype evolution need not be matched by parallel genotypic evolution because selection 'sees' phenotypes and doesn't 'care' about how they are achieved.

The reason for this complexity is simple: it is that this is how evolution working via phenotypes rather than genotypes molds the genetic aspects of causation.

Thursday, October 13, 2016

Genomic causation....or not

By Ken Weiss and Anne Buchanan

The Big Story in the latest Nature ("A radical revision of human genetics: Why many ‘deadly’ gene mutations are turning out to be harmless," by Erika Check Hayden) is that genes thought to be clearly causal of important diseases aren't always (the link is to the People magazine-like cover article in that issue.)  This is a follow-up on an August Nature paper describing the database from which the results discussed in this week's Nature are drawn.  The apparent mismatch between a gene variant and a trait can be, according to the paper, the result of technical error, a mis-call by a given piece of software, or due to the assumption that the identification of a given mutation in affected but not healthy individuals means the causal mutation has been found, without experimentally confirming the finding--which itself can be tricky for reasons we'll discuss.  Insufficient documentation of 'normal' sequence variation has meant that the frequency of so-called causal mutations hasn't been available for comparative purposes.  Again we'll mention below what 'insufficient' might mean, if anything.

People in general and researchers in particular need to be more than dismissively aware of these issues, but the conclusion that we still need to focus on single genes as causal of most disease, that is, do MuchMoreOfTheSame, which is an implication of the discussion, is not so obviously justified.   We'll begin with our usual contrarian statement that the idea here is being overhyped as if it were new, but we know that except for its details it clearly is not, for reasons we'll also explain.  That is important because presenting it as a major finding, and still focusing on single genes as being truly causal vs mistakenly identified, ignores what we think the deeper message needs to be.

The data come from a mega-project known as ExAC, a consortium of researchers sharing DNA sequences to document genetic variation and further understand disease causation, and now including data from approximately 60,000 individuals (in itself, rather small compared to the need for purpose). The data are primarily exome sequences, that is, from protein-coding regions of the human genome, not from whole genome sequences, again a major issue.  We have no reason at all to critique the original paper itself, which is large, sophisticated, and carefully analyzed as far as we can tell; but the excess claims about its novelty are we think very much hyperbolized, and that needs to be explained.

Some of the obvious complicating issues
We know that a gene generally does not act alone.  DNA in itself is basically inert.  We've been and continue to be misled by examples of gene causation in which context and interactions don't really matter much, but that leads us still to cling to these as though they are the rule.  This reinforces the yearning for causal simplicity and tractability.  Essentially even this ExAC story, or its public announcements, doesn't properly acknowledge causal context and complexity because it is critiquing some simplistic single-gene inferences, and assuming that the problems are methodological rather than conceptual.

There are many aspects of causal context that complicate the picture, that are not new and we're not making them up, but which the Bigger-than-Ever Data pleas don't address:
1.  Current data are from blood-samples and that may not reflect the true constitutive genome because of early somatic mutation, and this will vary among study subjects,
2.  Life-long exposure to local somatic mutation is not considered nor measured, 
3.  Epigenetic changes, especially local tissue-specific ones, are not included, 
4.  Environmental factors are not considered, and indeed would be hard to consider,
5.  Non-Europeans, and even many Europeans are barely included, if at all, though this is  beginning to be addressed, 
6.  Regulatory variation, which GWAS has convincingly shown is much more important to most traits than coding variation, is not included. Exome data have been treated naively by many investigators as if that is what is important, and exome-only data have been used a major excuse for Great Big Grants that can't find what we know is probably far more important, 
7.  Non-coding regions, non-regulatory RNA regions are not included in exome-only data,
8.  A mutation may be causal in one context but not in others, in one family or population and not others, rendering the determination that it's a false discovery difficult,
9.  Single gene analysis is still the basis of the new 'revelations', that is, the idea being hinted at that the 'causal' gene isn't really causal....but one implicit notion is that it was misidentified, which is perhaps sometimes true but probably not always so,
 10.  The new reports are presented in the news, at least, as if the gene is being exonerated of its putative ill effects.  But that may not be the case, because if the regulatory regions near the mutated gene have no or little activity, the 'bad' gene may simply not be being expressed.  Its coding sequence could falsely be assumed to be harmless, 
11. Many aspects of this kind of work are dependent on statistical assumptions and subjective cutoff values, a problem recently being openly recognized, 
12.  Bigger studies introduce all sorts of statistical 'noise', which can make something appear causal or can weaken its actual apparent cause.  Phenotypes can be measured in many ways, but we know very well that this can be changeable and subjective (and phenotypes are not very detailed in the initial ExAC database), 
13.  Early reports of strong genetic findings have well known upward bias in effect size, the finder's curse that later work fails to confirm.

Well, yes, we're always critical, but this new finding isn't really a surprise
To some readers we are too often critical, and at least some of us have to confess to a contrarian nature.  But here is why we say that these new findings, like so many that are by the grocery checkout in Nature, Science, and People magazines, while seemingly quite true, should not be treated as a surprise or a threat to what we've already known--nor a justification of just doing more, or much more of the same.

Gregor Mendel studied fully penetrant (deterministic) causation.  That is what we now know to be 'genes', in which the presence of the causal allele (in 2-allele systems) always caused the trait (green vs yellow peas, etc.; the same is true of recessive as dominant traits, given the appropriate genotype). But this is generally wrong, save at best for the exceptions such as those that Mendel himself knowingly and carefully chose to study.  But even this was not so clear!  Mendel has been accused of 'cheating' by ignoring inconsistent results. This may have been data fudging, but it is at least as likely to have been reacting to what we have known for a century as 'incomplete penetrance'.  (Ken wrote on this a number of years ago in one of his Evolutionary Anthropology columns.)  For whatever reason--and see below--the presence of a 'dominant' gene or  'recessive' homozyosity at a 'causal' gene doesn't always lead to the trait.

In most of the 20th century the probabilistic nature of real-world as opposed to textbook Mendelism has been completely known and accepted.  The reasons for incomplete penetrance were not known and indeed we had no way to know them as a rule.  Various explanations were offered, but the statistical nature of the inferences (estimates of penetrance probability, for example) were common practice and textbook standards.  Even the original authors acknowledge incomplete penetrance, but this essentially shows that what the ExAC consortium is reporting are details but nothing fundamentally new nor surprising.  Clinicians or investigators acting as if a variant were always causal should be blamed for gross oversimplification, and so should hyperbolic news media.

Recent advances such as genomewide association studies (GWAS) in various forms have used stringent statistical criteria to minimize false discovery.  This has led to mapped 'hits' that satisfied those criteria only accounting for a fraction of estimated overall genomic causation.  This was legitimate in that it didn't leave us swamped with hundreds of very weak or very rare false positive genome locations.  But even the acceptable, statistically safest genome sites showed typically small individual effects and risks far below 1.0. They were not 'dominant' in the usual sense.  That means that people with the 'causal' allele don't always, and in fact do not usually, have the trait.  This has been the finding for quantitative traits like stature and qualitative ones like presence of diabetes, heart attack-related events, psychiatric disorders and essentially all traits studied by GWAS. It is not exactly what the ExAC data were looking at, but it is highly relevant and is the relevant basic biological principle.

This does not necessarily mean that the target gene is not important for the disease trait, which seems to be one of the inferences headlined in the news splashes.  This is treated as a striking or even fundamental new finding, but it is nothing of that sort.  Indeed, the genes in question may not be falsely identified, but may very well contribute to risk in some people under some conditions at some age and in some environments.  The ExAC results don't really address this because (for example) to determine when a gene variant is a risk variant one would have to identify all the causes of 'incomplete penetrance' in every sample, but there are multiple explanations for incomplete penetrance, including the list of 1 - 13 above as well as methodological issues such as those pointed out by the ExAC project paper itself.

In addition, there may be 'protective' variants in the other regions of the genome (that is, the trait may need the contribution of many different genome regions), and working that out would typically involve "hyper astronomical" combinations of effects using unachievable, not to mention uninterpretable, sample sizes--from which one would have to estimate risk effects of almost uncountable numbers of sequence variants.  If there were, say, 100 other contributing genes, each with their own variant genotypes including regulatory variants, the number of combinations of backgrounds one would have to sort through to see how they affected the 'falsely' identified gene is effectively uncountable.

Even the most clearly causal genes such as variants of BRCA1 and breast cancer have penetrance far less than 1.0 in recent data (here referring to lifetime risk; risk at earlier ages is very far from 1.0). The risk, though clearly serious, depends on cohort, environmental and other mainly unknown factors.  Nobody doubts the role of BRCA1 but it is not in itself causal.  For example, it appears to be a mutation repair gene, but if no (or not enough) cancer-related mutations arise in the breast cells in a woman carrying a high-risk BRCA1 allele, she will not get breast cancer as a result of that gene's malfunction.

There are many other examples of mapping that identified genes that even if strongly and truly associated with a test trait have very far from complete penetrance.  A mutation in HFE and hemochromatosis comes to mind: in studies of some Europeans, a particular mutation seemed always to be present, but if the gene itself were tested in a general data base, rather than just in affected people, it had little or no causal effect.  This seems to be the sort of thing the ExAC report is finding.

The generic reason is again that genes, essentially all genes, work only in their context. That context includes 'environment', which refers to all the other genes and cells in the body and the external or 'lifestyle' factors, and also age and sex as well.  There is no obvious way to identify, evaluate or measure the effects of all possibly relevant lifestyle effects, and since these change, retrospective evaluation has unknown bearing on future risk (the same can be said of genomic variants for the same reason).  How could these even be sampled adequately?

Likewise, volumes of long-existing experimental and highly focused results tell the same tale. Transgenic mice, for example, in which the same mutation is introduced into their 'same' gene as in humans, very often show little or no, or only strain-specific effects.  This is true in other experimental organisms. The lesson, and it's by far not a new or very recent one, is that genomic context is vitally important, that is, it is person-specific genomic backgrounds of a target gene that affect the latter's effect strength--and vice versa: that is, the same is true for each of these other genes. That is why to such an extent we have long noted the legerdemain being foist on the research and public communities by the advocates of Big Data statistical testing.  Certainly methodological errors are also a problem, as the Nature piece describes, but they aren't the only problem.

So if someone reports some cases of a trait that seem too often to involve a given gene, such as the Nature piece seems generally to be about, but searches of unaffected people also occasionally find the same mutations in such genes (especially when only exomes are considered), then we are told that this is a surprise.  It is, to be sure, important to know, but it is just as important to know that essentially the same information has long been available to us in many forms.  It is not a surprise--even if it doesn't tell us where to go in search of genetic, much less genomic, causation.

Sorry, though it's important knowledge, it's not 'radical' nor dependent on these data!
The idea being suggested is that (surprise, surprise!) we need much more data to make this point or to find these surprisingly harmless mutations.  That is simply a misleading assertion, or attempted justification, though it has become the intentional industry standard closing argument.

It is of course very possible that we're missing some aspects of the studies and interpretations that are being touted, but we don't think that changes the basic points being made here.  They're consistent with the new findings but show that for many very good reasons this is what we knew was generally the case, that 'Mendelian' traits were the exception that led to a century of genetic discovery but only because it focused attention on what was then doable (while, not widely recognized by human geneticists, in parallel, agricultural genetics of polygenic traits showed what was more typical).

But now, if things are being recognized as being contextual much more deeply than in Francis' Collins money-strategy-based Big Data dreams, or 'precision' promises, and our inferential (statistical) criteria are properly under siege, we'll repeat our oft-stated mantra: deeply different, reformed understanding is needed, and a turn to research investment focused on basic science rather than exhaustive surveys, and on those many traits whose causal basis really is strong enough that it doesn't really require this deeper knowledge.  In a sense, if you need massive data to find an effect, then that effect is usually very rare and/or very weak.

And by the way, the same must be true for normal traits, like stature, intelligence, and so on, for which we're besieged with genome-mapping assertions, and this must also apply to ideas about gene-specific responses to natural selection in evolution.  Responses to environment (diet etc.) manifestly have the same problem.  It is not just a strange finding of exome mapping studies for disease. Likewise, 'normal' study subjects now being asked for in huge numbers may get the target trait later on in their lives, except for traits basically present early in life.  One can't doubt that misattributing the cause of such traits is an important problem, but we need to think of better solutions that Big Big Data, because not confirming a gene doesn't help, or finding that 'the' gene is only 'the' gene in some genomic or environmental backgrounds is the proverbial and historically frustrating needle in the haystack search.  So the story's advocated huge samples of 'normals' (random individuals) cannot really address the causal issue definitively (except to show what we know, that there's a big problem to be solved).  Selected family data may--may--help identify a gene that really is causal, but even they have some of the same sorts of problems.  And may apply only to that family.

The ExAC study is focused on severe diseases, which is somewhat like Mendel's selective approach, because it is quite obvious that complex diseases are complex.  It is plausible that severe, especially early onset diseases are genetically tractable, but it is not obvious that ever more data will answer the challenge.  And, ironically, the ExAC study has removed just such diseases from their consideration! So they're intentionally showing what is well known, that we're in needle in haystacks territory, even when someone has reported big needles.

Finally, we have to add that these points have been made by various authors for many years, often based on principles that did not require mega-studies to show.  Put another way, we had reason to expect what we're seeing, and years of studies supported that expectation.  This doesn't even consider the deep problems about statistical inference that are being widely noted and the deeply entrenched nature of that approach's conceptual and even material invested interests (see this week's Aeon essay, e.g.).  It's time to change, but doing so would involve deeply revising how resources are used--of course one of our common themes here on the MT--and that is a matter almost entirely of political economy, not science.  That is, it's as much about feeding the science industry as it is about medicine and public health.  And that is why it's mainly about business as usual rather than real reform.

Friday, October 7, 2016

Science journals: Anything for a headline

Well, this week's sensational result is reported in the Oct 5 Nature in a paper about limits to the human lifespan. The unsensational nature of this paper shows yet again how Nature and the other 'science' journals will take any paper that they can use for a cheap headline.  This paper claims that the human life span cannot exceed 115 (though the cover picture in a commentary in the same issue is a woman-- mentioned in the paper itself--who lived to be substantially older than that!).  The Nature issue has all the exciting details of this novel finding, which of course have been trumpeted by the story-hungry 'news' media.

In essence the authors argue that maximum longevity on a population basis has been increasing only very slowly or not at all over recent decades.  It is, one might say, approaching an asymptote of strong determination. They suggest that there is, as a result of many complex contributing factors-of-decline, essentially a limit to how long we can live, at least as a natural species without all sorts of genetic engineering.  In that sense, dreams of hugely extended life, even as a maximum (that is, if not for everyone), are just that: dreams.

This analysis raises several important issues, but largely ignores others.  First, however, it is important to note that virtually nothing in this paper, except some more recent data, is novel in any way.  The same issues were discussed at very great length long ago, as I know from my own experience.  I was involved in various aspects of the demography and genetics of aging, as far back as the 1970s.  There was a very active research community looking at issues such as species-specific 'maximum lifespan potential', with causal or correlated factors ranging from the effects of basic metabolism, or body or brain size.  Here's a figure from 1978 that I used in a 1989 paper

There was experimental research on this including life-extension studies (e.g., dietary restriction) as well as comparison of data over time, much as (for its time) the new paper.  The idea that there was an effective limit to human lifespan (and likewise for any species) was completely standard at that time, and how much this could be changed by modern technologies and health care etc. was debated. In 1975, for example (and that was over 40 years ago!), Richard Cutler argued in PNAS that various factors constrained maximum lifespan in a species-related way.  The idea, and one I also wrote a lot about in the long-ago past, is that longevity is related to surviving the plethora of biological decay processes, including mutation, and that would lead to a statistical asymptote in lifespan.  That is, that lifespan was largely a statistical result rather than a deterministically specified value.  The mortality results related to lifespan were not about 'lifespan' causation per se, but were just the array of diseases (diabetes, cancer, heart disease, etc.) that arose as a result of the various decays that led to risk increasing with duration of exposure, wear and tear, and so on, and hence were correlated with age.  Survival to a given age was the probability of not succumbing to any of these causes by that age.

This paper of mine (mentioned above) was about the nature of arguments for a causally rather that statistically determined lifespan limit.  If that were so, then all the known diseases, like heart disease, diabetes, cancer, and so on, were irrelevant to our supposed built-in lifespan limit!  That makes no evolutionary sense, since evolution would not be able to work on such a limit (nobody's still reproducing anywhere near that old).  It would make no other kind of sense, either.  What would determine such a limit and how could it have evolved?  On the other hand, if diseases--the real causes that end individual lives--were, together, responsible for the distribution of lifespan lengths, then a statistical rather than deterministic end is what's real.  The new paper doesn't deal with these, but by arguing that there is some sort of asymptotic limit, it implicitly invokes some sort of causal, evolutionarily determined value, and that seems implausible.

Indeed, evolutionary biologists have long argued that evolution would produce 'negative pleiotropy', in which genomes would confer greater survival at young ages, even if the result was at the expense of greater mortality later on.  That way, the species' members could live to reproduce (at least, if they survived developmentally-related infant mortality), and they were dispensable at older ages so that there was no evolutionary pressure to live longer.   But that would leave old-age longevity to statistical decay processes, not some built-in limit.

Of course, with very large data sets and mortality a multicausal statistical process, rare outliers would be seen, so that more data meant longer maximum survival 'potential' (assuming everyone in a species somehow had that potential, clearly a fiction given genetic diseases and the like that affect individuals differently).  There were many problems with these views, and many have since tried to find single-cause lifespan-determining factors (like telomere decay, in our chromosomes), an active area of research (more on that below).  We still hunger for the Fountain of Youth--the single cause or cure that will immortalize us!

The point here is that the new paper is at most a capable but modest update of what was already known long ago.  It doesn't really address the more substantive issues, like those I mention above.  It is not a major finding, and its claims are also in a sense naive, since future improvements in health and lifestyles that we don't have now but that applied to our whole population could extend life expectancy--the average age at death--and hence the maximum to which anyone would survive. After all, when we had huge infectious disease loads, hardly anybody lived to 115, and in the old days of research, to which the authors seem oblivious, something like 90-100 was assumed to be our deadline.

The new paper has been criticized by a few investigators, as seen in reports in the news media coverage.  But the paper's authors probably are right that nothing foreseeable will make a truly huge change in maximum survival, nor will many survive to such an extended age.  Nor--importantly--does this mean that those who do luck out are actually very lucky: the last few years or decades of decrepitude may not be worth it to most who last to the purported limit. To think of this as more than a statistical result is a mistake.  Not everyone can live to any particular age, obviously.

The main fault in the paper in my view is the claim in essence to portray the result as a new finding, and the publication in a purportedly major journal, with the typical media ballyhoo suggesting that.

On the other hand....
On the other hand, investigators who were interviewed about this study (to give it 'balance'!) denigrated it, saying that novel medical or other (genetic?) interventions could make major changes in human longevity.  This has of course happened in the past century or two.  More medical intervention, antibiotics and vaccines and so on have greatly increased average lifespan and, in so doing in large populations, increased the maximum survival that we observe.  This latter is a statistical result of the probabilistic nature of degenerative processes like accumulating wear and tear or mutations, as I mentioned earlier.  There is no automatic reason that major changes in life-extending technologies are in the offing, but of course it can't be denied as a possibility either. Similarly, if, say, antibiotic resistance becomes so widespread that infectious diseases are once again a major cause of death in rich countries, our 'maximum lifespan' will start to look younger.

Those who argue against this paper's assertions of a limit must be viewed just as critically as they judged the new paper.  The US National Institute on Aging, among other agencies, spends quite a lot of your money on aging, including decades (I know because I had some of it) on lifespan determination.  If someone quoted as dissing the new 'finding' is heavily engaged in the funding from NIA and elsewhere, one must ask whether s/he is defending a funding trough: if it's hopeless to think we'll make major longevity differences, why not close down their labs and instead spend the funding on something that's actually useful for society?

There are still many curious aspects of lifespan distributions, such as why rodents have small bodies that should be less vulnerable per-year to cancer or telomere degradation etc. that relate to the number of at-risk cells, yet only live a few years.  Why hasn't evolution led us to be in prime health for decades longer than we are?  There are potential answers to such questions, but mechanisms are not well understood, and the whole concept of a fixed lifespan (rather than a statistical one) is poorly constructed.

Still, everything suggests that, without major new interventions that probably will, at best, be for the rich only, there are rough limits to how long anyone can statistically avoid the range of independent risk our various organ systems face, not to  mention surviving in a sea of decrepitude.

One thing that does seem to be getting rather old, is the relentless hyperbole of the media including pop-culture journals like Nature and Science, selling non-stories as revolutionary new findings.  If we want to make life better for everyone, not just researchers and journals, we could spend our resources more equitably on quality of life, and our research resources on devastating diseases that strike early in the lives we already are fortunate to have.