Is it possible for an active scientist to communicate esoteric research findings in plain everyday language? The short answer is no. This may only be possible for stories on more popular topics such as dinosaurs, the "Big Bang" and extraterrestrial life - they readily capture the public's attention and numerous definitions and qualifications are not necessary. Nevertheless I have agreed to do so following a kind request from the editor for a "plain language version of our work re Lamarckianism". While this is difficult, we have already tried to do just this in our recent book Lamarck's Signature .
The proposition that characteristics, both physical and mental, acquired during an individual's lifetime may be passed on genetically to offspring is no doubt part of the popular imagination. As such it should be easily communicable to a wider non-scientific audience. However, such knowledge is often rudimentary and distorted ("the sins of the father" and such) or contaminated with a vague feeling that it does not "smell right" scientifically. One can also hear the more erudite utterances: "Does this not smack of that discredited fellow Lamarck who Darwin showed was wrong?".. and .. "Samuel Butler, George Bernard Shaw and Arthur Koestler were delusional romantics, all brilliant humanists but hopeless scientists." Here I will show that these hopeless romantics were probably right. I will attempt to condense the main features of our work to encourage those interested in the topic to delve further and confront the large body of evidence.
After two decades of research my colleagues and I now have good evidence that the tell-tale signs of "soma-to-germline genetic impact events" have been etched into the very fabric of our chromosomes. This conclusion is quite the opposite to that expected under the ruling neo-Darwinian genetic paradigm based on Weismann's Doctrine. The data have arisen from our research on the molecular genetics of the immune system, the system which allows our body to produce disease-fighting antibodies in the bloodstream. The quality of this evidence is now as strong as our confidence that the origin of craters on the surface of the moon or earth are the impact sites of large cosmic bolides such as comets and asteroids. Thus the molecular genetic evidence derived from the immune systems of higher animals point to "Lamarck's Signature", identified as the imprint of numerous soma-to-germline genetic impact events written into the DNA of our chromosomes encoding antibody genes. Such events which have repeatedly occurred over 400-million years of evolutionary time.
Before I get into this story let me express a special debt of gratitude to my colleague and active collaborator Professor Bob Blanden (of the ANU's John Curtin School of Medical Research) whose scientific support and friendship has been unstinting over almost three decades. In recent years Blanden's intellectual input has been both incisive and decisive in the development of our current ideas on how somatic cell-to-germ cell flow of genetic information may be effected.
Lamarck the pariah
The conclusions of the nineteenth-century German biologist August Weismann are known to every first-year biology student, dogmatically embedded in evolutionary text books. Weismann worked in the immediate post-Darwin period, yet just prior to the emergence of modern genetics. He erected a conceptual barrier which was assumed to protect the genes in germ cells from any type of genetic change within the body (soma) of the organism. Weismann was responding to Darwin's theory of the inheritance of acquired characteristics, and the bulk of his experimental evidence came from testing whether acquired parental mutilations could be inherited. He showed in breeding experiments with rats extending over many generations that tail chopping at birth never produced a tailless offspring. We have discussed the pointless nature of these mutilation experiments elsewhere in our book Lamarck's Signature. It seems inexplicable to us now that he did not forsee this outcome, given that the Jewish custom of circumcision of young boys has never resulted in a baby boy born without a foreskin.
But Weismann was influential. All somatic adaptive modifications within a parent's body were forbidden to cross into germ cells to appear in the offspring. By the early years of the twentieth century the Weismann Barrier became one of the key supporting pillars of the rise of neo-Darwinism, the modern theory of evolutionary genetic change. Neo-Darwinism sustained by Weismann's genetic chastity belt has thus reigned supreme throughout much of the twentieth century.
Evolution, so the theory goes, proceeds only by the natural selection of chance events. Actively aquired somatic influences that might reduce or eliminate the role of chance, as instanced by the antigen-driven mutation and selection of antibody genes that occurs within each immune system when subjected to infectious agents, do not contribute in any way to the genetic constitution of the next generation. In 1809 the French biologist Jean Baptiste de Lamarck published his comprehensive theories of biology and evolution in his Zoological Philosophy , fifty years before Charles Darwin's Origin of Species. On the continent, particularly in Italy, Lamarck is still revered as the father of modern evolutionary thought, but he has been demonised by Darwin's nineteenth and twentieth century followers (neo-Darwinists) and by political and social events he could not have foreseen. I do not want to get too deeply embroiled in this historical narrative - because it is beyond my brief - except to note several defining moments pivotal to our perceptions of the man and his work.
The antecedents for his pariah status may well have begun in France during his lifetime when he lost his scientific-political battle with the palaeontologist George Cuvier (a younger rising star of the 'establishment'). The historical record suggests he was of the scientific mainstream but "from the wrong side of the tracks". His devastating scientific criticisms of Cuvier's anti-evolutionary stance won no friends within the French establishment. Lamarck's core idea, widely accepted in folklore and by his generation on both sides of the English Channel, was that organisms adapting to a changing environment altered their bodily and behavioural characteristics and passed these acquired characteristics to their progeny. Whilst the proposition was reasonable and respectable for most of the nineteenth century, by the early years of the twentieth century it became one of the most emotive issues in contemporary science (apart from the metaphysical issue of whether an all-powerful "God" exists). As is often common place in far reaching intellectual developments historical ironies abound. Both Charles Darwin and his grandfather Erasmus (a contemporary of Lamarck) accepted the proposition that acquired characters could be inherited. The alert reader will note that the concept is overt in all of Charles Darwin's evolutionary analyses, particularly in his Origin of Species. For example, in a section on "Effects of the Increased Use and Disuse of Parts, as Controlled by Natural Selection" Darwin begins by confidently asserting :
"From the facts alluded to .. I think there can be no doubt that use in our domestic animals has strengthened and enlarged certains parts, and disuse diminished them, and that such modifications are inherited." Darwin found this necessary (I think) because his core evolutionary idea depended on a reasonably large repertoire of natural genetic variants pre-existing in the population prior to the "natural selection" of the fittest parents to produce the next generation.
How did this natural variation arise? Darwin considered this of major importance and hence used the Lamarckian theory of the inherited effects of organ use and disuse throughout his work. Thus in 1868 he published his detailed theory "Pangenesis" to explain the origin of genetic variations. He considered that, during a somatic, or bodily, change necessary for a particular adaptation, the body cells of the excited target organ would emit genetic material or gemmules (also termed 'pangenes') which were considered to be minute representations of each normal or altered bodily component. These were discharged from the active organ into the bloodstream, thus allowing them to enter the germ cells and be genetically transmitted to the next generation. For obvious reasons I personally consider Darwin's Pangenesis theory prescient and his major scientific achievement.
Thus Darwin was certainly no neo-Darwinian and he certainly was not a Weismannian. His position on these crucial issues is not widely known and not part of the modern orthodoxy which has demonised Lamarck. Indeed developments this century, I am sure, would have mystified if not horrified Darwin. Many neo-Darwininists, particularly in Britain, have been acutely embarrassed by Darwin's pangenesis speculations and have, where possible, expunged them from the scientific record. As a consequence many interesting acquired inheritance phenomena have been suppressed or deliberately misrepresented. Big reputations have been laid on the line over the matter and these passions have blinded generations of scientists from rationally considering the issues and the data.
However, during the first half of the twentieth century two other developments consolidated the demonisation of Lamarck and heightened the passions against his core assumption. The first is the tragic story of the Austrian biologist Paul Kammerer, powerfully told by Arthur Koestler in his 1971 book The Case of the Mid-Wife Toad . He was accused of using fraudulent data in his acquired inheritance experiments. These charges were never proven. The second is the bizarre aberration of the so-called Lamarckian theory promulgated by Joseph Stalin's head of Soviet agriculture,T.D. Lysenko, who, in the course of a destructive thirty year career ruined Soviet agriculture, biology and genetics. So at the end of a turbulent and violent century neo-Darwinism remains apparently impregnable. This I believe is an illusion , much like Russian and Eastern European Communism appeared just prior to their dramatic implosion a decade ago. In my view neo-Darwinism and Weismannism, both of which reached their zenith during the Cold War, are on the verge of a similar collapse.
There is still more dogma to consider, however, and this is embedded in modern molecular biology which has been used by some to prop up the orthodox view. To explain this we need to briefly divert and explain some necessary biological terms.
Language of life
The evidence I will recount is basically "informational" in nature. Our research program may in fact be considered to be "genetic archaeology" as we are systematically deciphering complex informational molecular fossils, "genetic Rosetta Stones" if you will. This necessitates the use of terminology and definitions. To highlight a term used for the first time bold font is most often used, whilst in some cases new words are italicized or bounded by quotation marks. The data we have gathered and analysed are encrypted in the molecular language of the long strings of base sequences - abbreviated as A, G, C, T and U - making up the DNA and RNA molecules (termed nucleic acids) of our genetic apparatus. As I will discuss shortly genetic information flows in one direction, from nucleic acid base sequences into strings of amino acid sequences which comprise the proteins.
Genetic information in its stored and thus dormant form comprises long molecular polymers of DNA base sequences. A single chromosome is a DNA sequence of millions of bases in length. The total number of chromosomes in the human genome comprises 23 in a gametic reproductive cell (egg, sperm) and 46 in a somatic or body cell (23 from each parent). In its expressed form the DNA sequence for each protein encoding gene is copied into a shorter stretch of bases termed messenger RNA which instructs the translation of its base sequence into a sequence of amino acids which makes up the protein. A protein is roughly equivalent to a sentence with a stand alone meaning. Moreover the base sequence data I will discuss has been analysed using familiar terminology and logic, akin to the rules of grammar and sentence structure; and the copy, cut and paste functions of the modern word processor often apply. Thus the copying of gene sequences can be either high fidelity or error-prone. And when a genetic sentence needs to be replaced by one of improved meaning, then a copy-cut-and-paste function is often executed - termed homologous genetic recombination. The DNA sequences specifying proteins are termed open reading frames which are devoid of stop codons which terminate the amino acid sequence during translation (these are normally located at the end of the protein coding region and are roughly equivalent to a full stop). When DNA is copied into a complementary DNA molecule, it is said to be replicated and such high fidelity copies are passed on to progeny cells or organisms. When DNA is copied into RNA it is said to be transcribed; and said to be reverse transcribed when the RNA is copied back into a DNA sequence. Both transcription and reverse transcription are low fidelity copying processes (about a million times more inaccurate than DNA replication).They are error-prone because the copies are not subject to meticulous proof-reading, as they are in DNA replication.
Thus incorrect DNA sequences can be repaired by the replicator's intrinsic proof-reader, and the ends of shorter stretches of DNA sequences, involved in genetic recombination processes, can display base trimmings or additions (that is, the ends can be tidied up). Precursor RNA messages are extensively processed, bracketed by so called molecular caps and tails ; and intervening sequences which do not code for proteins (called introns) in the RNA are edited or spliced out before they are able to instruct the correct sequence of amino acids when translated into the protein. All these features represent natural information copying and editing processes involving large strings of the genetic letters A,G,C,T (and U) which have been evolving on our planet for perhaps four billion years. We have used these features to interpret and analyse the DNA sequence structures of the diversified antibody genes which have been evolving for about 400 million years in animals with backbones (the vertebrates).
Central dogma of molecular biology
James Watson and Francis Crick's "Central Dogma" specifies that the chemical language of genetic information in the form of DNA sequences can be directly copied into complementary nucleic acid base sequences termed RNA which in turn can then be translated into a protein sequence of amino acids, a quite different chemical language. Genetic information never flows in reverse from a sequence string of amino acids into a complementary sequence of DNA or RNA bases. Protein molecules are of two categories, on the one hand structural supplying the primary building blocks of tissues in the body; and on the other catalytic facillitating the myriad functional properties of the different cells within our bodies.
This Central Dogma was formulated in the early 1950s but underwent a subtle yet far- reaching modification at the end of that decade when the American virologist Howard Temin first introduced the concept of reverse transcription, or the back- copying of RNA base sequence information into DNA. The Central Dogma can be written symbolically as :
DNA <-----> RNA ----> PROTEIN
Despite Temin's important modification in the first decade of the rise of modern molecular biology the mainstream mantra has always been "DNA makes RNA makes Protein" or :
DNA -----> RNA ----> PROTEIN.
This is obviously misleading and conveniently overlooks the all-pervasive copying of RNA sequences back into DNA sequences for genes throughout many biological systems, from viruses to bacteria through to man. I suspect that reverse transcription, whilst central to many gene cloning manipulations both in academic research and in the biotechnology industry (for example, the routine production of complementary DNA copies of RNA transcripts termed cDNAs), is systematically dropped from the mantra because of its deep Lamarckian overtones. Indeed the subtitle of Lamarck's Signature is "How retrogenes are changing Darwin's natural selection paradigm" and was chosen deliberately to highlight this fact.
Antibody genes are special. First, they are the only known genes within our bodies which undergo rapid somatic mutation as a consequence of stimulation by an identifiable environmental stimulus viz. invading foreign antigens (viruses and other infectious microbes). The immune white blood cells expressing these somatic mutant antibodies are "naturally selected" by the invasive antigen and form part of the "memory" response when we encounter the same disease again - a principle underlying all immunological vaccinations. Furthermore, in contrast to all other genes expressed within our body these genes undergo a unique type of genetic cutting and pasting called a DNA rearrangement creating an expression site for the gene. This means that when expressed in a white cell, antibody genes have a somatic configuration which is distinct from the dormant unrearranged form of the antibody genes said to be in the germline configuration (the chromosomal form in all cells of tissues which do not make antibodies, such as kidney, heart and muscle including the reproductive germ cells).
We have shown that the characteristic base sequence pattern of somatically mutated and processed antibody genes typical of the somatic configuration is actually present in the germline configuration. The simplest interpretation of all the sequence data from sharks through birds to mammals including humans, is that antigen-experienced somatic forms of the antibody genes have been reverse transcribed, physically transported to reproductive tissues and integrated into the chromosomes of germ cells, that is, soma-to-germline transmission of genetic information.
Penetration of Weismann's Barrier
Arthur Koestler was an intellectual giant of the twentieth century. Not only was he author of the hauntingly devastating Darkness at Noon and a founder, with Sidney Hook and others, of The Congress for Cultural Freedom, but his work on the history and philosophy of science has been seminal through works such as The Sleepwalkers , The Act of Creation and The Ghost in the Machine.
When I first ventured onto this controversial turf Arthur Koestler had just wrapped up his own conclusions on the matter in Janus: A Summing Up . His rational arguments and eloquent prose inspired me as a young man to get involved. On a flight between Toronto and Frankfurt sometime in early June 1978 I vividly recall reading his chapter entitled "Lamarck Revisited". I remember eagerly writing in the margins the elements of the retroviral soma-to-germline idea (later to become the more formal Somatic Selection Hypothesis). In my view Koestler, in his effort to discredit the Weismann Barrier, had logically confusedthis nineteenth-century idea with the more modern propositions of the Central Dogma. As we have seen, the latter specifies the molecular rules of the direction of genetic information flow, whilst the former is a cellular theory of inheritance simply denying that any type of genetic information can be passed from cells of the body (somatic cells) to the cells of the reproductive organs, the sperm or eggs (germ cells).
Thus Koestler had directly equated Weismann's dogma with the Central Dogma. But, in my view, this only confused the matter, as they are distinct biological theories. At the time I was at the Ontario Cancer Institute in Toronto on my first immunological post-doctoral stint, supervised by the New Zealand immunologist Alistair Cunningham. Working at the John Curtin School of Medical Research in the early 1970s Al Cunningham was the first to introduce the now pivotal concept of antigen-driven somatic mutation in antibody genes. I was thoroughly immersed in this field and the subtle confusion generated by Koestler's analysis deeply affected my thinking.
It seemed to me that Koestler's problem could be solved and Weismann's Barrier breached by invoking somatic gene mutations in the form of RNA being copied back into DNA and integrated into the genes of the chromosomes in the germ cells for transmission to progeny. It was also envisaged that not any type of random somatic mutation would be inherited, but only those that had been "somatically selected", or successful, within the body of the organism. Because the immune system was our biological prototype, this hypothesis immediately incorporated the Darwinian "natural selection" concepts of the late Sir MacFarlane Burnet - the idea that the foreign antigen binds to and thusselects its complementary antibody producing white cell, making only one particular antibody protein, from a large repertoire of cells each producing their own specific antibody. Burnet, Director of the Walter and Eliza Hall Institute in Melbourne, is widely perceived as the father of modern immunology. He developed his "Clonal Selection Theory of Acquired Immunity" during the late 1950s and it remains the central idea underpinning all modern views of the functioning of the immune system.
Thus as a young immunologist deeply influenced by both Burnet's and Cunningham's ideas, I viewed the solution of "Koestler's problem" as a synthesis of Burnetian/Darwinian natural selection principles (operating within the body of a multicellular organism) coupled to a reverse transcription-based geneticfeedback-loop. I imagined both cellular selection (mutant antibody producing cells selected by the foreign antigen) and, following Howard Temin's seminal work , molecular selection (at the level of gene expression or RNA ) coupled to an RNA-to-DNA reverse transcription step involving harmless retroviruses intrinsic to the body acting as somatic gene shuttles. No conventional genetic or "natural selection" dogmas had been violated by the hypothesis, except, of course, that the Weismann Barrier could now be easily breached and would henceforth have to be seen as selectively permeable to somatic genetic information.
After twenty years of research a large body of molecular evidence is now consistent with this "Somatic Selection" hypothesis. We have tried to explain this in the book Lamarck's Signature and in other technical and review papers published in the recent refereed scientific literature (1993-1998). No alternative explanation for our data or genetic analyses has yet been published by other scientists. Before I discuss this in more detail it is important briefly to review the main tenets of the reigning neo-Darwinian paradigm and how this explanation of evolutionary genetic change has developed during the twentieth century.
The traditional neo-Darwinian view of evolution has developed alongside the mathematical field of population genetics. By analogy with Adam Smith's economics, the overarching assumption is the operation of "the invisible hand" of natural selection operating on a substratum of random variability and chance. Within its prescribed domain population genetics has made a valuable contribution, but it says nothing about the origin of new variation which so troubled Charles Darwin. Thus, genetic variation pre-exists in the gene populations of a species. The frequencies of variant forms of particular genes (the alternatives of a particular gene are termed alleles) can be described by statistical formulae and these frequencies can be monitored and predicted under different environmental selection conditions. But to many critics this is akin to shuffling the deck chairs on the Titanic. One great advantage of a Lamarckian theory of gene use and disuse is that it can account for the emergence of new allelic forms of genes, and it goes a little way to actually explaining some aspects of species emergence and diversity, particularly closely related ones (although it has to be admitted that such a genetic process is only one part of the answer to the great evolutionary mystery of how major transitions in physical form and body plan take place).
So just what are the genetic expectations of the traditional theory? First, the genetic variants pre-exist before the selection force acts. These variants are never expected to be created by stimuli evoked as the organism rubs up, in an adaptive way, against the environment. The expectation is that these genetic variations appear randomly in the population. Lethal mutations are promptly deleted from the population of organisms. In animals the mutant embryo will likely die at some stage of early development causing spontaneous abortion. Those non-lethal mutant survivors stay in the gene pool and flourish in relative frequency depending on their Darwinian fitness in providing parents for the next generation. Second, the theory pertains mainly to the population genetic behaviour of single- copy housekeeping genes. These are the genes which specify the individual proteins which carry out the various step-wise functions absolutely necessary for the survival of all living cells. All of the housekeeping genes have ancient pedigrees dating to the first cellular life forms (on this planet about four billion years ago). Thus, they are said to be highly conserved. Cells and organisms live or die depending on whether they conserve the function of these genes - again Darwinian fitness of the cell or whole multicellular organism is the governing selection criterion. Richard Dawkins's phrase "selfish genes" applies particularly to these types of genes.
There are, however, other classes of highly conserved housekeeping genes which exist as multiple copies of the same DNA sequence. Here the traditional population genetic explanation has to be modified and new concepts such as concerted evolution and molecular drive have been grafted onto the neo- Darwinian superstructure. These concepts, championed by Oxford's Gabriel Dover amongst others, are devoid of mechanistic detail spelling out exactly how these molecular processes occur. They merely posit the existence of mechanisms intrinsic to the nucleus of the cell which homogenises the gene copies, thus maintaining them all with identical base sequences. (The term gene conversion is used in this context to explain the homogenisation but the term is used descriptively. Like many concepts in population genetics, it is devoid of detail about specific mechanisms.) These are the genes involved in allowing the mass production of (say) proteins within the cell (the ribosomal RNA genes - akin to multiple identical robotic assembly platforms in a manufacturing plant producing copies of a widget). Most of these genes exist in large copy-number per chromosome, of fifty to hundred or more members, and are lined up in tandem. As a cell or multicellular organism grows and expands its cell numbers by cell division to produce daughter cells, the very large numbers of "assembly platforms" required are quickly constructed by the simultaneous expression of all these identical gene copies. So the maintenance of a large number of identical genes is understood in terms of Darwinian fitness for survival. It makes perfect sense. For any given species there is a critical number of identical copies that must be maintained within the cell. Variant cells (or germ cells) will die if an accidental chromosome mutation deletes a large slab of gene copies taking it below the threshold copy-number.
There is also another large class of non-functioning gene which has buttressed the idea of "random gene mutation" and has led to an outgrowth of neo-Darwinian population genetics termed by M. Kimura the " Neutral Theory of Molecular Evolution". This is a useful mathematical-genetic theory within its limited domain. It has provided us with a measure of the basal background mutation rate over evolutionary time and it has fleshed out the concept of genetic drift. Here the main gene sequences are called pseudogenes. There are many in the genome, probably tens of thousands. As a rule of thumb there is at least one pseudogene for every functioning gene (more on this shortly). All pseudogenes are dead genes, true molecule fossils. They can never express the functional proteins or RNA molecules they encode. These crippled genes are presumably the result of the copy-duplication of a functional gene sequence followed by a crippling mutation in the DNA sequence of the copy. Usually they are located nearby on the same chromosome to the functioning gene, but often they are also dispersed elsewhere in the genome, and located on other chromosomes.
Pseudogenes are replicated along with the rest of the genome. They are dead genetic hitch-hikers and in many ways are the ultimate manifestation of Dawkins' "selfish genes" in that they are immortal replicators with no apparent function. The DNA encoding them is slavishly replicated along with the rest of the genome. In Kimura's theory such genes "soak up" random base changes in the DNA sequence and have no effect on the Darwinian fitness of the organism or species. And indeed Kimura saw how useful such supposed "junk DNA" might be. They are ideal "molecular clocks" measuring the rate of genetic drift as the genome replicates and slowly mutates over aeons of time, first diversifying and branching into a closely related species (eg.the separation of mice and rats), and then into species that look quite different (eg. rodents and humans) and so on.
But neutral mutation theory and molecular drive are concepts on the periphery. We can now categorize them as such because of far more significant developments, which have established the "fluidity" of the genome. This has enabled us to envisage far more flexible evolutionary genetic strategies.
The paradigm collapses
As World War II came to its close, and in its immediate aftermath, the plant geneticist Barbara McClintock began documenting abnormal genetic patterns in maize. In a few years her evidence was overwhelmingly showing that parts of the maize genome were far more fluid than previously thought. Genes could move between and within chromosomes, they were no longer in a fixed order as "beads on a string". But then in 1953 Watson and Crick elucidated the double helix structure of DNA and the revolution in molecular biology was well and truly underway. For thirty or so years this overwhelmed McClintock's achievement, but her "jumping genes" persisted popping up all over the place, becoming common features of all genomes. They were subsumed under the general name of transposon and became a fact of genetic life. But genes that could jump and thus rearrange the genome were profoundly disturbing. Allelic variants no longer could be assumed to maintain a constant position on the chromosome and the mathematical theory of population genetics could not really handle them at all.
In my view the lethal blow to neo-Darwinism, population genetics and Weismann's apparently impregnable barrier were the unexpected exocet missiles which arrived in the form of retrogenes. At first they were not readily apparent, elluding the radar of the self appointed neo-Darwinian censors.
In 1970 Temin and Baltimore first reported on reverse transcription in viruses that could cause tumours in chickens and mice. These C-type RNA tumour viruses were then renamed retroviruses (of which HIV is a family member). But within a decade it was apparent that reverse transcription was not restricted to a special class of viruses at all. Indeed retrogenes and retrosequences began cropping up all over the genome from bacteria through to man. It is now conservatively estimated that perhaps a third or more of the genomic DNA has been made first into RNA and then copied back into DNA. There are probably hundreds of thousands of copies of these retrotransposons. How many of these are functioning genes given that we detect them because they are non-functioning pseudogenes ? The suspicion arises that maybe the DNA of the entire genome has been processed through RNA before copying back into DNA (the tell-tale features of many of these retro DNA sequences is that they no longer have introns and possess "tails" as typically seen in mature RNA messages).
In my opinion the game is not only over but far worse than the comfortable conservative can possibly imagine. In the memorable words of Sylvester Stallone " It is their worst nightmare come true". For example, by 1991 it was clear that most, if not all of the pseudogene copies of functional single-copy genes have been processed through an RNA intermediate - so their correct name should now be prefixed as retropseudogenes. Many of the pseudogenes at the heart of Kimura's molecular clock are therefore retrogenes. And when you looked at the numbers (even in the gene data base accumulated at that time) no self respecting scientist could ignore the implications - for every functional gene there appeared to be on average five or more retropseudogene copies, most of which were dispersed to other sites as expected by retrotransposition. Indeed James Watson's HUGO, the human genome sequencing project, due for completion with the start of the new millennium, should be given a new corporate logo - RETROHUGO!
So how do the traditionalists explain this? Answer : by inserting their collective heads in the sand. All of these events occur only in germ cells. And you hear the defeated sigh "They are the detritus of evolutionary mistakes."
But what is the Darwinian fitness value of bombarding the genome, apparently willy-nilly and at random, with hundreds of thousands of retrosequences? A retrogene has to begin life not as a DNA sequence but as a RNA copy of a DNA sequence. Why should germ cells be rich in the activity of the enzyme reverse transcriptase which allows the copying of RNA into DNA? Genetic recombination, or the exchange and rejoining of gene sequences, should only occur during genesis of the sperm and eggs in the reproductive tissue in a process called meiosis. So why then is there an apparent role for reverse transcription in such wide spread genetic recombination processes?
Indeed the list of questions recedes into the distance. It is a can of worms. And one authoritative text on molecular evolution conveniently packages the entire phenomenon under the Vesuvian model - the invisible hand of natural selection is at it again as the "the germ cell eruption of retrogenes is good for evolution, it is the source of genetic mistakes". Would it not make more sense to make the lateral leap and advance the proposition, however heretical and thus dangerous to one's scientific career, that many of these retrosequences are somatically derived? That is, Lamarck's invisible hand is at it again.
Here is how we now view these phenomena. Retrogenes are emitted much like Darwin suggested by somatic cells excited or stressed by some environmental stimulus. They truly can be the detritus of evolutionary mistakes, but much of the time the incoming new somatic gene (as cDNA or retrotranscript) combines with and converts its normal gene to a subtly different if not improved version of the gene (by the copy-cut-paste strategem) . If this is a cDNA copy of a necessary housekeeping gene and it has occurred in a circulating somatic cell (for example, an antibody producing white blood cell) the cell only survives if the mutant sequence is compatible with the life of the cell, otherwise the cell dies - it will be naturally selected against. Such a surviving cell could migrate to the reproductive tissues (the Harald Rothenfluh model of somatic cell-to-germ cell migration) and effect a clean soma-to-germline conversion event giving rise to a new allele of the gene. If the somatic retrogene lands elsewhere in the genome then it will of course light up as a dispersed retropseudogene, an evolutionary mistake (a harmless mistake if it does not land within and disrupt another different functioning gene).
Somatic selection theory
As Bob Blanden and I have experienced it, the scientific mainstream within the field of immunology and in wider biomedical disciplines concedes no functional physiological significance attached to the existence of this rich and ample forest of germline retrogenes. It has been my experience in many personal professional encounters that the words "reverse transcriptase" and "reverse transcription" cause emotional shudders and hyperventilation if applied outside their "proper context", that is, the routine test tube, or technical, production of cDNA during the artificial cloning of genes.
For over twenty years my colleagues and I have been engaged in a research program focused very specifically on the role of reverse transcription both in shaping the DNA sequences of the germline genes of germ cells as well as the genes of somatic cells. In both cases the genes which encode the antibody molecules have been our prototype in establishing the retrogene concept for functional genes. In 1979 the retroviral -based "Somatic Selection Hypothesis" was first published as a short book in Toronto entitled Somatic Selection and Adaptive Evolution. A set of specific predictions were made, the most important being that the antibody variable genes (V-genes) of the germline - in the "germline configuration" - were diversified and updated by reverse transcription of antigen-selected somatic V-gene mutants (in the "somatic configuration") in circulating white blood cells followed by integration of such somatic sequences into an homologous V-gene in the germline.
In 1987 together with my long standing friend and collaborator Jeff Pollard I published the "Reverse Transcriptase Model" of somatic hypermutation. There is in fact an intimate connection between both molecular models. At the time both models emerged into the public domain they were rich in speculation, specific predictions and promise, yet short on supportive data. Over the intervening ten to twenty years molecular data from both our laboratory and many other laboratories have turned these hypotheses into robust functioning theories and mechanisms. They have well and truly run the gauntlet of Popperian criticism and error- elimination (although I can hear over my shoulder as I type this confident progress report, the indignant sighs if not angry snorts of the army of silent armchair protagonists both within and outside science).
Antibody genes are special retrogenes
With the above back drop to our thinking completed, I am now going to recount the main features of the supportive evidence as best I can. This molecular evidence comes from the DNA sequence structure of the antibody V-genes both in their somatic and germline configurations. It is this distinction, unique amongst all known genes, which allows us to detect the directional flow of genetic information from the soma to the germline.
A most striking feature of the large tandem array of V-genes in the germline configuration is the fact that they can exist in large numbers (several hundred) yet unlike the large families of identical housekeeping genes previously described above , the germline V-genes of a particular large family are very similar (say seventy percent or more similar in base sequence) yet distinctly very different. How did this pattern of sequence diversity evolve and how has it been maintained over evolutionary time? This is a very different problem to maintaining absolute sequence identity in a large set of conventional housekeeping genes. Indeed the more the problem is pondered upon, the more profound the problem becomes. Thus the differences between sequences are not randomly distributed - they arelocated in such places typical of somatic mutation and selection events (that a researcher can readily document in experiments where mice have been vaccinated with a foreign microbe or antigen).
So this presents a conundrum. At first sight this implies that the somatic mutation and selection pattern appears written into the germline configuration. We have established that the probability of these patterns emerging by chance is statistically remote. But it gets worse. Recall that when a V-gene is in the germline configuration it cannot possibly be expressed (that is, first into RNA then on translation into a functioning antibody protein). Yet all the non-random V-gene features in the germline DNA could only arise if the Darwinian somatic selection process mediated by the invading antigen took place in a somatic white cell with an antibody expressed as a protein on its surface (that is, on a V-gene in the somatic configuration). So this is the first big problem. Trying to solve it using the traditional mode of thinking, the neo-Darwinian mode, gets you nowhere. Indeed the data look like what they imply, soma-to-germline flow of mutated yet selected V-genes involving error-prone reverse transcription.
When we began to compare our sets of mouse derived V-gene sequences with those sets in other species, this pattern kept repeating itself. But then another feature became apparent . The pseudogenes amongst the V-gene germline families of mouse and man were in an apparently pristine condition! Indeed this exposed another feature of the data. Under Kimura's random genetic drift model of mutational change a certain proportion of "stop" codon mutations would be expected on statistical grounds. But on inspection germline V-gene families are largely devoid of stop codon mutations. They are dead V-pseudogenes and should be accumulating them! Kimura's clock has stopped for these genes, they look as if they have been recently created albeit with minor mistakes.
So the problem now was to explain how multiple different open reading frames could be maintained in the germline in the face of random genetic drift. Conventional neo-Darwinian population genetics will not handle this one, nor will Gabriel Dover's molecular drive concepts. Again, the simplest explanation was that the germline repertoire is maintained in its diversified and functioning (open reading frame) state by a soma-to-germline feedback of somatically mutated and antigen selected V-gene sequences. And Bob Blanden and I now think this is the raison d'etre of the somatic hypermutation process (where all thecurrent molecular evidence points very strongly to an involvement of error-prone reverse transcription).
The coup de grace however comes from two other areas. First, chicken germline V-genes apart from the just mentioned features also show overwhelming evidence for remnants or gene-bits typical of the somatic configuration physically linked into the germline V-genes. This says that information must have flowed systematically during evolutionary time from the somatic "rearranged" configuration into the germline V-genes. Second when our ANU colleague Georg Weiller subjected both our mouse data and other published human V-gene data sets to a specially developed genetic algorithm which detects hot-spots of homologous genetic recombination they were located in the gene exactly as predicted by somatic retrogene theory.
I vividly recall the moment several years ago when Bob Blanden saw this critical feature and enunciated excitedly how it would explain the recombination signature. Weiller's recombination pattern and Blanden's analyses shows that the somatic RNA processing signature - typical of somatically expressed V-genes that have undergone RNA splicing - is now imprinted as a recombination signature on the germline DNA! The Weiller recombination pattern is prominently displayed on the jacket cover of the first Allen & Unwin edition of Lamarck's Signature.
Is there a more visible outcome of our pure research program, with more direct medical relevance? There are in fact observations, outside of the immune system, which demonstrate transgenerational effects of chemically induced conditions such as diabetes in rats and mice. I believe these type of induced metabolic disorders, which are transmitted to offspring, could be relevant to a better understanding of the significant rise in diabetes throughout modern society, particularly in remote indigenous communities in Australia. Do such phenomena arise first in damaged genes in target organs, such as the pancreas, which then directly or indirectly effect the same genes in the germ cells? At the very least, modern DNA technology can now be rapidly applied to unravel the genetic basis of those well documented apparent acquired inheritance phenomena in laboratory rodents.
However our main conclusions reside in the immune system. We find that there are many features of the "somatic mutation pattern" literally written into the "germline configuration" at least for antibody genes. The simplest interpretation is that all these genetic patterns have arisen by the reverse transcriptase-mediated soma-to-germline flow of genetic information. Lamarck's legacy in modern biology we believe is now quite pervasive. It is written as "soma-to-germline genetic impact signatures" into the DNA of our chromosomes. We do not think it is restricted only to the genes of the immune system. As project HUGO uncovers the base sequence of many more of our genes, made up largely of the highly conserved housekeeping genes, we can expect to find Lamarck's signature written all over them. Indeed, how far into human culture would this tentative conclusion extend? In the epilogue of Lamarck's Signature we posed a non-trivial speculative question linking physiological processes and ontogenetic responses to the environment via conscious action and habits : "Ethical beliefs are after all, a social construct based on many factors. But what if our genetic make-up is in the final analysis also a 'social construct' arising from our consciousness?"
What is the next scientific step? On Saturday May 8 1999 Bob Blanden, Robyn Lindley and I published a short "Comment" piece in London's Independent newspaper and I quote here its conclusion: "The challenge (now) is for those other scientists, who really understand these data ,viz. molecular immunologists, to come up with a better scientific explanation. We don't think there is one - outside of course those non-scientific propositions evoking an 'intelligent gene manipulator' or, if you will a 'divine intervener' as playing a role in evolution."
Further commentary can be found at other Websites (each endlinked to other relevant sites):-
- "What is Lamarck's Signature?" by EJ Steele & RV Blanden an article published in the webzine "HMS Beagle" at: http://www.biomednet.com/hmsbeagle/56/viewpts/op_ed
- Newspaper commentary : "Genetic Notes - Embarrassment of the neo- Darwinists" by EJ Steele, RA Lindley, RV Blanden published in The Independent (UK) May 8 1999. Full text at: [tba]