Abiogenesis - the Chemical origins of life

Printer-friendly versionSend by emailPDF version

I've been interested in Abiogenesis, or the chemical origins of life, for quite some time. It was something I struggled with, but the explanations I heard around the traps just didn't cut it - there wasn't enough detail, at the same time as I'd struggle with the advanced chemical papers.

But over time, I did develop some familiarity with the ideas. I gave a talk to the Sydney Futurists on the subject here. My views have developed a bit since then.

I was originally overwhelmed by the Wikipedia reference on the subject - - it is in fact good stuff, though reading Wikipedia can be a bit like trying to drink from a fire hydrant. But I didn't rely on Wikipedia ( Ian Woolf tells me that's very naughty). Still, after I'd struggled with the ideas for a while, I then stumbled over the pages origins and probabilities which do help a lot in putting things in perspective.

Still, I hope I'm bringing things together in an interesting way, one that makes this article worth reading if you're trying to get your head around ideas in abiogenesis.

While I'm not trying to disagree with the creationists, some material is as a rejoinder to their ideas. At times the commentary by creationists - while wrong - is considered and shows a degree of finesse in its chemical understanding.

The main point I make against creationists is that although it is difficult to imagine natural conditions which make all the amino acids, all DNA/RNA base pairs, and long chain proteins, it is possible that a more limited chemistry could naturally arise - using not all the currently used amino acids, not all the currently used base pairs and using shorter chain proteins than are currently in use - which would be able to sustain the generation of information bearing molecules of increasing complexity - where at some later stage the developed chemistry ( no longer "natural" in a constraining fashion ) - might then support difficult facets of current life forms ( all the amino acids, all the base pairs, and long chain proteins ).

I'm assuming that you know something of DNA, amino acids, proteins and catalysts - but not that much - hopefully it will make sense as time goes on. I'm particularly appealing to you if you're a science fiction enthusiast or generally interested in science and the origins of life. I'm trying to fill in the details of the story, not particularly "prove" a case. I'm not seriously trying to "argue" with the Creationists, but I did use some of their arguments as a springboard to develop my understanding.

In putting this together, I've mostly embraced mainstream biochemistry, though I do go out on a limb in speculating that single strand DNA was able to help in the catalysis of reactions which store information, but do not store information by itself. Of course, my ideas might well be blown out of the water by credible recent research - who knows - but you've gotta start somewhere!

I do emphasise talking about catalysis and the ATP molecule. To my way of thinking, we should not just think of "DNA and amino acids" as important to cells, but we should think of "DNA, amino acids and ATP" as having similar importance - but some mainstream researchers might even agree with me there.

But, before we go into the these ideas, my plan is to look at some of the important things current life does, and how early life might have done them.


  1. Cellular life as we know it
  2. Catalysis : Concentration alteration
  3. Catalysts : higher energy states
  4. Energy for reactions. Food for the city.
  5. The polymerisation challenge
  6. The Dynamic Selective System - reducing the effects of formic acid
  7. Early life - a metabolism first approach
  8. Chemical innovation and the "Wheel of Fate"
  9. References / Inspirations / Further Reading

Cellular life as we know it

I define life as "The duplication and maintenance of information-bearing molecules, and the related supporting chemical processes and infrastructure."

Cells currently set up a very specific chemical environment to support this process, and make use of energy in the environment. In imagining early life, we think about reactions that may not have taken place in cells, but nevertheless led to these reactions.

The cell may thus be seen as something which :

  1. Extracts energy from the environment
  2. Maintains a cell wall and chemical environment separate from the surrounds
  3. Duplicates information

... And needless to say a lot goes on to maintain this.

Normally, we think of a "cell" providing these functions. However, depending on the chemical environment, it might be possible to "lean on" chemicals outside to provide energy, not bother with a cell wall, and just be a molecule ( or set of molecules ) which duplicate yourself, relying on the surrounding chemical environment for energy and raw materials.

Current cells duplicate DNA and use information in the DNA to synthesise proteins which catalyse reactions. However, this process takes energy. Energy is extracted from the environment to make ATP molecules, which do the "heavy lifting". Carbohydrates are a "store" of energy, but the energy must be converted to ATP for the cell to use it.

One approach is to take carbohydrates from the environment. Another is to use photosynthesis to generate carbohydrates from carbon dioxide and water in the environment. A third is chemosynthesis, where I understand energy molecules other than carbohydrates are used as an energy source to form ATP.

The conversion of carbohydrates to ATP is called the Citric Acid or Krebs Cycle. Depending on the microbe and the environment, a microbe may use fermentation, reacting the carbohydrate to alcohol, lactic acid or some other compound; or it may combine the carbohydrate with oxygen in "respiration", generating carbon dioxide, water and rather more energy.

There are 3 reactions in cells : condensation - lengthening of carbon chains, spltting - the reverse; polymerisation - joining a chain through oxygen or nitrogen, hydrolysis - the reverse; and also oxidation and reduction.

Catalysts are an important part of the picture - there's so many possible reactions, and you want to focus on a few which contribute to the cell. However, there's more to the picture - a chain of DNA base pairs, or a chain of amino acids has more energy than the individual components. So, while we normally think of catalysts as "helping the chemicals to a lower energy state", there's a lot more to the picture - we have to think about catalysts a bit differently.

So, let's first look more closely at catalysts.

Catalysis : Concentration alteration

You may have heard that catalysts speed up chemical reactions - that's true, but it's also misleading. Importantly, catalysts change the relative concentrations of different chemicals. We're used to the idea that chemicals react together, with only one way to react. But, if you have a lot of chemicals together, they'll form a random distribution of products, based on what chemical reactions are available.

We see this in the Miller-Urey experiment, where under the right conditions, amino acids will form from their precursors. Here, molecules randomly collide with each other and will build up into more complex molecules - including eventually amino acids.

What you'll have is a mix of all the random chemical combinations that can be formed - as a child might assemble lego blocks randomly together. They can be connected together in a multitude of different ways. More obviously, different sequences of the same set of amino acids will behave differently. One might be an effective protein, others will just be a random string of amino acids.

Rather than a random combination of molecules, catalysts will increase the concentration of a particular product. Yes, we've increased the speed of the reaction. But metabolic reactions like fixing nitrogen involve the hydrolyis of the equivalent of 16 ATP molecules. The original rate of reaction would be so close to zero as to be pretty much zero. While technically we might say the catalyst "speeds up" the reaction, in this case it is pretty much a meaningless notion. Here the catalyst "enables" the reaction.

It's worthwhile comparing this situation to a "regular" chemical synthesis, at least as happens laboratories and industry. First, we purify the reagents. Then, we react a few reagents together. We might just mix them, we might heat them to help the reaction overcome activation energy barriers, and we might also add a catalyst to help this along. Then, we have the products. Sometimes there are impurities. We remove them.

This is in contrast to a cell, which is not a collection of "pure" reagents in the way that we understand the notion, but rather a mixture of lots of potentially reactive components. Rather than "purifying" the mixture, the catalysts, through selecting the components of the chemical reaction, perform a "virtual purification" role.

How do catalysts work ? One way of looking at is that the catalysts have depressions which match the molecules in the reactions; the reactants fit into these depressions, and then react to form a new compound because they are in close proximity.

So, over time our chemical soup will form lots of chemical compounds. Over time, it might randomly put together compounds whose shape is ble to catalyse another reaction. It is this process that may have been one of the main steps towards life. Our reactive pre-life soup is searching out the information space of available compounds, looking for compounds that have a catalytic reactive linkage.

We can imagine going one step further, that catalysts catalyse other reactions which eventually form more of that catalyst, and we have a runaway loop. The researchers Manfred Eigen and Stuart Kaufmann investigated the idea of "catalytic loops" in more detail, suggesting they were important to the formation of life.

Of course, there a bit of a step between molecules involved in autocatalytic reactions and cellular life as we know it. But one thing at a time.

Catalysts : higher energy states

One issue with the whole "catalyst" picture is that what you're trying to build molecules which have higher energies than the original components. How does that work, if catalysts normally work to rearrange atoms into lower energy states?

Consider "normal" chemical reactions as we understand them. "Normal" endothermic chemical syntheis involves increasing the temperature, with the equilibrium at this higher temperature including more higher-energy products. Because the molecules are moving more rapidly, they can collide with each other, overcoming the repulsive force and have surplus energy to feed into bond formation. Once we lower the temperature, the high-energy state is locked into the chemical structure, because of the activation energy needed to change the state.

The problem with this "thermal" approach is that it is a very blunt instrument. Much as you may make some products, that's a statistical result - you'll have a lot of products you don't want. Further, at the same time as worthwhile chemicals are being made, existing chemicals you've made so far are being degraded. If you have a set of pure reagents and a narrow range of reaction paths, not a problem. However, with a cell you have a lot of things you want to keep at the same time as you want to push other molecules into a higher energy state.

Now, I've told a story about molecules being "randomly put together like a child putting lego blocks together randomly". And indeed, this sort of thing can happen. The problem is, under normal conditions it will only happen when moving molecules have thermal energy corresponding to bond energy - there needs to be a "high temperature". There's also some other issues, about the removing the resultant water molecule and similar ( it is called a "condensation" reaction, after all ). There might be some catalytic aspects helping things along, but most reactions will be of this "spontaneous random" variety.

This "spontaneous random" reaction environment will generate some strings of molecules - so called peptides, short chain amino acid sequences. And, yes, the problem is that along the way, those same reactions which create also destroy. ( Sorta like the two aspects of the Hindu God Shiva. ) However, all is not lost. It is a statistical game, and we're fighting against a steeper incline. At some stage, we can expect there to be a combination of molecules we want - it is just they will be in shorter supply.

However, in conjunction with a catalyst, at low temperatures we can use mechanical "pushes" to overcome the repulsion and provide energy to make up the bond energy. How is it that energy is "used" to "push" molecules together into a higher energy state ? You have what is effectively a "mechanical explosive" element. The enzyme has to hold the two precursors and an ATP molecule, which then "explodes", pushing the precursors together in a way which overcomes the repusion, forming a higher energy molecule.

So, we've now talking about a 4 way convergence - the shape of the two precursors, and the shape of a molecule which decays to provide energy, and the enzyme which has a "reverse template" matching the shape of all three molecules.

The ATP "energy-bearing" molecule is the one I've been making all the fuss about so far. Other molecules, like carbohydrates, may well store energy - but ATP is one of the few molecules ( maybe the only one?) which can be used to help other chemicals increase their energy without heating.

While the reaction increases the energy of the precursor chemicals, joining them together, it does in fact reduce the chemical energy of the whole set of molecules including the ATP molecule. The catalyst is still facililating progress towards a lower energy state overall - the ATP molecule loses more energy than the precursors gain - so reducing the overall energy ( Well, actually it reduces the chemical energy with more thermal energy - total energy is still conserved ).

More information on how the catalyst is more of a "jig", holding together ATP molecules alongside the precursors is contained in : http://physics.stackexchange.com/questions/8076/how-does-atp-transfer-energy-to-a-reaction

Energy for reactions. Food for the city.

By some calculations, it takes a lot of energy to replicate a cell. However, you'd need less energy to reproduce a few molecules. There are some approaches hypothesised which "actively" pull energy out of the environment. In between though, molecules might "scavenge" energy from their environment. One possibility is a "convection current" which "recharges" ATP in a thermally hot area, then directing them to a colder area where the ATP is used to make molecules.

If we know enough about physical chemistry, maybe we can label something as "impossible". However, it does seem to me that current chemistry focuses on ezyme mediated reactions in life as we know it now - it does not consider how it might be for life in times past. ATP is known to be made biologically in two ways, involving enzymes. There's little consideration on how ATP might be made naturally without enzymes.

Certainly, if you've demonstrated a reaction approach in the laboratory, that's good. But it is important to recognise that the reduction of carbon dioxide with hydrogen was originally thought impossible by some - but eventually it was found that some microbes performed just this reaction.

It is a matter of taking energy bearing molecules, and using that energy to store information in higher energy information bearing molecules. There's information in a particular sequence of molecules - and then there's "energy" - getting that particular sequence of molecular units joined together. When you have your own energy processing, rather than relying on background energy, you can increase the reaction rate.

A recent paper by Lane and Martin (
) goes into how ATP molecules might be generated, not through "heat" and "convection currents" in the way I imagine, but rather through proton
gradients which reside in naturally occuring chemical complexes. This could be an important part of the picture - but there's still the story of how the reaction which utilises ATP develops - assuming we have sufficient supplies of ATP. Do ATP based reactions only develop when we have a Lane-Martin setup generating the ATP? Or could there be other approaches developing ATP before Lane-Martin synthesis becomes dominant?

The polymerisation challenge

There are three challenges in polymerising amino acids - the first is getting the energetics and chemical environment supporting polymerisation. The second is to reduce the impact of the generated water molecules in "blocking" or "reversing" the reaction. The third is to reduce the impact of terminating reactions.

Once you have enzymes forming other proteins, and ATP molecules helping this along, all well and good. But how about before this? How do we get things started? Where did the proteins come from that we could then use ATP to make other proteins? They must have come from another source.

These can form thermally, but there are other approaches to help things along - there's super-critical water, which at high pressures and temperatures behaves differently. There's ice based reactions, where surfaces behave differently. There's catalysts like clays and pyrites, which facilitate condensation reactions. You have shock synthesis, the result of asteroids colliding together, which have been investigated in nature geoscience . And there might be others out there too, which may or may not have been written about.

The polymerisation reaction generates water, which can potentially block the further reaction. ATP facilitated synthesis does not have this problem, but it's otherwise an issue. Clays and pyrites can help to remove the influence of water, but importantly, colliodal reaction environments might also do this. The point is that this is a challenge, not a show-stopper.

Amongst these "natural" peptides, you need either one or more which form as "initiators". These peptides catalyse either copies of themselves or other peptides which end up catalysing more copies of themselves. But, at the same time, these peptides also create imperfect copies - fine so long as the concentration of the "unaltered" peptide continues to grow - with these imperfect copies searching out the space of possible relationships between catalysing molecules and reaction products, in order to find molecules which act to create more of themselves, and hopefully able to do something differently or more rapidly than the original set of molecules.

I assume that - perhaps with the help of ATP molecules - protein can catalyse reactions which result in more of that same protein. There's been a demonstration of such effects by Ghadiri and others back in 96/97.

Eventually, you would find a catalyst which could make use of ATP molecules to create molecules which form part of the catalytic loop. On doing this, you'd then be able to continue the reactions in a less "exotic" realm, where a supply of ATP facilates progress, rather than needing surface catalysts, supercritical water, high temperatures, or what have you.

Having reached this point, you'd then be able to search out the protein space far more effectively, with longer molecules approaching the length of current proteins, and much more "efficient" chemical reactions.

The Dynamic Selective System - reducing the effects of formic acid

Via the Muller-Urey experiment, we know that - at least in a reducing environment - it is possible to make a good quantity of amino acids. There's some controversy over just what our historical chemical environment was like, and whether there was a "primordial reducing ocean with lots of lightning". Well, maybe not. I'll leave the details of that argument to others. But what I claim is that somewhere on planet earth there was an environment which made amino acids naturally, with the Muller-Urey experiment providing some insight into that environment, though it was not necessarily an exact match.

One issue, however, is that the Muller-Urey experiment generates chemicals like formic acid, which are said to terminate peptide chains. This may be an issue if the chemistry is much like the Muller-Urey environment. However, it could be different in a way which makes the formic acid less relevant. Being acidic ( well it is called an acid after all, you know ), it could react with other molecules in the environment, taking itself out of the picture. Or, the environment could prefer reactions which do not involve the generation of formic acid. But, anyway, let's assume there's a decent amount of formic acid hanging around.

We normally think of a reaction using up the reactants, reaching an endpoint, and that's it. If some reactants are favoured, they will dominate the system. For this reason, you might think terminating species like formic acid will stop interesting reactions. However, we can imagine a more complex early chemical reaction taking place around a sea vent, near catalytic minerals, with colloids etc. etc. and with continuous circulation and recycling.

We need something which decomposes some of the reaction products ( say, amino acids and things of similar complexity ), so things can "recycle".

Essentially there are two components :

  1. A condensation site, where molecules are joined together , and;
  2. A destruction site, where molecules are broken down.

Sarfatti claims that prebiotic experiments generate at least three times more unifunctional molecules than bifunctional molecules. Not a problem. It becomes a statistical game. Because while "terminating" molecules represent the majority of molecules, once we have some catalysing molecules, their ability to generate more is related to their concentration, so they can eventually get a foothold - even with the generation of a single molecule.

Imagine a system with 20 different polymerisation molecules. 15 of them are unifunctional molecules, which stop chain development, and 5 are amino acids of interest. Assume we want a particular 10 amino acid peptide chain which can catalyse itself. The chance of it forming is (1/20)^10, or roughly 10^-13. Parts-per-trillion ( 10^-9 ) is the lowest concentrations people are able to measure readily, and this would be well below this limit, but it will still be relevant. We'll have this concentration, assuming all reactions are equally possible.

Assume we have 10^20 molecules in our system - a small fraction of a mole - that's 10^7 of the catalytic peptide. Lots of molecules, but a tiny proportion of the whole.

Next, assume we continuously decompose 10% of the molecules, and then the process restarts with these new molecules. While the terminating molecules are generated in accordance with their original proportion - blocking 3/4 of the possible reactions - if 90% of the catalytic molecules survived, they have the chance of operating on the molecules just released, with the catalyst concentration being a multiplier, to the extent they will collide with existing molecules and provide a faster reaction than the "natural" connection rate ( it is a catalyst, after all). So, as the result of this, while 3/4 of the reactions are blocked by the formic acid, the concentration of the catalyst might perhaps double. Eventually, you might get to the point where the catalyst takes up an eighth of released reactants, and the concentration in the whole collection of molecules might rise to perhaps 10%.

Of course, along with the "perfect copies", contributing to the cycle, there will be "imperfect copies", searching out the solution space. Continuous "recycling" means that even with one molecule ( it has a 90% chance of surviving the "convection cull" ) it could get past the formic acid blockage.

This "recycling" approach is one way of selecting for self-reproducing chemicals. "Spiegelman's Monster", RNA based, shows an analogous approach. Here, a molecule in a "reactive combinatorial environment" was able to explore the solution space and change its chemistry to something more effective. Here, samples which represent the "endpoint" of a reaction are introduced to batches of "new reactant" sucessively. There was a very particular chemical environment that was set up, giving the RNA strand the "raw materials" ( and presumably energy ) it needed to reproduce. However, it is possible to imagine my "recycling approach" might also have provided Spiegelman's monster with the raw materials it needed on an ongoing basis.

The point is that when you have the chemical environment set up properly, reactions can progress which increase informational complexity and allow the exploration of the space of available compounds and possible catalytic relationships. You just have to have that chemical environment. We can do it artificially - we just have to imagine it developing naturally.

Anyway, the point of this analysis is to show that even with a large number of other molecules generated, a selective environment can increase the number of catalytic peptides formed - and the recycling would mean mutation could allow the formation of other interesting

So - we imagine an autocatalytic set of proteins, operating to preserve the information stored. Further, however rapidly the system cycles through, even if the reactions took a few hundred years to develop - not a problem, there's a sterile environment. The crucial element for effectiveness of the catalysis is that the catalyst forms more than one copy of itself before it decomposes. This means its "efficiency" can be ridiculously low compared to catalysts used in current living cells - so long as there is an increase over time.

Early life - a metabolism first approach

Life nowadys is very complex. We can speculate on the overlay of past and present chemical approaches. For example, some RNA based catalysts are thought to have very early origins. Many forms of life exhibit the same "body plan", and all forms of life use the citric acid cycle. We see history before us, in the chemical record.

But as stepping stones towards current life, we can imagine chemical approaches which are no longer seen anywhere in current life, and do not exist in the fossil record. These stepping stones were superceded by more efficient processes now used in current life. While these stepping stones are no longer in evidence, the important thing is that they were closer to reactions which could have developed from naturally
available compounds.

I see early life dominated by protein catalysts. Here, protein preserves information, and causes the continued generation of information storing protein. It does not push information onto DNA/RNA, though at some later stage DNA/RNA would cooperate in the process, building new information.

So, my preference is for a "metabolism first" approach, with RNA and DNA added later. Keep in mind, though, this "metabolism" involves information processing - you're not talking about a "blind undirected" metabolism. At a later point DNA/RNA did obtain a monopoly on information storage, and this approach became obsolete.

As you look into the details, you can find details of an "RNA world", a metabolism first world, an Iron-Sulfur world - the list goes on. But I'll leave you to look into that - hopefully this article has set you up to explore the information space yourself.

Chemical innovation and the "Wheel of Fate"

We borrow the idea from Stephen Jay Gould of the "Wheel of Fate" - namely, that some approach developed for some purpose might find itself adapted to a different purpose when the opportunity arises. Here, we apply it to chemical processes. The idea is that some current metabolic process might have its origins in the use of those chemicals for a quite different purpose.

Take DNA. Currently, it is used to store information and as an intermediary towards protein production. Part of the reason why it stores Information is because it is readily duplicated. However, we can imagine it being used for some other purpose before its ability to duplicate came into prominence.

Its use might have been as a catalytic intermediary and template for the production of information bearing molecules. I understand the RNA base units cytosine and ribose are very difficult to make naturally. The mistake is to assume that we could not make an effective precursor RNA system with just guanine and uracil. Cytosine and ribose could come later. The point is, with just two molecules, you can have a single strand molecule with a "zig-zag" profile.

This deviates from current thinking about catalysts - that a single strand of DNA can catalyse a reaction. This isn't known, but there does not seem to be much research into single strands of DNA either. Most of the time DNA is paired in a spiral.

We know that a single strand of DNA has "active" edges - that are able to bond to the "opposite" base pair. However, if it has such "binding ability", then we can imagine it could bind to protein fragments. Things would not "lock into position" as with current catalysts, but rather they would "flap about" on the DNA chain, eventually colliding with each other and completing the reaction. You could even have exploding ATP-like molecules giving one "lobe" velocity compared to the other, providing kinetic energy to increase the bond energy of the target molecule. Compared to "regular" catalysts, this would be slow - but that's not the issue. The issue is whether this would contribute to a net increase in information bearing molecules over time.

References / Inspirations / Further Reading

I acknowledge the ideas of Predator, a Sydney activist who first put me onto the idea that chemical compounds could contain information - something that started me thinking about "proteins searching the space of available reactions". I also acknowledge discussions with RH, which helped me develop my perspective on abiogenesis.

I read the books "Origin of Life" by Leslie Orgel, and also Microbiology and Infection Control, Gary Lee & Penny Bishop. I've also read and asborbed lots of other stuff all over the place; I can't really articulate where from.

The Wikipedia article on
abiogenesis is good stuff. Also, the two pages origins and probabilities represent a good outline of possibilities.