Novel AI Technique Reveals Sample’s Biological or Non-Biological Origin with 90% Accuracy

Novel AI Technique Reveals Sample’s Biological or Non-Biological Origin with 90% Accuracy

The search for definitive biosignatures — unambiguous markers of past or present life — is a central goal of paleobiology and astrobiology. A team of researchers led by the Carnegie Institution for Science has developed a robust method that combines pyrolysis gas-chromatography mass-spectrometry (GC-MS) measurements of a wide variety of terrestrial and extraterrestrial carbonaceous materials with machine-learning-based classification to achieve 90% accuracy in the differentiation between samples of abiotic origins vs. biotic specimens, including highly-degraded, ancient, biologically-derived samples.

An artist’s impression of the Gliese 414 system. Image credit: Sci.News.

Since the early 1950’s scientists have known that given the right conditions, mixing simple chemicals can form some of the more complex molecules required for life, such as amino acids.

Since then, many more of the components necessary for life, such as the nucleotides needed to make DNA, have been detected in space.

But how do we know if these are of biological origin, or if they are made by another abiotic process over time. Without knowing that, we don’t know if we have detected life.

“We are asking a fundamental question: Is there something fundamentally different about the chemistry of life compared to the chemistry of the inanimate world?” said Carnegie Institution for Science’s Professor Robert Hazen.

“Are there ‘chemical rules of life’ that influence the diversity and distribution of biomolecules?”

“Can we deduce those rules and use them to guide our efforts to model life’s origins or to detect subtle signs of life on other worlds? We found that there is.”

“From an evolutionary point of view, life is not an easy thing to sustain, and so there are certain pathways which work and certain which don’t.”

“Our analysis does not rely on absolute identification of a compound but determines biological/non-biological origins by looking at the compound in relation to the sample context.”

Professor Hazen and colleagues employed pyrolysis GC-MS methods to analyze 134 varied carbon-rich samples from living cells, age-degraded samples, geologically processed fossil fuels, carbon-rich meteorites, and laboratory-synthesized organic compounds and mixtures.

“59 of these were of biological origin (biotic), such as a grain of rice, a human hair, crude oil, etc,” they said.

“75 were of non-biological origin (abiotic), such as lab-synthesised compounds like amino acids, or samples from carbon-rich meteorites.”

“The samples were first heated in an oxygen-free environment, which causes the samples to break down (a process known as pyrolysis).”

The treated samples were then analysed in a GC-MS, an analytical device which separates the mixture into its component parts, and then identifies them.

Using a suite of machine-learning methods, three-dimensional (time/intensity/mass) data from each abiotic or biotic sample were employed as training or testing subsets, which resulted in a model that can predict the abiotic or biotic nature of the sample with greater than 90% accuracy.

“From a chemical standpoint, the differences between biotic and abiotic samples relate to things like water solubility, molecular weights, volatility and so on,” said Carnegie Institution for Science’s Dr. Jim Cleaves.

“The simple way I would think about this is that a cell has a membrane and an interior, called the cytosol; the membrane is pretty water-insoluble, while the cell’s content is pretty water-soluble.”

“That arrangement keeps the membrane assembled as it tries to minimize its components’ contacts with water and also keeps the ‘inside components’ from leaking across the membrane.”

“The inside components can also stay dissolved in water despite being extremely large molecules like chromosomes and proteins.”

“So, if one breaks a living cell or tissue into its components, one gets a mix of very water-soluble molecules and very water-insoluble molecules spread across a spectrum. Things like petroleum and coal have lost most of the water-soluble material over their long histories.”

“Abiological samples can have unique distributions across this spectrum relative to each other, but they are also distinct from the biological distributions.”

The new technique may soon resolve a number of scientific mysteries on Earth, including the origin of 3.5 billion-year-old black sediments from Western Australia — hotly debated rocks that some researchers contend hold Earth’s oldest fossil microbes, while others claim they are devoid of life signs.

Other samples from ancient rocks in Northern Canada, South Africa, and China evoke similar debates.

“We’re applying our methods right now to address these long-standing questions about the biogenicity of the organic material in these rocks,” Professor Hazen said.

And new ideas have poured forth about the potential contributions of this new approach in other fields such as biology, paleontology and archaeology.

“If AI can easily distinguish biotic from abiotic, as well as modern from ancient life, then what other insights might we gain? For example, could we tease out whether an ancient fossil cell had a nucleus, or was photosynthetic?” Professor Hazen said.

“Could it analyze charred remains and discriminate different kinds of wood from an archeological site? It’s as if we are just dipping our toes in the water of a vast ocean of possibilities.”

The team’s work appears in the Proceedings of the National Academy of Sciences.

_____

H. James Cleaves II et al. 2023. A robust, agnostic molecular biosignature based on machine learning. PNAS 120 (41): e2307149120; doi: 10.1073/pnas.2307149120

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *