BOTANIC Was Trained on 1,600 Plant Genomes — and It Can Find Genetic Variants in Weeks Instead of Years

Marco Lanz

3 days ago

BOTANIC Was Trained on 1,600 Plant Genomes — and It Can Find Genetic Variants in Weeks Instead of Years

BOTANIC was trained on over 1,600 plant genomes using transformer architecture, enabling genetic variant discovery in weeks instead of years.
Living Models adopted an open-weights strategy, allowing seed companies to fine-tune the model on proprietary data while retaining ownership of results.
Unlike statistical genomic tools, BOTANIC learns biological structure, reducing failed candidates and cutting experimental validation cycles significantly.

On April 10, 2026, Living Models, a biotechnology startup, unveiled BOTANIC, an AI foundation model trained on over 1,600 plant genomes that could fundamentally reshape how agricultural companies develop new crop varieties.

According to TechRadar, which published an exclusive interview with the company’s leadership, the technology applies the same transformer architecture behind large language models to biological sequences, enabling researchers to identify promising genetic variants in weeks rather than years.

The initiative addresses what scientists describe as an urgent timeline: crop varieties suited for the climate expected in 2050 must be bred today, yet traditional methods typically require five to twelve years from initial hypothesis to commercial release. By analyzing patterns across genomic data, BOTANIC learns biological structure rather than mere statistical correlations, potentially cutting experimental validation cycles dramatically while reducing the number of failed candidates that waste resources.

How AI Foundation Models Could Transform Plant Biology

Bertrand Gakière, VP of Biology at Living Models, stated that “every living thing on Earth runs on the same programming language: DNA codes for RNA codes for proteins codes for phenotype.”

The company positions its technology as a tool that can read and interpret this code, arguing that predicting biological outcomes is “infinitely more useful than predicting the next word in a sentence.” The distinction matters because existing genomic selection tools like GBLUP and BayesC learn statistical patterns within specific training populations but degrade when applied to different environments or genetic backgrounds, remaining essentially “blind to biological mechanism.”

The open-weights release of BOTANIC reflects a strategy the company describes as the “DeepSeek of their category,” prioritizing speed of iteration over capital advantages. While incumbents like Bayer CropScience, Corteva, Syngenta, BASF, and Limagrain control the market, Living Models believes teams closest to the problem move faster. The company accumulates durable competitive advantage through proprietary fine-tuning data gathered from customer partnerships and real breeding program feedback loops, allowing major seed groups to adapt the model on proprietary phenotypic data while retaining ownership of resulting weights.

So, a chatbot that reads DNA instead of predicting your next autocorrect disaster? Honestly, the bar was pretty low for “more useful than autocomplete.” The agricultural sector has been waiting for tools that actually understand why certain genetic combinations fail, not just that they statistically tend to fail. Living Models claims their model explicitly flags low-coverage regions of genomic space and includes uncertainty quantification in every output, which sounds suspiciously like they’re aware that hallucinating in biology could mean accidentally creating a superweed or a tomato that tastes like cardboard.

Why Plants Before Humans: The Strategic Calculus

The decision to focus initially on agriculture rather than human health stems from practical factors beyond scientific priority. No HIPAA, GDPR, or patient consent frameworks apply to plant data, eliminating months of regulatory negotiation. Agricultural applications face plant variety registration requirements rather than FDA or EMA approval processes, while failed predictions in crop development get validated within a single growing season rather than years of clinical trials.

The climate urgency argument resonates particularly loud: agriculture remains the sector most immediately affected by climate disruption, creating demand for varieties that must perform in conditions not yet observed.

The competitive dynamics favor this approach for startups unable to match the resources of established agrochemical giants. Major seed companies can fine-tune BOTANIC on their proprietary field trial results and environment-specific performance records without surrendering data ownership. Customer data stays in their environment, and the resulting model weights remain their property, creating a hybrid deployment model that aligns incentives while bypassing the need for Living Models to fund its own extensive breeding programs.

Plant breeding operates on four to eight year timescales from genomic hypothesis to commercial variety, meaning the technology’s impact on global food security could emerge well before comparable applications in human medicine reach patients.

The Human Genome Project proved humanity could “read the code,” and Living Models describes its work as the infrastructure that makes that journey routine and scalable. For an industry burning through billions on candidates that fail late in development cycles, even modest improvements in hit rate could translate into crops that actually reach dinner tables before the climate makes growing anything significantly harder.