Illumina Launches Billion Cell Atlas for AI-Driven Drug Discovery 
Precision Medicine

Illumina Launches Billion Cell Atlas For AI-Driven Drug Discovery

By Team VOH

Illumina has unveiled the Billion Cell Atlas, the world’s largest genome-wide genetic perturbation dataset, designed to accelerate drug discovery and precision medicine using artificial intelligence. Announced on January 13, 2026, the Atlas is the first phase of a three-year program to build a five-billion-cell reference resource, intended to become one of the most comprehensive maps of human disease biology ever assembled.

The initiative is being developed in collaboration with AstraZeneca, Merck, and Eli Lilly and Company as founding partners. The first tranche focuses on a curated set of disease-relevant human cell lines and is aimed at validating genetic drug targets, training large-scale AI models, and revealing disease mechanisms that have been difficult to study with existing datasets.

At full scale, the Atlas will capture how one billion individual cells respond to CRISPR-based genetic perturbationsacross more than 200 disease-relevant cell lines. By switching genes on and off across roughly 20,000 human genes, the platform enables researchers to map how genetic changes alter cellular behavior in areas including cancer, immune disorders, cardiometabolic disease, neurological conditions, and rare genetic disorders. The dataset is designed to support mechanism-of-action studies, new indication discovery, and target validation grounded in human genetics.

The Atlas is the first data product from Illumina’s newly formed BioInsight business, which is focused on building large-scale biological datasets and analytics to underpin next-generation AI in biopharma. Data generation is powered by the Illumina Single Cell 3’ RNA prep platform, allowing millions of cells to be captured per experiment. The project is expected to produce around 20 petabytes of single-cell transcriptomic data within a year, processed through Illumina’s DRAGEN pipeline with hardware acceleration and delivered via the Illumina Connected Analytics cloud for scalable analysis.

Merck plans to use the Atlas to train proprietary AI and machine-learning foundation models and build virtual cell models to improve disease prediction and target selection across its pipelines. AstraZeneca and Eli Lilly are applying the resource to translate genetic signals into mechanistic biology and to generate cell-type-specific insights critical for AI-enabled discovery, particularly in complex and historically hard-to-decode diseases.

The Billion Cell Atlas builds on Illumina’s February 2025 announcement of its plan to create a five-billion-cell single-cell resource and marks a shift for the company toward delivering data-centric products alongside sequencing platforms. Illumina said it will continue expanding multi-billion-cell atlases with partners to create disease-specific perturbation datasets paired with advanced AI models, forming the backbone of future AI-driven drug development.

Also Read

SCROLL FOR NEXT