Resources
Code, models, and datasets produced by our research efforts.
Model
Evo 2
A biological sequence model with generalist capabilities trained across all domains of life
Brixi et al., bioRxiv (2025)
Model
Evo
A biological sequence model trained on prokaryotic genomes that enables prediction and design tasks from the molecular to genome scale
Nguyen et al., Science (2024)
Dataset
SynGenome
A first-of-its-kind genomic database containing over 120 billion base pairs of AI-generated DNA sequences
Merchant et al., bioRxiv (2024)
Dataset
OpenGenome2
8 trillion base pairs of genomes spanning all observed evolution used to train the Evo 2 model
Brixi et al., bioRxiv (2025)
Dataset
OpenGenome
300 billion base pairs of prokaryotic and phage genomes used to train the Evo model
Nguyen et al., Science (2024)