Programmable AI Platform for Lipid Nanoparticle Design and Optimization
Guy D. Rosin; Igor Nudelman; Lee Goldfryd; Nir Suissa; Amit Benazraf; Alex Lagadinos; Yogev Debbi
Lipid nanoparticles (LNPs) are a preferred delivery vehicle for RNA therapeutics but face challenges in extrahepatic delivery. Companies often under-utilize data and rely on empirical trial-and-error. Machine learning (ML) can analyze these datasets, predict optimal formulations, and accelerate development.
To address these challenges, Mana.bio has developed an ML platform that predicts key LNP features to optimize particle composition and formulation. The platform leverages proprietary ML algorithms to predict key LNP characteristics, including physicochemical properties, cell/tissue targeting, and toxicity. It operates in an iterative Learn-Design-Make-Generate cycle, continuously improving LNP design through data-driven insights.
In the "Learn" phase, Mana.bio collects relevant data from public literature using proprietary software tools to automate data extraction. This dataset forms the foundation for initial prediction models. During the "Design" phase, the platform defines and generates a theoretical LNP space encompassing up to 100s of millions of in-silico compositions. A two-step screening process filters out LNPs that fail predefined criteria (e.g., size below 90 nm, PDI below 0.2, and positive zeta potential), and a multi-objective optimization process, where we optimize LNP predicted performance (e.g., low transfection efficiency in target cell lines, low cytokine levels, or reduced liver toxicity). Mana.bio’s Freedom-to-Operate (FTO) tool integrates data from all relevant patents to ensure the selection of novel LNP components. Top-ranked candidates are selected for empirical tests in the lab based on component diversity. In the "Make" phase, LNPs are synthesized and encapsulated. During the "Generate" phase, physicochemical parameters and in-vitro activities are tested. Experimental results are stored in Mana.bio’s database to continuously retrain the models, improving performance in each iteration.
A recently developed capability enhances the "Design" phase by enabling autonomous iteration of lipid design. Inspired by evolutionary methods, the model systematically mutates lipid structures, ranks the generated mutants, and advances the most promising candidates to subsequent cycles. This innovation accelerates the discovery of high-performance LNPs tailored for specific therapeutic applications.
So far, across 120 cycles and more than 4,300 LNP formulations tested, Mana.bio has collected over 45,000 experimental data points and generated over 80,000 total data points. To evaluate our models, we apply 5-fold cross-validation. As for evaluation metrics, we use MAE and RMSE for regression models and ROC AUC for classification models. Notable examples include 1) PDI (polydispersity index), which achieved a 0.79 ROC AUC with a threshold of 0.2 for "good" LNPs, 2) particle size, where a regression model achieved an MAE of 27.95 and RMSE of 43.79, 3) transfection in non-activated primary T cells, where the model achieved an MAE of 4.32 and RMSE of 6.54, and 4) ALT and AST toxicity models, which achieved 0.8 and 0.82 ROC AUC, respectively with classification thresholds tailored by mouse strain and sex. These results underscore the platform's capacity to generate accurate, data-driven predictions, accelerating the discovery of safe and effective LNPs for diverse therapeutic applications.