In silico fragment synthesis with combinatorial chemistry

cheminformatics
LNP
Revisiting LNP libraries for rational design
Author

Akshay Balsubramani

Introduction

In recent years, it has become increasingly popular to design libraries of drug discovery molecules combinatorially, using well-defined reaction schemas to combine a relatively small set of fragments. Such approaches play well with modern high-throughput assays and AI/ML algorithms, allowing larger swathes of chemical space to be investigated in a broader variety of ways.

This has become increasingly important for LNP discovery, where the primary focus of engineering is the ionizable lipid component. The headgroup contains the ionizable amine that confers pH-sensitive cationic character: being selectively protonated only in mildly acidic conditions (pH ∼5–6) like the LNP endosome allows them escape the endosome and deliver their mRNA payload to the cytosol. The structure of the ionizable lipid is therefore critical to the success of the LNP, and ionizable lipid engineering is of growing interest for various uses.

Ionizable lipids (ILs) are typically looked upon as modular compositions of head, linker, and tail groups (Jörgensen, Wibel, and Bernkop-Schnürch 2023). This is a useful way to think about the design of ILs, as it allows for the systematic exploration of chemical space by varying each of these components independently, a common principle to design studies and libraries in the field. The combinatorial nature of this process can be significantly aided computationally.

To illustrate this, let’s look at the most broadly famous examples of ionizable lipids – those used in the Pfizer-BioNTech and Moderna mRNA vaccines for COVID-19. These formulations use different ionizable lipids, known as ALC-0315 (Pfizer-BioNTech) and SM-102 (Moderna), which are often looked upon in a modular manner.

By varying these components of the IL, a library of other ILs can be designed. It turns out this strategy is very generally applicable to the design of ILs and beyond. Here are some examples of how to implement these combinatorial chemistry strategies in silico. Approaches like these are the backbone of LNP discovery pipelines, and stand to dramatically aid drug discovery.

Two combinatorial lipid syntheses

The recent paper (Han et al. 2025) is a perfect example of how combinatorial chemistry can be used to design a lipid library. The authors summarize their approach with the following figure, showing how a primary amine head group can be combined with a dialkyl maleate group to form an ionizable lipid.

This is presented as an alternative to a seminal line of papers ((Akinc et al. 2008), (Whitehead et al. 2014)) which share a combinatorial synthesis scheme based on a Michael addition reaction, in which an amine acts as the nucleophile and reacts with a double bond. A comprehensive and concrete example that we focus on here is the paper (Whitehead et al. 2014), which investigates a combinatorial library of lipidoids with significant head diversity. This is, of course, quite different from (Han et al. 2025)’s dialkyl maleate library.

We can compare these libraries in silico.

(a) Han et al. 2025

(b) Akinc et al. 2008, Whitehead et al. 2014

Figure 1: Two synthesis reactions ligating amine heads to tails. The reaction at left is presented as an alternative to the one at right.

Reaction 1: Akinc et al. 2008, Whitehead et al. 2014, etc.

We start with the earlier reaction, a classic Michael addition, where the amine head group acts as a nucleophile and attacks the double bond of the alkene tail. We can express this reaction in SMARTS, a string-based language which defines the reaction rule in silico.

CODE
from rdkit import Chem
from rdkit.Chem.Draw import rdMolDraw2D
from rdkit.Chem import rdChemReactions
import io
from PIL import Image


def draw_reaction(smarts_in):
    rxn = rdChemReactions.ReactionFromSmarts(smarts_in)
    drawer = rdMolDraw2D.MolDraw2DCairo(700,300)
    drawer.DrawReaction(rxn)
    drawer.FinishDrawing()
    return drawer.GetDrawingText()


smarts_michael_addition_2o = "[CX3H2:1]=[CH1:2][C:3](=[O:4])[O:5].[NX3;H1:6][C:7]>>[C:7][NX3;H0:6][CH2:1][CH2:2][C:3](=[O:4])[O:5]"
bio = io.BytesIO(draw_reaction(smarts_michael_addition_2o))
print("SMARTS: {}".format(smarts_michael_addition_2o))
Image.open(bio)
SMARTS: [CX3H2:1]=[CH1:2][C:3](=[O:4])[O:5].[NX3;H1:6][C:7]>>[C:7][NX3;H0:6][CH2:1][CH2:2][C:3](=[O:4])[O:5]

In fact, there is a little more complexity here. We have defined the reaction for secondary amines specifically, and need to define it for primary amines as well. We can do this by defining a second reaction rule, which is identical to the first except in the amine group specification.

It would be more straightforward to define a single reaction rule that captures both primary and secondary amines, but it then becomes more difficult to iteratively apply such a rule to a structure, as we will do for various decomposition purposes.

CODE
smarts_michael_addition_1o = "[CX3H2:1]=[CH1:2][C:3](=[O:4])[O:5].[NX3;H2:6][C:7]>>[C:7][NX3;H1:6][CH2:1][CH2:2][C:3](=[O:4])[O:5]"
bio = io.BytesIO(draw_reaction(smarts_michael_addition_1o))
print("SMARTS: {}".format(smarts_michael_addition_1o))
Image.open(bio)
SMARTS: [CX3H2:1]=[CH1:2][C:3](=[O:4])[O:5].[NX3;H2:6][C:7]>>[C:7][NX3;H1:6][CH2:1][CH2:2][C:3](=[O:4])[O:5]

Reaction 2: Han et al. 2025

We can express this reaction in SMARTS as well. But due to the indeterminacy of the product, we need two separate strings to define the reaction and correspond to the two variable bond attachment points.

CODE
from PIL import Image


smarts_michael_addition_maleate_1 = "[#6:10][O:9][C:7](=[O:8])[CH1:1]=[CH1:2][C:3](=[O:4])[O:5][#6:11].[NX3;H1:6][C:61]>>[#6:10][O:9][C:7](=[O:8])[CH1:1]([NX3;H0:6][C:61])[CH2:2][C:3](=[O:4])[O:5][#6:11]"
smarts_michael_addition_maleate_2 = "[#6:10][O:9][C:7](=[O:8])[CH1:1]=[CH1:2][C:3](=[O:4])[O:5][#6:11].[NX3;H1:6][C:61]>>[#6:10][O:9][C:7](=[O:8])[CH2:1][CH1:2]([NX3;H0:6][C:61])[C:3](=[O:4])[O:5][#6:11]"
CODE
bio_1 = io.BytesIO(draw_reaction(smarts_michael_addition_maleate_1))
print("SMARTS: {}".format(smarts_michael_addition_maleate_1))
Image.open(bio_1)
SMARTS: [#6:10][O:9][C:7](=[O:8])[CH1:1]=[CH1:2][C:3](=[O:4])[O:5][#6:11].[NX3;H1:6][C:61]>>[#6:10][O:9][C:7](=[O:8])[CH1:1]([NX3;H0:6][C:61])[CH2:2][C:3](=[O:4])[O:5][#6:11]

CODE
bio_2 = io.BytesIO(draw_reaction(smarts_michael_addition_maleate_2))
print("SMARTS: {}".format(smarts_michael_addition_maleate_2))
Image.open(bio_2)
SMARTS: [#6:10][O:9][C:7](=[O:8])[CH1:1]=[CH1:2][C:3](=[O:4])[O:5][#6:11].[NX3;H1:6][C:61]>>[#6:10][O:9][C:7](=[O:8])[CH2:1][CH1:2]([NX3;H0:6][C:61])[C:3](=[O:4])[O:5][#6:11]

To proceed further, we need to examine the structures in the libraries themselves.

  • Taking a look at each library, we can clearly demonstrate how to deal with these combinatorics in silico.

  • Since the two reactions are analogous and both involve an amine head group reactant, we can create new libraries, unexplored in the individual papers, by combining fragments from one with fragments from the other where applicable - a “Frankenstein” approach.

Library 1: Michael addition to acrylates

As stated earlier, these are from a series of papers including (Akinc et al. 2008) and (Whitehead et al. 2014).

Amine head groups

CODE
import mols2grid

frags_whitehead_amines = ['NCCCCCO', 'CCNCCCNCC', 'CNCCOCCNC', 'CN(C)CCCN', 'CCN(CC)CCCN', 'NCCN(CCO)CCO', 'NCCN', 'NCCNCCO', 'NCCN(CCN)CCN', 'CN(CCN)CCN', 'CCN(CC)CCNCCN', 'N/C=C1\\\\CCCN1', 'NCCNCCNCCNCCNCCN', 'NCCCOCCOCCOCCCN', 'NCCOCCO', 'CC(C)C[C@H](N)CO', 'NCCNCCN1CCN(CCN)CC1', 'NCCN1CCNCC1', 'CC(N)CNCC(C)N', 'CCN(CCCNC)CCCNC', 'CN(C)CCCNCCCN', 'CNCCN(CCNC)CCNC', 'CC(CN)N', 'CN(CCCN)CCCN', 'C(CN)CN1CCN(CCCN)CC1', 'C1=CC(=CC(=C1)CN)CN', 'C1=CC2=C(C=C1N)OCCO2', 'COCCCN', 'C1COC(O1)CCN', 'C1OC(C)(C)OC1CN', 'COCCN', 'CCOCCN', 'COC(OC)CN', 'C1CC(OC1)CN', 'C(CO)CN', 'C[C@@H](O)CN', 'C[C@H](O)CCN', 'CC(CO)(CO)N', 'C(CCO)CN', 'C(COCCO)N', 'C(O)C(C)(C)CN', 'CCC(CO)(CO)N', 'C(CCCO)CCN', 'C1C[C@H](O)CC[C@H]1N', 'N(C)CCNC', 'N(CC)CCNCC', 'N(C(C)C)CCNC(C)C', 'N(CC)CCCNCC', 'N(C)CCOCCOCCNC', 'N1CCCNCC1', 'CCCN(CC)CN', 'C1CCN(C1)CCN', 'CN(C)CCN', 'CC(CN(C)C)N', 'C1CCN(CC1)CCN', 'C1COCCN1CCN', 'C1COCCN1CCCN', 'C1=CN(C=N1)CCCN', 'CNCCN', 'CNCCCN', 'CCCNCCN', 'C(CNCCNCCN)N', 'C(CN)CN', 'C(CCN)CN', 'CCCCNCCN', 'N(CCNCCO)CCO', 'CCCCCCCCCCCCCN', 'C(CN)CNCCN', 'CCCNCCCN', 'C(CCNCCCN)CNCCCN', 'CC(C)N(C(C)C)CCN', 'C(CNCCNCCNCCN)N', 'C(CN)CNCCN', 'C(CNCCN)N', 'CCCCCCCCCCCCN(CCN)CCN', 'C1CC[C@@H]([C@H](C1)N)N', 'C1CC[C@@H]([C@@H](C1)N)N', 'C1CCN(CC1)N', 'C1C=CC[C@H]([C@@H]1N)N', 'C(CNCCN)CNCCN', 'CCNCCN', 'C=CCCN', 'C1CNC[C@H]1N', 'C1CC(CC(C1)N)N', 'C1CNCCC1N', 'N1CCNCC1', 'N1CCNC[C@H]1C', 'CC1CCC(CC1)CCN', 'CCC1CCC(CC1)CCN', 'N1CCNCC1c1ccccc1', 'N1CCCNCCCNCCC1', 'N1CCNCCNCCNCC1', 'N1CCCNCCNCCNCC1', 'N1CCCNCCNCCCNCC1', 'N1(C)CCCNCCN(C)CCCNCC1', 'N1CCCNCCCNCCCNCC1', 'N1CCNCCNCCOCC1', 'C1C[C@H](CC[C@@H]1N)N', 'C1CC(CC(C1)CN)CN', 'C1=CC(=CC(=C1)N)N', 'C1=CC(=CC=C1N)N', 'CC1=CC=CC(=C1N)N', 'CC1=C(C=C(C=C1)N)N', 'CC1=C(C=CC=C1N)N', 'CC1=CC(=C(C=C1)N)N', 'C1=CC2=C(C(=C1)N)C(=CC=C2)N', 'N1C(C)NC(C)NC1C', 'C1(=NC(=NC(=N1)N)N)N', 'C1CC(CCC1CN)CN', 'C(CCCCN)CCCN', 'C(COCCOCCN)N', 'C(COCCOCCOCCN)N', 'C(CCOCCCN)COCCCN', 'COCCCN', 'N(C)CCOCCOCCNC', 'C(CO)N', 'N(CCNCCO)CCO', 'CCCN', 'CCCCN', 'CCCCCN', 'CCCCCCN', 'CNCCCCCCNC', 'CCCNCCCN', 'C(CN)CNCCCN', 'C(CCCNCCCCCCN)CCN', 'C(CN)CNCCCNCCCN', 'CC(C)[NH]', 'CC(C)(C)[NH]', 'CCC(CC)N', 'CCC(C)(C)N', 'CC(C)CN', 'CCC(C)N', 'CC(C)C(C)N', 'CCCC(C)N', 'CC(C)CCN', 'CC(C)(C)CCN', 'CC(C)(C)CC(C)(C)N', 'CCC(C)CN', 'CCCCC(CC)CN', 'CC(CO)N', 'CC(C)C[C@@H](CO)N', 'CC(C)(C)C(CO)N', 'C(CO)(CO)(CO)N', 'CC(C)OCCCN', 'N(C(C)(C)C)CCNC(C)(C)C', 'C(CCN)CCN', 'C(CCCN)CCN', 'C(CCCCN)CCN', 'C(CCCCCN)CCN', 'C(CCCCCCCN)CCN', 'C(CCCCCCCCCN)CCN', 'CC(CC(C)(C)CCN)CN', 'C(CCCN(C)Cc1ccccc1)N', 'CN1CCN(CC1)CCCCN', 'CCN1CCC(CC1)N', 'CN1CCN(CC1)CCN', 'CN1CCOCC1CN', 'CCCN1CCC(CC1)N', 'C(CN(C)Cc1ccccc1)N', 'CN(C)CCOCCN', 'C1CCN(CC1)CCOCCN', 'C1CCN(C1)CCOCCN', 'CN(C1=CC=CC=C1)C(=O)CN', 'CN1C=CN=C1CN', 'CN1CCCC1CN', 'CN1CCCC1C(=O)N', 'CN1CCC(CC1)CN', 'CN1CCC(CC1)N', 'CC(C)N1CCN(CC1)CCCN', 'C1CN(CCN1)C(=O)CCN', 'C(C)(CN(C)C)(C)CN', 'CC(C)NCCN', 'C1[CH]C1N', 'C1CC(C1)N', 'C1CCC(C1)N', 'CCN1CCC[C@H]1CN', 'CCN1CCC[C@@H]1CN', 'C1CCC(CC1)N', 'C1CCC(CC1)CN', 'CC(C)(CN(C)C)CN', 'C1CC2CC1CC2N', 'C1CCCC(CCC1)N', 'C1C2CC3CC1CC(C2)(C3)N', 'C1C2CC3CC1CC(C2)(C3)CN', 'C1=CC=C(C=C1)CCCN', 'C1=CC=C(C=C1)CCCCN', 'CC(C)(C)C1=CC=C(C=C1)N', 'CC(C)(C)C1=CC=CC=C1N', 'CCC1=C(C(=CC=C1)CC)N', 'COC1=CC=CC=C1N', 'COC1=CC=CC=C1CN', 'CC1=CC=C(C=C1)C(C)N', 'CC(C)C1=CC=CC=C1N', 'CC(C)C1=CC=C(C=C1)N', 'CC(C)OC1=CC=CC(=C1)N', 'COC1=CC=C(C=C1)CCN', 'CCOC1=C(C=CC=C1)N', 'CCCCCCCCOC1=CC=C(C=C1)N', 'CCOC1=CC=CC=C1CN', 'CC(C)(C)C1=CC(=C(C=C1)C(C)(C)C)N', 'COC1=CC(=C(C=C1)OC)N', 'CC(C)(C)c1ccc(OC)c(c1)N', 'C(Cc1cc(OC)c(OC)cc1)N', 'C(c1cc(C)c(c(OC)c1)OC)N', 'C1=CC=C(C=C1)NCCN', 'C1=CC=C(C=C1)CNCCN', 'CNC1=CC=CC=C1N', 'C1=CC(=CC(=C1)N)CN', 'Cc1c(N)c(C)c(C)c(c1C)N', 'C1=CC=C(C(=C1)CN)N', 'CC1=CC(=C(C=C1C)N)N', 'c1(F)cc(ccc1)N', 'c1(F)ccc(cc1)N', 'C1=CC(=C(C=C1)F)CN', 'C(c1cc(F)ccc1)N', 'C1=CC(=C(C=C1)F)CCN', 'c1(cc(F)ccc1F)N', 'C1=C(C=C(C(=C1)F)F)N', 'C1=C(C=C(C=C1F)F)N', 'C(c1cc(F)ccc1F)N', 'C1=C(C=C(C(=C1F)F)F)N', 'c1(F)cc(F)c(c(F)c1)N', 'C(F)(F)(F)c1cc(ccc1)N', 'C(F)(F)(F)c1c(ccc(c1)F)N', 'CC1=CC=C(C=C1F)N', 'COC1=CC=C(C=C1F)N', 'c1(OC(F)F)ccc(cc1)N', 'C(F)(F)(F)Oc1ccc(cc1)N', 'CC1CC(CCC1N)CC2CCC(C(C2)C)N', 'N1C2NCCNC2NCC1', 'C1CC(C2=C1C=CC=C2)N', 'C1=CC2=C(C=C1)CC(C2)N', 'C1CC2=C(C1)C(=CC=C2)N', 'C1OC2=CC=C(C=C2O1)N', 'C1CCC2=C(C1)C=CC(=C2)N', 'C1C2=C(C=CC(=C2)N)C3=C1C=C(C=C3)N', 'c1(cc2cc3ccccc3cc2cc1)N', 'C1=CC=C2C(=C1)C3=C(C=CC=C3)C(=C2N)N', 'C1=CC=C(C=C1)C2=CC(=CC=C2)N', 'C1=CC=C(C=C1)C2=CC=C(C=C2)N', 'Cc1c(N)ccc(c1)-c1ccc(c(C)c1)N', 'C1=C(C=C(C(=C1)N)N)C2=CC=C(C(=C2)N)N', 'Cc1c(N)c(C)cc(c1)-c1cc(C)c(c(c1)C)N', 'C1=CC=C(C=C1)CC2=CC=CC=C2N', 'C1=CC(=CC=C1CC2=CC=C(C=C2)N)N', 'CNC1=CC=C(C=C1)CC2=CC=C(C=C2)N', 'C1=CC=C(C=C1)C(C2=CC=CC=C2)N', 'C(CC(c1ccccc1)c1ccccc1)N', 'C1=CC=C(C=C1)OC2=CC=CC=C2N', 'C1=CC=C(C=C1)OC2=CC=C(C=C2)N', 'c1(Oc2ccc(F)cc2)ccc(cc1)N', 'C1=CC(=CC=C1N)OC2=CC=C(C=C2)N', 'C1=CC=C(C=C1)COC2=CC(=CC=C2)N', 'C1=CC=C(C=C1)NC2=CC=C(C=C2)N', 'C1=CC=C(C=C1)NC2=CC=CC=C2N', 'C1=CC=C(C=C1)[C@H]([C@H](C2=CC=CC=C2)N)N', 'C1=CC=C(C=C1)C(C2=CC=CC=C2)(C3=CC=CC=C3)N', 'Nc1c(cc(cc1-c1ccccc1)-c1ccccc1)-c1ccccc1', 'C1=CC=C2C(=C1)C=CC(=C2C3=C(C=CC4=CC=CC=C43)N)N', 'N(c1ccccc1)c1ccc(cc1)-c1ccc(cc1)Nc1ccccc1', 'C1(c2c(cccc2)-c2ccccc21)(c1ccc(N)cc1)c1ccc(cc1)N', 'CC(NCCN(CCNC(C)C)CCNC(C)C)C']

print("{} heads from Whitehead et al. 2014 library.".format(len(frags_whitehead_amines)))
mols2grid.display([Chem.MolFromSmiles(x) for x in frags_whitehead_amines],mol_col="mol", n_cols=7, n_rows=4)
262 heads from Whitehead et al. 2014 library.

Acrylate tail fragments of different lengths

Each of these is combined with acrylate tails of different lengths, which are the hydrophobic part of the lipidoid. The length of the tail is an important design parameter, as it affects the ability of the lipidoid to form nanoparticles with mRNA; but the precise way in which this is done is not well understood.

So the goal of many studies like this is to systematically try out different tail lengths in conjunction with the other design parameters, to see which combinations work best. In silico capabilities can really help with this type of search.

CODE
frags_whitehead_acrylate_tails = [
    'CCCCCCCCCCOC(=O)C=C', 
    'CCCCCCCCCCCOC(=O)C=C', 
    'CCCCCCCCCCCCOC(=O)C=C', 
    'CCCCCCCCCCCCCOC(=O)C=C', 
    'CCCCCCCCCCCCCCOC(=O)C=C'
]

print("{} tails from Whitehead et al. 2014 library.".format(len(frags_whitehead_acrylate_tails)))
mols2grid.display([Chem.MolFromSmiles(x) for x in frags_whitehead_acrylate_tails],mol_col="mol", n_cols=7, n_rows=4)
5 tails from Whitehead et al. 2014 library.

In order to run the combinatorial synthesis, we need to use the reaction SMARTS we defined above. This task – running a given reaction – is general enough to be well worth wrapping separately. We define it below.

CODE
from rdkit import Chem
from rdkit.Chem import AllChem

def run_reaction(
    smarts_rxn, 
    reactants
):
    """Run a chemical reaction on a set of reactants.

    Parameters
    ----------
    smarts_rxn : :obj:`str`
        SMARTS string of the reaction.
    reactants : :obj:`list` of :obj:`str`
        List of SMILES strings of the reactants.

    Returns
    -------
    :obj:`list` of :obj:`str` : List of SMILES strings of the products.
    """
    rxn = AllChem.ReactionFromSmarts(smarts_rxn)
    arglist = []
    for arg in reactants:
        arglist.append(Chem.MolFromSmiles(arg))
    products = rxn.RunReactants(tuple(arglist))
    toret = []
    for product_set in products:
        mol = product_set[0]
        Chem.SanitizeMol(mol)
        toret.append(Chem.CanonSmiles(Chem.MolToSmiles(Chem.RemoveHs(mol))))
        for product in product_set:
            smi = Chem.CanonSmiles(
                Chem.MolToSmiles(Chem.RemoveHs(product))
            )
            toret.append(smi)
        break
    return toret

Running a reaction is then as simple as calling the function with the reactants and the reaction SMARTS.

CODE
import numpy as np

list_flatten = lambda lst: [item for sublist in lst for item in sublist]
prod_lst = []
reactant_lst = []
for tail_frag in frags_whitehead_acrylate_tails:
    for head_frag in frags_whitehead_amines:
        prods = run_reaction(
            smarts_michael_addition_2o, 
            [tail_frag, head_frag]
        )
        prod_lst.append(prods)
        reactant_lst.append(['.'.join([tail_frag, head_frag])]*len(prods))
comb_products = np.unique(list_flatten(prod_lst))
comb_reactants = np.unique(list_flatten(reactant_lst))
print("{} products from Whitehead et al. 2014 library.".format(len(comb_products)))
mols2grid.display([Chem.MolFromSmiles(x) for x in comb_products],mol_col="mol", 
                  subset=["ID", "img", "Solubility"], n_cols=7, n_rows=4)
280 products from Whitehead et al. 2014 library.

Library 2: Michael addition to dialkyl maleates

As stated earlier, these are from the paper (Han et al. 2025).

Amine head groups

CODE
frags_han_amines = [
    'NCCN', 'NCCCN', 'NCCCCCCN', 'CNCCNC', 'CN(C)CCCNCCCN', 'CN(CCN)CCN', 'CNCCN(C)CCNC', 'CN(CCCN)CCCN', 
    'CNCCCN(C)CCCNC', 'NCCNCCO', 'OCCNCCNCCO', 'NCCCN1CCN(CCCN)CC1', 'NCCOCCOCCN', 'NCCCCC1NC(=O)C(CCCCN)NC1=O', 'NCCNCCN', 
    'NCCCN(CCCN)CCCN', 'NCCN(CCN)CCN', 'NCCNCCN1CCN(CCN)CC1', 'NCCCN(CCCN)CCCCN(CCCN)CCCN', 'NCCNC(=O)CCN(CCC(=O)NCCN)CCN(CCC(=O)NCCN)CCC(=O)NCCN', 
    'C[N]C', 'CCNCC', 'NCCO', 'CNCCO', 'NCCCCO', 'CNCCCCO', 'CN(C)CCCN', 'CNCCCN(C)C', 'CCN(CC)CCN', 'CCN(CC)CCNC', 'CN(C)[C@H]1CCNC1', 
    'OCCN1CCNCC1', 'CCN(CC)CCCN', 'CCN(CC)CCCNC', 'NCCN1CCCC1', 'CNCCN1CCCC1', 'NCCCN1CCCC1', 'CNCCCN1CCCC1', 'CN(C)CCN', 
    'CNCCN(C)C', 'CN1CCN(CCCN)CC1', 'CN1CCCNCC1', 'CN1CCNCC1', 'CN(C)C1CCNCC1', 'CCN(CC)C1CCNCC1', 'CCN(CC)C1CCNC1', 'CCN(CC)CCN1CCNCC1'
]
frags_han_amines = np.unique(frags_han_amines)

print("{} heads from Han et al. 2025 library.".format(len(frags_han_amines)))
mols2grid.display([Chem.MolFromSmiles(x) for x in frags_han_amines],mol_col="mol", n_cols=7, n_rows=4)
47 heads from Han et al. 2025 library.

Alkyl tails from dialkyl maleates

CODE
frags_han_maleates = [
    'CCCCOC(=O)/C=C\C(=O)OCCCC', 
    'CC.CCCCC(CC)COC(=O)/C=C\C(=O)OCC(CC)CCCC', 
    'CCCCCCCCOC(=O)/C=C\C(=O)OCCCCCCCC', 
    'CC(C)CCCCCOC(=O)/C=C\C(=O)OCCCCCC(C)C', 
    'CCCCC/C=C/C/C=C/CCCCCCCCOC(=O)/C=C\C(=O)OCC(CC)CCCC', 
    'CCCCCCCCC(CC)OC(=O)/C=C\C(=O)OCC(CC)CCCC', 
    'CCCCCCCCC(CCCCCCCC)OC(=O)/C=C\C(=O)OCC(CC)CCCC', 
    'CCCCCCCCCCC(C)OC(=O)/C=C\C(=O)OCC(CC)CCCC', 
    'CCCCCCCCCCC(CCCCCCCCCC)OC(=O)/C=C\C(=O)OCC(CC)CCCC', 
    'CCCCCCCCCCC(CCCCCCCC)COC(=O)/C=C\C(=O)OCC(CC)CCCC'
]

print("{} tails from Han et al. 2025 library.".format(len(frags_han_maleates)))
mols2grid.display([Chem.MolFromSmiles(x) for x in frags_han_maleates],mol_col="mol", n_cols=7, n_rows=4)
10 tails from Han et al. 2025 library.
CODE
prod_lst = []
reactant_lst = []
for tail_frag in frags_han_maleates:
    for head_frag in frags_han_amines:
        prods = run_reaction(
            smarts_michael_addition_maleate_1, 
            [tail_frag, head_frag]
        )
        prod_lst.append(prods)
        reactant_lst.append(['.'.join([tail_frag, head_frag])]*len(prods))
comb_products = np.unique(list_flatten(prod_lst))
comb_reactants = np.unique(list_flatten(reactant_lst))
print("{} products from Han et al. 2025 library.".format(len(comb_products)))
mols2grid.display([Chem.MolFromSmiles(x) for x in comb_products],mol_col="mol", 
                  n_cols=7, n_rows=4)
270 products from Han et al. 2025 library.

References

Akinc, Akin, Andreas Zumbuehl, Michael Goldberg, Elizaveta S Leshchiner, Valentina Busini, Naushad Hossain, Sergio A Bacallado, et al. 2008. “A Combinatorial Library of Lipid-Like Materials for Delivery of RNAi Therapeutics.” Nature Biotechnology 26 (5): 561–69.
Han, Xuexiang, Ying Xu, Adele Ricciardi, Junchao Xu, Rohan Palanki, Vivek Chowdhary, Lulu Xue, et al. 2025. “Plug-and-Play Assembly of Biodegradable Ionizable Lipids for Potent mRNA Delivery and Gene Editing in Vivo.” bioRxiv, 2025–02.
Jörgensen, Arne Matteo, Richard Wibel, and Andreas Bernkop-Schnürch. 2023. “Biodegradable Cationic and Ionizable Cationic Lipids: A Roadmap for Safer Pharmaceutical Excipients.” Small 19 (17): 2206968.
Whitehead, Kathryn A, J Robert Dorkin, Arturo J Vegas, Philip H Chang, Omid Veiseh, Jonathan Matthews, Owen S Fenton, et al. 2014. “Degradable Lipid Nanoparticles with Predictable in Vivo siRNA Delivery Activity.” Nature Communications 5 (1): 4277.