CONJUGATED MOLECULES Field of the Invention This invention relates to conjugated molecules such as antibody drug conjugates. In particular, the invention relates to methods of identifying multivalent antibody conjugates, for example 5 multispecific antibody conjugates such as bispecific antibody drug conjugates and bispecific antibody dye conjugates, and the conjugated multivalent molecules that are identified by the method. Intermediates, kits and uses are also provided. Background of the Invention 10 Antibody drug conjugates (ADCs) are therapeutic agents comprising an antibody (or antigen binding fragment) and a drug. The drug is often referred to as the payload. The drug and antibody are usually connected by a linker. Linkers are usually based on chemical motifs including disulfides, hydrazones or peptides (cleavable), or thioethers (noncleavable) and control the distribution and delivery of the cytotoxic agent to the target cell. Over a dozen ADCs 15 have now been approved as cancer treatments, with many more in preclinical and clinical development. More generally, drug conjugates can also be derived from non-antibody sources, such as binding proteins or peptides, for example cytokines or antibody mimetics such as affibodies. Another useful conjugate is an antibody dye conjugate, wherein an antibody is labelled to 20 permit easy detection such as in a diagnostic setting or during preclinical drug discovery. Bispecific antibodies are a common engineered antibody format, wherein a single antibody- like molecule is able to bind two different targets. For example, blinatumomab targets CD19 and CD3 and was approved by the US FDA in 2014, for treating B cell acute lymphoblastic leukaemia. Another approved therapeueitc bispecific antibody is emicizumab, which binds to 25 clotting factors IXa and X and is used in the treatment of hemophilia A. Bispecific antibodies, ADCs and bispecific ADCs are reviewed by

Mar; 10(3): 360.
Bipecific ADCs (BsADCs) are also discussed at (accessed on 6 June 2023), which includes in Figure 1 a discussion of fragment expression, 30 split intein trans-splicing, then drug conjugation to reactivated BsADCs.

Duckworth et al (bioRxiv 2022.07.17.500350; doi: https://doi.org/10.1101/2022.07.17.500350) describes the development of a bispecific Antibody–Drug Conjugate (BsADC) targeting CD7 and CD33 to treat Acute Myeloid Leukaemia, by screening different bispecific ADC Fab assemblies targeting CD7 and CD33 against several cell lines. The bispecific constructs were 35 created by linking two Fab fragments by click chemistry and attaching the toxic payload to the linked Fab fragments. The authors demonstrate that the bispecific ADCs are cytotoxic to AML cells in vitro and are more selective than single-antigen targeting ADCs in discriminating tumour from healthy cells (either myeloid or lymphoid). Siegmund et al (December 2016 Scientific Reports 6(1):39291) note that spontaneous 5 isopeptide bond formation has been used to develop a peptide-peptide ligation technology that enables the polymerisation of tagged-proteins catalyzed by SpyLigase. The authors adapted this technology to establish a modular antibody labeling approach which is based on isopeptide bond formation between two recognition peptides, SpyTag and KTag. The labeling strategy allows the attachment of a reporting cargo of interest to an antibody scaffold by fusing it 10 chemically to KTag, available via semi-automated solid-phase peptide synthesis (SPPS), while equipping the antibody with SpyTag. The authors report that this strategy was successfully used to engineer site-specific antibody-drug conjugates (ADCs) that exhibit cytotoxicities in the subnanomolar range. A SpyTag-based system was also used by Kausar Alam et al (Molecular Imaging and Biology volume 21, pages54–66 (2019)), who describe the site-specific 15 fluorescent labelling of antibodies and diabodies using the SpyTag/SpyCatcher system for in vivo optical imaging. WO-A-2022/200804 describes polypeptides comprising a first binding domain at the N terminus and a second binding domain at the C terminus, wherein the first and second binding domains are separated by a structural domain, the first binding domain and the second binding 20 domain are catcher domains that are each able to form an isopeptide linkage with a cognate peptide. The first catcher domain is linked to its cognate peptide tag by an isopeptide bond and the second catcher domain is linked to its cognate peptide tag by an isopeptide bond. Each peptide tag is typically attached to an antigen binding domain. WO-A-2024/069180 describes multivalent protein scaffolds useful as therapeutics, and useful in25 identifying new therapeutic compounds. The invention described therein also relates to multi- domain polypeptide constructs having multiple binding domains and a structural domain. Also described are methods of using the provided multivalent protein scaffolds to identify new candidate therapeutics, and new therapeutics thereby identified. One described aspect provides a polypeptide comprising a first binding domain at the N terminus and a second binding domain at 30 the C terminus, wherein the first and second binding domains are separated by a structural domain. The first and second binding domains may be the same or different. In some embodiments, the first binding domain and second binding domain are different antigen-binding domains. The construct is then a bispecific construct. The structural domain may comprise or consist of a CutA1 protein. The CutA1 structural domain may be human CutA1. The CutA1 structural domain may 35 contain one or more substitutions or deletions relative to wild-type CutA1. In other aspects, the structural domain comprises or consists of a cytokine, for example from the TNF superfamily such as TNF, OX40L, CD40L or TL1A. The huge potential of bispecific antibodies and other multivalent and multispecific antibody formats is only beginning to be explored from a clinical perspective, while bispecific-Drug Conjugates (e.g. ADCs) and multispecific-Drug Conjugates (e.g. ADCs) are an even newer area of research. There is a great need for improvements in the identification and development 5 of multivalent, bispecific and multispecific antibody molecules (and other targeting protein domains) conjugated to therapeutic or diagnostic molecules. Summary of the Invention The invention relates generally to improvements in the identification and development of 10 conjugated molecules such as antibody conjugates and related antibody-based conjugates, in particular multivalent antibody conjugates such as multispecific antibody conjugates, for example bispecific antibody drug conjugates or bispecific antibody dye conjugates. The invention is based, in part, on the realisation that a scalable, combinatorial system can be provided by a “conjugation first” approach that performs the conjugation step before combining 15 the specificities. The existing technologies for generating conjugated molecules such as ADCs typically comprise a first step of generating the antibody before a second step of conjugating the molecule of interest (e.g. payload, drug or dye) to the antibody. In contrast, the inventors have realised that much greater scale and throughput can be achieved by first conjugating the 20 molecule of interest to part of the antibody before forming the complete antibody. This approach is further compatible with storage of the pre-conjugated part of the construct (e.g. antibody), facilitating pre-conjugation in bulk followed by a multitude of smaller-scale reactions to form a set of complete constructs e.g. antibodies. This has particular advantages in screening and identifying useful multivalent-conjugate molecules such as bipecific ADCs 25 (BsADCs). The quadratic nature of bispecific generation means that the approach described herein can provide multiple orders of magnitude of improvements, which are improved again if a trispecific antibody or higher-order specificity is provided. In a first aspect, the invention provides a method of preparing a multivalent binding polypeptide conjugated to a molecule of interest, wherein the method comprises combining a first 30 polypeptide that is conjugated to a molecule of interest, with at least one other polypeptide to form a multivalent binding polypeptide conjugated to the molecule of interest. In some embodiments, the first polypeptide that is conjugated to a molecule of interest is combined with at least two other polypeptides, or at least three other polypeptides, or at least four other polypeptides, to form the multivalent binding polypeptide conjugated to the molecule of 35 interest. The combination of the first polypeptide with the other polypeptide or polypeptides may be covalent or non-covalent. The method of preparing a multivalent binding polypeptide conjugated to a molecule of interest may, in some embodiments, comprise the steps of: (a) conjugating the molecule of interest to a first polypeptide of the multivalent binding polypeptide; and 5 (b) combining the conjugated first polypeptide of the multivalent binding polypeptide that results from step (a), with the at least one other polypeptide of the multivalent binding polypeptide, thereby forming the multivalent binding polypeptide conjugated to the molecule of interest. The method of preparing a multivalent binding polypeptide conjugated to a molecule of interest 10 may, in some embodiments, comprise the steps of: (a) conjugating the molecule of interest to a first polypeptide of the multivalent binding polypeptide; and (b) combining the conjugated first polypeptide of the multivalent binding polypeptide that results from step (a), with the at least two other polypeptides of the multivalent binding 15 polypeptide, thereby forming the multivalent binding polypeptide conjugated to the molecule of interest. The molecule of interest typically comprises or consists of a drug, a label or a dye. In some embodiments, the molecule of interest is not a protein. In many drug conjugate embodiments, the drug may be a non-protein drug, for example a small molecule drug such as a 20 chemotherapeutic. Cytotoxic drugs, typically cytotoxic small molecule drugs, are a typical drug component of ADCs in the art, and are also typical molecules of interest in the current invention. As described elsewhere herein, examples of cytotoxic drugs include, but are not limited to, mertansine (also called DM1), monomethyl auristatin F (MMAF), monomethyl auristatin E (MMAE), deruxtecan. In certain embodiments, the small molecule drug is 25 exatecan. In some embodiments, the molecule of interest comprises or consists of a nucleic acid, polynucleotide or oligonucleotide. The nucleic acid, polynucleotide or oligonucleotide may be DNA, RNA, XNA, LNA, or a mixture thereof. Typically it is DNA or RNA. The nucleic acid, polynucleotide or oligonucleotide may be single stranded or double-stranded. In some 30 embodiments, the oligonucleotide comprises or consists of 3 to 50 nucleotides (or pairs of nucleotides when double-stranded), for example 5 to 30 nucleotides (or pairs). Antibody- oligonucleotide conjugates (AOCs) are often known in the art as a subset of ADCs. In some embodiments, the molecule of interest is conjugated covalently to the first polypeptide, as is well-known in the art. In other embodiments, the molecule of interest is conjugated non-covalently to the first polypeptide, as is also known in the art. One non-limiting example of this approach, is protamine (positively charged polypeptide) functionalised with a maleimide group covalently attached to the protein via a Cys thiol group. Subsequently, a nucleic acid (e.g. DNA) is added 5 to the protein-protamine conjugate and the nucleic acid binds strongly to the protamine through electrostatic interactions. Thus, the nucleic acid is not directly attached to the protein and is not covalently attached to the protein. This general approach is also shown, for example, in Figure 1 of Dugal-Tessier et al, J Clin Med.2021 Feb; 10(4): 838. The molecule of interest may be conjugated to any part of the first polypeptide. Typically, the 10 conjugation occurs in step (a) of the method. Particular benefits are provided when the molecule of interest is conjugated to the structural domain or to a linker region between the structural domain and a binding domain. In particular, conjugation to the structural domain or to the linker region enables conjugation to a corresponding position on a drug candidate or drug molecule in which a binding domain (such as SpyCatcher003) is replaced with a different15 binding domain (such as an antibody fragment) or removed, as described in WO-A- 2022/200804. Accordingly, screening data on potential target combinations (e.g. as identified using a catcher-based format) can be utilised to inform the design of drug candidates (e.g. wherein the identified binders [such as antigen-binding regions] are directly connected to the structural domain), as demonstrated in WO-A-2022/200804. In one embodiment, conjugation 20 of a molecule of interest to a CutA1 structural domain is provided. The multivalent binding polypeptide is able to bind to two or more epitopes. The multivalent binding polypeptide typically comprises two or more target binding regions, typically antigen- binding domains, antigen-binding regions or antigen-binding constructs, each of which binds specifically to an epitope. Each antigen-binding domain can be any antibody-based domain 25 that is capable of binding immunospecifically to an epitope. Each antigen-binding domain typically comprises six complementarity-defining regions (CDRs). Typically, each antigen binding domain comprises six CDRs and defined framework regions, as is known in the art. Typically, each antigen binding domain comprises an immunoglobulin variable region comprising CDRs and framework regions. Each antigen-binding domain may, as is known in 30 the art, comprise two immunoglobulin variable regions, typically a heavy chain variable region (VH) and a light chain variable region (VL). The multivalent binding polypeptide typically comprises binding domains (also referred to as binding regions) that are based on the epitope binding domains (i.e. the variable domains) of an antibody and comprises a CDR-defined paratope. Many antibody-based domains,35 constructs and formats are known in the art, and include a full-length antibody, an antigen- binding fragment of an antibody and engineered antibody constructs. Antibody-based binding- domains within the scope of the invention include, but are not limited to, Fab fragments, F(ab’)2, scFv fragments, scFv tandem, scFv-Fc, diabodies, scFv--CH3 (minibodies), scFab, human antibodies, humanised antibodies. The multivalent binding polypeptide may comprise antibody-like scaffolds and non-scaffold binders, including but not limited to affibodies, 5 nanobodies, or DARPins; or cytokines including but not limited to TNF, TNFR and receptor fragments or fusions such as etanercept or an target-binding fragment thereof. A multivalent polypeptide that binds to two epitopes is bivalent. A standard IgG antibody is bivalent. In some embodiments, the multivalent binding polypeptide binds to three or more epitopes. A multivalent polypeptide that binds three epitopes is trivalent. In some 10 embodiments, the multivalent binding polypeptide binds to four or more epitopes. A multivalent polypeptide that binds to four epitopes is tetravalent. In some embodiments, the multivalent binding polypeptide binds to six epitopes. A multivalent polypeptide that binds six epitopes is hexavalent. The epitopes to which the multivalent binding polypeptide binds may be the same or may be 15 different. A multivalent binding polypeptide that has multiple binding domains that bind multiple copies of the same epitope is multivalent monospecific, for example binding to two copies of the same epitope is bivalent monospecific. A multivalent binding polypeptide that binds to two different epitopes is bivalent bispecific, although the dual nomenclature is redundant and these molecules are typically referred to as bispecific. For the avoidance of 20 doubt, such a bispecific polypeptide targets each of two different epitopes with a valency of one (“1+1”, such as a bispecific antibody derived from hybridisation of two IgG species) and is therefore bivalent; similarly, a bispecific polypeptide targeting each of two different epitopes with a valency of two (“2+2”) is tetravalent; a bispecific polypeptide targeting each of two different epitopes with a valency of three (“3+3”) is hexavalent. Similarly, a multivalent 25 polypeptide may target each of its different target epitopes with different valency or with the same valency, such as a trivalent polypeptide targeting a first epitope with a valency of two and a second epitope with a valency of one (“2+1”), or a trivalent polypeptide targeting three different epitopes each with a valency of one (“1+1+1”). Most natural antibodies are bi- or multivalent molecules comprising identical antigen binding sites, i.e. are bivalent (or 30 multivalent) monospecific, such as IgG, IgA or IgM. The exception is some IgG4 molecules which, due to an unstable hinge region, are capable of exchanging Fab arms (half-antibody association). When a multivalent binding polypeptide binds to two or more different epitopes, it is multispecific. In some embodiments, the multivalent binding polypeptide is bispecific, i.e. it 35 binds to two different epitopes. Bispecific antibodies are well-known in the art. Bispecific antibodies with defined specificities are artificial molecules that are not found in nature. In some embodiments, the multivalent binding polypeptide is trispecific or has a higher order of specificity, for example has multiple binding regions that bind specifically to 4, 5, 6 or more different epitopes. In some embodiments, the multivalent binding polypeptide is a bispecific polypeptide. In some 5 embodiments, the multivalent binding polypeptide conjugated to a molecule of interest is a bispecific drug conjugate, for example a bispecific antibody drug conjugate (bsADC). In some embodiments, the bispecific drug conjugate does not feature one, some or all domains of a traditional antibody (such as substitution of the dimerising IgG Fc region for a multimerising domain, or substitution of the Fab region with a different binding domain, such as a nanobody, 10 affibody, DARPin, cytokine, naturally occurring ligand or receptor fragment). In some embodiments, the bispecific drug conjugate comprises only one or two domains of a traditional antibody (such as substitution of the dimerising IgG Fc region for a multimerising domain, or substitution of the Fab region with a different binding domain, such as a nanobody, affibody, DARPin, cytokine, naturally occurring ligand or receptor fragment). In some embodiments, the 15 drug component is not a protein, and may for example be a cytotoxic drug having a molecular weight under 1000Da (i.e. a “small molecule” drug). When the multivalent antigen-binding polypeptide is a bispecific polypeptide, for example a bispecific antibody, step (a) of the method may comprise conjugating the molecule of interest to a first polypeptide chain of the bispecific polypeptide. This may be, for example, one half of 20 a bispecific antibody (e.g. one heavy chain and one light chain of a traditional full-length antibody). In this embodiment, step (b) may comprise combining the conjugated first polypeptide chain of the bispecific antibody with at least a second chain of the bispecific antibody, which may be the second half of the antibody (e.g. the second heavy chain and second light chain in a traditional full-length antibody), to form the bispecific antibody. Methods 25 of in vitro association of parts of antibodies are well-known in the art. In one embodiment of the preceding paragraph, the known technique of controlled Fab-arm exchange (cFAE), as described by Labrijn et al Nature Protocols volume 9, pages 2450–2463 (2014), is used to generate a bispecific antibody, typically a bispecific IgG1 such as a bispecific human IgG1 antibody. This method comprises (i) separate expression of two parental IgG1s 30 containing single matching point mutations in the CH3 domain; (ii) mixing of parental IgG1s under permissive redox conditions in vitro to enable recombination of half-molecules; (iii) removal of the reductant to allow reoxidation of interchain disulfide bonds; and optionally (iv) analysis of exchange efficiency and final product typically using chromatography-based or mass spectrometry (MS)–based methods. According to the present invention, one or both of 35 the half-molecules (that result from step (i) and are mixed in step (ii)) are the first polypeptide chain of the bispecific polypeptide. Therefore, according to one embodiment of the present invention, the cFAE method is used to generate ADCs and comprises the steps of (i) separate expression of two parental IgG1s containing single matching point mutations in the CH3 domain; (ii) conjugating a molecule of interest to at least one of the parental IgG1s; (iii) mixing of parental IgG1s (at least one of which comprises a conjugated molecule of interest) under 5 permissive redox conditions in vitro to enable recombination of half-molecules; (iv) removal of the reductant to allow reoxidation of interchain disulfide bonds; and optionally (v) analysis of exchange efficiency and final product typically using chromatography-based or mass spectrometry (MS)–based methods. One application of the cFAE embodiment is demonstrated by Barron et al, Int J Mol Sci.2024 10 Feb; 25(4): 2097. In this embodiment, the method comprises the formation of bispecific ADCs by applying cFAE to a first parental ADC that binds to a first epitope (e.g. wherein the antibody is conjugated with a cytotoxic small molecule such as MMAE) and a second parental unconjugated antibody that binds to a second epitope. cFAE results in an antibody comprising one conjugated half (from the conjugated parental antibody) and one unconjugated half (from 15 the unconjugated parental antibody). Typically, this results in a biparatopic ADC, for example as shown in Figure 3 of Barron et al which is hereby specifically and expressly incorporated by reference. In certain embodiments, the first polypeptide of the multivalent antigen-binding polypeptide comprises or consists of a polypeptide comprising a first binding domain, a second binding domain and a structural domain. Each of these domains may comprise more than one 20 domain, for example the structural domain may be an Fc region comprising CH2 and CH3 domains. The first binding domain, second binding domain and structural domain may also be referred to as a first binding region, a second binding region and a structural region, respectively. In some embodiments, therefore, the first polypeptide of the multivalent antigen- binding polypeptide can be said to comprise or consist of a polypeptide comprising a first 25 binding region, a second binding region and a structural region. In these embodiments, the first polypeptide is not naturally-occurring, and are engineered constructs that do not occur in nature having a structural domain and binding domains. These are typically expressed as recombinant multi-domain polypeptides. The structural domain provides a defined structural support for the binding domains. Advantageously, in some embodiments the structural domain 30 can ensure that the binding domains have the desired orientation so that they can bind their targets, typically with both binding domains in the cis orientation. The constructs can therefore present a single binding surface, in some embodiments. An exemplary structural domain is a CutA1 protein, for example human CutA1. CutA1 has a number of beneficial features, including ease of expression, stability, oligomerisation, and 35 favourable orientation of binding domains attached to the N an/or C terminus. CutA1 as a structural domain is described in WO-A-2022/200804 and in WO-A-2024/069180 (which claims priority to UK patent application number GB2214235.0). Another exemplary structural domain is an Fc domain (also called an Fc region) of an antibody, for example a human Fc, such as a human IgG Fc domain. Human IgG domains may be of 5 the IgG1, IgG2, IgG3 or IgG4 Fc subtype. An Fc region comprises CH2 and CH3 domains. The CutA1 protein or Fc region may be a variant that comprises one or more modifications, typically substitutions, insertions or deletions, from the wildtype sequence. Typically, the modification is the substitution of 1 to 20 amino acid residues. A substitution may replace one or more cysteine residues with a non-cysteine residue. A substitution may replace one or more 10 non-cysteine residues with a cysteine residue. The selection of cysteine residues at certain positions can be useful in optimising conjugation of a molecule of interest. A substitution may introduce an unnatural amino acid. Unnatural amino acids can be useful to enable conjugation of a molecule of interest via biorthogonal chemistry, for instance via click-chemistry. An unnatural amino acid (UAA) is also known as a non-proteinogenic amino acid. Unnatural amino 15 acids can be useful to enable conjugation of a molecule of interest via biorthogonal chemistry, for instance via click-chemistry. Unnatural amino acids are known in the art and include D- amino acids, homo amino acids, N-methyl amino acids, hydroxyproline (Hyp), beta-alanine, citrulline (Cit), ornithine (Orn), norleucine (Nle), 3-nitrotyrosine, nitroarginine, and pyroglutamic acid (Pyr). For example, propargyllysine is an unnatural amino acid which, when incorporated 20 into a protein, can be exploited to attach commercially available fluorescent azide dyes through copper-catalyzed alkyne-azide cycloaddition click reaction (also known as click reaction). Other UAAs suitable for site-specific modification of a polypeptide sequence include 1: 3-(6- acetylnaphthalen-2-ylamino)-2-aminopropanoic acid (Anap), 2: (S)-1-carboxy-3-(7-hydroxy-2- oxo-2H-chromen-4-yl)propan-1-aminium (CouAA), 3: 3-(5-(dimethylamino)naphthalene-1- 25 sulfonamide) propanoic acid (Dansylalanine), 4: Nɛ-p-azidobenzyloxycarbonyl lysine (PABK), 5: Propargyl-L-lysine (PrK), 6: Nɛ-(1-methylcycloprop-2-enecarboxamido) lysine (CpK), 7: Nɛ- acryllysine (AcrK), 8: Nɛ-(cyclooct-2-yn-1-yloxy)carbonyl)L-lysine (CoK), 9: bicyclo[6.1.0]non- 4-yn-9-ylmethanol lysine (BCNK), 10: trans-cyclooct-2-ene lysine (2′-TCOK), 11: trans- cyclooct-4-ene lysine (4′-TCOK), 12: dioxo-TCO lysine (DOTCOK), 13: 3-(2-cyclobutene-1- 30 yl)propanoic acid (CbK), 14: Nɛ-5-norbornene-2-yloxycarbonyl-L-lysine (NBOK), 15: cyclooctyne lysine (SCOK), 16: 5-norbornen-2-ol tyrosine (NOR), 17: cyclooct-2-ynol tyrosine (COY), 18: (E)-2-(cyclooct-4-en-1-yloxyl)ethanol tyrosine (DS1/2), 19: azidohomoalanine (AHA), 20: homopropargylglycine (HPG), 21: azidonorleucine (ANL), 22: Nɛ-2- azideoethyloxycarbonyl-L-lysine (NEAK). In some embodiments, the first binding domain is at the N terminus of the multivalent binding polypeptide and the second binding domain is at the C terminus, wherein the first and second binding domains are separated by a structural domain. In some embodiments, the first binding domain is connected to the second binding domain, and the second binding domain is 5 connected to the structural domain, for example a “Catcher-Catcher-Structural” format such as a Catcher-Catcher-Fc format (as shown for example as SpyCatcher003-DogCatcher-Fc in Example 8 and Figure 24). Typically, the second binding domain is connected to the N- terminus or C-terminus of the structural domain. Various embodiments of the invention therefore provide binding (e.g. catcher) domains 10 separated by a structural domain (also sometimes referred to as a core protein), such as SpyCatcher003-Fc-DogCatcher, and also alternative constructs such as SpyCatcher003- DogCatcher-Fc (as illustrated in Figures 1 and 24). Such platforms with alternate geometries provide a useful approach to evaluate the impact of binder arrangement or valency on drug candidate behaviour (as similarly demonstrated for comparison across different structural 15 domains in Figures 11-14). Figure 24 shows that SpyCatcher003-DogCatcher-Fc is suitable for assembly with proteins featuring recombinantly introduced SpyTag/SpyTag003 or DogTag variants (herein also called “SpT proteins” or “DgT proteins”, respectively), providing a useful new protein platform for this and related screening approaches. This format is also suitable for pre-conjugation to small molecules (such as drug conjugation, demonstrated below via20 fluorophore conjugation), followed by SpyCatcher/SpyTag-based and DogCatcher/DogTag- based protein assembly, making it a suitable multivalent binding polypeptide in the context of this invention. In Figure 25, a drug screen based on a set of bispecific drug candidates assembled on SpyCatcher003-Fc-DogCatcher and SpyCatcher003-DogCatcher-Fc with no payload is provided to exemplify the difference in cell cytotoxicity upon a change in geometry. 25 Similarly, knob-into-holes (“KiH”) or related Fc-heterodimerisation technologies can be utilised to create complexes for subsequent assembly (e.g. from fusion proteins of SpyCatcher or SpyCatcher003 [herein both also “SpC”] or DogCatcher [herein also “DgC”] to an antibody Fc- region, such as SpC-Fc and DgC-Fc to make a SpC-Fc/DgC-Fc heterodimer). Herein, payload conjugation before Fc-heterodimerisation can be utilised to create pre-conjugated multivalent 30 antigen-binding polypeptides (such as “Catcher-core” or “CC” constructs, i.e. constructs featuring at least one catcher domain and a core protein) at reduced DAR (e.g. payload conjugation of a single SpC-Fc or DgC-Fc prior to hybridisation, resulting in Fc hybrids with conjugation at one half of the construct) or two different linker-payload species (e.g. conjugation of different linker-payload species to SpC-Fc and DgC-Fc prior to hybridisation). 35 Similarly, payload conjugation after Fc-heterodimerisation is also suitable to provide pre- conjugated complexes ready for assembly. Where catcher technologies are used as binding domains, certain protein-protein assembly approaches may be preferred for knob-into-hole Fc heterodimers, such as SpC-Fc/SpC-TEV-Fc or related technologies for sequential assembly of SpT binders, e.g. in cases where it is desired to retain the same positioning of protein tags relative to binders and Fc. 5 In some embodiments, KiH generation can be used as a step to introduce additional screening dimensions, for example by conjugating Catcher-Fc and Catcher-TEV-Fc (or similar e,g. DogCatcher) with matching Fc mutations to different payloads, e.g. to generate Catcher-Fc conjugated to MMAE and Catcher-TEV-Fc conjugated to exatecan, followed by stepwise assembly, to make bispecific dual-payload KiH. Multiple assembly approaches could be used 10 in combination to create further diversity, such as a combination of catcher technologies (including conditionally-activatable catcher technologies, e.g. via protease digestion or light, or orthogonal catcher technologies such as SpyCatcher and DogCatcher), Split-Inteins, Sortase, or other technologies as described further below. A structural domain can also be modified to include useful mutations or modifications, such 15 as ”silenced” Fc mutations known in the art (e.g. “LALA”, “LAGA”, “LALAPG ”) or to introduce or remove cysteine residues or other residues for linker-payload attachment, allowing the generation of Catcher-core molecules with varying DAR. For instance, assessment of assemblies based on silenced Fc instead of or in comparison to assemblies based on wild- type Fc may be desired for in vitro assays to eliminate potential influences of Fcγ receptors on 20 drug conjugate internalisation. The inventors demonstrate variations with additional cysteines to provide higher numbers of small molecule conjugation, e.g.8 Cys residues per dimer of Fc- based Catcher-core, facilitating a drug-to-antibody ratio of up to 8. These molecules are CC080 (SEQ ID NO: 53) and CC086 (SEQ ID NO: 54). In general, the “conjugation-first” approach described herein can leverage a variety of 25 technologies, either sequentially or in addition to any single step, to generate a broad diversity of bispecificity, multispecificity, geometries, formats, small molecule conjugation, combinations of small molecule conjugation, or other features. In some embodiments, the first and second binding domains may be antigen-binding domains, for example antibody-based domains comprising six CDRs-that form a paratope for binding to 30 the epitope on the target. Antigen-binding domains, regions and constructs are well-known in the art, as described elsewhere herein, and include for example, immunoglobulin-based formats such as a Fab fragment, a Fab’ fragment, an scFv, an sdAb, or non immunoglobulin-based antibody mimetic format, such as an anticalin, a fynomer, an affimer, an alphabody, a DARPin or an avimer. A linker may be included between any two domains, or between all domains in a construct, or between any selection of domains in a construct, for example a linker peptide of between 2 and 30 amino acid residues, typically between 5 and 25 amino acid residues. In some embodiments, the first polypeptide of the multivalent antigen-binding polypeptide 5 comprises or consists of a polypeptide comprising a first binding domain, a second binding domain and a structural domain, and the first binding domain and second binding domain are inteins. Typically, the intein specificity of the first binding domain is different from the intein specificity for the second binding domain. Inteins are auto-processing domains found in organisms from all domains of life. These proteins carry out a process known as protein 10 splicing, which is a multi-step biochemical reaction comprised of both the cleavage and formation of peptide bonds. While the endogenous substrates of protein splicing are specific essential proteins found in intein-containing host organisms, inteins are also functional in exogenous contexts and can be used to manipulate virtually any polypeptide backbone. In particular embodiments, the first polypeptide of the multivalent antigen-binding polypeptide 15 comprises or consists of a polypeptide comprising a first binding domain, a second binding domain and a structural domain, and the first binding domain and second binding domain are catcher domains each able to form an isopeptide linkage with a cognate peptide. Typically, the cognate peptide for the first binding domain is different from the cognate peptide for the second binding domain. 20 In some embodiments, the first binding domain and second binding domain are catcher domains each able to form an isopeptide linkage with a cognate peptide, and the cognate peptide is the same for both first and second binding domains. In these embodiments where the cognate peptide is the same for both first and second binding domains, then preferential, selective or sequential binding or conjugation can be achieved by temporal or sequential 25 control, spatial control, or spatiotemporal control. When the first polypeptide comprises two or more catcher domains and a structural domain, the at least one other polypeptide of the multivalent antigen-binding polypeptide may comprise an antigen-binding domain having a cognate peptide that is able to form an isopeptide bond with at least one of the catcher domains. The isopeptide bond forming peptide is typically 30 connected to the antigen biding domain by a peptide linkage, more typically wherein the peptide tag and the antigen biding domain are expressed as a single polypeptide chain. Typically, the two catcher domains are different, and form an isopeptide bind with a different cognate peptide. Contacting the first polypeptide (comprising two or more catcher domains) with an antigen-binding domain having a cognate peptide that is able to form an isopeptide 35 bond with at least one of the catcher domains, therefore leads to formation of the isopeptide bond, linking the antigen-binding domain to the structural domain via the catcher domain. In some embodiments, the at least one other polypeptide comprises two separate antigen- binding polypeptides each having a cognate peptide that is able to form an isopeptide bond with one of the catcher domains. The two separate antigen-binding polypeptides may be 5 combined with the first polypeptide simultaneously or sequentially. In the embodiments where the cognate peptide is the same for both first and second binding domains, then preferential, selective or sequential binding or conjugation can be achieved by temporal or sequential control, spatial control, or spatiotemporal control. This may be, for example, by activation or inactivation of one binding domain and/or by means of competitive 10 binding. Without limitation, activation may for example be achieved by enzymatic cleavage of an inhibiting peptide (as described for example by Driscoll et al, September 2023, bioRxiv 2023.08.31.555700; doi: https://doi.org/10.1101/2023.08.31.555700, which is incorporated herein by reference in its entirety) or by light activation, for example using light triggering of covalent bond formation in a photocaged binding domain created by site-specific incorporation 15 of an unnatural residue such as coumarin-lysine at the reactive site (as described for example, with reference to SpyCatcher003, by Rahikainen et al, 2023 “Visible light-induced specific protein reaction delineates early stages of cell adhesion”, bioRxiv 2023.07.21.549850; doi: https://doi.org/10.1101/2023.07.21.549850, which is incorporated herein by reference in its entirety). A further photoactivation approach, using photocaged glutamic acid analogues, is 20 described by Yang et al, Angewendte Chemie Volume 62, Issue 40 October 2, 2023 (published online 16 August 2023) “Photoactivatable Protein with Genetically Encoded Photocaged Glutamic Acid". Control over the availability of the reactive sites may alternatively be achieved by removal of any inhibiting domain, for example by means of steric hindrance, competition for binding, or blocking of the reactive or catalytic residue. 25 In some embodiments, two antigen-binding domains each containing the same isopeptide bond-forming tag can both be precisely and selectively conjugated onto a construct comprising two of the same isopeptide-bond-forming catcher domains (to which the tag binds), wherein one of the catcher domains is unreactive until unveiling of reactivity using light activation. Typically, one of the catcher domains is unreactive due to it being caged by the presence of 30 an unnatural light-reactive residue, for example a coumarin-lysine (7-hydroxycoumarin lysine, “HCK”) amino acid or a photocaged glutamic acid analogue, in the reactive isopeptide-bond- forming site. Once the first tagged-peptide has been attached to the first catcher by formation of the isopeptide bond between first catcher and tag, application of the appropriate light (e.g. 405nm light) releases the photocaging residue from the second catcher. The second catcher 35 is then available for binding, and the second Tagged peptide can be added. As used herein, “SpyTag” or “SpT” may refer to any suitably reactive version of SpyTag, preferably with beneficial reaction properties (e.g. SpyTag003), and generally refers to SpyTag003 in the examples (with “L2” utilising original SpyTag and other SpT binders generally referring to SpyTag003). Similarly, SpyCatcher as used herein may refer to any suitably reactive version of SpyCatcher, and usually refers to SpyCatcher003 as used in the examples (further see 5 provided SEQ IDs for details). In some embodiments, two antigen-binding domains each containing the same isopeptide bond-forming tag can both be precisely and selectively conjugated onto a construct of the present invention comprising two of the same isopeptide-bond-forming catcher domains (to which the tag binds), wherein one of the catcher domains is unreactive until unveiling of 10 reactivity using a site-specific protease. Typically, one of the catcher domains is unreactive due to the presence of a non-reactive Tag mutant sequence that is fused to the one of the catcher domains via a flexible linker containing a protease cleavage site. Once the first tagged- peptide has been attached to the first catcher by formation of the isopeptide bond between first catcher and tag, addition of the protease releases the non-reactive Tag mutant from the second 15 catcher. The second catcher is then available for binding, and the second Tagged peptide can be added. In certain embodiments, the non-reactive tag mutant is SpyTag003 D117A (SpyTag003DA) (Keeble et al., 2019 Proc. Natl. Acad. Sci. U. S. A.116, 26523–26533, incorporated herein by reference in its entirely) that is fused to the C-terminus of a second SpyCatcher003 via a 20 flexible linker containing a tobacco etch virus (TEV) protease cleavage site. Following cleavage at the TEV site, the SpyTag003DA peptide will be free to dissociate, unmasking the reactive Lys of SpyCatcher003 to enable reaction with a second supplied SpyTag-linked binder. Cleavage can be carried out with any suitable protease, for example superTEV protease. 25 In the embodiments where a first polypeptide comprising two (or more) catcher domains and a conjugated molecule of interest is combined with two (or more) separate antigen-binding polypeptides each having a cognate peptide able to form an isopeptide bond with at least one of the catcher domains, the result is a molecule comprising a structural domain, two catcher domains and two antigen binding domains, wherein the two antigen binding domains are 30 connected to the catcher domains by an isopeptide bond. The two antigen binding domains may be identical (to form a monospecific bivalent molecule) or different (forming a bispecific molecule). If the two antigen-binding domains and conjugated molecule of interest are determined to work well in combination, for example in one or more functional assays, then the molecule can be further modified to remove the catcher domains and thereby provide an 35 antigen-binding molecule consisting essentially of first and second antigen-binding domains separated by a structural domain. As noted elsewhere herein, the structural domain is optionally CutA1 or an Fc domain, or a variant thereof. When the molecule of interest is conjugated to the structural domain or to a linker connected to the structural domain, the removal of the catcher domains retains the conjugated molecule of interest on the structural domain or linker, and typically does not change the position or orientation of the conjugated 5 molecule of interest. The combination of binding domains and conjugated molecule of interest can then be developed further, where it can be expected to retain the function observed for the larger construct comprising catcher domains. In the embodiment where a first polypeptide comprising two catcher domains and a conjugated molecule of interest is combined with two separate antigen-binding polypeptides each having 10 a cognate peptide able to form an isopeptide bond with at least one of the catcher domains, a plurality of conjugated first polypeptides of the multivalent antigen-binding polypeptide may be generated in step (a) of the method. This generates multiple copies of the conjugated first polypeptide, that comprises two or more catcher domains. In step (b), these multiple copies of multi-catcher polypeptides each conjugated to a functional molecule of interest (e.g. a drug or 15 a dye) can then usefully be exposed to a plurality of different pairs of antigen-binding polypeptides each having a cognate peptide able to form an isopeptide bond with one of the catcher domains. This results in the generation of a variety of different bispecific antigen- binding polypeptides each conjugated to a non-protein molecule of interest. This can be scaled to large numbers of combinations of antigen-binding domains with a single molecule of interest, 20 which can each be assessed for useful function (e.g. cytotoxicity), with favourable compounds taken forward for further development. Accordingly, some embodiments provide a method of preparing a multivalent binding polypeptide conjugated to a molecule of interest may, comprising the steps of: (a) conjugating the molecule of interest (e.g. drug or dye, or other payload) to a first 25 polypeptide of the multivalent binding polypeptide, wherein the first polypeptide comprises or consists of a polypeptide comprising a first binding domain, a second binding domain and a structural domain, and the first binding domain and second binding domain are catcher domains each able to form an isopeptide linkage with a cognate peptide; and (b) combining the conjugated first polypeptide of the multivalent binding polypeptide 30 that results from step (a), with at least two other polypeptides, wherein each of the at least two other polypeptides comprises or consists of an antigen-binding domain having a cognate peptide that is able to form an isopeptide bond with one of the catcher domains. The contacting in step (b) is done under conditions suitable for the isopeptide bond to form, thereby forming the multivalent binding polypeptide conjugated to the molecule of interest. In a further embodiment, a non-covalent association of a protein and a tag, as described in Bhatta et al 2021, mAbs, 13:1, 1859049 “Bispecific antibody target pair discovery by high- throughput phenotypic screening using in vitro combinatorial Fab libraries” https://www.tandfonline.com/doi/pdf/10.1080/19420862.2020.1859049 (accessed June 16
th, 5 2023), may be utilised instead of one or more catchers with cognate peptides forming isopeptide bonds. A non-covalent approach may be suitable to simpler multivalent binding polypeptides, e.g. bivalent binding polypeptides which are not subject to multimerisation. The method of the invention, in all embodiments, comprises conjugating the molecule of interest to a first polypeptide of the multivalent binding polypeptide. This conjugated first 10 polypeptide does not have to be used immediately, and can be stored prior to combination with the one or more other polypeptides. In some embodiments, the conjugated first polypeptide is stored for at least one hour, prior to step (b), optionally wherein the storage period is at least one day, at least one week, at least one month or at least six months. The storage may advantageously be at refrigerated or frozen temperatures. Typically, the storage is at between 15 10°C and minus 130°C. In some embodiments, the storage is at 10°C to minus 100°C. In some embodiments the conjugate first polypeptide is stored at 8°C to -80°C, or 4°C to -20°C, or 4°C to 0°C. A second aspect of the invention provides a multivalent binding polypeptide conjugated to a molecule of interest obtained or obtainable by the method of the first aspect. In some 20 embodiments, the multivalent binding polypeptide is bispecific, trispecific or has a higher multiple specificity. A third aspect of the invention provides a method of preparing a population of multivalent antigen binding proteins wherein members of the population have different antigen binding domains from each other, wherein each multivalent antigen binding protein is conjugated to a 25 molecule of interest typically a non-protein molecule of interest, comprising the steps of: (a) providing a plurality of polypeptides each comprising a molecule of interest conjugated thereto, wherein each polypeptide comprises a first binding domain, a second binding domain and a structural domain, wherein the first binding domain and second binding domain are 30 catcher domains each able to form an isopeptide linkage with a cognate peptide, and wherein the cognate peptide for the first binding domain is different from the cognate peptide for the second binding domain; and (b) contacting each of the plurality of polypeptides comprising a molecule of interested conjugated thereto, with a pair of antigen-binding polypeptides each having a 35 cognate peptide that is able to form an isopeptide bond with one of the catcher domains, under conditions that allow the isopeptide bond to form between the catcher domains and the cognate peptides, wherein different pairs of antigen-binding polypeptides are contacted with different first polypeptides comprising a small molecule of interested conjugated thereto, to provide a plurality of antigen-binding polypeptides with different antigen-binding characteristics and the 5 same small molecule of interested conjugated thereto. The references to “two”, “pair” and “first and second” can be increased accordingly for higher order vaccines, so that 3, 4, 5 or more binding domains are present. In one embodiment of the method of the third aspect, the method further comprises the step of: 10 (c) assessing one of more characteristics of the plurality of antigen-binding polypeptides with different antigen-binding characteristics and the same small molecule of interested conjugated thereto. In a variant of the third aspect, the population of multivalent antigen binding proteins comprises different molecules of interested conjugated to different members of the population. This 15 allows for the screening of multiple different payloads in addition to multiple combinations of binding domains. In some embodiments, there are 2 or more different molecules of interest conjugated to different members of the population. In some embodiments, there are 3 or more different molecules of interest conjugated to different members of the population. In some embodiments, there are 4 or more, 5 or more, 10 or more, or 20 or more different molecules 20 of interest conjugated to different members of the population. In these embodiments, the method of the third aspect may further comprise the step of assessing one of more characteristics of the plurality of antigen-binding polypeptides with different molecules of interest attached and the same antigen-binding characteristics. Alternatively in these embodiments, the method of the third aspect may further comprise the step of assessing one 25 of more characteristics of the plurality of antigen-binding polypeptides with different molecules of interest attached and different antigen-binding characteristics, i.e. a combinatorial comparison of multiple differences. A fourth aspect of the invention provides a library of multispecific antigen-binding polypeptides with different antigen-binding characteristics and the same small molecule of interested 30 conjugated thereto, obtained or obtainable by the method of the third aspect. A fifth aspect of the invention provides a multivalent binding polypeptide conjugated to a molecule of interest, optionally a non-protein molecule of interest, wherein the multivalent binding polypeptide comprises a first binding domain, a second binding domain and a structural domain, wherein the first binding domain and second binding domain are catcher 35 domains each able to form an isopeptide linkage with a cognate peptide, and wherein the cognate peptide for the first binding domain is different from the cognate peptide for the second binding domain. In one embodiment of the fifth aspect, the first binding domain is at the N terminus and the second binding domain is at the C terminus, wherein the first and second binding domains are 5 separated by a structural domain. In another embodiment of the fifth aspect, the first binding domain is connected to the second binding domain, and the second binding domain is connected to the structural domain. In a further embodiment of the fifth aspect, the catcher domains are covalently linked by an isopeptide linkage to their cognate peptide. In one embodiment, the cognate peptide for the 10 first catcher domain is different from the cognate peptide for the second catcher domain, and each cognate peptide is covalently attached to a different antigen-binding domain. The covalent attachment between the peptide tag and the antigen binding domain is typically a peptide linkage, more typically wherein the peptide tag and the antigen biding domain are expressed as a single polypeptide chain. 15 In certain embodiments, multivalent binding polypeptides suitable for conjugation to a molecule of interest, or conjugated to a molecule of interest, are provided. In some embodiments, a multivalent binding polypeptide is provided with at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to any exemplary multivalent binding polypeptide described herein. In some embodiments, a multivalent binding 20 polypeptide is provided with at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to SEQ ID NO: 47. In some embodiments, a multivalent binding polypeptide is provided with at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to SEQ ID NO: 52. In some embodiments, a multivalent binding polypeptide is provided with at least 85%, at least 90%, at 25 least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to SEQ ID NO: 53. In some embodiments, a multivalent binding polypeptide is provided with at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to SEQ ID NO: 54. In some embodiments, the % identity is at least 95%. In some embodiments, the % identity is at least 99%. Typically, the functions of the exemplary sequence are retained when 30 the sequence is varied, in particular the functions relevant to the methods described herein. A sixth aspect of the invention provides a population of multivalent binding polypeptides conjugated to a molecule of interest according to the second aspect or the fifth aspect. The multivalent binding polypeptide conjugated to a molecule of interest of the second aspect or the fifth aspect, or the population according to the sixth aspect, can be formulated as a 35 composition for storage. The storage may be at refrigerated or frozen temperature for at least one hour, at least one day, at least one week, at least one month or at least six months. The formulation may comprise one or more excipients. In some embodiments, the one or more excipients may comprise a cryoprotectant. A seventh aspect of the invention provides a kit comprising a plurality of multivalent binding 5 polypeptides conjugated to a molecule of interest according to the second aspect or the fifth aspect, and a plurality of antigen-binding polypeptides each having a cognate peptide that is able to form an isopeptide bond with one of the catcher domains. Brief Description of the Figures 10 Figure 1: Illustration of ADC library generation. a) In current state-of-the-art, a bispecific is generated in a scalable hybridisation manner and subsequently conjugated with a payload. A payload conjugation step is conducted for each generated ADC. b) Our approach first introduces the payload to our 3+-part assembly platform, and then undergoes a scalable step to generate bispecific diversity. As a first step, the Catcher-core library is conjugated to a toxic 15 payload producing the drug-conjugate Catcher-core library. Subsequently the drug-conjugate Catcher-core library is assembled with Tag A and Tag B Binder libraries to produce a drug conjugate / ADC panel. In comparison to the state-of-the-art approach, only requires one payload conjugation reaction to generate a diverse panel of different drug conjugates / ADCs. 20 Figure 2 shows truncation of the HsCutA1 termini. a) Coding sequence schematic of differentially truncated HsCutA1 fused to SpyCatcher003 (SpC3) and DogCatcher (DgC) via example linkers L1 and L2. The termini truncations are noted as 44-179 (SEQ ID NO: 16) and 60-171 (SEQ ID NO: 19) respectively. The sequences are compared in the multiple sequence alignment of WT HsCutA1 (33-179) with HsCutA1 truncated to 44-179 and 60-171. b) Cartoon25 of sites of truncation on HsCutA1. At the N-terminus, residues 44-59 are shown and at the C- terminus, residues 172-179 are shown. Cartoon modelled from PDB ID: 2ZFH. c) Comparison of expression of HsCutA144-179 to HsCutA160-171. Cell lysates from E. coli were compared to show any changes imparted by the truncation on the expression of the proteins. d) Apoptosis induced by SpC3-(GGGGS)3-[C75V, C96S]HsCutA144-179-(GGGGS)3-DgC (SEQ ID NO: 26) 30 and SpC3-(GGGGS)3-[C75V, C96S]HsCutA160-171-(GGGGS)3-DgC (SEQ ID NO: 29) variably assembled with L7 at either SpC3, DgC or both, forming trivalent or hexavalent monospecific molecules. Fractions of surviving Colo 205 cells were determined 48 h after the start of the treatment at indicated doses. Figure 3 shows the removal of cysteines from HsCutA1. a) Cartoon representation of sites of cysteines natively present in HsCutA1 for removal. Cartoon modelled from PDB ID: 2ZFH. b) Table of amino acid frequency at sites similar to HsCutA1’s C75 and C96, following a multiple sequence alignment (MSA) of 406 homologs. Amino acids with frequency of zero are not shown. c) Comparison of expression of HsCutA1
44-179 to HsCutA1
60-171 with natively WT cysteines (C75, C96), alanine mutated cysteines (C75A, C96A) and MSA-informed mutations (C75V, C96S). Cell lysates from E. coli were compared to show any changes imparted by the mutagenesis on the expression of the proteins. d) Apoptosis induced by SpC3-GGGGS- HsCutA1
44-179-GGGGS-DgC (SEQ ID NO: 15), SpC3-(GGGGS)3-[C75V, C96S]HsCutA1
44-179- (GGGGS)3-DgC (SEQ ID NO: 26), SpC3-(GGGGS)3-HsCutA1
60-171-(GGGGS)3-DgC (SEQ ID NO: 27), SpC3-(GGGGS)3--[C75A, C96A]HsCutA1
60-171-(GGGGS)3-DgC (SEQ ID NO: 28), and SpC3-(GGGGS)3--[C75V, C96S]HsCutA1
60-171-(GGGGS)3-DgC (SEQ ID NO: 29) assembled with L7 at both SpC3 and DgC, forming hexavalent monospecific molecules. Fractions of surviving Colo 205 cells were determined 48 h after the start of the treatment at indicated doses. Figure 4: Illustration of a general thiol-maleimide reaction with a monomeric protein, as well as various linker-payload structures. A) A thiol group on the monomeric protein reacts with the maleimide moiety of a payload-linker compound to produce a thiosuccinimide payload-linker adduct onto the protein. B) Example linker-payload compounds, which have been conjugated to Catcher-core proteins: 1 – fluorescein-5-maleimide; 2 – deruxtecan; 3 – maleimidocaproyl- Val-Cit-PAB-DM1; 4 – maleimidocaproyl-Val-Cit-PAB-MMAF. Figure 5: Positions of Cys mutations on a cartoon representation of the CutA1 structure (modelled from PDB ID: 2ZFH) and a model of a fluorescein-5-thiosuccinimide-CutA1 conjugate. A) Spatial locations of Cys mutations on one monomer of the CutA1 structure. Cys thiols are shown as a sphere. B) Model of a fluorescein-5-thiosuccinimide conjugate onto the K82C mutant of CutA1 (SEQ ID NO: 31). All images were produced in PyMOL. Figure 6: Fluorescein-5-maleimide conjugation of all Cys mutant CutA1 Catcher-core variants (SEQ ID NO: 40, 38, 41, 37, 39, 42) as well as wild-type (SEQ ID NO: 15; with wild type CutA1 provided as SEQ ID NO: 1 and the truncated version incorporated in CC1 provided as SEQ ID NO: 16) and no Cys (SEQ ID NO: 36) controls. A) Dye/protein ratio for all mutants calculated as described in methods for samples after PD-10 purification. B) 12% SDS-PAGE gel of unconjugated and conjugated CutA1 mutants. Little to no conjugation is observed for the CC041 negative control but all other mutants conjugate to fluorescein-5-maleimide. CC denotes Catcher-core protein gel bands. Figure 7: Deconvoluted ESI mass spectra of CC042 (SEQ ID NO: 38) and fluorescein-5- maleimide, deruxtecan, maleimidocaproyl-Val-Cit-PAB-DM1 and maleimidocaproyl-Val-Cit- PAB-MMAF conjugates thereof. A) Unconjugated Catcher-core: the main peak is within one mass unit of the expected molecular weight. B) Catcher-core conjugated to fluorescein-5- 5 maleimide: the main peak is approximately 18 Da larger than the expected molecular weight, possibly due to thiosuccinimide ring hydrolysis after conjugation. C) Catcher-core conjugated to deruxtecan: the main peak is within one mass unit of the expected molecular weight. D) Catcher-core conjugated to maleimidocaproyl-Val-Cit-PAB-DM1; the main peak is within one mass unit of the expected molecular weight. E) Catcher-core conjugated to maleimidocaproyl- 10 Val-Cit-PAB-MMAF; the main peak is within one mass unit of the expected molecular weight. Figure 8: SDS-PAGE (4-20% gradient) of fluorescein-5-maleimide conjugation and Catcher/Tag assembly of CC042 (SEQ ID NO: 38) and the CC041 no Cys negative control (SEQ ID NO: 36). Only CC042 efficiently conjugates to fluorescein-5-maleimide. A low level of background signal indicative of conjugation is observed for CC041, suggesting that maleimide 15 conjugation is mostly specific to Cys residues. The conjugated CC042 shows a clear change in migration in the gel on reaction with a DogTag-binder (L7), SpyTag-binder (L8), or both binders simultaneously. A similar profile of bands is observed for the CC041 control, suggesting that maleimide conjugation does not impair assembly efficiency. Figure 9: Fluorescein-5-maleimide conjugation and Catcher/Tag assembly of CC068 (SEQ ID 20 NO: 47) and CC060 (SEQ ID NO: 48) (Fc-Catcher constructs) after one freeze/thaw cycle. In CC068 one of the three hinge region Cys residues is mutated to Ser (C230S), whereas the hinge region is deleted in CC060 so there are no surface-exposed Cys residues. Only CC068 conjugates to fluorescein-5-maleimide. A low level of background conjugation is observed for CC060, suggesting that fluorescein-5-maleimide conjugation is mostly specific to Cys 25 residues. The fluorescein-5-maleimide conjugated CC068 shows a clear and complete change in migration in the gel on reaction with a DogTagged-binder (L7), SpyTagged-binder (L8), or both binders simultaneously. A similar profile of bands is observed for CC060, suggesting that maleimide conjugation does not impair assembly efficiency. Figure 10: Specific membrane binding and internalisation by RTK binders, single assemblies 30 and full assemblies. HsCutA1 refers to CC7, while HsCutA1* refers to CC042 conjugated to fluorescein-5-maleimide. A) His-tagged binders L1 and L2, along with full assembly with His- tagged CC7 (SEQ ID NO: 29) demonstrate membrane binding and internalisation of all compounds when staining with an anti-His antibody. (Note differential ratio of His-tags.) B) CutA1 staining of full assembly of CC7 with L1 and L2 binders confirms membrane binding and 35 internalisation. C) Fluorescein-5-maleimide (F-5-M) conjugated CC042 (SEQ ID NO: 38) alone as single assemblies and full assembly shows RTK specific internalisation of all assemblies without off-target binding of HsCutA1*. Figure 11: Cell survival data of HCT116 cells upon treatment with protein assemblies of CC042 (SEQ ID NO: 38) and CC042 conjugated with DM1 (CC042DM1) with various ligands (L2, L4, 5 L5, L6) at 50nM (A) 10nM (B) and 2nM monomer concentrations (C). Cell viability is presented in a heat map. Left panel shows the difference in cell viability of drug-conjugated CC042DM1 assemblies compared to respective CC042 assemblies without drug conjugation. Middle panel shows cell viability of CC042 assemblies without drug conjugation, right panel shows cell viability of CC042DM1 assemblies. All viability data was normalised to mock-treated controls. 10 Figure 12: Cell survival data of HCT116 cells upon treatment with protein assemblies of CC068 (SEQ ID NO: 47) and CC068 conjugated to deruxtecan (CC068D) with various ligands (L2, L4, L5, L6) at 20nM (A) 4nM (B) and 0.8nM monomer concentrations (C). Cell viability is presented in a heat map. Left panel shows the difference in cell viability of drug-conjugated CC068D assemblies compared to respective CC068 assemblies without drug conjugation. Middle panel 15 shows cell viability of CC068 assemblies without drug conjugation, right panel shows cell viability of CC068D assemblies. All viability data was normalised to mock-treated controls. Figure 13: Cell survival data of HCT116 cells upon treatment with protein assemblies of CC042 (SEQ ID NO: 38) and CC042 conjugated to MMAF (CC042MMAF) with various ligands (L2, L4, L5, L6) at 50nM (A) 10nM (B) and 2nM monomer concentrations (C). Cell viability is 20 presented in a heat map. Left panel shows the difference in cell viability of drug-conjugated CC042MMAF assemblies compared to respective CC042 assemblies without drug conjugation. Middle panel shows cell viability of CC042 assemblies without drug conjugation, right panel shows cell viability of CC042MMAF assemblies. All viability data was normalised to mock-treated controls. 25 Figure 14: Cell survival data of HCT116 cells upon treatment with protein assemblies of CC068 (SEQ ID NO: 47) and CC068 conjugated to MMAF (CC068MMAF) with various ligands (L2, L4, L5, L6) at 50nM (A) 10nM (B) and 2nM monomer concentrations (C). Cell viability is presented in a heat map. Left panel shows the difference in cell viability of drug-conjugated CC068MMAF assemblies compared to respective CC068 assemblies without drug 30 conjugation. Middle panel shows cell viability of CC068 assemblies without drug conjugation, right panel shows cell viability of CC068D assemblies. All viability data was normalised to mock-treated controls. Figure 15 shows a schematic of the coupling of SpyCatcher003 and DogCatcher to magnetic 35 beads to be used during the assembly purification pipeline. Glutathione (GSH)-conjugated magnetic beads are mixed with GST-SpC3 and GST-DgC both at equimolar concentrations. The binding of the GST-Catchers to the beads is incubated for 1 hour at 25 ºC. The beads are then sedimented using a magnetic block and washed 3´ using PBS, or until a baseline absorbance reading at 280 nm is achieved. The GST bound beads containing both Catchers 5 are retained for use during the assembly purification. Figure 16 shows SDS-PAGE (4-20% gradient) of the preparation of glutathione-conjugated beads with GST-SpC3 and GST-DgC. MagneGST resin was prepared with SDS-loading buffer to check for the presence of contaminating proteins in the stock. W1-3: PBS washes. 10 Figure 17 shows a schematic of another variation of Catcher-based protein assembly and clean-up. SpC3-Core-DgC is combined with SpT-protein and DgT-protein with a 1.4x excess of tagged-protein to Catcher-core. The conjugation reaction is incubated for 1 hour at 25 ºC to allow full capture of the tagged-proteins. Pre-prepared paramagnetic glutathione-conjugated 15 beads bound to GST-SpC3 and GST-DgC as described in Figure 15 are added to give a 2x excess of each GST-Catcher to their corresponding tagged-protein. The sample is transferred to a magnetic block to capture the bead-GST-Catcher-Tag conjugates and supernatant containing the Core-Catcher-Tag conjugates is retained for further analysis and downstream assays. 20 Figure 18 shows SDS-PAGE (4-20% gradient) of protein assembly and MagneGST purification. L7
SpT:CC7:L2
DgT protein assembly prior to purification (L7
SpT:CC7:L2
DgT (pre)) shows excess of both binders, whereas L7
SpT:CC7:L2
DgT protein assembly after purification (L7
SpT:CC7:L2
DgT (post)) shows fully assembled molecule only. GST-Catchers + Resin 25 demonstrate the binder excess has been captured during MagneGST purification. (CC7 corresponds to SEQ ID NO: 51). Figure 19 shows cytotoxicity data of one cancer cell line upon treatment with assemblies of various binders assembled with unconjugated CC068 (SEQ ID NO: 47) at 0.5, 5 and 50 nM 30 dimer concentration. Each concentration was drugged in triplicate. Binders were selected to cover a broad range of targets, and were made of different formats (including affibodies, DARPins, scFvs, and Fabs). The unit of the colour scale is cytotoxicity relative to the untreated negative control. Black denotes high cytotoxicity; lighter shades of grey denote low cytotoxicity (legend shown). Standard deviation is also shown in units of cytotoxicity. Black denotes low 35 standard deviation, while lighter shades of grey denote higher standard deviation (legend shown). X: Data excluded. Labels denoting binders and targets are shared within Figures 19- 21 (but may differ from other Figures), with differentially tagged binders of the same binder sequence assigned an identical label (e.g. Binder 1). Figure 20 shows cytotoxicity data of one cancer cell line upon treatment with assemblies of 5 various binders with CC068 (SEQ ID NO: 47) conjugated to MMAF (specifically MC-Val-Cit- PAB-MMAF) at 0.5, 5 and 50 nM dimer concentration. Each concentration was drugged in triplicate. Binders were selected to cover a broad range of targets, and were made of different formats (including affibodies, DARPins, scFvs, and Fabs). In comparison to Figure 19, payload-dependent cytotoxicity becomes apparent. The unit of the colour scale is cytotoxicity 10 relative to the untreated negative control. Black denotes high cytotoxicity; lighter shades of grey denote low cytotoxicity (legend shown). Standard deviation is also shown in units of cytotoxicity. Black denotes low standard deviation, while lighter shades of grey denote higher standard deviation (legend shown). X: Data excluded. Labels denoting binders and targets are shared within Figures 19-21 (but may differ from other Figures), with differentially tagged 15 binders of the same binder sequence assigned an identical label (e.g. Binder 1). Figure 21 shows cytotoxicity data of one cancer cell line upon treatment with assemblies of various binders with CC068 (SEQ ID NO: 47) conjugated to PEG-MMAF (specifically Mal- PEG8-Val-Cit-PAB-MMAF) at 0.5, 5 and 50 nM dimer concentration. Each concentration was 20 drugged in triplicate. Binders were selected to cover a broad range of targets, and were made of different formats (including affibodies, DARPins, scFvs, and Fabs). In comparison to Figure 19, payload-dependent cytotoxicity becomes apparent. In comparison to Figure 20, this demonstrates the ability to rapidly test the impact of different linker-payload constructs. The unit of the colour scale is cytotoxicity relative to the untreated negative control. Black denotes 25 high cytotoxicity; lighter shades of grey denote low cytotoxicity (legend shown). Standard deviation is also shown in units of cytotoxicity. Black denotes low standard deviation, while lighter shades of grey denote higher standard deviation (legend shown). X: Data excluded. Labels denoting binders and targets are shared across Figures 19-21 (but may differ from other Figures), with differentially tagged binders of the same binder sequence assigned an 30 identical label (e.g. Binder 1). Figure 22 consists of two parts, i.e. Figure 22A and Figure 22B (together referred to as Figure 22). Figure 22A shows that the “conjugation-first” approach enables rapid, large-scale screening across a broad range of binder and/or target combinations. In a liquid handling 35 setup, combinations of different binders across a broad target range were assembled with CC068 (SEQ ID NO: 47) with and without payload conjugation to PEG-MMAE (specifically Mal-PEG8-Val-Cit-PAB-MMAE) to generate a panel of ~800 drug candidates (of which ~400 were drug conjugates conjugated to PEG-MMAE) followed by high-throughput drugging in triplicate in a 384-well format to 4 different cell lines relevant to a single indication at 2.5 nM concentration of the assembly dimer, with a total time from drug candidate generation to drugging of <1 week. Experimental data was collected by high- 5 throughput fluorescent cell imaging after 5 days. Left quadrant targets 2-5: Section in which SpT and DgT variants of the same binders are available, containing tetravalent monospecific assemblies, as well as ”mirrored” assemblies (i.e. SpT Binder A + DgT Binder B compared to SpT Binder B + DgT Binder A). The unit of the colour scale is cytotoxicity relative to the untreated negative control. Black denotes high cytotoxicity ; lighter shades of grey denote low 10 cytotoxicity (legend shown). (n=1; legend shown). X: Data excluded. Labels denoting binders and targets are shared within Figure 22A/B (but may differ from other Figures), with differentially tagged binders of the same binder sequence assigned an identical label (e.g. Binder 1). 15 Figure 22B shows standard deviations corresponding to Figure 22A. Standard deviation is shown in units of cytotoxicity. Black denotes low standard deviation, while lighter shades of grey denote higher standard deviation (legend shown). X: Data excluded. Labels denoting binders and targets are shared with Figure 22A (but may differ from other Figures), with differentially tagged binders of the same binder sequence assigned an identical label (e.g. 20 Binder 1). Figure 23 shows that assembly clustering allows an automatic grouping of assemblies based on their activity across the four cell lines. The values for cytotoxicity, bispecific cytotoxicity difference (difference in cytotoxicity between the bispecific assembly and the most cytotoxic 25 corresponding single assembly comprising one of the binders in the bispecific assembly), and ADC cytotoxicity difference (difference in cytotoxicity between the PEG-MMAE-conjugated assembly and the unconjugated assembly) for all PEG-MMAE-conjugated assemblies across all four cell lines were collected (total of 12 features – rows in the clustered heatmap). Minmax normalisation was applied to all 12 features, ensuring values are between 0 and 1 (see scale 30 bar). Hierarchical clustering was then applied to the assemblies using the Euclidean distance metric and Ward’s minimum variance method to progressively merge clusters. The maximum number of clusters was set to 30 (alternating dark grey, grey and black bands on ‘Cluster’ row). Values close to 0 indicate higher cytotoxicity, and higher bispecific and ADC cytotoxicity differences. The first three clusters (left to right) contain assemblies which are active across 35 three cell lines, whereas the last few clusters contain assemblies which only show strong activity in Cell Line 4. Figure 24 shows SDS-PAGE of assemblies derived from different Catcher-core proteins with and without conjugation to the Alexa488 fluorophore molecule (specifically Alexa488 C5 maleimide). For CC076, CC080 and CC086, the left-side gel is Coomassie-stained, and the right-side gel is imaged using the absorbance/fluorescence of the Alexa488 molecule. The 5 annotations for full, single assemblies and Catcher-core proteins are shown for the Alexa488- conjugated samples. Figure 25 shows cytotoxicity data of one cancer cell line upon treatment with assemblies of various binders with unconjugated CC068 (SEQ ID NO: 47) and CC076 (SEQ ID NO: 52) at 10 0.25.2.5 and 25 nM dimer concentration. Comparing the two assemblies of varying orientation, differences in pattern of cytotoxicity with the same binder combination is observed, highlighting the utility in screening with Catcher-cores of different geometries. The unit of the colour scale is cytotoxicity relative to the untreated negative control. Black denotes higher cytotoxicity, lighter shades of grey denote lower cytotoxicity (legend shown). For standard deviation, black 15 denotes low standard deviation, lighter shades of grey denote higher standard deviation (n=3; legend shown). Figure 26 shows in vitro assays of example bispecific antibodies, bAb001 and bAb003 either unconjugated or conjugated to PEG-MMAE (specifically Mal-PEG8-Val-Cit-PAB-MMAE) or 20 PEG-Exatecan (specifically MC-PEG8-Val-Ala-PAB-Exatecan), against a Target A+/Target B+ cell line. bAb001 is a construct of binding domains directly fused to the Fc in a geometry similar to CC068 (binding_domainA-Fc-binding_domainB), whereas bAb003 is a construct of binding domains directly fused to the Fc in a geometry similar to CC076 (binding_domainA- binding_domainB-Fc). A,B) Cell binding assay. The cancer cell line was treated with increasing 25 concentrations of the proteins to a maximum of 200 nM antibody (dimer) concentration allowing for binding saturation (n=2). Level of cell binding was quantified through incubation with an anti-FC secondary antibody conjugated to FITC, followed by flow cytometry quantification. A) Percentage of samples bound to the cell surface as quantified by the percent of cells within the positive gate set from the negative controls. B) Cell binding quantified as Mean 30 Fluorescence Intensity (MFI), allowing relative quantification of the amount of antibody bound to the cell surface. A non-linear curve fitting was used to estimate the EC50 of the samples based on the MFI curve which were: bAb001-no payload, 0.085 nM; bAb003-no payload, 0.14 nM; bAb001-PEG-MMAE, 0.10 nM; bAb001-PEG-Exatecan, 0.077 nM. C) In vitro cytotoxicity assay. The cancer cell line was treated with increasing concentrations of the proteins to a 35 maximum of 100 nM antibody concentration and cells were incubated for 5 days. Cell viability was quantified through Hoechst nuclear staining and subsequent readout on the Celigo Imaging Cytometer. A non-linear curve fitting was used to estimate the EC50 of the samples which were: bAb001-no payload, 0.026 nM; bAb003-no payload, 0.014 nM; bAb001-PEG- MMAE, 0.00060 nM; bAb001-PEG-Exatecan, N.A. nM. Detailed Description of the Invention 5 The invention relates generally to the identification and development of conjugates such as antibody conjugates, in particular multivalent antibody conjugates such as multispecific antibody conjugates, for example bispecific antibody drug conjugates or bispecific antibody dye conjugates. The invention is described with respect to particular embodiments and with reference to certain 10 drawings but the invention is not limited thereto but only by the claims. Any reference signs in the claims shall not be construed as limiting the scope. Of course, it is to be understood that not necessarily all aspects or advantages may be achieved in accordance with any particular embodiment of the invention. Thus, for example those skilled in the art will recognize that the invention may be embodied or carried out in a manner that achieves or optimizes one 15 advantage or group of advantages as taught herein without necessarily achieving other aspects or advantages as may be taught or suggested herein. In addition as used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to “a scaffold” includes two or more scaffolds, reference to “an oligomer” 20 includes two or more such oligomers and the like. All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety. The invention typically utilises multivalent binding polypeptide conjugated to a molecule of interest, optionally a non-protein molecule of interest, wherein the multivalent binding 25 polypeptide comprises a first binding domain, a second binding domain and a structural domain, wherein the first binding domain and second binding domain are catcher domains each able to form an isopeptide linkage with a cognate peptide, and wherein the cognate peptide for the first binding domain is different from the cognate peptide for the second binding domain. 30 Structural Domains The structural domain provides a defined structural support for the binding domains. In some embodiments, the structural domain can ensure that the binding domains have the desired orientation so that they can bind their targets, typically with both binding domains in the cis orientation. The constructs can therefore present a single binding surface, in some embodiments. In some embodiments, the structural domain provides such structural support via the tertiary structure of a monomer (e.g. suitable relative positioning of N and C termini of a single monomer). In some embodiments, the structural domain provides such structural support via association of monomers to a quaternary structure (e.g. suitable relative positioning 5 of N and/or C termini between monomers). In some preferred embodiments, the structural domain provides such structural support via the tertiary structure of a monomer combined with association of monomers to a quaternary structure (e.g. suitable relative positioning of N and C termini within and between monomers). The structural domain may be any polypeptide domain comprising a defined secondary 10 structure, typically an alpha helix or a beta sheet. In some particularly advantageous embodiments, the structural domain has its N and C termini in the same spatial region, for example substantially adjacent or adjacent to each other. Attaching the binding domains to the termini of the structural domain then provides the two binding domains substantially adjacent in the three-dimensional conformation. In some embodiments, the N and C termini 15 are oriented to face in substantially the same direction. As described elsewhere herein, the provision of N and C termini that are adjacent in space can result in the binding domains being situated on the same face of the construct, or on the same face of the oligomer comprising multiple constructs. Accordingly, the constructs typically present a single binding surface. The constructs typically present the binding regions in cis orientation. 20 The structural domains may comprise a single polypeptide chain, or may be formed of two or more separate polypeptide chains that associate to form a single structural domain complex. In some embodiments, two or more polypeptide chains with appropriate characteristics are identified and then fused, typically by recombinant means to form a single polypeptide chain (i.e. a fusion protein), but also by chemical conjugation or bonding to form a single covalent 25 molecule. Similarly, the structural domain may also comprise multiple domains or may be referred to as a structural region, such as an Fc domain or an Fc region (further described below). The structural domain is different from the two binding domains. Therefore, when the binding domains are catcher polypeptides such as SpyCatcher, DogCatcher or SnoopCatcher, the 30 structural domain is not a catcher polypeptide. A number of suitable structural domains are described in WO-A-2022/200804 and WO-A- 2024/069180. 35 CutA1 CutA1, typically human CutA1, is a suitable structural domain. In some embodiments, a CutA1 protein is engineered to comprise one or more substitutions, insertions or deletions relative to wild-type CutA1. 5 In certain embodiments the structural domain is human CutA1: MSGGRAPAVLLGGVASLLLSFVWMPALLPVASRLLLLPRVLLTMASGSPPTQPSPASDSGSGYVPGSVSAAFVTC PNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIA LPVEQGNFPYLQWVRQVTESVSDSITVLP In certain embodiments, the structural domain is a modified form of human CutA1 wherein at 10 least one cysteine residue is substituted for a different amino acid residue. In certain embodiments, the structural domain is a modified and truncated form of human CutA1 wherein at least one N and/or C terminus amino acid residue is removed and wherein at least one cysteine residue is substituted for a different amino acid residue. In certain embodiments, the CutA1, typically human CutA1, is engineered to remove one or 15 more cysteine residues from the native sequence (presented above). Typically, this is a substitution of one or more cysteine residues with one or more non-cysteine residues. The removal of one or more cysteine residues beneficially allows for targeted cysteine conjugation at non-native sites or in fusion proteins. Without being bound by theory, unpaired cysteines routinely interfere with stability and downstream application, so it can be beneficial to remove 20 unpaired cysteines. In some embodiments, it is beneficial to remove one or more unpaired cysteines and reintroduce one or more cysteine residues only at optimised targeted positions. In some embodiments, one or more cysteine residues in CutA1 are substituted with one or more Alanine residues. In some embodiments, one or more cysteine residues in CutA1 are substituted with one or more Valine residues. In some embodiments, one or more cysteine 25 residues in CutA1 are substituted with one or more Serine residues. In some embodiments, the substitution comprises or consists of two cysteines substituted with two alanines, which is referred to herein as a “CACA” substitution. In some embodiments, the substitution comprises or consists of one cysteine substituted with a valine and one cysteine substituted with a serine, which is referred to herein as a “CVCS” substitution. In some embodiments, the cysteine 30 residues at positions 75 and 96 of wild-type human CutA1 (e.g. SEQ ID NO: 1) are substituted for different residues. In some embodiments, human CutA1 is engineered to substitute two cysteine residues, wherein the cysteine substitutions comprise or consist of (i) C75A, C96A or (ii) C75V, C96S. Accordingly, in some embodiments, the structural domain of the polypeptide construct (or the subunit monomer of the oligomeric core) is a human CutA1 protein 35 engineered to substitute two cysteine residues, wherein the cysteine substitutions comprise or consist of (i) C75A, C96A or (ii) C75V, C96S. Figure 3 demonstrates the generation and biological effects of polypeptide constructs comprising such cysteine-substituted human CutA1 domains. In some embodiments, it is beneficial to remove one or more unpaired cysteines and reintroduce one or more cysteine residues only at optimised targeted positions. The 5 reintroduced cysteines can be useful, for example, as conjugation sites for a drug or dye, such as in the production of a labelled antibody (or antibody-type molecule) or antibody drug conjugate (“ADC")-type molecule. Example 3 exemplifies the substitution of non-cysteine residues to cysteine residues, and observes that CutA1 with newly-introduced Cys residues have a higher conjugation efficiency when conjugating a dye or a drug to the CutA1, compared 10 to the wild-type and the negative control (CC041, i.e. CutA1 CACA - engineered to remove native cysteines). The newly-introduced cysteine or cysteines can be substituted into the sequence at a favourable position. As exemplified in Example 13, exemplary positions in human CutA1 include one or more of V64, E78, K79, K82, E83, K91, Q102, K110, E114, F136, S139, F158 and Q166. In some embodiments, human CutA1 (e.g. as listed above) has 15 cysteine residues substituted-in at 2, 3, 4, 5 or 6 of residues E78, K82, Q102, E114, F136 and Q166. In certain embodiments, the CutA1, typically human CutA1, is engineered to delete one or more residues from either or both termini of the native sequence, to form a truncated CutA1 domain that is incorporated into the construct of the invention. In some embodiments, one 20 or more residues are deleted from the N terminus of CutA1, typically human CutA1. Typically, 5 to 70 residues are deleted from the N terminus of CutA1, typically human CutA1, for example 10 to 59 residues. In some embodiments, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 50, 55, 59, 60, 61, 62, 63, 64, 65 or 66 residues are deleted from the N terminus of CutA1. In some embodiments, the truncated CutA1 begins at residue 30 (i.e 25 29 N-terminal residues are deleted). In some embodiments, the truncated CutA1 begins at residue 33 (i.e 32 N-terminal residues are deleted). In some embodiments, the truncated CutA1 begins at residue 44 (i.e 43 N-terminal residues are deleted). In some embodiments, the truncated CutA1 begins at residue 60 (i.e 59 N-terminal residues are deleted). In some embodiments, the truncated CutA1 begins at residue 67 (i.e 66 N-terminal residues are 30 deleted), for example the “HsCutA167-171 CACA” direct fusion construct exemplified in Figure 23b of WO-A-2024/069180. In some embodiments, one or more residues are deleted from the C terminus of CutA1, typically human CutA1. The C-terminal deletion may be instead of, or in addition to, the N- terminal deletion. Typically, 5 to 20 residues are deleted from the C terminus of CutA1, 35 typically human CutA1, for example 6 to 12 residues are deleted. In some embodiments, approximately 8 residues are deleted. In human CutA1 an 8 residue deletion leaves a truncated protein with a C-terminus at what is residue 171 (valine) in the wild-type sequence shown above. In some embodiments, the truncated protein has a C-terminus at what is residue 168 (threonine) in the wild-type sequence shown above. In some embodiments, the truncated protein has a C-terminus at what is residue 169, 170, 172, 173, 174, 175 or 176 in the wild- 5 type sequence shown above. In some embodiments, the truncated CutA1 starts at any of residues 30 to 67 of the human CutA1 sequence shown above. In some embodiments, the truncated CutA1 starts at any of residues 44 to 67. In certain embodiments, the truncated CutA1 consists of residues 44-179, residues 61-168 or residues 60-171. In some embodiments, the truncated CutA1 starts at 10 any of residues 44 to 67 and ends at any of residues 168 to 179. In other embodiments, the truncated CutA1 starts at any of residues 30 to 65 and ends at any of residues 165 to 179. In some embodiments, the truncated CutA1 starts at any of residues 44 to 67 and ends at any of residues 171 to 179. The truncation of CutA1 can improve the precision by which fusion constructs can be 15 constructed. In certain embodiments, the CutA1 is a human CutA1 that is engineered to remove one or more cysteine residues from the native sequence and that is truncated at the N and/or C termini. In some embodiments, the truncated CutA1 consists of residues 44-179 or residues 60-171, and has cysteine substitutions that comprise or consist of (i) C75A, C96A or (ii) C75V, 20 C96S. In other embodiments, the truncated CutA1 starts at any of residues 30 to 65 of and end at any of residues 165 to 179, and also has cysteine substitutions that comprise or consist of (i) C75A, C96A or (ii) C75V, C96S. In some embodiments, the CutA1 is a human CutA1 that is engineered to remove one or more cysteine residues from the native sequence, is truncated at the N and/or C termini, and has at 25 least one non-cysteine residue substituted to a cysteine residue. In some embodiments, the truncated CutA1 consists of residues 44-179 of human CutA1 (shown above), has cysteine residues substituted out of the native CutA1 sequence that comprise or consist of (i) C75A, C96A or (ii) C75V, C96S, and has at least one cysteine present at a residue position that is not a cysteine residue in the native sequence. 30 In some embodiments, the truncated CutA1 consists of residues 44-179 of human CutA1, has cysteine substitutions-out that comprise or consist of C75A, C96A, and has a cysteine residue substituted-in at one, two or three of residues K82, E114 and F136. In some embodiments, the truncated CutA1 consists of residues 44-179 of human CutA1, has cysteine substitutions-out that comprise or consist of C75A, C96A, and has a cysteine residue 35 substituted-in at one or more of residues V64, E78, K79, E83, K91, Q102, K110, S139, F158 and Q166. In some embodiments, this variant CutA1 has cysteine residues substituted-in at 2, 3, 4, 5, 6, 7, 8, 9 or 10 of residues V64, E78, K79, E83, K91, Q102, K110, S139, F158 and Q166. In some embodiments, the truncated CutA1 consists of residues 44-179 of human CutA1, has 5 cysteine substitutions-out that comprise or consist of C75A, C96A, and has a cysteine residue substituted-in at one or more of residues E78, Q102 and Q166. In some embodiments, this variant CutA1 has cysteine residues substituted-in at two or three of residues E78, Q102 and Q166. In some embodiments, the truncated CutA1 consists of residues 44-179 of human CutA1, has 10 cysteine substitutions-out that comprise or consist of C75A, C96A, and has a cysteine residue substituted-in at one or more of residues E78, K82, Q102, E114, F136 and Q166. In some embodiments, this variant CutA1 has cysteine residues substituted-in at 2, 3, 4, 5 or 6 of residues E78, K82, Q102, E114, F136 and Q166. In some embodiments, the variant CutA1 comprises or consists of a sequence as defined 15 above, or any of the specific sequences described in the Examples, and comprises between 1 and 10 amino acid substitutions at residues that are not specified as being a cysteine residue substitution-in or substitution-out. For example, in some embodiments variant human cCutA1 sequences can be provided that contain 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acid substitutions at positions other than 95, 64, 76, 78, 79, 82, 83, 91, 95, 102, 110, 114, 136, 139, 158 and 20 166. These 1 to 10 substitutions can be to any of the standard 20 amino acids, although typically they will not replace a non-cysteine residue with cysteine. Typically, these 1 to 10 substitutions will be conservative substitutions. Fc domains Fc domains of antibodies are well-known in the art. 25 Typically, the Fc domain (or Fc region) is a human Fc, more typically a human IgG Fc, for example IgG1, IgG2, IgG3 or IgG4. The canonical human IgG1 (secreted) heavy chain constant sequence has UniProtKB accession Number P01857-1 (IGHG1). This contains the CH1, hinge (usually defined as comprising at least CPPCP), CH2 and CH3 regions as is well-known in the art. 30 In some embodiments, the structural domain is the IgG1 constant fragment (Fc) incorporating the hinge region, typically human. The hinge region natively contains Cys residues, allowing payload conjugation. In some embodiments, 1, 2, or 3 of these cysteines can be substituted to a non-cysteine residue, to control conjugation. For example, the exemplified CC068 sequence (SEQ ID NO: 47) is a SpyCatcher-hinge-Fc- (G4S)2-DogCatcher construct with a C230S (Fc region PDB numbering) mutation. As there is no flexible linker between the C-terminus of the SpyCatcher3 domain and the hinge region, the C230S mutation was selected to eliminate the most N-terminal disulfide bond between two 5 hinge-Fc monomers and allow greater conformational freedom in the most N-terminal part of the hinge region. Thus, this construct contains 2 Cys residues per monomer and 4 per dimer. In contrast, in CC60 (SEQ ID NO: 48), the hinge region is deleted and longer linkers are used, resulting in a SpyCatcher-(G4S)3-Fc-(G4S)3-DogCatcher construct, which can be used as a negative control for payload conjugation. 10 Linkers as structural domains In some embodiments, the structural domain comprises or consists of a polypeptide linker. The polypeptide linker typically comprises or consists of 3 to 30 amino acid residues, typically 5 to 20 residues, for example 7, 8, 9, 10, 11, 12, 13 or 14 residues. In some embodiments, the polypeptide linker comprises or consists of 9 amino acid residues. 15 The polypeptide linker may comprise a majority (i.e. more than 50%) glycine and/or serine residues, or may comprise only glycine and/or serine residues. An exemplary linker is GSGGSGGSG. A polypeptide linker can be rigid or flexible. In some embodiments, additional structural support for the binding domains is provided by the 20 presence of one or more intramolecular disulphide bonds that help to lock the orientation of the binding domains. For example, different domains of the construct can be connected by a disulphide linkage. This may be achieved by engineering a non-native cysteine residue into each of two binding domains, at a location that allows for disulphide bond formation between the engineered cysteines without destrying target binding. 25 Antigen Binding Domains The multivalent binding polypeptides of the invention comprise multiple binding sites. Typical binding sites are antigen-binding domains, which are well-known in the art. An antigen binding domain (sometimes also referred to as an antigen binding region) may be 30 a full-length antibody, an antigen-binding fragment of an antibody or an engineered antibody construct. Antigen-binding domains within the scope of the invention include, but are not limited to, Fab fragments, F(ab’)2, scFv fragments, scFv tandem, scFv-Fc, diabodies, scFv-- CH3 (minibodies), scFab, human antibodies, humanised antibodies, nanobodies / VHH fragments, humanized VHH, (engineered / stabilised) human VH domains. Antigen-binding domains further include, without limitation, suitable natural or engineered protein scaffolds not derived from antibodies such as affibodies (engineered from Protein A), designed ankyrin repeat proteins (DARPins), or knottins. Furthermore, antigen-binding domains may include naturally occurring antigen-binders, such as cytokines and natural ligands or fragments thereof 5 (such as growth factors, tumour necrosis factors, interleukins; such as TNFα, TGFα; including membrane-bound or soluble components). As used herein, antigen-binding domains shall also refer to receptor fragments capable of binding a soluble or membrane-bound protein or peptide, as exemplified by existing therapeutic drugs such as Enbrel (featuring Fc fused to a TNFR2-receptor fragment capable of binding TNFα and TNFβ). 10 The term “antibody” is used herein in its broadest sense and includes certain types of immunoglobulin molecules comprising one or more antigen-binding domains that specifically bind to an antigen or epitope. An antibody includes intact full-length antibodies (e.g., intact immunoglobulins), antibody fragments, and multi-specific antibodies. An scFv (single Chain Fragment variable) typically has a variable domain of light chain (VL) 15 connected from its C-terminus to the N-terminal end of a variable domain of heavy chain (VH) by a polypeptide chain. Alternately the scFv comprises of polypeptide chain where in the C- terminal end of the VH is connected to the N-terminal end of VL by a polypeptide chain. A “Fab fragment” (also referred to as fragment antigen-binding) contains the constant domain (CL) of the light chain and the first constant domain (CH1) of the heavy chain along with the 20 variable domains VL and VH on the light and heavy chains respectively. The variable domains comprise the complementarity determining loops (CDR, also referred to as hypervariable region) that are involved in antigen-binding. Fab′ fragments differ from Fab fragments by the addition of a few residues at the carboxy terminus of the heavy chain CH1 domain including one or more cysteines from the antibody hinge region. 25 “F(ab’)2” fragments contain two Fab’ fragments joined, near the hinge region, by disulfide bonds. F(ab’)2 fragments may be generated, for example, by recombinant methods or by pepsin digestion of an intact antibody. The F(ab’) fragments can be dissociated, for example, by treatment with ß-mercaptoethanol. “Fv” fragments comprise a non-covalently-linked dimer of one heavy chain variable domain 30 and one light chain variable domain. “Single-chain Fv” or “scFv” includes the VH and VL domains of an antibody, wherein these domains are present in a single polypeptide chain. In one embodiment, the Fv polypeptide further comprises a polypeptide linker between the VH and VL domains which enables the scFv to form the desired structure for antigen-binding. For a review of scFv see Pluckthun in35 The Pharmacology of Monoclonal Antibodies, vol.113, Rosenburg and Moore eds., Springer- Verlag, New York, pp. 269-315 (1994). HER2 antibody scFv fragments are described in WO93/16185; U.S. Pat. No.5,571,894; and U.S. Pat. No.5,587,458. “scFv-Fc” fragments comprise an scFv attached to an Fc domain. For example, an Fc domain may be attached to the C-terminal of the scFv. The Fc domain may follow the VH or VL, 5 depending on the orientation of the variable domains in the scFv (i.e., VH -VL or VL -VH ). Any suitable Fc domain known in the art or described herein may be used. In some cases, the Fc domain comprises an IgG4 Fc domain. In some cases, the Fc domain comprises an IgG1 Fc domain. “single domain antibody” or “sdAb” refers to a molecule in which one variable domain of an 10 antibody specifically binds to an antigen without the presence of the other variable domain. Single domain antibodies, and fragments thereof, are described in Arabi Ghahroudi et al., FEBS Letters, 1998, 414:521-526 and Muyldermans et al., Trends in Biochem. Sci., 2001, 26:230-245, each of which is incorporated by reference in its entirety. Single domain antibodies are also known as sdAbs or nanobodies. Sdabs are fairly stable and easy to express 15 as fusion partner with the Fc chain of an antibody (Harmsen MM, De Haard HJ (2007). "Properties, production, and applications of camelid single-domain antibody fragments". Appl. Microbiol Biotechnol.77(1): 13-22). The terms “full length antibody,” “intact antibody,” and “whole antibody” are used herein interchangeably to refer to an antibody having a structure substantially similar to a naturally 20 occurring antibody structure and having heavy chains that comprise an Fc region. For example, when used to refer to an IgG molecule, a “full length antibody” is an antibody that comprises two heavy chains and two light chains. Catcher domains and cognate tags Site-specific isopeptide linkages 25 An isopeptide bond is an amide bond that is formed between the side chains of two amino acid residues, for example between the carboxyl group of one amino acid and the amino group of another. A Tag/Catcher system to form an isopeptide bond is typically used according to the invention. Tag/Catcher systems have also been described utilising alternative chemistries, such as ester 30 bonds instead of isopeptide bonds. The SpyTag/SpyCatcher system is a well-known technique for forming isopeptide linkages between amino acid residues. Other known Tag/catcher systems include SnoopTag/SnoopCatcher, SdyTag/Catcher and SpyLigase/SpyTag, and can be used according to the invention. SpyTag-SpyCatcher system The SpyCatcher/SpyTag system is known in the art, for example as described by Zakeri et al 2012 (Proc Natl Acad Sci U S A.2012 Mar 20; 109(12): E690-E697). Briefly, Streptococcus pyogenes fibronectin-binding protein FbaB contains a domain (CnaB2) 5 with a spontaneous isopeptide bond between Lys (K31) and Asp (D117). By splitting this domain and rational engineering of the fragments, a peptide (SpyTag) is provided which forms an amide bond to its protein partner (SpyCatcher) in minutes. The catalytic lysine for isopeptide bond formation is in the SpyCatcher domain and the reactive aspartate is in the SpyTag peptide. The isopeptide bond formation reaction occurs spontaneously in high yield simply 10 upon mixing and amidst diverse conditions of pH, temperature, and buffer/redox conditions. SpyTag can be fused to a target protein at either terminus or internally and reacts specifically with SpyCatcher forming a site-directed isopeptide bond. This isopeptide bond has been shown not to be reversed by boiling or competing peptide. Single-molecule dynamic force spectroscopy showed that SpyTag did not separate from SpyCatcher until the force exceeded 15 1 nN, where covalent bonds snap. The robust reaction conditions and irreversible linkage of SpyTag shed light on spontaneous isopeptide bond formation and provides a stable linkage for new protein architectures. Multiple rounds of engineering have produced SpyCatchers/SpyTags with fast rates for isopeptide bond formation. This has been used for protein conjugation with many different 20 proteins. For example, the initial SpyTag/SpyCatcher system was subsequently evolved to increase the rate of isopeptide bond formation, through SpyTag/Catcher 002 to SpyTag/SpyCatcher 003, which saw a 400 fold rate improvement from the first iteration. A range of SpyTag peptides and SpyCatcher polypeptides are known in the art, for example as described by Arnold et al (J Am Chem Soc.2013 September 18; 135(37): 13988-13997), 25 Tirrell et al (Proc Natl Acad Sci U S A.2014 July 21; 111(31): 11269-11274) Howarth et al (J Am Chem Soc. 2014 August 21; 136(35): 12355-12363), Li et al (J Mol Biol. 2014 Jan 23; 426(2): 309–317) and Keeble et al PNAS December 26, 2019116 (52) 26523-26533. Any suitable sequences can be used in the invention as will be apparent to the skilled person. The plasmids for basic SpyCatcher and SpyTag constructs are available from the Addgene 30 plasmid repository (www.addgene.org): SpyCatcher (#35044); ΔN1ΔC2SpyCatcher (#87376), SpyTag-maltose binding protein (MBP) (#35050); AviTag-SpyCatcher (#72326); SpyCatcher002 (#102827); SpyTag002-MBP (#102831). A database of Tag/Catcher sequences known as “SpyBank” is publicly available at https://www2.bioch.ox.ac.uk/howarth/info.htm and http://www.howarthgroup.org/info. In October 2022 this database contained over 1,000 Tag/Catcher amino acid sequences to download as an Excel file. SpyBank includes expression organism, Tag/Catcher location (N- terminal, internal, C-terminal) and annotated linkers, as a reference for groups creating their own fusions. SpyBank itself is described in: “Insider information on successful covalent protein 5 coupling with help from SpyBank” by Keeble AH, Howarth M. Meth Enz 2019. As used herein, the terms SpyCatcher and SpyTag refers to the diversity of SC and ST proteins and is not limited to one specific sequence. The original SpyCatcher peptide is shown in SEQ ID NO:2 with the expressed N-terminal sequences and in SEQ ID NO:5 without those optional N-terminal sequences. The original 10 SpyTag peptide is 13 residues long as shown in SEQ ID NO: 3. A truncated but still functional SpyTag is shown in SEQ ID NO:4 and a truncated but still functional SpyCatcher is shown in SEQ ID NO:6. SEQ ID NOs. 7 and 8, and SEQ ID NOs: 8 and 9, show two further SpyTag/SpyCatcher pairs, respectively. The SpyTag of SEQ ID NO:3 is typically paired with the SpyCatcher of SEQ ID NO:5 or SEQ ID NO:6. SpyTag002 of SEQ ID NO:7 is typically 15 paired with SpyCatcher002 of SEQ ID NO:8. SpyTag003 (SEQ ID NO:9) is typically paired with SpyCatcher003 of SEQ ID NO:10, although each SpyTag generation can also bond efficiently with other SpyCatcher generations. Original SpyCatcher Sequence MSYYHHHHHHDYDIPTTENLYFQGAMVDTLSG (SEQ ID NO: 2) LSSEQGQSGDMTIEEDSATHIKFSKRDEDGKE (Optional N-terminal protease site in italics; LAGATMELRDSSGKTISTWISDGQVKDFYLYP Optional linker underlined) GKYTFVETAAPDGYEVATAITFTVNEQGQVTV NGKATKGDAHI Original SpyTag Sequence AHIVMVDAYKPTK (SEQ ID NO:3) Aspartate residue that forms the isopeptide bond is underlined. Minimal Reactive SpyTag Sequence AHIVMVDA (SEQ ID NO:4) Original SpyCatcher Sequence VDTLSGLSSEQGQSGDMTIEEDSATHIKFSKR (SEQ ID NO:5) DEDGKELAGATMELRDSSGKTISTWISDGQVK without optional sequences DFYLYPGKYTFVETAAPDGYEVATAITFTVNE QGQVTVNGKATKGDAHI Minimal Reactive SpyCatcher “ΔN1ΔC2” DSATHIKFSKRDEDGKELAGATMELRDSSGKT (SEQ ID NO:6) ISTWISDGQVKDFYLYPGKYTFVETAAPDGYE VATAITFTVNEQGQVTVNG SpyTag002 VPTIVMVDAYKRYK (SEQ ID NO:7) SpyCatcher002 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKR (SEQ ID NO:8) DEDGRELAGATMELRDSSGKTISTWISDGHVK DFYLYPGKYTFVETAAPDGYEVATAITFTVNE QGQVTVNGEATKGDAHT SpyTag003 RGVPHIVMVDAYKRYK (SEQ ID NO:9) SpyCatcher003 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKR (SEQ ID NO:10) DEDGRELAGATMELRDSSGKTISTWISDGHVK DFYLYPGKYTFVETAAPDGYEVATPIEFTVNE DGQVTVDGEATEGDAHT Suitable SpyCatcher and SpyTag sequences can include modifications to these exemplary amino acid sequences. SpyCatcher variants can include modifications to the amino acid residues in the SpyCatcher 5 peptide. In some embodiments, the SpyCatcher polypeptide comprises or consists of a sequence at least 70% identical, at least 80% identical, at least 90% identical, or at least 95% identical to SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:10. In some embodiments, the lysine residue that forms the isopeptide bond is not modified. In some embodiments, the lysine residue that forms the isopeptide bond is substituted with arginine or 10 histidine. A variant of the SpyCatcher molecule that can be used according to the invention, is the SpyLigase protein described by Fierer et al (PNAS April 1, 2014111 (13) E1176-E1181). SpyLigase locks two peptide tags together. SpyLigse was formed by splitting CnaB2 into three parts to enable peptide–peptide ligation. Specifically, SpyTag (13 aa) was left unchanged but the β-strand of CnaB2 containing the reactive Lys was separately expressed and termed KTag (10 aa). SpyLigase (11 kDa) was derived from SpyCatcher by (i) removing residues from the β-strand containing the reactive Lys and (ii) circular permutation, replacing residues from the C terminus of CnaB2 with a Gly/Ser linker followed by N-terminal CnaB2 residues. These 5 sequences are summarised below: Note: TEV Protease cleavage site: ENLYFGQ I>E and M>Y mutations in SpyCatcher compared to original CnaB2 10 SpyTag parent sequence: AHIVMVDA KTag parent sequence: ATHIKFSKRD Circular permutation underlined. SpyTag and KTag dock with SpyLigase and the triad is arranged as in CnaB2 to direct covalent ligation of SpyTag with KTag. In some embodiments of the present invention, one protein (e.g. 15 enzyme) is tagged with SpyTag and another protein (e.g. enzyme) is tagged with KTag. These tagged proteins can bind together using a SpyLigase immobilised to a bead. In some embodiments, the catcher polypeptide comprises or consists of a sequence at least 70% identical, at least 80% identical, at least 90% identical, or at least 95% identical to the SpyLigase sequence shown above. 20 SpyTag variants can include 1, 2, 3 or more modifications to the amino acid residues in the SpyTag peptide. In some embodiments, the SpyTag peptide comprises or consists of a sequence having 1, 2, 3, 4 or 5 substitutions, additions or deletions relative to SEQ ID NO: 3, SEQ ID NO:7 or SEQ ID NO:9, wherein the active Asp (e.g. at position 7 of SEQ ID NO:3) is not varied or wherein the active aspartic acid is substituted for glutamic acid. Typically, the SpyTag peptide comprises the minimal reactive sequence of SEQ ID NO:4. Importantly, catcher/tag systems that are “orthogonal” to SpyCatcher/SpyTag (do not cross- 5 react with SpyCatcher/SpyTag) have been described in the art. These orthogonal systems include SnoopCatcher/SnoopTag, SnoopCatcher/SnoopTagJr, and DogCatcher/DogTag. SnoopCatcher and DogCatcher were derived from the same D4 domain from the RrgA adhesion protein from Streptococcus pneumoniae. The D4 domain were differentially split at both termini to form SnoopCatcher/SnoopTag and DogCatcher/DogTag. SnoopCatcher and 10 DogCatcher can react with each other as they contain cognate tag sequences. Systems similar to SpyLigase have been described based on the RrgA D4 domain, i.e. the SnoopLigase and SnoopLigase2 systems with respective cognate tags. Exemplary and non-limiting catcher-tag combinations are described in the table below: Catcher Tag SpyCatcher SpyTag SpyCatcher SpyTag002 SpyCatcher SpyTag003 SpyTag SpyCatcher SpyTag002 SpyCatcher SpyTag003 SpyCatcher SpyCatcher002 SpyTag002 SpyCatcher002 SpyTag SpyCatcher002 SpyTag003 SpyTag002 SpyCatcher002 SpyTag SpyCatcher002 SpyTag003 SpyCatcher002 SpyCatcher003 SpyTag003 SpyCatcher003 SpyTag SpyCatcher003 SpyTag002 SpyTag003 SpyCatcher003 SpyTag SpyCatcher003 SpyTag002 SpyCatcher003 SpyTag K-tag Binding mediated by K-tag SpyTag
SpyLigase SnoopCatcher SnoopTag SnoopTag SnoopCatcher SnoopCatcher SnoopTagJr SnoopTagJr SnoopCatcher SnoopTagJr DogTag Binding mediated by DogTag SnoopTagJr SnoopLigase Pilin-C IsopepTag IsopepTag Pilin-C DogCatcher DogTag First polypeptides One or more domains of the first polypeptide may be connected by a linker, typically a peptide linker. Suitable peptide linkers for use in connecting a binding domain to a structural domain 5 are typically between 1 and 150, 1 and 100, 1 and 50, 1 and 25, 1 and 20, 1 and 15, or 1 to 10 amino acids in length. An example of a linker sequence is GSGS, GGGGS, GGGGSGGGGS, or GGGGSGGGGSGGGGS. A number of illustrative first polypeptides are provided below. SpC-Linker1-CutA1-Linker2-SpC 10 SnC-Linker1-CutA1-Linker2-SnC SnC-Linker1-CutA1-Linker2-SpC SpC-Linker1-CutA1-Linker2-SnC SpC3-Linker1-CutA1-Linker2-DgC wherein SpC is SpyCatcher, SpC3 is SpyCatcher003, SnC is SnoopCatcher, DgC is 15 DogCatcher. In each case, the Linker1 and Linker2 linkers are optional. The CutA1 sequences may be human or from Pyrococcus horikoshii, or a homologue from another species, or have at least 30%, at least 50%, at least 70% or at least 90% identity to the human or Pyrococcus horikoshii sequene. Further illustrative constructs are: 20 SnC-Linker1-NC1-Linker2-SpC SpC-Linker1-NC1-Linker2-SnC wherein NC1 is a collagen NC1 domain from Collagen VIII or Collagen X. In each case, the Linker1 and Linker2 linkers are optional. Any suitable structural domain, in particular any of the structural domains described herein, 5 may be used in these constructs instead of the exemplary structural domains provided above. Accordingly, these exemplary formats are described for use with structural domains in general and with the structural domains described herein. Molecules of interest A molecule of interest includes an effector molecule or a payload as sometimes referred to in 10 the art, in particular in the context of conjugating a molecule of interest to an antibody to form an antibody-drug conjugate (ADC). Accordingly, such payloads as well as other molecules of interest are well-known in the art. Typically, the molecule of interest comprises or consists of a label, a dye molecule, or a drug i.e a substance that has a therapeutic or prophylactic physiological effect on a human or animal body when consumed by the body. An effect of such 15 a drug may be potentiated or enabled when administered as a drug conjugate. A drug is typically a pharmaceutical drug. In some typical embodiments, the molecule of interest is a drug. The drug may be of any type, as will be apparent to the skilled person. Drug types that may be conjugated to CutA1 or Fc include but are not limited to analgesic, antibiotic, anticancer, 20 anticoagulant, antidepressant, antidiabetic, antiepileptic, antipsychotic, antispasmodic, antiviral, cardiovascular, depressant, sedative, and stimulant drugs. In some embodiments, the drug is an anticancer drug, for example a cytotoxic drug. In some embodiments, the anticancer drug is a tubulin inhibitor, for example a maytansinoid such as mertansine, or auristatin, or a taxol derivative. Examples of drugs for inhibiting tubulin polymerisation are 25 auristatins such as the tissue factor directed monomethyl auristatin E (MMAE) and monomethyl auristatin F (MMAF), compounds derived from dolastatin 10 (e.g. TZT-1027 as described by Kobayashi et al Jpn J Cancer Res.1997 Mar;88(3):316-27.) , and tubulysins such as Tubulysin A. In some embodiments, the anticancer drug is a DNA-damaging agent, which causes cell death by cleaving the DNA and/or causing DNA alkylation. Examples of DNA- 30 damaging molecules are duocarmycins, calicheamicins, and pyrrolobenzodiazepines. In some embodiments, the drug is a topoisomerase I inhibitor, which binds to a complex between topoisomerase I and DNA, stabilises it and thereby prevents DNA re-ligation, leading to DNA damage. Examples of drugs inhibiting topoisomerase I are DXd and SN-38. In some embodiments, the drug is an RNA polymerase II inhibitor such as α-amanitin. In some 35 embodiments, the drug is an immunomodulator such as a TLR agonist or a STING agonist (as desired, for example, in Table A of Fu et al, Signal Transduction and Targeted Therapy volume 7, Article number: 93 (2022)). In some embodiments, the drug is a small molecule drug. Typically, a small molecule drug has a molecular weight of 1000Da or lower, more typically 750Da or lower, or 500Da or less. 5 This molecular weight is of the drug molecule itself and excludes any linker. In some embodiments the drug is a cytotoxic drug, typically a small molecule cytotoxic drug, for example DXd, mertansine, Monomethyl auristatin E (MMAE), or Monomethyl auristatin F (MMAF). A cytotoxic small molecule drug is often called a chemotherapy or a chemotherapy agent, in particular in the context of cancer treatment. 10 In some embodiments, the small molecule drug is exatecan. Conjugation of the drug to the first polypeptide can be via any known method. Typically, the drug is conjugated to the first polypeptide at a cysteine residue, more typically a cysteine residue that is not present in the native sequence of the structural domain (e.g. CutA1 or Fc domain). When conjugating to cysteine, the molecule of interest to be conjugated may 15 comprise a maleimide linker region. Cysteine residues react readily with maleimides to form succinimidyl thioether conjugates. In some embodiments, the drug has a molecular weight of greater than 1000Da. In some embodiments, the drug is or comprises an oligopeptide or a polypeptide, comprising 2 or more, for example 3, 4, 5, 6, 7, 89, 10 or more, or 20 or more, or 50 or more, or 100 or more, amino 20 acid residues covalently linked by peptide bonds. In some embodiments, the drug is an immunotoxin, for example Pseudomonas exotoxin A (PE) which is described as an anticancer agent by Wolf & Beile (Int J Med Microbiol.2009 Mar;299(3):161-76. doi: 10.1016/j.ijmm.2008.08.003). In some embodiments, the molecule of interest is a dye molecule. In some embodiments, the 25 dye molecule is a small molecule dye. Typically, a small molecule dye has a molecular weight of 1000Da or lower, more typically 750Da or lower, or 500Da or less. In some embodiments the dye is fluorescent, for example fluorescein. Conjugation can be via any known method. Typically, the dye is conjugated to the CutA1 at a cysteine residue, more typically a cysteine residue that is not present in the native sequence. When conjugating to cysteine, the molecule 30 to be conjugated may comprise a maleimide. Cysteine residues react readily with maleimides to form succinimidyl thioether conjugates. In some embodiments, the effector molecule is a nanoparticle, for example a gold nanoparticle, a silica nanoparticle, a lipid nanoparticle, or a lipid polydopamine hybrid nanoparticle (“LPN”) as described by Yang et al, Acta Pharmaceutica Sinica B Volume 10, Issue 11, November 2020, pages 2212-2226. Typically, the nanoparticle is conjugated to one or more cysteine residues on the CutA1, by site-specific conjugation onto a maleimide group of a lipid nanoparticle or a polydopamine (PDA) hybrid nanoparticle. 5 Post-assembly clean-up In some aspects, as shown in the Examples, the assembly of conjugates (e.g. ADCs, AOCs) provided by the invention can be combined with simple post-assembly clean-up. This is particularly useful for the manufacture of drug candidates for downstream analysis. In some embodiments, these methods can be automated. In some embodiments the clean-up uses 10 beads, for example paramagnetic beads, that can be bound with appropriate catchers to quench unconjugated tagged proteins from an assembly of tagged binders to a Catcher-core (see e.g. Figure 17). This allows for cost-effective, rapid purification of assemblies (by removing unbound tagged binders) that is easily scalable. The clean-up method typically comprises contacting a post-assembly reaction mixture with 15 clean-up binding domains that can bind to unbound binders. The clean-up binding domains, bound to the previously unbound binders, can then be removed from the reaction mixture. These clean-up methods are particularly suitable for plate-based formats that can be automated through the use of a liquid handling robot. For example, GST-Catchers can be loaded onto glutathione-conjugated beads (e.g. Figure 15) and the unbound excess removed 20 via washing with PBS (e.g. Figure 16). The GST-Catcher bound beads can then added to each conjugation reaction, typically in excess (for example a 2x excess of GST-Catcher) to the corresponding unconjugated tagged binders. The beads can then be removed by an appropriate means, for example when the beads are magnetic they can be removed via magnetic capture of the particles, leaving a supernatant that contains only the fully assembled 25 molecule (e.g. Figure 18). This method is advantageous, relative to covalent-conjugation of catchers to beads, as it can be performed on-demand at a variety of scales. There are also no complex chemical processes required or reducing conditions. Furthermore, the design of the catcher constructs, the ratio of catcher proteins, or resin can easily be changed as demand requires. 30 This clean-up technology is a general improvement to the art and can also be applicable to other modular assembly approaches, in particular the method described by Driscoll et al, September 2023, bioRxiv 2023.08.31.555700; doi: https://doi.org/10.1101/2023.08.31.555700. The clean-up technology may be adapted for approaches such as Sortase-based assembly,35 for example as described by Andres et al, Mol Cancer Ther (2020) 19 (4): 1080–1088 “High- Throughput Generation of Bispecific Binding Proteins by Sortase A–Mediated Coupling for Direct Functional Screening in Cell Culture”. The clean-up step typically uses fusion proteins that are able to bind excess binders in the reaction mix after the assembly of catchers and tagged binders. These clean-up fusion 5 proteins typically comprise a protein that is useful in the clean-up method described immediately above, fused to at least one binding domain as described herein, typically a catcher domain that is able to bind to excess binder in the reaction mix. The protein that is useful in the clean-up step is typically able to form a protein-domain based association useful in a purification process. This association may be covalent (e.g. halo-tag, which is a modified 10 haloalkane dehalogenase designed to bind covalently to synthetic ligands that comprise a chloroalkane linker attached to a variety of useful molecules, such as fluorescent dyes, affinity handles, or solid surfaces) or non-covalent (e.g. maltose binding protein [MBP]). As noted above, this construct is typically a glutathione-S-transferase “GST” sequence that is able to bind to GSH on a bead. In some embodiments, the clean-up fusion protein is attached to a15 solid support such as a bead. In some embodiments, the bead may be magnetic or para- magnetic. GST/GSH is a particularly suitable system due to commercial availability of paramagnetic beads with high protein capacity at comparably low cost. The clean-up fusion protein can comprise a single type of binding domain, which will be able 20 to clean up one excess binder. This single binding domain may be provided in a single copy (e.g. GST-SpyCatcher), or optionally be provided in multiple copies (for example in the format GST-SpyCatcher-SpyCatcher). When there are two or more post-assembly excess binders to clean-up, then two or more clean-up fusion proteins can be used in combination, each with a different single type of binding domain. GST-SpyCatcher and GST-DogCatcher are examples 25 of clean-up fusion proteins. In some exemplary embodiments, for example, a first clean-up fusion protein is GST-SpyCatcher and a second clean-up fusion protein is GST-DogCatcher. Specific sequences are provided in the Examples for GST-SpyCatcher003 and GST- DogCatcher. In other exemplary embodiments, clean-up fusion proteins comprise GST- SpyCatcher002 or GST-SpyCatcher-003. In other exemplary embodiments, clean-up fusion 30 proteins comprise GST-SpyCatcher002 and GST-SpyCatcher-003, used in combination. Other clean-up fusion proteins may include inactive variants of SpyCatchers such as SpyCatcher002 KA, SpyDock or SpySwitch (e.g. as described in Khairil Anuar et al, Nature Communications volume 10, Article number: 1734 (2019), in particular Figure 7). MBP-SpyCatcher and MBP-DogCatcher are further examples of clean-up fusion proteins. In 35 some exemplary embodiments, for example, a first clean-up fusion protein is MBP-SpyCatcher and a second clean-up fusion protein is MBP-DogCatcher. In other exemplary embodiments, clean-up fusion proteins comprise MBP-SpyCatcher002 or MBP-SpyCatcher-003. In other exemplary embodiments, clean-up fusion proteins comprise MBP-SpyCatcher002 and MBP- SpyCatcher-003, used in combination. 5 Halotag-SpyCatcher and Halotag -DogCatcher are further examples of clean-up fusion proteins. In some exemplary embodiments, for example, a first clean-up fusion protein is Halotag-SpyCatcher and a second clean-up fusion protein is Halotag -DogCatcher. In other exemplary embodiments, clean-up fusion proteins comprise Halotag-SpyCatcher002 or Halotag -SpyCatcher-003. In other exemplary embodiments, clean-up fusion proteins 10 comprise Halotag-SpyCatcher002 and Halotag-SpyCatcher-003, used in combination. In some embodiments, two or more binding domains are attached to a single protein that is useful in the clean-up method described above. The multiple binding domains, typically multiple catcher domains, may be connected at any suitable location on the protein useful in clean-up (e.g. GST, MBP or halo-tag). The multiple binding (e..g catcher) domains may for 15 example be arranged in series at one terminus of the clean-up protein, or be arranged at different termini. As with other embodiments described herein, a linker peptide (e.g.2 to 30 amino acid residues) may be included between separate components of the fusion as appropriate. Some exemplary formats, wherein each catcher is different, include Catcher1- GST-Catcher2, Catcher1-Catcher2-GST, GST-Catcher1-Catcher2, Catcher1-MBP-Catcher2,20 Catcher1-Catcher2-MBP, MBP-Catcher1-Catcher2, Catcher1-halotag-Catcher2, Catcher1- Catcher2-halotag, or halotag-Catcher1-Catcher2. In some embodiments, a useful polypeptide comprises a first binding domain at the N terminus and a second binding domain at the C terminus, wherein the first and second binding domains are separated by a structural domain that is useful in the clean-up method described 25 immediately above, and wherein the first and second binding domains are the same or are different. In these embodiments, the structural domain is typically able to form a protein- domain based association useful in a purification process. This association may be covalent (e.g. halo-tag) or non-covalent (e.g. maltose binding protein [MBP]). As noted above, this construct is typically a glutathione-S-transferase “GST” sequence that is able to bind to GSH 30 on a bead. The invention is further described with reference to the following non-limiting Examples. EXAMPLES The inventors have devised a method for the scalable generation of bispecific molecules (or other orders of multiple specificities, such as trispecific molecules) featuring conjugation, for example small molecule dye conjugation or small molecule drug conjugation. The approach 5 is described in Example 1 and Figure 1. Example 1: Design of scalable process for generation of conjugated bispecific drug candidates Using the modular assembly system based on SpyCatcher/SpyTag and DogCatcher/DogTag 10 chemistry, multivalent, bispecific drug candidates can be generated rapidly. In addition, this system enables the generation of multivalent, bispecific, payload-conjugated drug candidates at scale by the introduction of site-specific mutations into either the core protein (CutA1, Fc or other), or into the SpyCatcher/DogCatcher domains to covalently link the Catcher-core protein to fluorescent or cytotoxic small molecules (Fig.1). The Catcher-core protein is first conjugated 15 in bulk to a small molecule of interest (via an attachment site on the molecule of interest), then purified, and subsequently assembled with SpyTag/DogTag-ligand proteins to generate a library of bispecific, small-molecule-conjugated drug candidates. Using the example illustrated in Figure 1 of generating a bispecific ADC library of size n with a single drug-linker compound, a traditional approach would require n reactions to produce n bispecific candidates, followed 20 by a further n reactions to conjugate a drug-linker compound to each of the bispecific candidates, which may each require further processes such as post-conjugation dialysis and cleanup. In contrast to the traditional method of ADC library generation, using our method, payload conjugation and payload-protein conjugate purification needs to be done only once, rather than being done for every individual drug candidate in the library. This is advantageous 25 as the purification of the protein-payload conjugate can be complex and can lead to loss of about 40 to 50% of the original sample. With our method, sample loss is restricted to the Catcher-core only. The conjugation reaction can be scaled up to the volume required for a particular ADC library screen. Therefore, Catcher-core proteins conjugated to different payloads can be stockpiled at -80 °C and then thawed as necessary to assemble with binders 30 (such as antigen-binding domains) and generate an ADC library. This enables parallel, scalable screening of bispecific drug candidates with different toxic payloads. Example 2: Rational engineering of CutA1 core protein for termini truncation and native cysteine removal CutA1 has multiple advantageous features to be utilised as a core protein in a bispecific molecule. Nonetheless, it may be beneficial to modify the protein to have better compatibility 5 with the workflow described elsewhere herein. Such modifications include truncating the termini of HsCutA1 to remove regions which are unresolved in the crystal structure (PDB ID: 2ZFH) and to enforce a similar distance of the N- and C-terminal core residues relative to the surface of the core, and to remove native unpaired cysteines from the HsCutA1 sequence as they may interfere with downstream processes. 10 Described elsewhere herein are HsCutA1 full length including the signal peptide (SEQ ID NO: 1), removal of 43 N-terminal residues to yield HsCutA1
44-179 (SEQ ID NO: 15, 16, 17), and removal of 59 N-terminal residues and 11 C-terminal residues to yield HsCutA160-171 (SEQ ID NO: 18). Removing 59 N-terminal residues and 8 C-terminal residues to yield HsCutA1
60-171 (SEQ ID NO: 19) can enable similar distance of residues that project away from the HsCutA1 15 core (Figure 2). Protein production between the two molecules show no drastic difference in expression, which suggests the truncations do not heavily impact HsCutA1 (Figure 2). Trends in cytotoxicity in Colo 205 cells upon assembly with tagged L7 was largely comparable between the two molecules, suggesting imparted function is not affected (Figure 2). HsCutA1 has two native unpaired cysteines at positions 75 and 96 (Figure 3). Alanine scanning 20 mutagenesis is a common strategy to substitute unpaired cysteine residues. Further, a multiple sequence alignment was produced using the HsCutA1 sequence to identify preferred amino acid frequencies at positions C75 and C96 across 406 homologs. Thereafter, these residues were mutated to either alanine for both positions yielding [C75A, C96A]HsCutA1 mutant or to valine at position 75 and to serine at position 96, yielding [C75V, C96S]HsCutA1. Protein 25 production between the molecules show no drastic difference in expression, which suggests that cysteine removal do not heavily impact HsCutA1 (Figure 3). Cell killing effects in Colo 205 cells upon assembly with tagged L7 were largely comparable, suggesting imparted function is not affected (Figure 3). 30 Example 3: Mutagenesis of CutA1 as a core protein for multivalent bispecific drug conjugation To enable site-specific conjugation of payloads to the CutA1 Catcher-core, 6 amino acid positions in CutA1 [44-179, C75A, C96A] (SEQ ID NO: 20) were mutated to Cys (see Methods: Cysteine mutant design) based on homology modelling or surface exposure. The thiol groups 35 of Cys residues could then be reacted with the maleimide moiety of linker-payload compounds (Fig.4) The positions of the mutations on the CutA1 structure are shown in Fig. 5A. A comparison of conjugation efficiency between the different mutants as CutA1 Catcher-core constructs (SEQ ID NO: 37, 38, 39, 40, 41, 42) as well as a negative control, CC041 (SEQ ID NO: 36), and the wild-type (SEQ ID NO: 15) is shown in Fig.6A (dye/protein ratio). All newly 5 introduced Cys mutations have a higher conjugation efficiency compared to the wild-type and the negative control. Mass spectra of CC042 (SEQ ID NO: 38) conjugated to fluorescein-5- maleimide, deruxtecan, maleimidocaproyl-Val-Cit-DM1 and maleimidocaproyl-Val-Cit-PAB- MMAF (Fig.7) are consistent with complete conjugation of a single linker-payload compound to each CutA1 monomer. This is equivalent to a ‘drug-to-antibody’ ratio of 3:1 as CutA1 is 10 trimeric. Fig.8 shows both conjugation with fluorescein-5-maleimide and assembly of CC042 (SEQ ID NO: 38) with SpyTagged- and DogTagged- ligands. In contrast, the negative control CC041 (SEQ ID NO: 36) displays a low level of non-specific labelling but assembles with ligands. Fig.8 suggests that maleimide conjugation does not compromise the ability of the Catcher-core protein to assemble with tagged ligands. 15 Example 4: Implementation of Fc as a core protein for multivalent bispecific drug conjugation The IgG1 constant fragment (Fc) incorporating the hinge region can also be used as a core protein. SpyCatcher and DogCatcher domains can be fused to either terminus of hinge-Fc to 20 produce a SpyCatcher-hinge-Fc-(G4S)2-DogCatcher protein, or as a tandem fusion to produce a DogCatcher-SpyCatcher-hinge-Fc protein. The hinge region natively contains three surface- exposed Cys residues, allowing payload conjugation. In addition, the DAR can be controlled by mutagenesis of the Cys residues to Ser residues. For example, CC068 (SEQ ID NO: 47) is a SpyCatcher-hinge-Fc-(G4S)2-DogCatcher construct with a C230S (Fc PDB numbering) 25 mutation. As there is no flexible linker between the C-terminus of the SpyCatcher3 domain and the hinge region, the C230S mutation was selected to eliminate the most N-terminal disulfide bond between two hinge-Fc monomers and allow greater conformational freedom in the most N-terminal part of the hinge region. Thus, this construct contains 2 surface-exposed Cys residues per monomer and 4 per dimer, therefore having a maximum DAR (drug-to-antibody 30 ratio) of 4. In contrast, in CC60 (SEQ ID NO: 48), the hinge region is deleted and longer linkers are used, resulting in a SpyCatcher-(G4S)3-Fc-(G4S)3-DogCatcher construct. Fig.9 shows conjugation with fluorescein-5-maleimide and subsequent assembly with SpyTag and DogTag- ligands of CC068 but only assembly for CC060 as the hinge region is deleted. Therefore, Fc- based catcher cores retaining native hinge Cys residues can also be used to produce 35 multivalent, bispecific and payload-conjugated drug candidate libraries. Example 5: Generation of dye-conjugated bispecifics for rapid imaging without antibody-based staining Figure 8C depicts Fluorescein-conjugated CutA1-Core, which has been assembled with one 5 or two binders to generate multivalent and multivalent bispecific Fluorescein-labelled assemblies. This system allows the generation of a library of multivalent bispecific Fluorescein- labelled drug conjugates at scale. These assemblies can then be used for imaging without further staining using fluorescent antibodies (Figure 10C). In addition this system can be used to screen for high binding/internalising binder clones using high-throughput imaging 10 applications. Since Fluorescein conjugation is uniform across the library, binding and internalisation profiles can be quantified across the library. Example 6: Scalable generation of drug-conjugated bispecifics for rapid screening of drug candidates 15 Drug-conjugation to a Catcher-core protein allows for subsequent generation of a library of drug-conjugated bispecifics. These drug conjugates can then be screened for efficacy and specificity using cell viability screening approaches such as cell titre glow assays or cell count based high-throughput live imaging. Figures 11-14 depict 4 libraries of drug-conjugated bispecifics comprising CC042-DM1, CC068-Deruxtecan, CC042-MMAF and CC068-MMAF 20 respectively, all conjugated orthogonally to binders L2, L4, L5, L6. Cell viability screening at 3 concentrations using MTT assays as a readout of cell viability of drug-conjugated assemblies and assemblies without drug conjugation rapidly informs on multivalent and/or bispecific binder pairs that benefit from drug conjugation to increase efficacy. Furthermore, as the DAR is controlled for drug conjugation to Catcher-Core as well as Fc, comparison of libraries of 25 Catcher-Core drug conjugates with varying DARs/linkers can further inform on the most promising hits for drug conjugate generation. Example 7: Integration of modular assembly with simple post-assembly clean-up enables the manufacture of uniform drug candidates for downstream analysis. 30 An automation compatible method was developed using GST-SpyCatcher003 (“GST-SpC3”, SEQ ID NO: 49) and GST-DogCatcher (“GST-DgC”, SEQ ID NO: 50) bound to commercially available paramagnetic beads to quench unconjugated tagged proteins from an assembly of SpT and DgT binders to a Catcher-core (Figure 17). This allows for cost-effective, rapid purification of assemblies that is easily scalable. It is suitable for low-volume plate-based formats that can be automated through the use of a liquid handling robot. The GST-Catchers are loaded onto the beads (Figure 15) and the unbound excess removed via washing with PBS (Figure 16). The GST-Catcher bound beads are added to each conjugation reaction to give a 5 2x excess of GST-Catcher to the corresponded unconjugated L2-DgT and L7-SpT. This can then be removed via magnetic capture of the particles, leaving a supernatant that contains only the fully assembled molecule (Figure 18). This method is advantageous, relative to covalent-conjugation of catchers to beads, as it can be performed on-demand at a variety of scales. There are also no complex chemical processes required or reducing conditions. 10 Furthermore, the design of the catcher constructs, the ratio of catcher proteins, or resin can easily be changed as demand requires. Example 8: Large-scale modular assembly can be utilised to rapidly screen a broad panel of target combinations in 384-well plate format. 15 To further exemplify the scalability of the approach, the inventors have demonstrated implementation of a higher-throughput robotic approach during drug candidate generation, as well as miniaturisation of drugging to 384-well plate format. Assembly. Figures 19-21 depict cytotoxicity after drugging a cell line with CC068 conjugated with MMAF in two linker variants (with and without PEG) in 3 concentrations, as well as a comparison to a drug candidate panel 20 derived from CC068 only. In the assay depicted in Figures 19-21, to generate functionally monospecific controls for SpT binders, a DgT binder against a non-mammalian target was added to more closely match the process for bispecific drug candidates, providing an alternative set of monospecific controls to Figures 11-14; in Figure 22, this approach was expanded with an additional corresponding 25 SpT binder against a non-mammalian target for monospecific controls derived from DgT binders. Comparison of Figures 20 and 21 to Figure 19 show that cytotoxicity is highly payload- dependent, as well as demonstrating target-dependent payload delivery. Differences between conjugates of MMAF and PEG-MMAF, despite featuring the same payload, further highlight the suitability of the approach to test various aspects of drug design. In addition to Figure 19- 30 21, the inventors have conducted similar screens derived from CC042, which further emphasise the impact of core geometry on drug conjugate behaviour. Taken together, the inventors have demonstrated the capability of this approach to rapidly derive drug candidates with different linker-payload conjugation from a given pre-conjugated protein, as well as its applicability to a broad range of binder families. Finally, Figure 22 demonstrates the suitability of this approach to conduct screening campaigns at commercially competitive scales, with a rapid overall process. To demonstrate the power of their approach, the inventors have further implemented a robotic system for high- throughput application of the drug candidates to target cell lines. Herein, the investors were 5 able to generate a panel of ~800 drug candidates, of which ~400 were drug conjugates conjugated to PEG-MMAE (344 bispecific drug conjugates, plus 8 monospecific tetravalent assemblies [wherein SpT and DgT binders are identical], plus 52 with second binder against a non-mammalian target as functionally monospecific drug candidates), and to proceed to drugging after <1 week. In total, a panel of 9 SpT binders and 45 DgT binders were utilised (of 10 which 1 SpT binder and 1 DgT binder featured a target not relevant to mammalian cells as “monospecific” control), covering a broad range of different targets (5 for SpT; 28 for DgT) to facilitate the identification of novel target combinations. As with Figure 19-21, Figure 22 exemplifies the impact of payload conjugation and target selection, as well as differences between bispecific and monospecific assemblies, as would be expected from a BsADC 15 screening panel. In Figure 23, these differences between bispecific and relevant monospecific assemblies are further analysed, as are changes between drug conjugates and non- conjugated assemblies together with changes in overall cytotoxicity, clustering bispecific drug conjugate assemblies to inform candidate selection. 20 Example 9: Catcher-core designs with varying geometry, drug-to-antibody ratio (DAR), and other modifications can be derived from the same core molecules. In addition to catcher domains separated by a single core molecule, such as SpyCatcher003- Fc-DogCatcher in CC068 with SEQ ID NO: 47, the inventors have also derived alternative constructs from Fc, including SpyCatcher003-DogCatcher-Fc in CC076 with SEQ ID NO: 52 25 (as illustrated in Figure 1; Figure 24). Such platforms with alternate geometries provide a useful approach to evaluate the impact of binder arrangement or valency on drug candidate behaviour (as similarly demonstrated for comparison across different core proteins in Figure 11-14). Figure 24 shows that CC076 is suitable for assembly with SpT and DgT proteins, providing a 30 useful new protein platform for this and related screening approaches (e.g. bispecific screening without conjugation, as in WO-A-2022/200804). CC076 is also suitable for pre-conjugation to small molecules (such as drug conjugation, demonstrated herein via fluorophore conjugation), followed by SpT/DgT-based protein assembly, making it a suitable Catcher-core (CC) protein in the context of this invention. In Figure 25, a drug screen based on a set of bispecific drug candidates assembled on CC068 and CC076 with no payload is provided to exemplify the difference in cell cytotoxicity upon a change in geometry. Similarly, knob-into-holes or related Fc-heterodimerisation technologies can be utilised to create SpC-Fc/DgC-Fc complexes for subsequent assembly. Herein, payload conjugation 5 before Fc-heterodimerisation can be utilised to create pre-conjugated Catcher-core constructs at reduced DAR (e.g. payload conjugation of a single SpC-Fc or DgC-Fc prior to hybridisation, resulting in Fc hybrids with conjugation at one half of the construct) or two different linker- payload species (e.g. conjugation of different linker-payload species to SpC-Fc and DgC-Fc prior to hybridisation). Similarly, payload conjugation after Fc-heterodimerisation is also 10 suitable to provide pre-conjugated complexes ready for assembly. The inventors note that specific protein-protein assembly approaches may be preferred for knob-into-hole Fc heterodimers, such as SpC-Fc/SpC-TEV-Fc or related technologies for sequential assembly of SpT binders, e.g. in cases where it is desired to retain the same positioning of protein tags relative to binders and Fc. 15 Next to deriving alternative drug candidate geometries, a core protein can also be modified to include useful mutations or modifications, such as ”silenced” Fc mutations known in the art (e.g. “LALA”, “LAGA”, “LALAPG”) or to introduce or remove cysteine residues or other residues for linker-payload attachment, allowing the generation of Catcher-core molecules with varying DAR. For instance, assessment of assemblies based on silenced Fc instead of or in 20 comparison to assemblies based on wild-type Fc may be desired for in vitro assays to eliminate potential influences of Fcγ receptors on drug conjugate internalisation. The inventors demonstrate variations of CC068 with additional cysteines to provide Catcher-core variants with higher numbers of small molecule conjugation sites, e.g.8 Cys residues per dimer of Fc- based Catcher-core, facilitating a drug-to-antibody ratio of up to 8. These molecules are CC080 25 (SEQ ID NO: 53) and CC086 (SEQ ID NO: 54). In general, the “conjugation-first” approach described herein can leverage a variety of technologies, either sequentially or in addition to any single step, to generate a broad diversity of bispecificity, multispecificy, geometries, small molecule conjugation, combinations of small molecule conjugation, or other features. 30 Example 10: Informing drug design for bispecific antibody-drug conjugates. The screening approach described herein is useful to identify novel targets, target combinations, and/or binder combinations for payload delivery. In Figure 22, a broad range of target combinations had been tested. The results of these tests demonstrate that multiple of these target combinations are of interest in the art of bispecific ADCs. Furthermore, Figure 22 demonstrates that multiple identified target combinations can translate across different binders against the same target (such as for target 2) as well as for different geometries (i.e. SpT Binder A + DgT Binder B compared to 5 SpT Binder B + DgT Binder A in the context of assemblies derived from SpC-Fc-DgC; see left quadrants in Figure 22), providing evidence of internal consistency to the discovery approach. Herein, iterative screening triages and variations in drug candidate design give indications to inform the design of final drug candidates; at any stage, additional optimisation of a drug candidate can be undergone to improve performance given an interesting binder or target 10 combination. From this initial discovery phase, various approaches are known in the art to generate bsADCs in formats suitable to production of pharmaceutical drugs, such as DuoBody, CrossMAb, FIT-Ig, DVD-Ig, Morrison format, scFv-IgG or related technologies. In Barron et al, 2024 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10889805/) and WO-A-2022/200804, comparisons of screening stage bispecific candidates or bispecific ADC to drug-like formats 15 are described (such as changes in DAR in translation to a final molecule, or removal of adapter binding domains). Similarly, the usefulness of “conjugate associated” approaches for the discovery of bsADC is well known in the art, as is provided for commercial use via anti-Fc Fab conjugates (which can be combined with bispecific antibodies to generate complexes mimicking bispecific ADC; e.g. “anti-HIgG(Fc)Fab-C-MMAE ADC (CAT#: ADC-1011-SHFz)”20 sold by Creative Biolabs [as of June 13th, 2024 https://www.creativebiolabs.net/hFc-Fab-C- MMAE-22302.htm]). Without wishing to be bound by theory, the inventors consider that the approach provided herein can be utilised to identify previously unknown target combinations from a previously difficult or impossible to achieve scale and variation; once a given combination is selected for further development, methods known in the art are suitable to 25 derive a drug molecule replicating or exceeding the newly found effects. In Figure 26, a bispecific ADC in drug-like format corresponding to a target combination featuring increased cytotoxicity in a screen that is part of Figure 22 is shown as an illustrative example. Figure 26 demonstrates that a bispecific ADC in drug-like format featuring preferred binding properties and high cytotoxicity as an MMAE drug conjugate can produce reduced cell 30 viability as also demonstrated for a bispecific drug conjugate assembly against the same target combination in Figure 22 (herein ~40% at 2.5 nM dimer concentration for the same cell line described in Figure 26). METHODS Multiple sequence alignment Multiple sequence alignment was used to select Cys mutations. SEQ ID NO: 1 was used as 5 the query for a HMMER (Eddy, 1998, Bioinformatics) search of the UniProtKB database (v.2021_04) (The UniProt Consortium, 2020, Nucleic Acids Research). An e-value cutoff of 0.0001 was used. Sequences with an alignment length shorter than 60% of SEQ ID NO: 1 were discarded. CD-HIT (Li and Godzik, 2006, Bioinformatics) was then used with a sequence identity threshold of 0.95 to remove redundant sequences. A multiple sequence alignment was 10 then performed using MAFFT (Katoh et al, 2002, Nucleic Acids Research). The multiple sequence alignment from MAFFT was then analysed using the Biopython, NumPy and Pandas libraries in Python 3 to produce the frequency of each amino acid at each position in SEQ ID NO: 1. Gaps were excluded from this analysis. 15 Cysteine mutant design Cysteine mutants K82C, E114C, F136C The human CutA1 crystal structure (PDB ID 2ZFH) was aligned against homologous structures using FATCAT (Ye, Y. & Godzik, A. FATCAT: a web server for flexible structure comparison and structure similarity searching. Nucleic Acids Res.32, W582-585 (2004)). 20 Positions with corresponding surface facing Cys residues in homologues were selected, leading to the following mutations: K82C, E114C, F136C (SEQ ID NO: 31, 33, 34) Cysteine mutants E78C, Q102C, Q166C The human CutA1 crystal structure (PDB ID 2ZFH) was minimised using the EvoEF2 (Huang, X., Pearce, R. & Zhang, Y. EvoEF2: accurate and fast energy function for 25 computational protein design. Bioinforma. Oxf. Engl.36, 1135–1142 (2020)) force field with the RepairStructure command. Then, Cys 75 and Cys 96 were mutated to Ala using the BuildMutant command. Subsequently using this model with Cys to Ala mutations, structural models with a single Cys mutation at positions V64, E78, K79, E83, K91, Q102, K110, S139, F158 and Q166 were generated. The solvent accessibility of the in silico Cys mutant models 30 was evaluated using PDBePISA (Krissinel, E. & Henrick, K. Inference of Macromolecular Assemblies from Crystalline State. J. Mol. Biol.372, 774–797 (2007) by averaging the solvent-accessible surface area value for the mutated residue over the three chains of CutA1. Three mutations were selected across a range of solvent accessibility: E78C, Q102C, Q166C (SEQ ID NO: 30, 32, 35) 35 Mutation Av 2erage solvent-accessible surface area (Å) E78C 59.96 Q102C 121.83 Q166C 80.01 All of the above Cys mutations were introduced individually into the SpC3-G4S-HsCutA1[44- 179, C75A, C96A]-G4S-DgC construct (SEQ ID NO: 36). Molecular cloning 5 Plasmids encoding recombinant proteins were provided by Twist Biosciences or ProteoGenix. DNA fragments and oligonucleotides were synthesised by Integrated DNA Technologies (IDT). Constructs for SpC3-hinge-Fc-DgC, SpC3-Fc-DgC and SpC3-HsCutA1-DgC were assembled through standard cloning procedures. To introduce synthesised DNA fragments into plasmid backbones, to introduce point mutations, or to make other adjustments to recombinant 10 sequences, DNA was amplified using standard polymerase chain reaction (PCR) followed by standard cloning methods, including restriction cloning. Assembled constructs were transformed into E. coli NEB 5-alpha cells. Putative positive clones were grown overnight, and DNA was isolated from bacterial pellets via miniprep. Samples were sent for Sanger sequencing to Source Bioscience for sequence validation and alignment was performed using 15 Benchling’s molecular biology suite (www.benchling.com). Protein expression and purification Expression To obtain proteins CC7, CC041, CC038, CC042, CC044, CC056, CC057, CC058 and L2, L4, 20 L5, L6, L7, L8, DNA encoding these genes, in the pET28 expression vector, was transformed into BL21 (DE3) or SHuffle T7 express cells. Colonies were used to inoculate LB cultures with 50 µg/mL Kanamycin at 37 °C with 160-220 rpm shaking. Overnight cultures were diluted 1:100 into LB or 2×YT media supplemented with 50 µg/mL Kanamycin. Cultures were grown at 37 °C with 160-220 rpm shaking before induction of protein expression with 0.2-0.4 µM IPTG at OD 25 0.6-0.8 (LB) or 1.6-2.0 (2×YT). The cultures were incubated for a further 4 hours at 37 °C, or 16 hours at 18 °C, with 160-200 rpm shaking. Alternatively, overnight cultures were diluted 1:100 into SB autoinduction media supplemented with 50 µg/mL Kanamycin. Cultures were grown at 37 °C with 160-220 rpm shaking for 4 hours. 30 The cultures were then incubated for a further 16 hours at 30 °C with 160-200 rpm shaking. Cells were collected by centrifugation at 5000 x g for 15 minutes at 4 °C and pellets stored at -80 °C prior to purification. 5 To obtain proteins CC060 and CC068, DNA encoding these genes, in the pTWIST expression vector was transfected into Expi293F cells (ThermoFisher) in Expi293 expression media using the PEI MAX reagent (Polysciences) and Opti-MEM reduced serum medium. The cells were incubated at 37 °C and 8% CO
2 for 16 hours with shaking at 120 rpm. Afterwards, the cells were supplemented with sterilised (using a 0.22 um syringe filter) Phytone/sodium butyrate 10 expression supplement to a final concentration of 0.5% Phytone (Appleton Woods) and 2 mM sodium butyrate (ThermoFisher) and the cells were further incubated for 4-6 days at 37 °C and 8% CO
2 with shaking at 120 rpm. To obtain protein CC076, DNA encoding these genes in pTWIST expression vector was transfected in Expi293F cells as above. Proteins CC080, CC086, binders in Figures 19-22, and proteins bAb001 and bAb003 were either expressed as 15 described above and purified as described below or produced by other means well known in the art following expression from ExpiCHO, Expi293 or CHO-K1 cells, such as by purification via Protein A or Ni-NTA. Protein A 20 Mammalian Expi293F supernatant for CC060, CC076 and CC068 was harvested by centrifuging the cell cultures at 300xg for 30 min. The supernatant was supplemented with with 1 mM PMSF and 1x cOmplete EDTA-free protease inhibitor cocktail and filter sterilised. The proteins from the supernatant were purified using an AKTA Pure FPLC with a HiTrap MAbSelect PrismA column (Cytiva). Samples were purified at 4 °C using 20 mM sodium 25 phosphate, 150 mM NaCl, pH 7.2 start buffer, and 0.1 M Glycine, pH 3.0 elution buffer.1 M Tris-HCl, pH 8.0 neutralisation buffer was added to the wells of the 96-well collection deep well plate. The elution peak fractions were analysed by SDS-PAGE and the appropriate fractions were pooled. At larger volumes, these proteins were further purified using the HiScreen Fibro PrismA column (Cytiva). 30 Ni-NTA For proteins CC038, CC042, CC044, CC056, CC057, and CC058, bacterial cell pellets were resuspended in Ni-NTA equilibration buffer (50 mM Tris, pH 7.8; 300 mM 25 NaCl, 10 mM imidazole) supplemented with 1 mM PMSF, 1x cOmplete EDTA-free protease inhibitor cocktail, benzonase (5 U/mL), 1 mg/ml lysozyme and 5 mM 2-mercaptoethanol. Samples were sonicated with an Ultrasonic Processor using a 20 mm probe and an amplitude of 20%, for 9- 12 minutes pulsing on 2 seconds/off 4 seconds. Alternatively, bacterial cell pellets were resuspended in B-PER chemical lysis buffer (ThermoFisher) supplemented with 1 mM PMSF, 1x cOmplete EDTA-free protease inhibitor 5 cocktail, benzonase (5 U/mL), 1 mg/ml lysozyme and 5 mM 2-mercaptoethanol. The samples were then centrifuged at 16,000 ×g for 30 min. Supernatants were retained for Ni-NTA chromatography. The supernatant was loaded onto pre-equilibrated HisPur Ni-NTA gravity-flow columns. The resin was washed with 20 resin-bed volumes of Ni-NTA Wash Buffer 1 (50 mM Tris, pH 7.8; 10 300 mM NaCl, 10 mM imidazole) supplemented with 5 mM 2-mercaptoethanol and subsequently with 10 resin-bed volumes of Ni-NTA Wash Buffer 2 (50 mM Tris, pH 7.8; 300 mM NaCl, 30 mM imidazole) supplemented with 5 mM 2-mercaptoethanol. His-tagged proteins were eluted from the resin with five resin-bed volumes of Ni-NTA Elution Buffer (50 mM Tris, pH 7.8; 300 mM NaCl, 200 mM imidazole) supplemented with 5 mM 2-mercaptoethanol until 15 the absorbance of the elution fractions at 280 nm approached baseline. For proteins CC7, CC041, L2, L4, L5, L6, L7 and L8, the same protocols were used but without supplementing buffers with 5 mM 2-mercaptoethanol. SEC For CC038, CC042, CC044, CC056, CC057, CC058, proteins that were above 90% purity20 following Ni-NTA chromatography were dialysed against PBS supplemented with 5 mM 2- mercaptoethanol using SnakeSkin
TM Dialysis Tubing at 3K MWCO. The proteins were then concentrated in a Pierce
TM Protein Concentrator PES. The proteins were then further purified using an AKTA Pure FPLC with a Superdex 16/600 S200 column. Samples were purified at 4 °C using PBS buffer supplemented with 2 mM TCEP. 25 For CC7, CC041, L2, L4, L5, L6, L7 and L8 the same protocol was used but without supplementing buffers with 2-mercaptoethanol and TCEP. For CC060, CC076 and CC068, optional dialysis against PBS was performed using SnakeSkin
TM Dialysis Tubing at 3K MWCO after protein A purification, followed by concentration and purification as described above without supplementing buffers with 2- 30 mercaptoethanol and TCEP. The elution peak fractions were analysed by SDS-PAGE and the appropriate fractions were pooled. Estimated protein concentrations were calculated from measurement of absorbance at 280 nm using an Implen NanoPhotometer N60 with reduced extinction coefficients predicted by ProtParam. Protein samples were adjusted to 10-300 μM concentration by monomer and stored at -80 °C. Cysteine-maleimide conjugation 5 Catcher-core protein reduction Catcher-core protein variants containing single Cys residues (by monomer) were prepared at a concentration of 50 μM (by monomer) in PBS buffer pH 7.4 (Formedium) with 50 mM EDTA (Formedium). To reduce cysteine residues, 100 molar equivalents (relative to protein thiols) of 40 mM TCEP (Melford) in PBS buffer at pH 7.0 were added to the Catcher-core protein and 10 the mixture was incubated for either a) 16 hours at 37 °C, or b) 2 hours at 25 °C on a microplate shaker (Fisher Scientific) with shaking at 600 rpm. Alternatively, Catcher-core protein variants with more than one Cys residue per monomer were prepared at a monomer concentration of 50 μM in PBS buffer pH 7.4 (Formedium) with 10 or 50 mM EDTA (Formedium). To reduce cysteine residues, 4 to 25 molar equivalents (relative 15 to protein thiols) of TCEP (Melford) in PBS buffer at pH 7.0 were added to the Catcher-core protein and the mixture was incubated for 2 hours at 25 °C on a microplate shaker (Fisher Scientific) with shaking at 600 rpm. Bispecific antibody reduction Bispecific antibodies were prepared at a dimer concentration of 50 μM in PBS buffer pH 7.4 20 (Formedium) with 10 mM EDTA (Formedium). To reduce cysteine residues, 4 or 32 molar equivalents (relative to protein thiols) of TCEP (Melford) in PBS buffer at pH 7.0 were added to the Catcher-core protein and the mixture was incubated for 2 hours at 25 °C on a microplate shaker (Fisher Scientific) with shaking at 600 rpm. Maleimide conjugation 25 10 to 15 molar equivalents (relative to protein thiols) of fluorescein-5-maleimide, deruxtecan, MC-Val-Cit-PAB-DM1 or MC-Val-Cit-PAB-MMAF (MedChemExpress, dissolved in DMSO) were added and the mixture was incubated for 1 hour at 25 °C. Finally, 10 molar equivalents (relative to maleimide) of L-cysteine (Melford) (dissolved in PBS pH 7.4) were added and the mixture was incubated for 15 minutes to terminate the reaction. 30 Alternatively, 1, 2, 4 or 8 molar equivalents (relative to protein thiols) of Alexa488 C5 maleimide (ThermoFisher Scientific, dissolved in DMSO), MC-Val-Cit-PAB-MMAF, Mal-PEG8-Val-Cit- PAB-MMAF, Mal-PEG8-Val-Cit-PAB-MMAE, or MC-PEG8-Val-Ala-PAB-Exatecan (MedChemExpress, dissolved in DMSO) were added to the reduced protein, and the mixture was incubated for 2 hours at 25 °C. Finally, 10 molar equivalents (relative to maleimide) of L- cysteine (Melford) (dissolved in PBS pH 7.4) were added and the mixture was incubated for 15 minutes to terminate the reaction. Dye-protein conjugate purification The fluorescein-5-maleimide reaction mixture was centrifuged for 10 minutes at 17,000 g to remove precipitate. Subsequently the reaction mixture was loaded on a PD-10 gravity flow desalting column (Cytiva), followed by up to 2.5 mL of PBS buffer. The conjugated protein was eluted with an additional 3.5 mL of PBS buffer.250 μL elution fractions with a high protein concentration, as determined by UV/Vis spectrophotometry using an IMPLEN Nanophotometer at wavelengths of 280 nm and 495 nm, were selected and concentrated using a 0.5 mL Pierce
TM protein concentrator (ThermoFisher) with either a 3 or 10 kDa MWCO filter to a final volume of approximately 100-400 μL. Protein concentration was adjusted for the contribution of the dye to absorbance at 280 nm using the following formula:
A280 denotes the absorbance of the dye (payload)-protein conjugate at 280 nm, A495 denotes the absorbance at 495 nm and ^^prot denotes the extinction coefficient of the reduced protein at 280 nm as calculated using Expasy ProtParam. Alternatively, the Alexa488 C
5 maleimide reaction mixture was centrifuged for 10 minutes at 17,000 g to remove precipitate. Subsequently, the reaction mixture was loaded on a 40K MWCO Zeba
TM spin desalting column (ThermoFisher Scientific). The spin desalting column was subsequently centrifuged at 700 g according to the manufacturer’s instructions and the purified sample was retrieved. The absorbance of the conjugated protein was determined by UV/Vis spectrophotometry using an IMPLEN Nanophotometer at wavelengths of 280 nm and 494 nm. Protein concentration was adjusted for the contribution of the dye to absorbance at 280 nm using the following
A280 denotes the absorbance of the dye (payload)-protein conjugate at 280 nm, A494 denotes the absorbance at 494 nm and ^^prot denotes the extinction coefficient of the reduced protein at 280 nm as calculated using Expasy ProtParam. Protein concentration can be estimated as either monomer or multimer, depending on the extinction coefficient used for the protein. Dye/protein ratio estimation Final dye/protein ratio for the fluorescein-5-maleimide conjugate after concentration was estimated using the following formula: ^^
ସଽହ A
495 denotes the absorbance of the dye (payload)-protein conjugate at 495 nm, ^^
dye denotes the extinction coefficient of the dye at 495 nm (68,000 M
-1cm
-1), Cprot denotes the adjusted protein concentration according to the above equation. Alternatively, the final dye/protein ratio for the Alexa488 C5 maleimide conjugate was estimated using the following formula:
A494 denotes the absorbance of the dye (payload)-protein conjugate at 494 nm, ^^dye denotes the extinction coefficient of the dye at 494 nm (71,000 M
-1cm
-1), Cprot denotes the adjusted protein concentration according to the above equation. Dye/protein ratio can be estimated relative either to the monomer or multimer concentration for the protein. Drug-protein conjugate purification The reaction mixture was centrifuged for 10 minutes at 17,000 g to remove precipitates. Subsequently the reaction mixture was loaded on a PD-10 gravity flow desalting column (Cytiva), followed by up to 2.5 mL of PBS buffer. The conjugated protein was eluted with an additional 3.5 mL PBS buffer. Elution fractions with a high protein concentration, as determined by UV/Vis spectrophotometry using an IMPLEN Nanophotometer at wavelengths of 280 nm and a wavelength with high absorbance from the drug, λdrug , were selected and concentrated using a 0.5 mL Pierce
TM protein concentrator (ThermoFisher) with a 10 kDa MWCO filter to a final volume of approximately 100-400 μL. Linker-drug
λdrug^^ௗ^ ^
^ ௨
୰^ ^
^ – extinction ^^
ௗଶ^଼௨^^ – extinction compound coefficient of the coefficient of the drug at λdrug drug at 280 nm deruxtecan 370 nm 20,982 M
-1cm
-1 10,764 M
-1cm
-1 MC-Val-Cit-PAB- 252 nm 26,350 M
-1cm
-1 5,456 M
-1cm
-1 DM1 MC-Val-Cit-PAB- 214 nm - - MMAF* For deruxtecan, extinction coefficients at 280 nm (10,764 M
-1cm
-1) and 370 nm (20,982 M
-1cm-
1) for a close deruxtecan analogue (compound 21a
6) were used. For DM1, extinction coefficients at 280 nm and 252 nm were used as determined in (Fishkin, N. Maytansinoid– BODIPY Conjugates: Application to Microscale Determination of Drug Extinction Coefficients and for Quantification of Maytansinoid Analytes. Mol. Pharmaceutics 12, 1745–1751 (2015)). *Absorbance at 214 nm for MMAF was only used to determine the fractions at which free MC- Val-Cit-PAB-MMAF begins to elute as MMAF absorbance overlaps with protein peptide bond absorbance. Protein concentration was adjusted for the contribution of the drug to absorbance at 280 nm using the following formula from (Chen, Y. Drug-to-Antibody Ratio (DAR) by UV/Vis Spectroscopy. in Antibody-Drug Conjugates (ed. Ducry, L.) 267–273 (Humana Press, 2013)): ^
− ^^ A280 denotes the absorbance of the linker-drug-protein conjugate at 280 nm, Aλdrug
absorbance of the linker-drug-protein conjugate at a wavelength specific to the
^^ denotes the extinction coefficient of the drug at λ
drug, ^^
ௗଶ^଼௨^^ denotes the extinction coefficient of the drug at 280 nm, ^^
^ଶ^଼^^௧ denotes the of the reduced protein at 280 nm as calculated using Expasy ProtParam and ^^
^ ^୰ denotes the extinction coefficient of the protein at λdrug as estimated by Nanophotometer measurements at 280 nm and λdrug of the unconjugated protein, and l denotes the optical path length. Subsequently, the concentrated sample was further purified using a Slide-A-Lyzer® mini dialysis device (Thermo scientific, 0.5 ml sample volume, 20 kDa MWCO cut off filter) for eight rounds of dialysis into 14 ml of PBS buffer, each of at least two hours. Final linker-drug-protein conjugate concentrations were determined using the Coomassie Plus Bradford assay (ThermoFisher). Alternatively, the reaction mixture was centrifuged for 10 minutes at 17,000 g to remove precipitate. Subsequently the reaction mixture was loaded on a 40K MWCO Zeba
TM spin desalting column (ThermoFisher Scientific). The spin desalting column was subsequently centrifuged at 700 g according to the manufacturer’s instructions and the purified sample was retrieved. Then, the purified sample was transferred to a 3 mL Slide-A-Lyzer
TM 20K MWCO dialysis casette or 0.5 mL Slide-A-Lyzer
TM 20K MWCO dialysis devices in 15 mL falcon tubes (ThermoFisher Scientific) and dialysed for at least three rounds of dialysis following the manufacturer’s instructions. Alternatively, after the initial spin desalting step, the purified sample was loaded onto another 40K MWCO Zeba
TM spin desalting column, centrifuged at 700 g according to the manufacturer’s instructions and the purified sample was retrieved. Protein concentration was initially estimated using the above formula. For conjugates with MC-Val-Cit-PAB-MMAF, Mal- PEG8-Val-Cit-PAB-MMAF and Mal-PEG8-Val-Cit-PAB-MMAE, λdrug of 248
^^ of 15,900 M
-1cm
-1 and ^^
ௗଶ^଼௨^^ of 1,500 M
-1cm
-1 were used. For conjugates with MC-PEG8-Val-Ala- PAB-Exatecan, the deruxtecan values described above were used. Final linker-drug-protein conjugate concentrations were determined using the Coomassie Plus Bradford assay (ThermoFisher). Protein concentration can be estimated as either monomer or multimer, depending on the extinction coefficient used for the protein. Payload/protein ratio estimation The final drug formula:
×
^୰ − × × ^^ The final drug/protein molar ratio was then estimated using: ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^
ௗ^௨^ Drug/protein ratio can be to monomer or multimer concentration for the protein. 5 LC-MS The molecular weight of the dye-protein and drug-protein conjugates of SpyCatcher3-G4S- HsCutA1[44-179, C75A, C96A, K82C]-G4S-DogCatcher, or CC042 (SEQ ID NO: 38) was analysed using LC-MS. An Agilent 1290 UPLC and a 6550 ESI Q-ToF mass spectrometer 10 were used. LC buffer A was 0.1% formic acid in water, LC buffer B was 0.1% formic acid in acetonitrile. An Agilent PLRP-S 1000A (5μm 50 x 2.1mm) column was used at 60C, flow rate of 0.4ml/min and 0.2μL sample injection volume. Solvent composition was held at 20% B for 5 minutes, increased to 75% B in 1 minute then to 100% B in 4 minutes and held for 2 minutes. Positive ion spectra were acquired over the m/z range 300-3200 at the rate of 1 spectrum per 15 second. SEC-HPLC The size and aggregate content of some Catcher-core and linker-payload-conjugated Catcher- core samples was analysed using SEC-HPLC. An Agilent 1100 HPLC was used with a MAb- Pac
TM SEC-1 column (ThermoFisher Scientific). LC buffer was 1x PBS pH 7.4, flow rate was 20 0.2 ml/ min and the injection volume was 10-15 μL. Data was collected over 25 minutes. Protein quantification by BCA assay Prior to conjugation, samples were quantified using a BCA Protein Assay kit (ThermoFisher) according to the manufacturer’s instructions. BSA standards were diluted in 1× PBS. Purified protein was diluted 1/5, 1/10 and 1/20 in 1× PBS to ensure concentration was within the linear 25 range of the assay. Following incubation with the BCA reagent, absorbance at 562 nm was measured on a BMG FluoSTAR Omega plate reader. Concentration in mg/mL was interpolated based on the standard curve. Catcher-based protein assembly 30 SpyTag/DogTag-ligand assembly test (CC041,CC042,CC060,CC068) Reaction mixtures of fluorescein-5-maleimide-conjugated CutA1-based cores fused with SpyCatcher003 and DogCatcher and SpT3 and DgT binder proteins were prepared at 10 μM Catcher-core with a 2:1 molar ratio of each binder over the Catcher-core monomer. The reactions were prepared in PBS buffer and incubated on a plate shaker at 600 rpm at 25 °C for 1 hour. Subsequently, 6x SDS loading buffer was added and the samples were heated at 95 °C for 5 minutes to terminate the reaction and prepare the samples for visualisation by SDS- 5 PAGE. SpyTag/DogTag ligand assembly for cell assays Another variation of the conjugation process is as follows. Conjugation samples of SpT3, and DgT binder proteins to CutA1-based cores fused with SpyCatcher003 and DogCatcher were prepared at 20 μM with a 1:1.2:1.2 binder excess of platform:ligand:ligand ratios of monomer 10 concentration. The reactions were prepared in PBS buffer supplemented with 1x cOmplete EDTA-free protease inhibitor and 50 μg/mL Penicillin-Streptomycin. Samples were incubated at 25 °C for 1 hour for SpT3 and DgT binders. Excess binders were removed by loading and washing a resin column. (The resin column was loaded with GST-Catchers, washed, followed by loading of the assembly mixture.) 15 The assemblies were visualised by SDS-PAGE to validate binder removal. More specifically, a MagneGST resin column pre-loaded with GST-Catcher was utilised to capture excess ligands from an assembly mixture upon incubation with the resin. To pre-load the resin, GST- Catchers were mixed in PBS to give a final concentration of 2.5 mg of GST-SpyCatcher003 and 2.5 mg of GST-DogCatcher. An appropriate amount of MagneGST resin was added to 20 achieve full capture of both proteins, assuming a binding capacity of 5 mg/mL of resin. The suspension was washed repeatedly with PBS until the A280 of the supernatant approached baseline. The suspension was kept on ice as a 50% slurry with PBS until used in the assembly purification. An appropriate volume of MagneGST resin bound GST-DogCatcher and GST- SpyCatcher003 suspension was added to the conjugation reaction to give a final concentration 25 of 16 µM for each GST-Catcher. The suspension was then incubated at room temperature for 4 hours with 600 rpm shaking. The resin was sedimented using and Opentron magnetic module engaged for 5 minutes. The supernatant was then transferred to a new 96 well plate. The post-purification conjugates were then quantified via Bradford assay and analysed by SDS-PAGE followed by Coomassie staining. 30 Automated production of Catcher-core assemblies Large-scale assembly reactions were performed as described above using a Hamilton Microlab Star with the exception that tagged binder proteins were added at an equimolar ratio to the Catcher-core with a final concentration of 2-4 µM per monomer. Furthermore, no GST- 35 Catcher based purification was performed. Assemblies were prepared in 96-well plates and then transferred to 384-well low dead volume acoustic qualified source plates or 384-well polypropylene acoustic qualified source plates for cell viability/cytotoxicity assays. Cell culture NCI-N87 (CRL-5822) and HCT116 (CCL-247) cells were obtained from ATCC and routinely cultured in RPMI, and McCoy respectively, supplemented with 10% FCS and 5% Penicillin/Streptomycin. Additional cell lines for screening were obtained from ATCC or ECACC and routinely cultured in appropriate medium according to supplier recommendation. Cell imaging Anti-His & Anti-CutA1 imaging 8-well chamber slides (Lab-Tek) were coated with 200 μL Poly-L-lysine for 1h at RT under mild shaking. Slides were washed 3x with 500 μL filtered MilliQ water and dried for 2h at RT. NCI- N87 cells were seeded at 1.5x10
4 cells/well into 8-well chamber slides in 300μL RPMI-1640 medium supplemented with 10% fetal calf serum and incubated at 37 degrees 5% CO2. After 24h cells were treated with binders or full assembly at 200 nM monomer concentrations and incubated for 2h45mins at 37 °C 5% CO2. Cells were washed 2x with DPBS and fixed with 4% paraformaldehyde in PBS for 15 mins at RT. Then, cells were washed 2x 5 mins with DPBS-T (0.1% Triton X-100) and blocked in DPBS-T, 5% FCS (blocking buffer) for 1h at RT. For immunofluorescent staining cells were treated with a primary antibody (a. anti-His antibody, proteintech 66005, b. anti-CutA1 antibody, abcam ab192236), 1:500 dilution in blocking buffer at 4°C overnight. Cells were washed 3x 5 mins with DPBS-T at RT and stained with a secondary antibody (a. goat anti-mouse IgG H&L DL488 b. goat anti-rabbit IgG H&L DL488, Abbexa), washed 3x 5 mins with DPBS-T at RT and mounted using Fluoroshild mounting medium with DAPI (Abcam). Cells were imaged on the Nikon Ti2 inverted confocal microscope and images were analysed using Fiji. Fluorescein-Imaging 8-well chamber slides (Lab-Tek) were coated with 200 μL Poly-L-lysine for 1h at RT under mild shaking. Slides were washed 3x with 500 μL filtered MilliQ water and dried for 2h at RT. NCI- N87 cells were seeded at 2.5x10
4 cells/well into 8-well chamber slides in 300μL RPMI-1640 medium supplemented with 10% foetal calf serum and incubated at 37 °C 5% CO2. After 24h cells were treated with fluorescein-5-maleimide-labelled Catcher-core (CC*, 100nM monomer concentration), single assemblies (CC*:L1, 100nM monomer concentration; L2:CC*, 300nM monomer concentration) or full assembly (L2:CC*:L1, 100nM monomer concentration) or full assembly and incubated for 4h at 37 °C 5% CO2. Cells were washed 2x with complete medium followed by 2 washes in DPBS and fixation with 4% paraformaldehyde in PBS for 15 mins at RT. Then, cells were washed 3x 5 mins with DPBS-T (0.05% Tween 20) and mounted using Fluoroshield mounting medium with DAPI (Abcam). Cells were imaged on the Nikon Ti2 inverted confocal microscope and images were analysed using Fiji. 5 Cell viability assay / cytotoxicity assay 1,000 HCT116 cells/well were seeded into 96-well plates and grown in McCoy medium supplemented with 10% FCS for 24 h. Cells were then treated or mock-treated with various concentrations (0.8-50 nM) of protein assemblies with a panel of 4 × 4 orthogonally tagged 10 ligands (L2, L4, L5, L6; each SpT or DgT) or the corresponding drug-conjugate assemblies (CC042-DM1 assemblies or CC068-Deruxtecan assemblies). Scaffold only (CC042 or CC068) and PBS treated cells were used as controls. Cells were then grown for 4 days, and surviving fractions were measured using the MTT assay.20 µL of 5 mg/mL 3-(4,5-Dimethylthiazol-2-yl)- 2,5-diphenyltetrazolium bromide (MTT) in PBS were added to each well containing 200 µL 15 medium. Incubation for 3 h was followed by media aspiration, formazan dissolution in 100% DMSO and absorbance reading at 570 nm. Surviving fractions were normalised to mock- treated controls. Alternatively, cells were seeded into 384-well black clear flat bottom imaging plates at an appropriate cell density in complete growth medium according to manufacturer’s instructions. The following day, they were treated with various antibody concentrations (0.5-50 20 nM by dimer) or a single concentration of Catcher-core assemblies, positive controls (free payload) or were left untreated (negative controls). Addition of drug candidates was performed using an Echo 550 with Catcher-core assemblies in low dead volume acoustic qualified source plates. Cells were then grown for 5 days before staining with Hoechst 33342 for 1 h and analysis using a Celigo Imaging Cytometer. Cell viability was assessed by quantification of 25 total Hoechst+ cells and results were normalized to positive and negative controls. Brief Description of the Informal Sequence Listing SEQ ID NO: 1 shows the amino acid sequence of a monomer of the human CutA1 protein. 30 SEQ ID NO: 2 shows the amino acid sequence of “SpyCatcher” with a His-Tag and optional N-terminal protease site and linker. SEQ ID NO: 3 shows the amino acid sequence of “SpyTag”. This is also referred to herein as “SpyTag 001”. SEQ ID NO: 4 shows the amino acid sequence of minimal reactive “SpyTag”. 35 SEQ ID NO: 5 shows the amino acid sequence of “SpyCatcher”. This is also referred to herein as “SpyCatcher 001”. SEQ ID NO: 6 shows the amino acid sequence of minimal reactive SpyCatcher “ΔN1ΔC2”. SEQ ID NO: 7 shows the amino acid sequence of “SpyTag 002”. SEQ ID NO: 8 shows the amino acid sequence of “SpyCatcher 002”. SEQ ID NO: 9 shows the amino acid sequence of “SpyTag 003”. SEQ ID NO: 10 shows the amino acid sequence of “SpyCatcher 003”. 5 SEQ ID NO: 11 shows the amino acid sequence of “SnoopCatcher”. SEQ ID NO: 12 shows the amino acid sequence of “SnoopTag”. SEQ ID NO: 13 shows the amino acid sequence of “DogCatcher”. SEQ ID NO: 14 shows the amino acid sequence of “DogTag”. SEQ ID NO: 15 shows the amino acid sequence of a monomer of the His-tagged construct 10 H6-SpyCatcher003-HsCutA1-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 16; also “CC1”. SEQ ID NO: 16 shows the amino acid sequence of a monomer of the human CutA1 truncated based on the resolved structure in PDB ID: 2zfh, representing an intermediate truncation between SEQ ID NO: 1 and SEQ ID NO: 18. 15 SEQ ID NO: 17 shows the amino acid sequence of a monomer of the construct SpyCatcher003-HsCutA1-DogCatcher as used in structural prediction. SEQ ID NO: 18 shows the amino acid sequence of a monomer of the human CutA1 as resolved in PDB ID: 2zfh, representing a truncation of SEQ ID NO: 1. SEQ ID NO: 19 shows the amino acid sequence of a monomer of human CutA1 truncated at 20 both termini to HsCutA1
60-171. SEQ ID NO: 20 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 16 with C75A and C96A mutation. SEQ ID NO: 21 shows the amino acid sequence of a GGGGS linker fused to HsCutA1 to create a direct fusion construct without SpyCatcher, DogCatcher, or SnoopCatcher modules. 25 SEQ ID NO: 22 shows the amino acid sequence of a (GGGGS)3 linker fused to HsCutA1 to create a direct fusion construct without SpyCatcher, DogCatcher, or SnoopCatcher modules. SEQ ID NO: 23 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 16 with C75V and C96S mutation. SEQ ID NO: 24 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID 30 NO: 19 with C75A and C96A mutation. SEQ ID NO: 25 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 19 with C75V and C96S mutation. SEQ ID NO: 26 shows the amino acid sequence of a monomer of the His-tagged construct H6-SpyCatcher003-(GGGGS)3-[C75V, C96S]HsCutA144-179-(GGGGS)3-DogCatcher, with 35 HsCutA1 truncated as in SEQ ID NO: 23 and with linker as in SEQ ID NO: 22. SEQ ID NO: 27 shows the amino acid sequence of a monomer of the His-tagged construct H6-SpyCatcher003-(GGGGS)3-HsCutA160-171-(GGGGS)3-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 19 and with linker as in SEQ ID NO: 22. SEQ ID NO: 28 shows the amino acid sequence of a monomer of the His-tagged construct H6-SpyCatcher003-(GGGGS)3-[C75A, C96A]HsCutA1
60-171-(GGGGS)3-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 24 and with linker as in SEQ ID NO: 22. SEQ ID NO: 29 shows the amino acid sequence of a monomer of the His-tagged construct H6-SpyCatcher003-(GGGGS)3-[C75V, C96S]HsCutA1
60-171-(GGGGS)3-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 25 and with linker as in SEQ ID NO: 22. SEQ ID NO: 30 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 20 with a E78C mutation. SEQ ID NO: 31 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 20 with a K82C mutation. SEQ ID NO: 32 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 20 with a Q102C mutation. SEQ ID NO: 33 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 20 with a E114C mutation. SEQ ID NO: 34 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 20 with a F136C mutation. SEQ ID NO: 35 shows the amino acid sequence of a monomer of human CutA1 as in SEQ ID NO: 20 with a Q166C mutation. SEQ ID NO: 36 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A]HsCutA144-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 20 and with linker as in SEQ ID NO: 21; also “CC041”. SEQ ID NO: 37 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A, E78C]HsCutA144-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 30 and with linker as in SEQ ID NO: 21; also “CC056”. SEQ ID NO: 38 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A, K82C]HsCutA144-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 31 and with linker as in SEQ ID NO: 21; also “CC042”. SEQ ID NO: 39 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A, Q102C]HsCutA144-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 32 and with linker as in SEQ ID NO: 21; also “CC057”. SEQ ID NO: 40 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A, E114C]HsCutA144-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 33 and with linker as in SEQ ID NO: 21; also “CC038”. SEQ ID NO: 41 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A, F136C]HsCutA144-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 34 and with linker as in SEQ ID NO: 21; also “CC044”. 5 SEQ ID NO: 42 shows the amino acid sequence of the His-tagged construct H6- SpyCatcher003-GGGGS-[C75A, C96A, Q166C]HsCutA1
44-179-GGGGS-DogCatcher, with HsCutA1 truncated as in SEQ ID NO: 35 and with linker as in SEQ ID NO: 21; also “CC058”. SEQ ID NO: 43 shows the amino acid sequence of the wild-type human IgG1 hinge region. 10 SEQ ID NO: 44 shows the amino acid sequence of the human IgG1 hinge region with a C230S (PDB numbering) mutation. SED ID NO: 45 shows the amino acid sequence of a (GGGGS)2 linker fused to HsCutA1 to 15 create a direct fusion construct without SpyCatcher, DogCatcher, or SnoopCatcher modules. SEQ ID NO: 46 shows the amino acid sequence of the wild-type human IgG1 Fc region without the hinge. 20 SEQ ID NO: 47 shows the amino acid sequence of the construct SpyCatcher003- hinge[C230S]-Fc-(GGGGS)2-DogCatcher with hinge region as in SEQ ID NO: 44 and with C- terminal linker as in SEQ ID NO: 45. SEQ ID NO: 48 shows the amino acid sequence of the construct H6-SpyCatcher003- 25 (GGGGS)3-Fc-(GGGGS)3-DogCatcher and with linker as in SEQ ID NO: 22. SEQ ID NO: 49 shows the amino acid sequence of His-GST-(GGGGS)3-SpyCatcher 003. SEQ ID NO: 50 shows the amino acid sequence of His-GST-(GGGGS)3-DogCatcher. 30 SEQ ID NO: 51 shows the amino acid sequence of a monomer of the His-tagged construct H6-SpyCatcher003-(GGGGS)3-[C75V, C96S]HsCutA160-171-(GGGGS)3-DogCatcher. SEQ ID NO: 52 shows the amino acid sequence of a monomer of CC076 construct 35 SpyCatcher003-(GGGGS)3-DogCatcher-hinge[C230S]-Fc. SEQ ID NO: 53 shows the amino acid sequence of a monomer of CC080 construct SpyCatcher003[S49C]-hinge-Fc-(GGGGS)2-DogCatcher. This construct has a wild-type hinge region sequence as in SEQ ID NO: 43 and an [S49C] mutation in the SpyCatcher003 domain for a total of 4 Cys residues per monomer. 5 SEQ ID NO: 54 shows the amino acid sequence of a monomer of CC086 construct SpyCatcher003[S49C]-GGGGS-hinge-Fc-(GGGGS)2-DogCatcher. This construct has a wild- type hinge region sequence as in SEQ ID NO: 43 and an [S49C] mutation in the SpyCatcher003 domain for a total of 4 Cys residues per monomer. 10
SEQ ID NO: 1 MSGGRAPAVLLGGVASLLLSFVWMPALLPVASRLLLLPRVLLTMASGSPPTQPSPASDSGSGYVPGSVSAAFVTC PNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIA LPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 2 MSYYHHHHHHDYDIPTTENLYFQGAMVDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDEDGKELAGATMELRDS SGKTISTWISDGQVKDFYLYPGKYTFVETAAPDGYEVATAITFTVNEQGQVTVNGKATKGDAHI SEQ ID NO: 3 AHIVMVDAYKPTK SEQ ID NO: 4 AHIVMVDA SEQ ID NO: 5 VDTLSGLSSEQGQSGDMTIEEDSATHIKFSKRDEDGKELAGATMELRDSSGKTISTWISDGQVKDFYLYPGKYTF VETAAPDGYEVATAITFTVNEQGQVTVNGKATKGDAHI SEQ ID NO: 6 DSATHIKFSKRDEDGKELAGATMELRDSSGKTISTWISDGQVKDFYLYPGKYTFVETAAPDGYEVATAITFTVNE QGQVTVNG SEQ ID NO: 7 VPTIVMVDAYKRYK SEQ ID NO: 8 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATAITFTVNEQGQVTVNGEATKGDAHT SEQ ID NO: 9 RGVPHIVMVDAYKRYK SEQ ID NO: 10 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHT SEQ ID NO: 11 KPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQNVRTGEDGKLTFKNLSDGKYRLFENSEPAGYKPVQNKPIVAFQI VNGEVRDVTSIVPQDIPATYEFTNGKHYITNEPIPPK SEQ ID NO: 12 KLGDIEFIKVNK SEQ ID NO: 13 KLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPP GYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 14 DIPATYEFTDGKHYITNEPIPPK SEQ ID NO: 15 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTCPNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q SEQ ID NO: 16 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTCPNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 17 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGSGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTCP NEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIAL PVEQGNFPYLQWVRQVTESVSDSITVLPGSGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDYPDIYGAIDQNG TYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 18 SGYVPGSVSAAFVTCPNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTD FVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVT SEQ ID NO: 19 GSGYVPGSVSAAFVTCPNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALT DFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESV SEQ ID NO: 20 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 21 GGGGS SEQ ID NO: 22 GGGGSGGGGSGGGGS SEQ ID NO: 23 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTVPNEKVAKEIARAVVEKRLAASVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 24 GSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALT DFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESV SEQ ID NO: 25 GSGYVPGSVSAAFVTVPNEKVAKEIARAVVEKRLAASVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALT DFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESV SEQ ID NO: 26 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSMASGSPPTQ PSPASDSGSGYVPGSVSAAFVTVPNEKVAKEIARAVVEKRLAASVNLIPQITSIYEWKGKIEEDSEVLMMIKTQS SLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSGGGGSGGGGSKLGEIEFI KVDKTDKKPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNK PIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 27 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSGSGYVPGSV SAAFVTCPNEKVAKEIARAVVEKRLAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPY EVAEVIALPVEQGNFPYLQWVRQVTESVGGGGSGGGGSGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDY PDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 28 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSGSGYVPGSV SAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPY EVAEVIALPVEQGNFPYLQWVRQVTESVGGGGSGGGGSGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDY PDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 29 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSGSGYVPGSV SAAFVTVPNEKVAKEIARAVVEKRLAASVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPY EVAEVIALPVEQGNFPYLQWVRQVTESVGGGGSGGGGSGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDY PDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 30 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNCKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 31 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVACEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 32 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPCITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 33 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKICEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 34 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDCVRSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLP SEQ ID NO: 35 MASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSE VLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALPVEQGNFPYLQWVRCVTESVSDSITVLP SEQ ID NO: 36 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ Q SEQ ID NO: 37 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNCKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q SEQ ID NO: 38 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNEKVACEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q SEQ ID NO: 39 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPCITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q ID NO: 40 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKICEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q SEQ ID NO: 41 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDCV RSVHPYEVAEVIALPVEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q SEQ ID NO: 42 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSMASGSPPTQPSPASDSGSG YVPGSVSAAFVTAPNEKVAKEIARAVVEKRLAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFV RSVHPYEVAEVIALPVEQGNFPYLQWVRCVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVP Q SEQ ID NO: 43 EPKSCDKTHTCPPCPAPELLG SEQ ID NO: 44 EPKSSDKTHTCPPCPAPELLG SEQ ID NO: 45 GGGGSGGGGS SEQ ID NO: 46 GPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQ DWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQ PENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK SEQ ID NO: 47 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTEPKSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLM ISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKA LPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDG SFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGKGGGGSGGGGSKLGEIEFIKVDKTDKKPLRG AVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEV RDVTSIVPQ SEQ ID NO: 48 GSSHHHHHHGSSVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGHV KDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSGPSVFLFPPK PKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKC KVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPP VLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGKGGGGSGGGGSGGGGSKLGEIEFIK VDKTDKKPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKP IVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 49 MGSSHHHHHHSSGMSPILGYWKIKGLVQPTRLLLEYLEEKYEEHLYERDEGDKWRNKKFELGLEFPNLPYYIDGD VKLTQSMAIIRYIADKHNMLGGCPKERAEISMLEGAVLDIRYGVSRIAYSKDFETLKVDFLSKLPEMLKMFEDRL CHKTYLNGDHVTHPDFMLYDALDVVLYMDPMCLDAFPKLVCFKKRIEAIPQIDKYLKSSKYIAWPLQGWQATFGG GDHPPKGGGGSGGGGSGGGGSVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTI STWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHT SEQ ID NO: 50 MGSSHHHHHHSSGMSPILGYWKIKGLVQPTRLLLEYLEEKYEEHLYERDEGDKWRNKKFELGLEFPNLPYYIDGD VKLTQSMAIIRYIADKHNMLGGCPKERAEISMLEGAVLDIRYGVSRIAYSKDFETLKVDFLSKLPEMLKMFEDRL CHKTYLNGDHVTHPDFMLYDALDVVLYMDPMCLDAFPKLVCFKKRIEAIPQIDKYLKSSKYIAWPLQGWQATFGG GDHPPKGGGGSGGGGSGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDG KLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 51 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGH VKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSGSGYVPGSV SAAFVTVPNEKVAKEIARAVVEKRLAASVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPY EVAEVIALPVEQGNFPYLQWVRQVTESVGGGGSGGGGSGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQHPDY PDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRDVTSIVPQ SEQ ID NO: 52 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDSSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSGGGGSGGGGSKLGEIEFIKVDKTDKKPLRGAV FSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEVRD VTSIVPQEPKSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVE VHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDEL TKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALH NHYTQKSLSLSPGK SEQ ID NO: 53 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDCSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLM ISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKA LPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDG SFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGKGGGGSGGGGSKLGEIEFIKVDKTDKKPLRG AVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRIVDGEV RDVTSIVPQ SEQ ID NO: 54 VTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELRDCSGKTISTWISDGHVKDFYLYPGKYTF VETAAPDGYEVATPIEFTVNEDGQVTVDGEATEGDAHTGGGGSEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKP KDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCK VSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPV LDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGKGGGGSGGGGSKLGEIEFIKVDKTDK KPLRGAVFSLQKQHPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSFRI VDGEVRDVTSIVPQ Alignment showing sequence variation: CC1 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC058 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC057 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC056 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC044 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC042 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC038 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 CC041 MGSSHHHHHHSSGVTTLSGLSGEQGPSGDMTTEEDSATHIKFSKRDEDGRELAGATMELR 60 ************************************************************ CC1 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC058 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC057 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC056 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC044 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC042 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC038 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 CC041 DSSGKTISTWISDGHVKDFYLYPGKYTFVETAAPDGYEVATPIEFTVNEDGQVTVDGEAT 120 ************************************************************ CC1 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTCPNEKVAKEIARAVVEKR 180 CC058 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKR 180 CC057 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKR 180 CC056 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNCKVAKEIARAVVEKR 180 CC044 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKR 180 CC042 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVACEIARAVVEKR 180 CC038 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKR 180 CC041 EGDAHTGGGGSMASGSPPTQPSPASDSGSGYVPGSVSAAFVTAPNEKVAKEIARAVVEKR 180 ******************************************.** *** ********** CC1 LAACVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 CC058 LAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 CC057 LAAAVNLIPCITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 CC056 LAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 CC044 LAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDCVRSVHPYEVAEVIALP 240 CC042 LAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 CC038 LAAAVNLIPQITSIYEWKGKICEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 CC041 LAAAVNLIPQITSIYEWKGKIEEDSEVLMMIKTQSSLVPALTDFVRSVHPYEVAEVIALP 240 ***.***** *********** ********************* **************** CC1 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC058 VEQGNFPYLQWVRCVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC057 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC056 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC044 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC042 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC038 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 CC041 VEQGNFPYLQWVRQVTESVSDSITVLPGGGGSKLGEIEFIKVDKTDKKPLRGAVFSLQKQ 300 ************* ********************************************** CC1 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC058 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC057 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC056 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC044 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC042 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC038 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 CC041 HPDYPDIYGAIDQNGTYQDVRTGEDGKLTFTNLSDGKYRLIENSEPPGYKPVQNKPIVSF 360 ************************************************************ CC1 RIVDGEVRDVTSIVPQ 376 CC058 RIVDGEVRDVTSIVPQ 376 CC057 RIVDGEVRDVTSIVPQ 376 CC056 RIVDGEVRDVTSIVPQ 376 CC044 RIVDGEVRDVTSIVPQ 376 CC042 RIVDGEVRDVTSIVPQ 376 CC038 RIVDGEVRDVTSIVPQ 376 RIVDGEVRDVTSIVPQ 376 **************** 5 References for the Examples 1. Ye, Y. & Godzik, A. FATCAT: a web server for flexible structure comparison and structure similarity searching. Nucleic Acids Res.32, W582-585 (2004). 10 2. Huang, X., Pearce, R. & Zhang, Y. EvoEF2: accurate and fast energy function for computational protein design. Bioinforma. Oxf. Engl.36, 1135–1142 (2020). 3. Krissinel, E. & Henrick, K. Inference of Macromolecular Assemblies from Crystalline State. J. Mol. Biol.372, 774–797 (2007). 4. Henkel, M., Röckendorf, N. & Frey, A. Selective and Efficient Cysteine Conjugation by 15 Maleimides in the Presence of Phosphine Reductants. Bioconjug. Chem. 27, 2260–2265 (2016). 5. Kantner, T., Alkhawaja, B. & Watts, A. G. In Situ Quenching of Trialkylphosphine Reducing Agents Using Water-Soluble PEG-Azides Improves Maleimide Conjugation to Proteins. ACS Omega 2, 5785–5791 (2017). 20 6. Li, W. et al. Synthesis and Evaluation of Camptothecin Antibody–Drug Conjugates. ACS Med. Chem. Lett.10, 1386–1392 (2019). 25 It is understood that the Examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, sequence accession numbers, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for 30 all purposes.