Peptide linker with reduced post-translational modificationsThe present invention is in the field of recombinant polypeptide production. Herein is reported in detail a novel peptide linker which can reduce post-translational modifications resulting in less product heterogeneity, in particular in the recombinant production of fusion polypeptides.
Background
The introduction of recombinantly produced multispecific antibodies in eukaryotic cells allows the production of many different molecular forms. In some of these molecular forms, domain structures are used in which different non-naturally associated domains are fused to each other using linkers. If recombinant production of the fusion polypeptide is intended, such a linker must be an encodable linker, such as a peptide linker.
Peptide linkers are synthetic amino acid sequences used to link different polypeptide domains. Peptide linkers are typically composed of a linear chain of amino acids, of which 20 naturally occurring amino acids are blocks of monomeric structures joined by peptide bonds. The peptide linker may have a length of 1 to 50 amino acid residues, in most cases a length of 3 to 25 amino acid residues is chosen. The peptide linker may comprise a repeating amino acid sequence. Peptide linkers have the function of ensuring that the components are linked by the linker, and their biological activity can be achieved by allowing the domains to fold correctly and present correctly. Typically, the peptide linker is a synthetic peptide linker, which is designated as being rich in glycine residues, glutamine residues and/or serine residues. These residues are arranged in small repeating units of up to five amino acids, such as GGGS or GGGGS (SEQ ID NO:16 and SEQ ID NO:17, respectively). The small repeat unit can be repeated two to five times to form a multimeric unit, e.g. (GGGS)2Or (GGGGS)2。
Unfortunately, when fusion polypeptides comprising a peptide linker are recombinantly produced in eukaryotic host cells, the amino acid residues within the peptide linker may serve as substrates for in vivo post-translational modifications. The addition of post-translational modifications results in increased heterogeneity of the recombinantly produced fusion polypeptide. Thus, the addition of novel peptide linkers that can reduce or even eliminate post-translational modifications will enable recombinant production of more homogeneous fusion polypeptide preparations.
Hydroxyproline in the GGGGP (SEQ ID NO:18) linker was reported in the publication by Spahr et al (MAbs 9(2017) 812-819).
Information on biochemical and structural studies that have prompted understanding of the mechanisms of the three major classes of protein serine/threonine phosphatases is reported by Shi, y, with emphasis on PP2A (Cell 139(2009) 468-.
Basl, E.et al report information on the chemical modification of proteins of endogenous amino acids (chem. biol.17(2010) 213-227).
Polypeptide variants comprising the ligand binding domain of a cytokine linked via a flexible polypeptide linker molecule are disclosed in WO 2003/062276.
WO 2011/161260 discloses anti-cancer fusion proteins.
WO 2012/088461 discloses that a linker peptide lacking the amino acid sequence GSG reduces or eliminates the addition of post-translational modifications to a polypeptide comprising the linker peptide.
WO 2014/085621 discloses therapeutic fusion proteins useful in the treatment of lysosomal storage diseases and methods for treating such diseases.
WO 2015/091130 discloses a method for the recombinant production of a soluble form of a polypeptide, the method comprising the steps of: transfecting a eukaryotic cell with a nucleic acid encoding the polypeptide, whereby the polypeptide has been modified (as compared to the wild-type polypeptide) by introducing one or more artificial glycosylation sites; culturing the transfected cells in a culture medium; and recovering the polypeptide from the culture medium, whereby the yield of monomeric polypeptide (determined after one purification step) is increased by at least 100% compared to the wild-type polypeptide.
WO 2016/115511 discloses VEGF variant polypeptide components.
WO 2011/161260 discloses a fusion protein comprising: domain (a) which is a functional fragment of a soluble hTRAIL protein sequence; and domain (b), which is a sequence of a pro-apoptotic effector peptide.
WO 2016/120216 discloses a polypeptide encoding a chimeric antigen receptor comprising at least one extracellular binding domain comprising at least a scFv formed by a VH chain and a VL chain specific for an antigen, wherein said extracellular binding domain comprises at least one mAb-specific epitope.
Spencer, d. et al disclose that O-xylosylation in recombinant proteins is directed against the common motif of the glycine-serine linker (j.pharm. sci 102(2013) 3920-.
Wen, D.et al disclose studies on O-xylosylation in engineered proteins containing (GGGGS) n (SEQ ID NO:17) linkers (anal. chem.85(2013) 4805-4812).
J. Cell surface and extracellular proteins are disclosed to be O-glycosylated, with the most abundant type of O-glycosylation in proteins being the attachment of GalNAc to serine or threonine in the protein chain via glycosidic bonds (meth. enzymol.405(2005) 139-.
Plomp, R. et al disclose the hinge region O-glycosylation of human immunoglobulin G3 (mol. cell. prot.14(2015) 1373-1384).
US 9,409,960 discloses linker peptides and polypeptides comprising the linker peptides, wherein the linker peptide lacking the amino acid sequence GSG reduces or eliminates the addition of post-translational modifications to the polypeptide comprising the linker peptide.
Disclosure of Invention
The present invention is based, at least in part, on the following findings: a glycine-serine peptide linker lacking at least the C-terminal serine residue or even all serine residues, thereby producing a pure glycine linker, reduces or even eliminates the addition of post-translational modifications to the fusion polypeptide comprising the linker peptide. To achieve this, the C-terminal polypeptide of the peptide linker should not contain a serine, threonine or proline residue at its N-terminus, i.e. the first amino acid residue after the peptide linker should not be a serine, threonine or proline residue.
More specifically, as reported herein, a peptide linker reduces or even eliminates the ability of an enzyme to add post-translational modifications (such as phosphate groups or carbohydrate moieties) to a fusion polypeptide comprising such a peptide linker, e.g. reduces the ability of a xylosyltransferase to link xylose to serine residues.
Thus, by including a peptide linker as reported herein in the fusion polypeptide, the homogeneity of recombinantly (in eukaryotic cells) produced fusion polypeptide components and preparations may be increased.
One aspect of the present invention is a fusion polypeptide comprising the following amino acid sequence
GyX1(SEQ ID NO:04)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline, and
wherein y is an integer of 3 to 25 inclusive.
In one embodiment, the fusion polypeptide comprises the following amino acid sequence
GyX1X2X3(SEQ ID NO:05)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 may be any amino acid residue independent of each other, and
wherein y is an integer of 3 to 25 inclusive.
In one embodiment, y is an integer from 4 to 20 inclusive.
In one embodiment, y is an integer from 5 to 15 inclusive.
One aspect of the present invention is a fusion polypeptide comprising the following amino acid sequence
GnSGmX1(SEQ ID NO:01)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
In one embodiment of all aspects of the invention, the fusion polypeptide comprises the following amino acid sequence
GnSGmX1X2X3(SEQ ID NO:02)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
In one embodiment of all aspects of the invention, the fusion polypeptide comprises at least three domains
Wherein each of the three domains is a polypeptide of at least 10 amino acid residues in length, independently of the other two domains, and
wherein the domains are conjugated to each other via peptide bonds.
In one embodiment of all aspects of the invention, the C-terminus of the first domain is conjugated to the N-terminus of the second domain via a peptide bond, and the C-terminus of the second domain is conjugated to the N-terminus of the third domain via a peptide bond.
In one embodiment of all aspects of the invention, the fusion polypeptide is a recombinant fusion polypeptide.
In one embodiment of all aspects of the invention, the fusion polypeptide is produced in a eukaryotic cell.
In one embodiment of all aspects of the invention, the fusion polypeptide has NO post-translational modification at S or X1 of SEQ ID NO: 01.
In one embodiment of all aspects of the invention, the linear fusion polypeptide has NO post-translational modification at X1 of SEQ ID NO: 01.
In one embodiment of all aspects of the invention, the post-translational modification is phosphorylation (addition of a phosphate group) and/or glycosylation (addition of a carbohydrate moiety). In one embodiment of all aspects of the invention, the glycosylation is xylosylation (the carbohydrate moiety is xylose). In one embodiment of all aspects of the invention, the glycosylation is glucosylation (the carbohydrate moiety is glucose).
In one embodiment of all aspects of the invention, the fusion polypeptide has reduced post-translational modifications at residues S and/or X1 compared to a fusion polypeptide comprising the amino acid sequence:
GnSGmX1X2X3(SEQ ID NO:06)
wherein X1 is serine, threonine or proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
In one embodiment of all aspects of the invention, n is 3 and m is 3 or 4. In a preferred embodiment of all aspects of the invention, n is 3 and m is 4.
In one embodiment of all aspects of the invention, n is 4 and m is 4 or 5. In a preferred embodiment of all aspects of the invention, n is 4 and m is 5.
In one embodiment of all aspects of the invention, the fusion polypeptide comprises the following amino acid sequence
GpSGnSGmX1X2X3(SEQ ID NO:03)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 3 or 4,
wherein m is 3, 4 or 5, and
wherein p is 3 or 4.
In one embodiment of all aspects of the invention, the fusion polypeptide further comprises/at least one domain is an antibody Fc region polypeptide.
In one embodiment of all aspects of the invention, the first domain or/and the third domain is an antibody Fc region polypeptide.
In one embodiment of all aspects of the invention, the fusion polypeptide further comprises a VH domain, a VL domain, a scFv, a scFab, a VH-CH1 pair, a VL-CL pair, a VH-CL pair, a VL-CH1 pair, a receptor or extracellular domain thereof, a receptor binding portion of a ligand, an enzyme, a growth factor, an interleukin, a cytokine or a chemokine.
In one embodiment of all aspects of the invention, the fusion polypeptide further comprises at least one domain selected from the group consisting of: a VH domain, a VL domain, a scFv, a scFab, a VH-CH1 pair, a VL-CL pair, a VH-CL pair, a VL-CH1 pair, a receptor or extracellular domain thereof, a receptor binding portion of a ligand, an enzyme, a growth factor, an interleukin, a cytokine, and a chemokine.
In one embodiment of all aspects of the invention, the fusion polypeptide is monomeric.
In one embodiment of all aspects of the invention, the fusion polypeptide is a linear fusion polypeptide.
Another aspect of the present invention is a multimeric molecule comprising at least two polypeptides, at least one of which is a fusion polypeptide as reported herein.
In one embodiment of all aspects of the invention, the at least two polypeptides are conjugated to each other by one or more disulfide bonds.
In one embodiment of all aspects of the invention, the multimeric molecule is an antibody.
In one embodiment of all aspects of the invention, the multimeric molecule is a bispecific antibody.
In one embodiment of all aspects of the invention, the multimeric molecule further comprises at least one antibody light chain.
In one embodiment of all aspects of the invention, the multimeric molecule further comprises at least one antibody heavy chain.
Another aspect of the present invention is a pharmaceutical preparation comprising at least one fusion polypeptide as reported herein or a multimeric molecule as reported herein and optionally a pharmaceutically acceptable carrier.
Yet another aspect of the invention is a nucleic acid molecule encoding a fusion polypeptide as reported herein.
In one embodiment of all aspects of the invention, the nucleic acid molecule is in an expression cassette.
In one embodiment of all aspects of the invention, the nucleic acid molecule is in a vector.
Yet another aspect of the invention is a set of nucleic acid molecules encoding the polypeptides of the multimeric molecules as reported herein.
In one embodiment of all aspects of the invention, each of the nucleic acid molecules is in an expression cassette.
In one embodiment of all aspects of the invention, each of the nucleic acid molecules is in a vector.
In one embodiment of all aspects of the invention, the nucleic acid molecules are on the same vector.
In one embodiment of all aspects of the invention, the nucleic acid molecules are on different vectors.
In one embodiment of all aspects of the invention, the nucleic acid molecule is on two vectors.
Yet another aspect as reported herein is a eukaryotic cell comprising a nucleic acid molecule as reported herein or a set of nucleic acid molecules as reported herein.
Another aspect of the invention is a peptide linker comprising an amino acid sequence.
GyX1(SEQ ID NO:04)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline, and
wherein y is an integer of 3 to 25 inclusive.
Another aspect of the invention is a peptide linker comprising an amino acid sequence.
GyX1X2X3(SEQ ID NO:05)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 may be any amino acid residue independent of each other, and
wherein y is an integer of 3 to 25 inclusive.
In one embodiment, y is an integer from 4 to 20 inclusive.
In one embodiment, y is an integer from 5 to 15 inclusive.
Another aspect of the invention is a peptide linker comprising an amino acid sequence.
GnSGmX1(SEQ ID NO:01)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
Another aspect of the invention is a peptide linker comprising an amino acid sequence.
GnSGmX1X2X3(SEQ ID NO:02)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
Another aspect of the invention is a peptide linker comprising an amino acid sequence.
GpSGnSGmX1X2X3(SEQ ID NO:03)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4,
wherein m is 3, 4 or 5, and
wherein p is 3 or 4.
Another aspect of the invention is a method for producing a fusion polypeptide comprising the steps of:
-culturing a eukaryotic cell according to the invention under conditions in which the fusion polypeptide is expressed, and
-recovering the fusion polypeptide from the eukaryotic cell or the culture medium.
Thereby producing a fusion polypeptide.
Another aspect of the invention is a method of producing a fusion polypeptide having a reduced level of post-translational modification comprising:
-culturing the host cell according to claim 38 under conditions for expression of the fusion polypeptide, and
-recovering the fusion polypeptide from the eukaryotic cell or culture medium,
thereby producing a fusion polypeptide with a reduced level of post-translational modification.
Yet another aspect of the invention is a method of producing a stable fusion polypeptide, the method comprising genetically engineering a fusion protein to comprise a peptide linker of the invention, and allowing the fusion polypeptide to be expressed by a eukaryotic cell, thereby producing a stable fusion polypeptide.
Yet another aspect of the invention is a composition prepared by the method according to the invention.
A further aspect of the invention is a method of treating a subject who would benefit from treatment with a fusion polypeptide according to the invention or a multimeric molecule according to the invention, the method comprising administering to the subject a composition according to the invention.
A further aspect of the invention is the use of a composition according to the invention for the treatment of a disease or disorder.
A further aspect of the invention is the use of a composition according to the invention for the manufacture of a medicament.
One aspect of the invention is further a fusion polypeptide according to the invention or a multimeric molecule according to the invention for use as a medicament.
A further aspect of the invention is the use of a fusion polypeptide according to the invention or a multimeric molecule according to the invention in the manufacture of a medicament.
A final aspect of the invention is a method of treating an individual in need of treatment, said method comprising administering to the individual an effective amount of a fusion polypeptide according to the invention or a multimeric molecule according to the invention.
Detailed Description
The present invention is based, at least in part, on the following findings: the use of a glycine-serine peptide linker lacking a C-terminal serine residue reduces or even eliminates the addition of post-translational modifications to the peptide linker, particularly when the peptide linker is included in a fusion polypeptide. To achieve this, the C-terminal polypeptide of the peptide linker should also not contain a serine, threonine or proline at its N-terminus.
More specifically, the peptide linkers reported herein reduce the ability of enzymes to attach secondary modifications (such as phosphate groups or carbohydrate moieties) to fusion polypeptides comprising such peptide linkers, e.g. reduce the ability of xylosyltransferases to attach xylose to the polypeptide.
Thus, by using a peptide linker as reported herein in the fusion polypeptide, the homogeneity of recombinantly (in eukaryotic cells) produced fusion polypeptide compositions and preparations may be increased.
General information on the nucleotide sequences of human immunoglobulin light and heavy chains is given in: kabat, E.A. et al, Sequences of Proteins of Immunological Interest, 5 th edition, Public Health Service, National Institutes of Health, Bethesda, Md. (1991). As used herein, the amino acid positions of all constant regions and domains of the heavy and light chains are numbered according to the Kabat numbering system described in Kabat et al, Sequences of Proteins of Immunological Interest, 5 th edition, Public Health Service, National Institutes of Health, Bethesda, MD (1991), and are referred to herein as "numbering according to Kabat". Specifically, the Kabat numbering system of Kabat et al, Sequences of Proteins of Immunological Interest, 5 th edition, Public Health Service, National Institutes of Health, Bethesda, MD (1991) (see page 647-.
It must be noted that, as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes a plurality of such cells and equivalents thereof known to those skilled in the art, and so forth. Also, the terms "a" (or "an"), "one or more" and "at least one" may be used interchangeably herein.
It should also be noted that the terms "comprising," "including," and "having" may be used interchangeably.
Procedures and methods for converting amino acid sequences (e.g., peptide linkers or fusion polypeptides) into corresponding encoding nucleic acid sequences are well known to those skilled in the art. Thus, a nucleic acid is characterized by its nucleic acid sequence consisting of individual nucleotides and, as such, by the amino acid sequence of the peptide linker or fusion polypeptide it encodes.
The use of recombinant DNA technology enables the generation of derivatives of nucleic acids. Such derivatives may be modified at single or several nucleotide positions, for example, by substitution, alteration, exchange, deletion or insertion. Modification or derivatization can be carried out, for example, by means of site-directed mutagenesis. Such modifications can be readily carried out by those skilled in the art (see, e.g., Sambrook, J. et al, Molecular Cloning: A Laboratory Manual (1999) Cold Spring Harbor Laboratory Press, New York, USA; Hames, B.D. and Higgins, S.G., Nucleic acid hybridization-a practical approach (1985) IRL Press, Oxford, England).
Useful methods and techniques for practicing the invention are described, for example, in Ausubel, F.M, (ed.), Current Protocols in Molecular Biology, Volumes I to III (1997); glover, N.D. and Hames, B.D. editing, DNA Cloning A Practical Approach, Volumes I and II (1985), Oxford University Press; freshney, R.I. (eds.), Animal Cell Culture-a practical proproach, IRL Press Limited (1986); watson, J.D. et al, Recombinant DNA, second edition, CHSL Press (1992); winnacker, e.l., From Genes to Clones; n.y., VCH Publishers (1987); celis, J. editor, Cell Biology, second edition, Academic Press (1998); freshney, R.I., Culture of Animal Cells: A Manual of Basic Technique, second edition, Alan R.Liss, Inc., N.Y. (1987).
The term "about" means a range of +/-20% of the value followed by a numerical value. In one embodiment, the term "about" means a range of +/-10% followed by a numerical value. In one embodiment, the term "about" means a range of +/-5% followed by a numerical value.
The term "antibody" as used herein is used in the broadest sense and encompasses a variety of antibody structures, including but not limited to monoclonal, polyclonal and multispecific antibodies (e.g., bispecific antibodies), so long as they exhibit the desired antigen-binding activity. For example, a naturally occurring IgG antibody is a heterotetrameric glycoprotein of about 150,000 daltons, consisting of two identical light chains and two identical heavy chains that are disulfide bonded. From N-terminus to C-terminus, each heavy chain has a variable domain (VH), also known as a variable heavy domain or heavy chain variable region, followed by three constant heavy domains (CH1, CH2, and CH 3). Similarly, from N-terminus to C-terminus, each light chain has a variable domain (VL), also known as a variable light chain domain or light chain variable region, followed by a constant light Chain (CL) domain.
In certain embodiments, at least one domain of the fusion polypeptide is an antibody fragment. The term "antibody fragment" refers to a molecule other than an intact antibody, which includes a portion of an intact antibody that retains the ability to specifically bind to an antigen. Antibody fragments include, but are not limited to, Fab '-SH, F (ab')2, Fv, single chain Fab (scFab), single chain variable fragment (scFv), and single domain antibodies (dAb). For a review of certain antibody fragments, see Holliger and Hudson, Nature Biotechnology 23(2005) 1126-1136.
In one embodiment, the antibody fragment is a Fab, Fab '-SH or F (ab')2 fragment, particularly a Fab fragment. Papain digestion of whole antibodies produces two identical antigen-binding fragments, referred to as "Fab" fragments, each comprising a heavy and light chain variable domain (VH and VL, respectively) and a constant domain of the light Chain (CL) and a first constant domain of the heavy chain (CH 1). Thus, the term "Fab fragment" refers to antibody fragments that include a light chain comprising a VL domain and a CL domain, and a heavy chain fragment comprising a VH domain and a CH1 domain. Fab 'fragments differ from Fab fragments in that the Fab' fragment adds residues at the carboxy terminus of the CH1 domain that include one or more cysteines from the antibody hinge region. Fab '-SH is a Fab' fragment in which the cysteine residues of the constant domains carry a free thiol group. Pepsin treatment produces a F (ab')2 fragment having two antigen binding sites (two Fab fragments) and a portion of the Fc region. For a discussion of Fab and F (ab')2 fragments that include salvage receptor binding epitope residues and have increased half-life in vivo, see U.S. Pat. No. 5,869,046.
In another embodiment, the antibody fragment is a diabody, a triabody, or a tetrabody. Diabodies are antibody fragments with two antigen binding sites, which may be bivalent or bispecific. See, for example, EP 404,097; WO 1993/01161; hudson et al, nat. Med.9: 129-; and Hollinger et al, Proc. Natl. Acad. Sci. USA 90: 6444-. Trisomal and tetrasomal antibodies are also described by Hudson et al, Nature medicine (nat. Med.)9:129-134 (2003).
In yet another embodiment, the antibody fragment is a single chain Fab fragment. A "single chain Fab fragment" or "scFab" is a polypeptide consisting of an antibody heavy chain variable domain (VH), an antibody heavy chain constant domain 1(CH1), an antibody light chain variable domain (VL), an antibody light chain constant domain (CL) and a peptide linker, wherein the antibody domain and the linker have one of the following sequences in the N-terminal to C-terminal direction: a) VH-CH 1-linker-VL-CL, b) VL-CL-linker-VH-CH 1, c) VH-CL-linker-VL-CH 1, or d) VL-CH 1-linker-VH-CL. In particular, the peptide linker is a polypeptide of at least 30 amino acids, preferably between 32 and 50 amino acids. The single chain Fab fragment is stabilized via the native disulfide bond between the CL domain and the CH1 domain. In addition, these single chain Fab fragments may be further stabilized by creating interchain disulfide bonds via insertion of cysteine residues (e.g., position 44 in the variable heavy chain and position 100 in the variable light chain according to Kabat numbering).
In one embodiment, the antibody fragment is a Fab fragment with a domain crossover. As used herein, the term "domain crossing" means that in an antibody heavy chain VH-CH1 fragment and its corresponding cognate antibody light chain pair, i.e., in the antibody binding arm (i.e., in the Fab fragment), the domain sequence deviates from the native sequence in that at least one heavy chain domain is replaced by its corresponding light chain domain, and vice versa. There are three common types of domain crossing: (i) the intersection of CH1 and CL domains, which results in a domain-intersecting light chain having a VL-CH1 domain sequence and a domain-intersecting heavy chain fragment having a VH-CL domain sequence (or a full-length antibody heavy chain having a VH-CL-hinge-CH 2-CH3 domain sequence); (ii) domain crossing of the VH and VL domains, resulting in a domain-crossing light chain with a VH-CL domain and a domain-crossing heavy chain fragment with a VL-CH1 domain sequence; and (iii) a domain crossing of the complete light chain (VL-CL) and the complete VH-CH1 heavy chain fragment ("Fab crossing"), which results in a domain-crossing light chain having the VH-CH1 domain sequence and a domain-crossing heavy chain fragment having the VL-CL domain sequence (all of the above domain sequences are shown in an N-terminal to C-terminal orientation).
As used herein, the term "substituted with respect to each other" with respect to the respective heavy and light chain domains means that the aforementioned domains cross. Thus, when CH1 and CL domains are "substituted for one another," it is meant that the domains referred to in item (i) intersect and the resulting heavy and light chain domain sequences. Thus, when VH and VL are "substituted for" each other, it is meant that the domains mentioned in item (ii) are crossed; and when the CH1 and CL domains are "substituted for one another" and the VH1 and VL domains are "substituted for one another", it is meant that the domains mentioned in item (iii) are crossed. Bispecific antibodies comprising domain crossing are reported, for example, in WO 2009/080251, WO 2009/080252, WO 2009/080253, WO 2009/080254 and Schaefer, W.et al, Proc. Natl. Acad. Sci USA 108(2011) 11187-.
The fusion polypeptide according to the present invention may also include a Fab fragment comprising a domain intersection of the CH1 and CL domains as described in item (i) above, or a domain intersection of the VH and VL domains as described in item (ii) above. Fab fragments that specifically bind to the same antigen are constructed with the same domain sequence. Thus, where multiple Fab fragments with domain crossings are included in a multispecific antibody, the Fab fragments specifically bind to the same antigen.
An "isolated" fusion polypeptide is one that has been separated from components of its natural environment. In some embodiments, the fusion polypeptide is purified to greater than 95% or 99% purity as determined by, for example, electrophoresis (e.g., SDS-PAGE, isoelectric focusing (IEF), capillary electrophoresis), or chromatography (e.g., ion exchange or reverse phase HPLC). For a review of methods for assessing antibody purity, see, e.g., Flatman, S et al, j.chromanogr.b 848(2007) 79-87.
An "isolated" nucleic acid is a nucleic acid molecule that has been separated from components of its natural environment. An isolated nucleic acid includes a nucleic acid molecule that is contained in a cell that normally contains the nucleic acid molecule, but which is present extrachromosomally or at a chromosomal location that is different from its natural chromosomal location.
The term "monoclonal antibody" as used herein refers to an antibody obtained from a substantially homogeneous population of antibodies, i.e., each antibody comprising the population is identical and/or binds the same epitope, except for possible variant antibodies (e.g., comprising naturally occurring mutations or produced during the production of a monoclonal antibody preparation, such variants typically being present in minute amounts). In contrast to polyclonal antibody preparations, which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody in a monoclonal antibody preparation is directed against a single determinant on the antigen. Thus, the modifier "monoclonal" indicates that the characteristics of the antibody are obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. For example, monoclonal antibodies for use in accordance with the present invention can be prepared by a variety of techniques, including but not limited to hybridoma methods, recombinant DNA methods, phage display methods, and methods that utilize transgenic animals comprising all or part of a human immunoglobulin locus.
The "class" of antibodies refers to the type of constant domain or constant region that the heavy chain of an antibody has. There are five major classes of antibodies: IgA, IgD, IgE, IgG, and IgM, and some of them can be further divided into subclasses (isotypes), e.g., IgG1, IgG2, IgG3, IgG4, IgA1, and IgA 2. The heavy chain constant domains corresponding to different classes of immunoglobulins are referred to as α, δ, ε, γ, and μ, respectively.
The term "N-linked oligosaccharide" refers to an oligosaccharide that is linked to the peptide backbone at an asparagine amino acid residue by way of an asparagine-N-acetylglucosamine linkage. N-linked oligosaccharides are also known as "N-glycans". All N-linked oligosaccharides have a common pentasaccharide core Man3GlcNAc 2. They differ in the presence of peripheral sugars (such as N-acetylglucosamine, galactose, N-acetylgalactosamine, fucose and sialic acid) and the number of their branches (also called antennal). Optionally, the structure may further comprise a core fucose molecule and/or a xylose molecule. The N-linked oligosaccharide is attached to the nitrogen of the asparagine or arginine side chain. The N-glycosylation motif, i.e., the N-glycosylation site, includes the Asn-X-Ser/Thr consensus sequence, where X is any amino acid except proline. Thus, the amino acid residue in the N-glycosylation site can be any amino acid residue in the Asn-X-Ser/Thr consensus sequence, wherein X is any amino acid other than proline. In one embodiment, is an amino acid residue in an N-glycosylation site Asn, Ser, or Thr.
The term "O-linked oligosaccharide" means an oligosaccharide linked to a peptide backbone at a threonine or serine amino acid residue. In one embodiment, is an amino acid residue in the O-glycosylation site Ser or Thr.
The term "variable region" or "variable domain" refers to the domain of an antibody heavy or light chain that is involved in binding of the antibody to an antigen. The variable domains of the heavy and light chains of natural antibodies (VH and VL, respectively) generally have similar structures, with each domain comprising four conserved Framework Regions (FR) and three hypervariable regions (HVRs). (see, e.g., Kindt, t.j. et al, Kuby Immunology, 6 th edition, w.h.freeman and co., n.y. (2007), page 91) a single VH or VL domain may be sufficient to confer antigen binding specificity. Furthermore, antibodies that bind a particular antigen can be isolated using the VH or VL domains, respectively, from antibodies that bind the antigen to screen libraries of complementary VL or VH domains. See, e.g., Portolano, S. et al, J.Immunol.150(1993)880- "887; clackson, T.et al, Nature 352(1991) 624-.
As used herein, the term "post-translational modification" generally refers to a post-translational modification of a polypeptide. Non-limiting examples of post-translational modifications are the attachment of functional groups (such as acetate, phosphate, lipid, or carbohydrate) to polypeptides in cells (i.e., in vivo).
In more detail, the term "post-translational modification" denotes covalent modification of amino acid residues within a polypeptide after biosynthesis. Post-translational modifications can be made on the amino acid side chain by modifying existing functional groups or introducing new functional groups. Known post-translational modifications of amino acids of different proteins are, for example,
ala: n-acetylation
Arg: deimination, methylation
Asn: deamidation, N-linked glycosylation
Asp: isomerization of
Cys: disulfide bond formation, oxidation, N-acetylation
Gln: cyclization of
Glu: cyclization, gamma-carboxylation
Gly: n-myristoylation, N-acetylation
His: phosphorylation of
Lys: acetylation, ubiquitination, methylation, hydroxylation
Met: n-acetylation, oxidation
Pro: hydroxylation and subsequent further modification
Ser: phosphorylation, O-linked glycosylation
Thr: phosphorylation, O-linked glycosylation, N-acetylation
Trp: oxidation by oxygen
Tyr: phosphorylation of
Val: n-acetylation.
The term "glycosylation" as used herein denotes the covalent attachment of one or more carbohydrate moieties to amino acid residues within a polypeptide. Generally, glycosylation is a post-translational event that can occur in the intracellular environment of a cell or in a cell extract. The term glycosylation includes, for example, the addition of one or more carbohydrate moieties at a common site of glycosylation. One example of glycosylation involves the addition of one or more xylose residues to the polypeptide. For example, a consensus sequence for xylose addition includes the sequence [ D/E GSG D/E ]. The term "O-glycosylation" means that one or more carbohydrate moieties are covalently linked to an oxygen atom in the side chain of an amino acid residue in a polypeptide, such as the oxygen of a serine or threonine.
The term "phosphorylated" denotes a phosphate group (PO)4) With polypeptide internal amino groupsCovalent attachment of acid residues. In general, phosphorylation is a post-translational event, which may occur in the intracellular environment of a cell or in a cell extract. The term phosphorylation includes, for example, the addition of a phosphate group to the free hydroxyl group of serine.
The homogeneity of these polypeptides is important if the polypeptides are to be used in therapy. Surprisingly, as demonstrated herein, it has been found that changes in the amino acid sequence of a peptide linker in a fusion polypeptide reduce or eliminate post-translational modifications at the C-terminus of the peptide linker. Reducing or even eliminating post-translationally added modifications improves polypeptide homogeneity.
As used herein, the term "polypeptide" means a polymer comprising ten or more (up to 650) naturally occurring amino acid residues conjugated to each other by peptide bonds.
The term "amino acid" as used herein includes alanine (Ala (three letter code) or a (one letter code)), arginine (Arg or R), asparagine (Asn or N), aspartic acid (Asp or D), cysteine (Cys or C), glutamine (Gln or Q), glutamic acid (Glu or E), glycine (Gly or G), histidine (His or H), isoleucine (Ile or I), leucine (Leu or L), lysine (Lys or K), methionine (Met or M), phenylalanine (Phe or F), proline (Pro or P), serine (Ser or S), threonine (Thr or T), tryptophan (Trp or W), tyrosine (Tyr or Y), and valine (Val or V).
As used herein, the term "peptide linker" means a synthetic amino acid sequence that links (connect/link) two polypeptide sequences, e.g., fuses two polypeptide domains together. The peptide linker then itself represents the third domain. Accordingly, a fusion polypeptide comprising a peptide linker according to the invention comprises at least three domains: a first (first polypeptide) domain, a second (peptide linker) domain, and a third (second polypeptide) domain. The term "synthetic" as used herein denotes non-naturally occurring amino acid sequences. Thus, in one embodiment, the peptide linker according to the invention is a synthetic peptide linker.
The peptide linker of the invention links two amino acid sequences via a peptide bond. In one embodiment of an aspect of the invention, the peptide linker of the invention links the first biologically active polypeptide (first domain) to the second polypeptide (third domain) in a linear sequence. In another embodiment of an aspect of the invention, a peptide linker connects two biologically active polypeptides.
In the context of fusion polypeptides, a "linear sequence" or "sequence" refers to the order of amino acids in a fusion polypeptide in the amino-terminal to carboxy-terminal direction (N-terminal to C-terminal direction), wherein residues adjacent to each other in the sequence are contiguous in the primary structure of the polypeptide.
As used herein, the terms "conjugated," "linked," "fused," or "fusion" may be used interchangeably. These terms refer to the joining together of two or more polypeptides or domains by recombinant means. As used herein, the term "genetically fused," "genetically linked," or "genetic fusion" means that two or more polypeptides are co-linear, covalently linked, or attached via their respective peptide backbones by recombinant expression of a single nucleic acid molecule encoding a fusion polypeptide in a eukaryotic cell. Such gene fusions result in the expression of a single contiguous gene sequence. Preferred genes are fused in-frame, i.e., two or more Open Reading Frames (ORFs) are fused to form a continuous longer ORF in a manner that maintains the correct reading frame of the original ORF. Thus, the resulting recombinant fusion polypeptide is a single polypeptide comprising two or more polypeptide domains corresponding to the polypeptides encoded by the original ORF (whose segments are not naturally conjugated to each other).
The peptide linker of the present invention differs from the traditional Gly/ser (gs) -peptide linkers of the art in that the presently claimed peptide linker at least lacks the C-terminal serine amino acid residue and the next amino acid residue immediately thereafter should not be a serine, threonine or proline amino acid residue.
As used herein, the term "gly-ser linker" or "GS-peptide linker" refers to a polypeptide consisting of glycine and serine residues. Exemplary Gly/ser-peptide linkers include the amino acid sequence (Gly)4Ser)n. In one embodiment of all aspects of the invention, the peptide linker comprises or consists of a peptide having one or more amino acid substitutions, deletions and/or additionsAn additional GS-peptide linker and a GS-peptide linker lacking a C-terminal serine residue.
In one embodiment of an aspect of the invention, the fusion polypeptide of the invention is a "chimeric" polypeptide. Such chimeric fusion polypeptides comprise a first amino acid sequence (first domain) which is not naturally linked to a second amino acid sequence (third domain) in nature by means of a peptide linker (second domain) according to the invention. The amino acid sequences of the first and third domain polypeptides, which may be present in separate proteins or may be present in the same protein but are separated from each other, are joined together in a new arrangement in a fusion polypeptide. For example, a chimeric fusion polypeptide can be produced by creating and translating a polynucleotide in which the domains are encoded in the desired relationship. Exemplary chimeric fusion polypeptides include fusion polypeptides comprising a peptide linker of the invention.
Fusion polypeptides comprising a peptide linker of the invention may be monomeric or multimeric. For example, in one embodiment, the fusion polypeptide of the invention is a dimer or tetramer.
In one embodiment of an aspect of the invention, the dimer of the fusion polypeptide of the invention is a homodimer comprising two identical monomeric fusion polypeptides of the invention.
In another embodiment of an aspect of the invention, the dimer of the fusion polypeptide of the invention is a heterodimer comprising two different monomeric subunits, wherein at least one monomeric subunit is a fusion polypeptide according to the invention.
In another embodiment of an aspect of the invention, the tetramer of the fusion polypeptide of the invention is a heterotetramer comprising at least three different monomer subunits, wherein at least one monomer subunit is a fusion polypeptide according to the invention.
In certain embodiments, in addition to including one or more peptide linkers according to the present invention, the fusion polypeptide may include one or more traditional GS-peptide linkers at other locations within the fusion polypeptide.
Fusion polypeptides according to aspects of the invention include at least one biologically active moiety. A biologically active moiety refers to a moiety that has the ability to one or more of: in a biological context, to localize or target a molecule to a desired site or cell, to perform a function, to perform an action or to react. For example, the term "biologically active moiety" refers to a biologically active molecule or portion thereof that binds to a component of a biological system (e.g., a protein in serum or on the surface of a cell or in the matrix of a cell) and that produces a biological effect (e.g., measured by a change in the active moiety and/or a component that binds thereto (e.g., lysis, transmission of a signal, or enhancement or inhibition of a biological response of the active moiety and/or a component that binds thereto in a cell or in a subject)).
Exemplary biologically active portions include, for example, antigen-binding fragments (e.g., f (ab), scFv, VH domain, or VL domain) of antibody molecules or portions thereof (e.g., for use as targeting moieties or to confer, induce, or block a biological response), ligand-binding portions of receptors or receptor-binding portions of ligands, as well as Fc region polypeptides, intact Fc regions, scFc domains, enzymes, and the like. Furthermore, as used herein, the term "biologically active moiety" includes, for example, a moiety that may not be active when present alone in monomeric form, but is biologically active when paired with a second moiety in the context of a dimeric molecule.
In one embodiment of an aspect of the invention, the fusion polypeptide of the invention comprises a binding site in at least one of its fusion domains, or the binding site is obtained by fusion of domains in the fusion polypeptide. The term "binding domain" or "binding site" as used herein denotes a portion, region or site of a polypeptide that mediates specific interaction with a target molecule (e.g., an antigen, ligand, receptor, substrate or inhibitor). Exemplary binding domains include an antigen binding site (e.g., a VH or VL domain or a pair thereof) or a molecule (e.g., an antibody) comprising such a binding site, a receptor binding domain of a ligand, a ligand binding domain of a receptor, or a catalytic domain.
In one embodiment of an aspect of the invention, the fusion polypeptide comprises (has) at least one binding domain that specifically binds to a target. In one embodiment of an aspect of the invention, the binding domain comprises or consists of an antigen binding site (e.g., comprises a variable heavy chain domain and a variable light chain domain or at least six CDRs from an antibody or at least a functional portion thereof comprising only an isolated VH or VL or three CDRs.
In one embodiment of an aspect of the invention, the fusion polypeptide is a modified antibody chain or a modified antibody. As used herein, the terms "modified antibody chain" and "modified antibody" include synthetic forms of antibody chains or antibodies that have been altered, e.g., by altering the native domain structure, sequence, or number, so that they do not occur naturally. For example, heavy chain molecules conjugated to scFv molecules are included, and the like. In addition, the term "modified antibody" includes multivalent forms of antibodies (e.g., trivalent antibodies, tetravalent antibodies, etc., that bind to three or more copies of the same antigen).
In one embodiment of all aspects of the invention, the fusion polypeptide comprises an Fc region, or a domain thereof or an Fc region polypeptide.
The term "antibody-dependent cellular cytotoxicity (ADCC)" is a function mediated by Fc-receptor binding and refers to the lysis of target cells by an antibody as reported herein in the presence of effector cells. In one embodiment, ADCC is measured by treating a preparation of CD 19-expressing erythroid cells (e.g., K562 cells expressing recombinant human CD 19) with an antibody comprising a fusion polypeptide as reported herein in the presence of effector cells, such as freshly isolated PBMCs (peripheral blood mononuclear cells) or purified effector cells, such as monocytes or NK (natural killer) cells from buffy coats. Target cells were labeled with 51Cr and subsequently incubated with antibodies. The labeled cells were incubated with effector cells and the supernatant was analyzed for released 51 Cr. Controls included incubating target endothelial cells with effector cells, but not with an antibody including a fusion polypeptide. The ability of an antibody to induce an initial step in mediating ADCC is investigated by measuring the binding of the antibody to cells expressing Fc γ receptors, such as cells recombinantly expressing Fc γ RI and/or Fc γ RIIA or NK cells (essentially expressing Fc γ RIIIA). In one embodiment, binding to Fc γ R on NK cells is measured.
The term "effector function" refers to those biological activities attributable to the Fc region of an antibody, which vary with the antibody species. Examples of antibody effector functions include: c1q binding and Complement Dependent Cytotoxicity (CDC); fc receptor binding; antibody-dependent cell-mediated cytotoxicity (ADCC); phagocytosis; down-regulation of cell surface receptors (e.g., B cell receptors); and B cell activation.
Fc receptor binding-dependent effector function can be mediated by the interaction of the Fc region of an antibody with Fc receptors (fcrs), specialized cell surface receptors on hematopoietic cells. Fc receptors belong to the immunoglobulin superfamily and have been shown to mediate the removal of antibody-coated pathogens by phagocytosis of immune complexes and to mediate lysis of red blood cells and various other cellular targets (e.g., tumor cells) coated with the corresponding antibodies via antibody-dependent cell-mediated cytotoxicity (ADCC) (see, e.g., Van de Winkel, j.g., and Anderson, c.l., j.leukc.biol.49 (1991), 511-. FcR is defined by its specificity for immunoglobulin isotypes: the Fc receptor for IgG antibodies is called Fc γ R. Fc-receptor binding, e.g., as Ravetch, J.V. and Kinet, J.P., Annu.Rev.Immunol.9(1991) 457-492; capel, P.J. et al, Immunomethods 4(1994) 25-34; de Haas, M. et al, J.Lab.Clin.Med.126(1995) 330-; and Gessner, J.E., et al, Ann.Hematol.76(1998) 231-.
Receptor cross-linking to the Fc region of IgG antibodies (fcyr) can trigger a variety of effector functions including phagocytosis, antibody-dependent cellular cytotoxicity, and inflammatory mediator release, as well as immune complex clearance and modulation of antibody production. In humans, three classes of Fc γ rs have been characterized, which are:
fc γ RI (CD64) binds monomeric IgG with high affinity and is expressed on macrophages, monocytes, neutrophils and eosinophils. In Fc region IgG, a modification at least one of amino acid residues E233-G236, P238, D265, N297, a327 and P329 (numbering according to EU index of Kabat) reduces binding to Fc γ RI. Are positioned between 233 and 23Replacement of the IgG2 residue at position 6 with IgG1 and IgG4 reduced binding to Fc γ RI by 103Doubling and abrogating the response of human monocytes to antibody-sensitized erythrocytes (Armour, K.L. et al, Eur.J. Immunol.29(1999) 2613-2624),
fc γ RII (CD32) binds to complexed IgG with medium to low affinity and is widely expressed. The receptors can be divided into two subtypes, Fc γ RIIA and Fc γ RIIB. Fc γ RIIA is found on many cells involved in killing (e.g., macrophages, monocytes, neutrophils) and appears to be able to activate the killing process. Fc γ RIIB appears to play a role in the inhibition process and is found on B-cells, macrophages, and mast cells and eosinophils. On B cells, the Fc γ RIIB appears to act to inhibit further immunoglobulin production and to inhibit isotype switching to, for example, IgE class. On macrophages, Fc γ RIIB can inhibit phagocytosis mediated by Fc γ RIIA. In eosinophils and mast cells, type B may help inhibit activation of these cells by binding IgE to its respective receptor. Reduced binding to Fc γ RIIA was found, for example, for antibodies comprising an IgG Fc region having a mutation at least one of amino acid residues E233-G236, P238, D265, N297, a327, P329, D270, Q295, a327, R292 and K414 (numbering according to the EU index of Kabat),
fc γ RIII (CD16) binds IgG with medium to low affinity and is present in both types. Fc γ RIIIA is found on NK cells, macrophages, eosinophils, and some monocytes and T cells, and mediates ADCC. Fc γ RIIIB is highly expressed on neutrophils. Reduced binding to Fc γ RIIIA was found, for example, for antibodies comprising an IgG Fc region with a mutation at least one of amino acid residues E233-G236, P238, D265, N297, a327, P329, D270, Q295, a327, S239, E269, E293, Y296, V303, a327, K338 and D376 (numbering according to the EU index of Kabat).
Mapping of binding sites for Fc receptors on human IgG1, the above mentioned mutation sites and methods for measuring binding to Fc γ RI and Fc γ RIIA are described in Shields, R.L. et al, J.biol.chem.276(2001) 6591-6604.
A "therapeutically effective amount" of a pharmaceutical agent (e.g., a pharmaceutical formulation) is an amount effective to achieve the desired therapeutic or prophylactic result at the necessary dosage and time period.
As used herein, the term "Fc-receptor" refers to an activated receptor characterized by the presence of cytoplasmic ITAM sequences associated with the receptor (see, e.g., ravatch, j.v. and Bolland, s., annu, rev. immunol.19(2001) 275-. Such receptors are Fc γ RI, Fc γ RIIA and Fc γ RIIIA. The term "does not bind Fc γ R" means that at an antibody concentration of 10 μ g/ml, the binding of the antibody to NK cells as reported herein is 10% or less of the binding of the anti-OX 40L antibody lc.001 as reported in WO 2006/029879.
The term "Fc region" is used herein to define the C-terminal region of an immunoglobulin heavy chain, which comprises at least a portion of a constant region. The term includes native sequence Fc regions and variant Fc regions. In one embodiment, the human IgG heavy chain Fc region extends from Cys226 or from Pro230 to the carboxy-terminus of the heavy chain. However, the C-terminal lysine (Lys447) of the Fc region may or may not be present. The Fc region is a dimer of two Fc region polypeptides.
The Fc region of the antibody is directly involved in complement activation, C1q binding, C3 activation, and Fc receptor binding. Although the effect of antibodies on the complement system depends on certain conditions, binding to C1q is caused by defined binding sites in the Fc region. Such binding sites are known in the art and are described, for example, by Lukas, T.J. et al, J.Immunol.127(1981) 2555-2560; brunhouse, r. and Cebra, j.j., mol. immunol.16(1979) 907-; burton, D.R. et al, Nature 288(1980) 338-344; thommesen, J.E., et al, mol.Immunol.37(2000) 995-1004; idusogene, E.E.et al, J.Immunol.164(2000) 4178-; hezareh, M. et al, J.Virol.75(2001) 12161-; morgan, A. et al, Immunology 86(1995)319- "324; and EP 0307434. Such binding sites are, for example, L234, L235, D270, N297, E318, K320, K322, P331 and P329 (numbering according to EU index of Kabat). Antibodies of subclasses IgG1, IgG2, and IgG3 generally show complement activation, C1q binding, and C3 activation, whereas IgG4 does not activate the complement system, does not bind C1q, and does not activate C3.
The "Fc region of an antibody" is a term well known to the skilled artisan and is defined based on the papain cleavage of the antibody. In one embodiment, the Fc region is a human Fc region. In one embodiment, the Fc region is a human IgG4 subclass comprising mutations S228P and/or L235E (numbering according to the EU index of Kabat). In one embodiment, the Fc region is the human IgG1 subclass, which includes mutations L234A, L235A, and optionally P329G (numbering according to the EU index of Kabat).
As used herein, the term "Fc region polypeptide" means that portion of a single immunoglobulin heavy chain that begins in the hinge region upstream of the papain cleavage site (i.e., residue 216 in IgG, carries the first residue of the heavy chain constant region to 114) and terminates at the C-terminus of the antibody. Thus, a complete Fc region polypeptide includes at least a hinge domain, a CH2 domain, and a CH3 domain. As used herein, the term "Fc region" refers to a dimerized Fc region polypeptide (e.g., whether prepared in a traditional two-polypeptide chain form or as a single chain Fc region) that is similar to the Fc region of a native antibody.
As used herein, the term "Fc region polypeptide portion" includes an amino acid sequence of, or derived from, an Fc region polypeptide. In certain embodiments, the Fc region polypeptide portion includes at least one of: a hinge (e.g., upper, middle, and/or lower hinge region) domain, a CH2 domain, a CH3 domain, a CH4 domain, or a variant, portion, or fragment thereof. In other embodiments, the Fc region polypeptide portion includes a complete Fc region polypeptide (i.e., the hinge domain, CH2 domain, and CH3 domain). In one embodiment, the Fc region polypeptide portion includes a hinge domain (or portion thereof) fused to a CH3 domain (or portion thereof). In another embodiment, the Fc region polypeptide portion comprises a CH2 domain (or portion thereof) fused to a CH3 domain (or portion thereof). In another embodiment, the Fc region polypeptide portion consists of a CH3 domain or portion thereof. In another embodiment, the Fc region polypeptide portion consists of a hinge domain (or portion thereof) and a CH3 domain (or portion thereof). In another embodiment, the Fc region polypeptide portion consists of a CH2 domain (or portion thereof) and a CH3 domain. In another embodiment, the Fc region polypeptide portion consists of a hinge domain (or portion thereof) and a CH2 domain (or portion thereof). In one embodiment, the Fc region polypeptide portion lacks at least a portion of a CH2 domain (e.g., all or part of a CH2 domain).
In one embodiment, the Fc region polypeptide comprises at least a portion of an Fc region known in the art required for FcRn binding, referred to herein as a neonatal receptor (FcRn) binding partner. An FcRn binding partner is a molecule or part thereof that can be specifically bound by the FcRn receptor, which is then actively transported by the FcRn receptor of the FcRn binding partner. Specific binding refers to two molecules that form a complex that is relatively stable under physiological conditions. Specific binding is characterized by high affinity and low to intermediate capacity, as opposed to non-specific binding, which typically has low affinity and intermediate to high capacity.
Generally, when the affinity constant KA is higher than 106M-1Or more preferably higher than 108M-1Binding is then considered specific.
The FcRn receptor has been isolated from several mammalian species including humans. The sequences of human, monkey, rat and mouse FcRn are known (Story et al, j.exp.med.180(1994) 2377). The FcRn receptor binds IgG at relatively low pH (but not other immunoglobulin classes such as IgA, IgM, IgD and IgE), actively transports IgG transcellularly in the lumen towards the serosa, and then releases IgG when relatively high pH is found in the interstitial fluid.
FcRn binding partners encompass molecules which can be specifically bound by the FcRn receptor, including intact IgG, the Fc region of IgG and other fragments which include the intact binding region of the FcRn receptor. The portion of the Fc region of IgG that binds to the FcRn receptor has been described based on X-ray crystallography (Burmeister et al, Nature 372(1994) 379). The main contact region of Fc to FcRn is near the junction of the CH2 and CH3 domains. The Fc-FcRn contacts are all within a single Ig heavy chain polypeptide. FcRn binding partners include intact IgG, the Fc region of IgG, single Fc region polypeptides, and other fragments of IgG that include the intact binding region of FcRn. The primary contact sites include amino acid residues 248, 250, 257, 272, 285, 288, 290, 291, 308, 311 and 314 of the CH2 domain and amino acid residues 385, 387, 428 and 433, 436 of the CH3 domain.
The fusion polypeptide of the invention comprises at least one peptide linker of the invention. In one embodiment, the fusion polypeptide includes between 1 and 10 peptide linkers. In one embodiment, two or more peptide linkers are present in the fusion polypeptide of the invention. In another embodiment, a fusion polypeptide of the invention comprises 1, 2, 3 or 4 peptide linkers of the invention.
The peptide linker of the invention may occur once at a given position, or may occur multiple times at different positions within the same fusion polypeptide.
The peptide linker of the invention is modified by those skilled in the art such that the C-terminal serine amino acid is removed or substituted with a glycine residue, provided that the next amino acid residue from the fusion polypeptide is not a serine, threonine, or proline amino acid residue.
The peptide linkers of the present invention may have different lengths. In one embodiment, the peptide linker of the invention is about 6 to about 75 amino acids in length. In another embodiment, the peptide linker of the invention is about 6 to about 50 amino acids in length. In another embodiment, the peptide linker of the invention is about 10 to about 40 amino acids in length. In another embodiment, the peptide linker of the invention is about 15 to about 35 amino acids in length. In another embodiment, the peptide linker of the invention is about 15 to about 20 amino acids in length. In another embodiment, the peptide linker of the invention is about 15 amino acids in length.
The position of the peptide linker of the invention may vary depending on the domain of the fusion polypeptide to which it is attached. Although various specific examples of fusion polypeptides comprising a peptide linker are disclosed herein, it is understood that the peptide linker can be located at least anywhere in the recombinant fusion polypeptide where the peptide linker is currently located. Peptide linkers are used so frequently in protein engineering that they have become standard assembly parts in synthetic biology.
Some examples of the use of peptide linkers currently recognized in the art include use in: scFv molecules (free et al, FEBS 320(1993)97), single chain immunoglobulin molecules (Shun et al, Proc. Natl. Acad. Sci. USA 90(1993)7995), miniantibodies (Hun et al, Cancer Res.56(1996)3055), CH2 domain deleted antibodies (Mueller et al, Proc. Natl. Acad. Sci. USA.87(1990)5702), single chain bispecific antibodies (Schertz et al, Cancer Res.65(2005)2882), full length IgG class bispecific antibodies (Marvin et al, Acta pharm. sin.26(2005)649, Michelson et al, MAbs 1(2009)128, Routt et al, Prot. Eng.Des.137 (137), scFv fusion proteins (Deee et al, Brit. J.811), Brit. 811. 86(2002) protein, Biotech (3642) complementary to the Fc (3642) and complementary Fc (Biotech) molecules (2008/143954).
Peptide linkers can be attached to the N-terminus or C-terminus (or both) of the polypeptide where they are used for fusion with other polypeptides.
In another embodiment, the peptide linker of the invention can be used to genetically fuse two biologically active polypeptides (each biologically active polypeptide being a domain) wherein each polypeptide is individually biologically active.
In another embodiment, the peptide linker of the invention is used to fuse two polypeptides to each other, wherein neither polypeptide is biologically active alone, but when fused to a gene.
For example, in one embodiment, the peptide linker of the invention may be used to genetically fuse VH and VL: A-L-B, wherein A is VH or VL, B is VH or VL, and L is a peptide linker according to the invention or A-L-B-L, wherein A is VH or VL, B is VH or VL, and L is a peptide linker according to the invention.
In another embodiment, a peptide linker can be used to fuse a biologically active polypeptide to an intact Fc region, an Fc region polypeptide portion, or an scFc region gene: C-L-Fc, wherein C is a biologically active polypeptide, L is a peptide linker according to the invention, and Fc is an Fc region (e.g., a single chain or a traditional double polypeptide chain), an Fc region polypeptide portion, or an scFc region. For example, in one embodiment, C comprises an scFv molecule (e.g., comprising VH-L-VL or VL-L-VH, where L is a peptide linker) and Fc consists of an Fc region polypeptide (hinge-CH 2-CH3 domain) or an scFc region, thereby forming an scFv-Fc fusion protein or scFv-scFc fusion protein. In another embodiment, C comprises scFv molecules (e.g., comprising VH-L-VL or VL-L-VH, wherein L is a peptide linker and Fc is a CH3 domain, thereby forming a minibody. in another embodiment, C comprises two scFv molecules in tandem with an Fc region polypeptide portion that is a CH3 domain, thereby forming a tetravalent minibody.
Tetravalent miniantibodies can also be formed using the following format: a-L-B-L-Fc-L-a-L-B, wherein a and B are each one of a VH or VL domain, L is a peptide linker according to the invention, and Fc is a CH3 domain or scFc region.
In another embodiment, the fusion polypeptide of the invention may have the following form: D-L-a-L-B, wherein D is a complete antibody molecule, L is a peptide linker according to the invention, and a and B are each a VH or VL domain. Such constructs produce a C-terminal tetravalent antibody molecule.
In another embodiment, the fusion polypeptide of the invention may have the following form: a-L-B-L-D, wherein D is a complete antibody molecule, L is a peptide linker according to the invention, and a and B are each a VH or VL domain. Such constructs produce N-terminal tetravalent antibody molecules. In such constructs, the A-L-B (scFv) portion of the molecule may be genetically fused to the light or heavy chain variable region.
In another embodiment, the peptide linker of the invention may be used to fuse the CH3 domain to the hinge region. In another embodiment, the peptide linker of the invention may be used to fuse the CH3 domain to the CH1 domain. In yet another embodiment, a peptide linker according to the invention may be used as a spacer between the hinge region and the CH2 domain. Preferred positions for the peptide linker according to the invention are between the Fc region or Fc region polypeptide and the scFv or Fab.
Where more than one binding site is included in a polypeptide, it will be understood that such molecules may be monospecific or multispecific, i.e. the binding sites may be the same or may be different.
Peptide linkers can be introduced into the polypeptide sequence using techniques known in the art. Modification can be confirmed by DNA sequence analysis. Plasmid DNA may be used to transform host cells for stable production of the produced polypeptide.
The fusion polypeptides of the invention comprise at least one biologically active polypeptide (domain). Such polypeptides may have biological activity as a single molecule, or may need to be associated with another polypeptide (e.g., when linked to a second polypeptide via a peptide linker or when present in a polypeptide dimer).
In one embodiment, the fusion polypeptide of the invention comprises only one biologically active polypeptide (domain) (creating a molecule that is monomeric with respect to the biologically active polypeptide, but can be monomeric or dimeric with respect to the number of polypeptide chains). In another embodiment, the fusion polypeptide of the invention comprises more than one biologically active polypeptide (domain), e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10 or more biologically active polypeptides.
As used herein, the term "biologically active polypeptide" is not intended to include chemically effector moieties (e.g., toxic moieties, detectable moieties, etc.) that can be added to the polypeptide by chemical means.
In one embodiment of the invention, the biologically active polypeptide is operably linked to the N-terminus of the Fc region polypeptide or a portion thereof via a peptide linker according to the invention. In another embodiment, the biologically active polypeptide is operably linked to the C-terminus of the Fc region polypeptide via a peptide linker according to the invention.
In other embodiments, two or more biologically active polypeptides are linked to each other consecutively (e.g., via a peptide linker according to the invention). In one embodiment, the tandem biologically active polypeptide arrangement is operably linked to the C-terminus or N-terminus of the Fc region polypeptide, or a portion thereof, via a peptide linker according to the present invention.
In one embodiment, the fusion polypeptide of the invention includes at least one of an antigen binding site (e.g., of an antibody, antibody variant, or antibody fragment), a receptor binding portion of a ligand, or a ligand binding portion of a receptor.
In one embodiment, the biologically active polypeptide comprises an antigen binding site.
In certain embodiments, the fusion polypeptides of the invention have at least one binding site specific for a target molecule that mediates a biological effect. In one embodiment, the binding site modulates activation or inhibition of a cell (e.g., by binding to a cell surface receptor and causing transmission of an activation or inhibition signal). In one embodiment, the binding site is capable of initiating signal transduction, which results in cell death (e.g., by a cell signal-induced pathway, by complement fixation, or exposure to a payload present on the binding molecule (e.g., a toxic payload)), or which modulates a disease or disorder in the subject (e.g., by mediating or promoting cell killing, by promoting fibrin clot lysis or promoting clot formation, or by modulating the amount of a bioavailable substance (e.g., by increasing or decreasing the amount of a ligand, such as TNF), in the subject)). In another embodiment, the fusion polypeptide of the invention has at least one binding site specific for an antigen to be reduced or eliminated (e.g., a cell surface antigen or a soluble antigen).
In another embodiment, binding of a fusion polypeptide of the invention to a target molecule (e.g., an antigen) results in the reduction or elimination of the target molecule, e.g., from the tissue or circulation.
In another embodiment, the fusion polypeptide has at least one binding site specific for a target molecule and can be used to detect the presence of the target molecule (e.g., detect a contaminant or diagnose a condition or disorder).
In yet another embodiment, the fusion polypeptide of the invention includes at least one binding site that targets the molecule to a specific site (e.g., tumor cell, immune cell, or blood clot) in the subject.
In certain embodiments, a fusion polypeptide of the invention can include two or more biologically active polypeptides. In one embodiment, the biologically active polypeptides are the same. In another embodiment, the biologically active polypeptides are different.
In certain particular embodiments, the fusion polypeptides of the invention are multispecific, e.g., having at least one binding site that binds to a first molecule or epitope of a molecule and at least one second binding site that binds to a second molecule or to a second epitope of the first molecule. The multispecific binding molecules of the present invention may comprise at least two binding sites. In certain embodiments, at least one binding site of a multispecific binding molecule of the invention is an antigen-binding region of an antibody or antigen-binding fragment thereof (e.g., an antibody or antigen-binding fragment).
The following examples and figures are provided to aid the understanding of the present invention, the true scope of which is set forth in the appended claims. It will be appreciated that modifications may be made to the procedures set forth without departing from the spirit of the invention.
Summary of examples:
fusion protein 1-HC 1-F:
the fusion protein 1 comprises two peptide linkers connecting a first non-IgG protein to a second non-IgG protein and an IgG heavy chain domain in an N-terminal to C-terminal direction. Fusion protein 1 was found to have phosphorylation of the C-terminal serine residue in linker GGGGSGGGGSRE SEQ ID NO:09 (see FIGS. 1-8). In addition to phosphorylation, xylosylation is also present. The phosphorylation results in a mass shift of +79 Da. The xylosylation results in a mass shift of +132 Da. FIG. 1 shows a deconvoluted total mass spectrum of deglycosylated and reduced HC 1-F. In addition to the expected HC1-F, another peak indicating the presence of a strong +79Da HC1-F product variant was also seen. The modification was confirmed to be present in the linker of SEQ ID NO:09 by LC-MS analysis of a tryptic digest of HC 1-F. FIG. 3 shows the Extract Ion Current (EIC) chromatograms (z ═ 2 and 3) for HC1-F (A) unmodified (elution time 36.5 minutes) and (B) +79.97Da modified (elution time 37.6 minutes) trypsin glycine-serine linker peptide (X9 GGGGSGGGGSR; SEQ ID NO: 07). The additional 79.97Da was confirmed to be localized on the C-terminal serine residue of the peptide linker by collision induced dissociation of trypsin digested glycine-serine linker peptides with triple protonated unmodified of HC1-F and +79.97Da modified using ion trap MS/MS analysis, and electron transfer/high energy collision dissociation of triple protonated modified peptides using MS/MS analysis (see figure 4). This was further confirmed by labeling the synthetic polypeptide and enzymatic dephosphorylation (FIGS. 5 and 6). By using thermolysin digestion and LC-MS/MS, phosphorylation can be specifically localized to the glycine-serine linker I of fusion protein 1 (see FIGS. 7 and 8).
To reduce the heterogeneity of the HC1-F fusion protein, post-translational modifications are removed by substituting the terminal serine residue of the peptide linker with a different residue, rather than with proline or threonine, preferably with glycine or by simply deleting residues, provided that the next residue, i.e. the first residue of the polypeptide fused to the peptide linker, is also not a serine, threonine or proline residue.
Fusion protein 2-HC 2-F:
fusion protein 2 comprises a peptide linker linking the C-terminus of the antibody heavy chain to the shortened N-terminus of the antibody light chain. HC2-F was found to be expressed in linker SLSLPGGGGSGGGGSGGGGSGGGGSIQM SEQ ID NO:10 as O-glycosylation of serine residues (see FIGS. 9-13). FIG. 9 shows O-xylosylation and O-glycosylation on HC2-F after reduction and analysis by UHR-QTOF-ESI-MS. In the extracted ion chromatogram obtained from HC2-F in FIG. 10, O-glycans were identified in the peptide SLSLSLSLSLSLSSPGGGGGSGGGGSGGSGGGGSIQMTX 13(SEQ ID NO:10) including the glycine serine linker. Using EThcD/HCD MS2The positioning of GalNAc (+203Da) and GalNAc-Gal-Neu5Ac (+656Da) at the C-terminal serine residue of the GS peptide linker was confirmed (FIGS. 11 and 12). FIG. 13 shows the relative quantification of O-glycans of the HC peptide fragment of SEQ ID NO:10 relative to the sum of unmodified and O-xylosylated peptides for HC 2-F.
To reduce the heterogeneity of the HC2-F fusion protein, post-translational modifications are removed by substituting the terminal serine residue of the peptide linker with a different residue, rather than with proline or threonine, preferably with glycine or by simply deleting residues, provided that the next residue, i.e. the first residue of the polypeptide fused to the peptide linker, is also not a serine, threonine or proline residue.
Fusion protein 3-LC 3-F: fusion protein 3 includes a peptide linker that links the antibody light chain to the Fc region of the antibody. LC3-F was found to be expressed as O-glycosylation of threonine residues in linker GGGSGGGGSGGGGSGGGGSGGGGTCPPCPAPEAAGGPSVFLFPPKPK SEQ ID NO:11 (see FIGS. 14-16). FIG. 14B shows O-glycosylation on LC3-F generated in HEK cells after N-deglycosylation of intact proteins and analysis by UHR-QTOF-ESI-MS. GalNAc-Gal-2NeuAc (+948Da) modification was quantified at a level of about 20%. FIG. 14C shows intact LC3-F after N-deglycosylation, desialylation and analysis by UHR-QTOF-ESI-MS of intact proteins. FIG. 14D demonstrates that O-glycosylation is localized to the A chain of LC3-F after N-deglycosylation, desialylation and analysis by UHR-QTOF-ESI-MS of the reduced fusion protein 3.
FIG. 15 shows a deconvoluted mass spectrum of an LC3-F endoprotease digest (derived from Exxomyces incarnata (Akkermansia muciniphila); OpeRATOR; Genovis). FIG. 16 shows XICs of LC3-F after desialylation and tryptic digestion with (upper XIC) or without (lower XIC) OpeRATOR protease. The mass of the fragment corresponds to O-glycosylation of threonine residues of the hinge region. MS/MS of desialylated tryptic peptides digested with operrat protease, which localized O-glycosylation to the N-terminal threonine residue (fig. 17).
To reduce the heterogeneity of the LC3-F fusion protein, post-translational modifications were removed by substituting the first residue of the polypeptide fused to the peptide linker, i.e., the threonine residue, instead of using proline or serine, preferably glycine.
In the variant of LC3-F, the threonine residue at the terminus of the linker was deleted and replaced with a glycine residue such that the linker included the amino acid sequence GGGGSGGGGSGGGGSGGGGSGGGGSGGGGGCPPC SEQ ID NO: 23. FIG. 21 shows the LC3-F variants after N-deglycosylation of the intact protein and analysis by UHR-QTOF-ESI-MS. FIG. 22 shows LC3-F variants after N-deglycosylation, reduction and analysis by UHR-QTOF-ESI-MS. There was no O-glycosylation in the LC3-F variant.
Fusion protein 4-HC 4-F:
fusion protein 4 includes a peptide linker linking the non-IgG protein to the Fc region of the antibody. HC4-F was found to be expressed in the linker LGGGGSGGGGSRT SEQ ID NO:14 as O-fucosylation of serine residues (see FIGS. 18-19). FIG. 18 shows that O-fucosylation (+146Da) is present for HC4-F after thermolysin digestion, and that this O-fucosylation can be localized to peptide LGGGGSGGGGSRT (SEQ ID NO:14) by peptide map (XIC). FIG. 19 shows the results of adding the modified synthetic peptide (27-310nM) X8LGGGGSGGGGS (+ trehalose) RT (SEQ ID NO:15) to the tryptic digest for HC 4-F. The modified synthetic peptide co-eluted with the modified HC4-F tryptic peptide X8LGGGGSGGGGSRT +146Da and increased the area under the curve. Shown are unlabeled XICs (top panel) and labeled level increases (bottom panel). LC-MS/MS tryptic peptide mapping mapped O-fucosylation to terminal serine residues (FIG. 20). To reduce the heterogeneity of the HC4-F fusion protein, post-translational modifications are removed by substituting the terminal serine residue of the peptide linker with a different residue, rather than with proline or threonine, preferably with glycine or by simply deleting residues, provided that the next residue, i.e. the first residue of the polypeptide fused to the peptide linker, is also not a serine, threonine or proline residue.
The specific embodiment is as follows:
1. a fusion polypeptide comprising an amino acid sequence
GyX1                    (SEQ ID NO:04)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline, and
wherein y is an integer of 3 to 25 inclusive.
2. The fusion polypeptide according to embodiment 1, wherein the fusion polypeptide comprises an amino acid sequence
GyX1X2X3               (SEQ ID NO:05)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 may be any amino acid residue independent of each other, and
wherein y is an integer of 3 to 25 inclusive.
3. The fusion polypeptide according to any one of embodiments 1 to 2, wherein y is 4 to 20 and includes an integer of 4 to 20.
4. A fusion polypeptide according to any one of embodiments 1 to 3 wherein y is an integer from 5 to 15 and including 4 to 20.
5. A fusion polypeptide comprising an amino acid sequence
GnSGmX1X2X3           (SEQ ID NO:02)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
6. The fusion polypeptide according to any one of embodiments 1 to 5, wherein the fusion polypeptide comprises at least three domains
Wherein each of the three domains is a polypeptide of at least 10 amino acid residues in length, independently of the other two domains, and
wherein the domains are conjugated to each other via peptide bonds.
7. The fusion polypeptide according to embodiment 6, wherein the C-terminus of the first domain is conjugated to the N-terminus of the second domain via a peptide bond and the C-terminus of the second domain is conjugated to the N-terminus of the third domain via a peptide bond.
8. The fusion polypeptide according to any one of embodiments 1 to 7, wherein the fusion polypeptide is a recombinant fusion polypeptide.
9. The fusion polypeptide according to any one of embodiments 1 to 8, wherein the fusion polypeptide is produced in a eukaryotic cell.
10. The fusion polypeptide according to any one of embodiments 1 to 9, wherein the fusion polypeptide has NO post-translational modification at S or X1 of SEQ ID NO:01 or SEQ ID NO:02 or SEQ ID NO:04, respectively.
11. The fusion polypeptide according to any one of embodiments 1 to 10, wherein the linear fusion polypeptide has NO post-translational modification at X1 of SEQ ID NO 01 or SEQ ID NO 02 or SEQ ID NO 04.
12. The fusion polypeptide according to any one of embodiments 10 to 11, wherein the post-translational modification is phosphorylation (addition of phosphate groups) and/or glycosylation (addition of carbohydrate moieties).
13. The fusion polypeptide according to embodiment 12, wherein the glycosylation is xylosylation (the carbohydrate moiety is xylose).
14. The fusion polypeptide according to embodiment 12, wherein the glycosylation is glucosylation (the carbohydrate moiety is glucose).
15. The fusion polypeptide according to any one of embodiments 5 to 14, wherein the fusion polypeptide has reduced post-translational modifications at residue S and/or X1 as compared to a fusion polypeptide comprising:
GpSGnSGmX1X2X3        (SEQ ID NO:06)
wherein X1 is serine, threonine or proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4,
wherein p ═ n or p ═ 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
16. A fusion polypeptide according to any one of embodiments 5 to 15 wherein n-3 and m-3 or 4.
17. A fusion polypeptide according to any one of embodiments 5 to 15 wherein n-3 and m-4.
18. A fusion polypeptide according to any one of embodiments 5 to 15 wherein n-4 and m-4 or 5.
19. A fusion polypeptide according to any one of embodiments 5 to 15 wherein n-4 and m-5.
20. The fusion polypeptide according to any one of embodiments 1 to 19, wherein the fusion polypeptide comprises an amino acid sequence
GpSGnSGmX1X2X3        (SEQ ID NO:03)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4,
wherein m is 3, 4 or 5, and
wherein p is 3 or 4.
21. The fusion polypeptide according to any one of embodiments 1 to 20, wherein the fusion polypeptide further comprises/at least one domain is an antibody Fc-region polypeptide.
22. A fusion polypeptide according to any one of embodiments 6 to 21 wherein the first or/and third domain is an antibody Fc region polypeptide.
23. A fusion polypeptide according to any one of embodiments 1 to 22, wherein the fusion polypeptide further comprises a VH domain, a VL domain, a scFv, a scFab, a VH-CH1 pair, a VL-CL pair, a VH-CL pair, a VL-CH1 pair, a receptor or extracellular domain thereof, a receptor binding portion of a ligand, an enzyme, a growth factor, an interleukin, a cytokine, or a chemokine/a fusion polypeptide according to any one of embodiments 2 to 18, wherein at least one domain of the fusion polypeptide is selected from the group consisting of: a VH domain, a VL domain, a scFv, a scFab, a VH-CH1 pair, a VL-CL pair, a VH-CL pair, a VL-CH1 pair, a receptor or extracellular domain thereof, a receptor binding portion of a ligand, an enzyme, a growth factor, an interleukin, a cytokine, and a chemokine.
24. The fusion polypeptide according to any one of embodiments 1 to 23, wherein the fusion polypeptide is monomeric.
25. The fusion polypeptide according to any one of embodiments 1 to 24, wherein the fusion polypeptide is a linear fusion polypeptide.
26. A multimeric molecule comprising at least two polypeptides, wherein at least one polypeptide is a fusion polypeptide according to any one of embodiments 1 to 25.
27. The multimeric molecule according to embodiment 26, wherein at least two polypeptides are conjugated to each other via one or more disulfide bonds.
28. The multimeric molecule according to any one of embodiments 26 to 27, wherein the multimeric molecule is an antibody.
29. The multimeric molecule according to any one of embodiments 26 to 28, wherein the multimeric molecule is a bispecific antibody.
30. The multimeric molecule according to any one of embodiments 26 to 29, further comprising at least one antibody light chain.
31. The multimeric molecule according to any one of embodiments 26 to 30, further comprising at least one antibody heavy chain.
32. A pharmaceutical composition comprising at least one fusion polypeptide according to any one of embodiments 1 to 25 or a multimeric molecule according to any one of embodiments 26 to 31 and optionally a pharmaceutically acceptable carrier.
33. A nucleic acid molecule encoding a fusion polypeptide according to any one of embodiments 1 to 26.
34. The nucleic acid molecule according to embodiment 33, wherein the nucleic acid molecule is in an expression cassette.
35. The nucleic acid molecule according to any one of embodiments 34 to 35, wherein the nucleic acid molecule is in a vector.
36. A set of nucleic acid molecules encoding the polypeptides of the multimeric molecule according to any one of embodiments 26 to 31.
37. A set of nucleic acid molecules according to embodiment 36, wherein each of the nucleic acid molecules is in an expression cassette.
38. The set of nucleic acid molecules according to any one of embodiments 36 to 37, wherein each of the nucleic acid molecules is in a vector.
39. The set of nucleic acid molecules according to any one of embodiments 36 to 37, wherein the nucleic acid molecules are on the same vector.
40. The set of nucleic acid molecules according to any one of embodiments 36 to 37, wherein the nucleic acid molecules are on different vectors.
41. The set of nucleic acid molecules according to any one of embodiments 36 to 37, wherein the nucleic acid molecules are on two vectors.
42. A eukaryotic cell comprising a nucleic acid molecule according to example 33 or a group of nucleic acid molecules according to example 36.
43. A peptide linker comprising an amino acid sequence
GyX1                    (SEQ ID NO:04)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline, and
wherein y is an integer of 3 to 25 inclusive.
44. A peptide linker comprising an amino acid sequence
GyX1X2X3                (SEQ ID NO:05)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 may be any amino acid residue independent of each other, and
wherein y is an integer of 3 to 25 inclusive.
45. The peptide linker according to any one of embodiments 43 to 44, wherein y is 4 to 20 and includes an integer of 4 to 20.
46. The peptide linker according to any one of embodiments 43 to 45, wherein y is an integer from 5 to 15 and including 4 to 20.
47. A peptide linker comprising an amino acid sequence
GnSGmX1X2X3            (SEQ ID NO:01)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4, and
wherein m is 3, 4 or 5.
48. A peptide linker comprising an amino acid sequence
GpSGnSGmX1X2X3        (SEQ ID NO:03)
Wherein X1 can be any amino acid residue other than serine, threonine, and proline,
wherein X2 and X3 can be any amino acid residue independently of each other,
wherein n is 1, 2, 3 or 4,
wherein m is 3, 4 or 5, and
wherein p is 3 or 4.
49. A fusion polypeptide according to any one of embodiments 47 to 48 wherein n-3 and m-3 or 4.
50. A fusion polypeptide according to any one of embodiments 47 to 49 wherein n-3 and m-4.
51. A fusion polypeptide according to any one of embodiments 47 to 48 wherein n-4 and m-4 or 5.
52. A fusion polypeptide according to any one of embodiments 47 to 48 and 51 wherein n-4 and m-5.
53. A method of producing a fusion polypeptide comprising the steps of:
culturing the eukaryotic cell according to example 42 under conditions in which the fusion polypeptide is expressed, and
-recovering the fusion polypeptide from the eukaryotic cell or the culture medium.
Thereby producing a fusion polypeptide.
54. A method of producing a fusion polypeptide having a reduced level of post-translational modification, comprising:
culturing a host cell according to example 42 under conditions in which the fusion polypeptide is expressed, and
-recovering the fusion polypeptide from the eukaryotic cell or culture medium,
thereby producing a fusion polypeptide with a reduced level of post-translational modification.
55. A method of producing a stable fusion polypeptide, the method comprising genetically engineering a fusion protein to comprise the peptide linker of any one of embodiments 43 to 52, and allowing the fusion polypeptide to be expressed by a eukaryotic cell, thereby producing a stable fusion polypeptide.
56. A composition prepared according to the method of any one of embodiments 53 to 55.
57. A method of treating a subject who would benefit from treatment with a fusion polypeptide according to any one of examples 1 to 25 or a multimeric molecule according to any one of examples 26 to 31, comprising administering to the subject the composition of any one of examples 32 or 56.
58. Use of a composition according to any one of embodiments 32 or 56 for treating a disease or disorder.
59. Use of a composition according to any one of embodiments 32 or 56 for the manufacture of a medicament.
60. A fusion polypeptide according to any one of embodiments 1 to 25 or a multimeric molecule according to any one of embodiments 26 to 31 for use as a medicament.
61. Use of a fusion polypeptide according to any one of embodiments 1 to 25 or a multimeric molecule according to any one of embodiments 26 to 31 in the manufacture of a medicament.
62. A method of treating an individual in need of treatment comprising administering to the individual an effective amount of a fusion polypeptide according to any one of embodiments 1 to 25 or a multimeric molecule according to any one of embodiments 26 to 31.
Drawings
FIG. 1 shows the total mass determination of a fusion protein comprising an antibody heavy chain constant domain and a non-antibody polypeptide (HC1-F) transiently expressed in human embryonic kidney cells. Deconvolution mass spectra of deglycosylated and reduced HC1-F are shown. In addition to the expected HC1-F, another peak indicating the presence of a strong +79Da HC1-F product variant was also seen. There was also lower signal intensity due to the +132Da xylose product variant.
FIG. 2 shows a schematic of HC 1-F. HC1-F consists of two >10kDa non-IgG proteins fused to an IgG heavy chain constant domain by two standard glycine-serine [ (G4S)2] linkers I and II.
FIG. 3 shows the Extract Ion Current (EIC) chromatograms (z ═ 2 and 3) for HC1-F (A) unmodified (elution time 36.5 minutes) and (B) +79.97Da modified (elution time 37.6 minutes) trypsin digested glycine-serine linker peptide (X9 GGGGSGGGGSR; SEQ ID NO: 07). Relative comparison of integrated EIC chromatograms quantified the modification amount to 5.5% (including 1.1%, containing O-xylose). MA, artificial integration peak; NL, normalized intensity level.
Figure 4 shows ion trap MS/MS data obtained by collision induced dissociation of triple protonated (a) unmodified and (B) +79.97Da modified trypsin digested glycine-serine linker peptides of HC1-F, and (C) MS/MS spectra by electron transfer/high energy collision dissociation of triple protonated modified peptides. The additional 79.97Da is located at the C-terminal serine residue.
Figure 5 shows the effect of spiking synthetic phosphopeptides to tryptic digests. (A) Extract ion flow chromatograms of unmodified and +79.97Da modified trypsin digested linker peptide (X9 GGGGSGGGGSR; SEQ ID NO:07) of HC1-F without, (B) with 0.5. mu.M and (C) with 1.0. mu.M of a tagged synthetic phosphopeptide (X9 GGGGSGGGGpSR; SEQ ID NO:08) (z ═ 2 and 3). The spiked samples showed an increase in the peak area of the phosphorylated peptide, which demonstrates the correct identification of the modified tryptic peptides. NL, normalized intensity level.
FIG. 6 shows the results of enzymatic dephosphorylation of modified trypsin digested linker peptide of HC1-F with alkaline phosphatase. Extraction ion flux chromatograms (z ═ 2 and 3) for unmodified and +79.97Da modified trypsin linker peptide (X9 GGGGSGGGGSR; SEQ ID NO:07) (a) incubation with alkaline phosphatase and (B) incubation with alkaline phosphatase (control reaction). After enzymatic dephosphorylation, no further modified tryptic peptides could be detected. NL, normalized intensity level.
FIG. 7 shows that specific phosphorylation occurs at the glycine-serine linker I. Extracted ion flow chromatogram (EIC) (z 2) of (A) unmodified (elution time 12.5 min) and phosphorylated (elution time 13.5 min) thermolysin digested glycine-serine linker I peptide (XGGGSGGGGSREX 3; SEQ ID NO:09) of HC 1-F. Relative comparison of integrated EIC chromatograms quantified the modification amount to 11.3% (including 1.4% of O-xylose-containing at the GSG motif (data not shown)). Orbitrap HCD-MS/MS data for the doubly protonated unmodified and modified thermolysin digested glycine-serine linker I peptide are shown in (B) and (C), respectively. The +79.97Da modification is located at the C-terminal serine residue. MA, artificial integration peak; NL, normalized intensity level.
FIG. 8 shows linker phosphorylation in HC1-F stably expressed in Chinese hamster ovary cells. (A) Extraction Ion Current (EIC) chromatograms of unmodified (elution time 15.6 min) and phosphorylated (elution time 16.85 min) thermolysin digestion of glycine-serine linker I peptide (xggggsggsrex 3; SEQ ID NO:09) (z ═ 2). The amount of phosphorylated peptide was quantified to 0.4%. (B) HCD-MS/MS data for peptides digested with thermolysin modified by double protonation. MA, artificial integration peak; NL, normalized intensity level.
FIG. 9 shows O-xylosylation and O-glycosylation on fusion protein 2 (produced in CHO cells; HC2-F) after reduction and analysis by UHR-QTOF-ESI-MS.
FIG. 10 is obtained using HC 2-F: the major signal induced by Fc-N-glycans (RT 16 min, 62 to 66 min). For the peptide SLSLSLSLSLSLSPGGGGSGGGGSGGGGSGGGGSIQMTX 13(SEQ ID NO:10) including a glycine serine linker, O-glycans were identified (RT 43 to 46 min).
FIG. 11 shows MS/MS for HC2-F, including the O-glycosylated peptide fragment of FIG. 10 with O-glycan and peptide identification.
FIG. 12 shows the results of MS/MS for the peptide of SEQ ID NO:10 for HC 2-F. Using EThcD/HCD MS2Allowing the localization of GalNAc (+203Da) and GalNAc-Gal-Neu5Ac (+656Da) to the C-terminal serine residue of the GS peptide linker. In addition, several O-xylose sites (+132Da) may be located at other serine residues in the peptide linker.
FIG. 13 shows the relative quantification of O-glycans of the HC peptide fragment of SEQ ID NO:10 relative to the sum of unmodified and O-xylosylated peptides for HC 2-F.
FIG. 14A shows O-glycosylation on fusion protein 3(LC 3-F; produced in HEK cells) after UHR-QTOF-ESI-MS analysis of the intact protein. Fig. 14B shows a deconvolution mass spectrum of an N-deglycosylated intact protein including LC3-F produced in HEK cells, fig. 14C shows a deconvolution mass spectrum of an N-deglycosylated and desialylated intact protein including LC-3F, and fig. 14D shows a deconvolution mass spectrum of an N-deglycosylated and reduced LC 3-F. The determined molecular weights of the N-deglycosylated intact protein or the annotated variants of LC-3F are listed (annotated: B: +364.8Da (GalNAc-Gal), +656.8Da (GalNAc-Gal-Neu5Ac), +948.2Da (GalNAc-Gal-2Neu5Ac), +1313.1Da (2xGalNAc-Gal-Neu5Ac), +1602.2Da (GalNAc-Gal-Neu5Ac and GalNAc-Gal-2Neu5Ac), +1896.7(2xGalNAc-Gal-2Neu5Ac),. C: +365.7Da (GalNAc-Gal), +731.0Da (2xGalNAc-Gal) and the chain A (C: +365.5Da (Gal-Gal), +729.8Da (2 xGal-Gal) Gal-Gal) and the GalNAc-Gal content of the annotated variants of N-3F are approximately equal to a circle (+.365.5 Da (approximately 8).
FIG. 15 shows a deconvoluted mass spectrum of the OpeRATOR digest for LC 3-F. The mass of the fragment corresponds to O-glycosylation of threonine residues of the hinge region. Operator is derived from Exendiella and is expressed in E.coli. The enzyme contains a His-tag and has a molecular weight of 42 kDa.
FIG. 16 shows XICs of LC3-F after desialylation and tryptic digestion with (upper XIC) or without (lower XIC) OpeRATOR protease. It can be seen that O-glycosylation (+365.13Da ═ GalNAc-Gal) is according to the following sequence (underlined) GGGSGGGGSGGGGSGGGGSGGGGTCPPCPAPEAAGGPSVFLFPPKPK (SEQ ID NO:11) is located at the threonine residue, i.e., the C-terminus of the GS peptide linker (identified by peptide mapping analysis (LC-MS/MS)). Shown are the XICs of the O-glycosylated tryptic peptide (T (+365.13) C (+59.01) PPC (+59.01) PAPEAAGGPSVFLFPPKP; SEQ ID NO:12) and the unmodified tryptic peptide (XX (+59.01) GGGSGGGGSGGGGSGGGGSGGGGTC (+59.01) PPC (+59.01) PAPEAAGGPSVFLFPPKPK; SEQ ID NO:13) digested at the O-glycosylated threonine residue. +59.01 ═ carboxymethylation.
Figure 17 shows MS/MS of desialylated tryptic peptides digested with operrat protease for LC3-F, which localized O-glycosylation (+365.13Da ═ GalNAc-Gal) to the N-terminal threonine residue. +59.01 ═ carboxymethylation.
FIG. 18 shows that O-fucosylation (+146Da) is present for fusion protein 4(HC4-F) after thermolysin digestion and can be mapped to peptide LGGGGSGGGGSRT (SEQ ID NO:14) by peptide mapping analysis (LC-MS/MS). Shown are modified (lower XIC, 2.06%) and unmodified (upper XIC) XICs of thermolysin digested peptide fragments.
FIG. 19 shows the results of adding the modified synthetic peptide (27-310nM) X8LGGGGSGGGGS (+ trehalose) RT (SEQ ID NO:15) to the tryptic digest for HC 4-F. The modified synthetic peptide co-eluted with the modified HC4-F tryptic peptide X8LGGGGSGGGGSRT +146Da and increased the area under the curve. Shown are unlabeled XICs (top panel) and labeled level increases (bottom panel).
FIG. 20 shows the MS/MS results for HC4-F, the tryptic peptide which localizes O-fucosylation (+146Da) to the C-terminal serine residue.
FIG. 21 shows the mass spectra after N-deglycosylation of the intact protein and analysis by UHR-QTOF-ESI-MS for variants of LC3-F (upper panel is full scale, lower panel is magnified). In the variant of LC3-F with a linker comprising amino acid sequence GGGGSGGGGSGGGGSGGGGSGGGGSGGGGGCPPC (SEQ ID NO:23), there was NO O-glycosylation.
FIG. 22 shows the mass spectra after N-deglycosylation of the reduced protein and analysis by UHR-QTOF-ESI-MS for the LC3-F variant (upper panel is full scale, lower panel is enlarged). In the variant of LC3-F with a linker comprising amino acid sequence GGGGSGGGGSGGGGSGGGGSGGGGSGGGGGCPPC (SEQ ID NO:23), there was NO O-glycosylation.
Examples of the invention
Summary of the samples:
fusion protein 1-HC 1-F: at the joint GGGGSGGGGSPhosphorylation of the serine residue in RE SEQ ID NO:09 (see FIGS. 1-8)
Fusion protein 2-HC 2-F: at joint SLSLPGGGGSGGGGSGGGGSGGGGSO-glycosylation of serine residue in IQM SEQ ID NO:10 (see FIGS. 9-13)
Fusion protein 3-LC 3-F: at joint GGGSGGGGSGGGGSGGGGSGGGGTCPPCPAPEAAGGPSVFLFPPKPK SEQ ID NO:11 has O-glycosylation of threonine residues (see FIGS. 14-17)
Fusion protein 4-HC 4-F: o-fucosylation with serine residue in linker LGGGGSGGGGSRT SEQ ID NO:14 (see FIG. 18-FIG. 20)
Variants of fusion protein 4-HC 4-F: there was NO O-glycosylation in the linker including amino acid sequence GGGGSGGGGSGGGGSGGGGSGGGGSGGGGGCPPC (SEQ ID NO:23) (see FIGS. 21-22).
Trypsin, thermolysin, neuraminidase and alkaline phosphatase from bovine intestinal mucosa were purchased from Sigma-Aldrich.
Synthetic peptides (HPLC purity 98%) were synthesized at Biosyntan GmbH.
PNGase F was obtained from Roche Diagnostics GmbH, Custom Biotech.
The OpeRATOR protease is obtained from Genovis AB (Lund, Sweden; OpeRATOR is an O-protease that digests O-glycosylated proteins N-terminal to the S/T glycosylation site; OpeRATOR requires the presence of O-glycans for its enzymatic activity).
Example 1:
enzymatic digestion
UHR-ESI-QTOF-MS
The fusion protein was deglycosylated using PNGase F (with or without neuraminidase present), reduced in 100mM TCEP and desalted by HPLC on a Sephadex G255 x250 mM chromatography column (Amersham Biosciences) using 40% acetonitrile and 2% formic acid (v/v) as mobile phase. Total mass was determined by ESI-QTOF MS on a maXis 4G UHR-QTOF MS system (Bruker Daltonik) equipped with a TriVersa NanoMate source (Advion). Calibration was performed using ESI-L low concentration Tuning Mix (Agilent Technologies). To visualize the results, the m/z spectra were converted to deconvolution mass spectra using a software tool.
OpeRATOR digestion
Prior to reduction, denaturation and carboxymethylation, the fusion protein was N-deglycosylated using PNGase F and desialylated using neuraminidase, and finally digested with O-glycan specific endoprotease as per the instructions of the supplier (OpeRATOR, Genovis AB, Sweden). Digests were analyzed by UHR-ESI-QTOF-MS.
Example 2:
UPLC-MS/MS peptide mapping analysis
The fusion protein was denatured and reduced in 0.3M Tris-HCl (including 6M guanidine hydrochloride and 20mM Dithiothreitol (DTT)) (pH 8) at 37 ℃ for 1 hour. Thereafter, the fusion protein was alkylated by the addition of 40mM iodoacetic acid (C13: 99%) (Sigma-Aldrich) and incubated for 15 minutes at room temperature in the absence of light. Excess iodoacetic acid was inactivated by further adding 20mM DTT to the reaction mixture. The alkylated fusion protein was buffer exchanged using NAP5 gel filtration column. The fusion protein was then digested with trypsin in 50mM Tris-HCl (pH 7.5) under proteolytic conditions for 16 h at 37 deg.C (with or without OpeRATOR protease). The reaction was stopped by adding formic acid to 0.4% (v/v). Thermolysin was used in 25mM Tris-HCl, 1mM CaCl2(pH 8.3), digestion was performed at 25 ℃ for 30 minutes, and then digestion was stopped by adding EDTA to 8 mM. The digested samples were stored at-80 ℃ and mass spectrometry using a mass spectrometer coupled to TriVersa NanoMate (Advion) and Orbitrap Elite (Thermo Fisher)Scientific) was analyzed by UPLC-MS/MS. Approximately 2.4. mu.g of digested fusion protein was injected in a volume of 5. mu.L. The chromatography was carried out on an Acquity BEH 300C 18 column (1x150mm, 1.7 μm,(Waters)) is performed in anti-phase. Mobile phases a and B contained 0.1% (v/v) formic acid in UPLC grade water and acetonitrile, respectively. The column temperature used was 50 ℃, the mobile phase B gradient was 1% to 40% over 90 minutes, then increased to 99% mobile phase B for 2 minutes and a 6 minute re-equilibration step was performed at 1% mobile phase B. Two injections of mobile phase a were performed using a 50 minute gradient between the two injections to prevent carryover between samples. The effluent was split post-column using TriVersa NanoMate and the nanoliter flow fraction was introduced into the mass spectrometer.
High resolution MS mass spectra were acquired using an Orbitrap mass analyser and CID MS/MS fragment ion spectra in the ion trap were detected in parallel with dynamic exclusion enabled (repeat count 1, exclusion time 15s (+ -10 ppm)). The Orbitrap Fusion is used in data dependent mode.
The MS is set as follows: complete MS (AGC: 2x 10)5Resolution ratio: 6x104M/z range: 300-2000, maximum sample injection time: 100 ms); MS/MS (AGC: 1x 10)4And the maximum sample introduction time is as follows: 100ms, separation width: 2 Da). Normalized collision energy was set at 35%, activation p: 0.25, separation width: 2Da of the total weight of the rice flour,
for the method of only obtaining the HCD MS/MS spectrum, the HCD Orbitrap MS/MS spectrum was performed up to 20 times on the most abundant ions after full MS scan of the Orbitrap. AGC settings for MS/MS experiments 5x104The maximum sample introduction time is 500 ms. The normalized collision energy was set to 20% and 15 × 10 in Orbitrap3The resolution setting of (a) detects HCD fragment ions. All other settings are the same as described for the method using only CID fragments.
The complementary EThcD method based on HCD and ETD as data dependent fragmentation techniques involves a full scan MS acquired using an Orbitrap mass analyser and parallel detection of ETD and HCD fragment ion spectra in an ion trap and Orbitrap mass analyser respectively. In order to use as many data dependent MS/MS scans as possible for a complete scan, a fixed cycle time is set.
Complete MS: the same settings as CID and HCD.
For HCD, the MS/MS settings are the same as listed above.
For ETD, MS/MS settings are as follows: reaction time was set at 50ms, ETD reagent target: 1x106And the maximum sample introduction time is as follows: 200 ms. The ETD is activated supplementally. The supplemental activation collision energy was set at 25%. AGC target setting is 1x104The precursor separation width was 2Da and the maximum injection time was set to 250 ms.
Analysis of LC-MS/MS data and post-translational modification (PTM) discrimination was performed using pre-treatment options and PepFinder software (Thermo Fisher Scientific), using PEAKS studio 6.0 and 7.5 software (Bioinformatics Solutions Inc.). Manual data interpretation and quantification was performed using XCalibur software (Thermo Fisher Scientific). The theoretical mass was calculated using GPMAW (Lighthouse data) and XIC was generated with the strongest isotope mass using a mass tolerance of 8 ppm.
Example 3:
labeling of synthetic peptides
Based on the calculated amount of modified peptide, an amount of synthetic peptide was spiked to the tryptic digest. 2.4 μ g of tryptic digest with or without spiked synthetic peptide was analyzed by LC-MS/MS as described previously.
Example 4:
enzymatic dephosphorylation
Enzymatic dephosphorylation of the +79.97Da modified trypsin linker peptide was performed by lyophilization of-62 μ g of the trypsin digest. Peptides were resuspended in 25. mu.L of 100mM Tris-HCl, 5mM MnCl2(pH 8.0) and incubated with 250 units of alkaline phosphatase at 37 ℃ for 1 h. The digested samples were stored at-80 ℃. MS analysis was performed as described previously.
It will be readily appreciated that the embodiments as generally described herein are exemplary. The detailed description of some embodiments presented and the examples presented are not intended to limit the scope of the disclosure, but are merely representative of the embodiments. Moreover, the order of the steps or actions of the methods disclosed herein may be altered by those skilled in the art without departing from the scope of the disclosure. In other words, unless a specific order of steps or actions is required for proper operation of the embodiment, the order or use of specific steps or actions may be modified.