Missense variants in the TNFA epitopes and their effects on interaction with therapeutic antibodies—in silico analysis

Background Tumor necrosis factor alpha (TNFA) is an important cytokine that influences multiple biological processes. It is one of the key mediators of acute and chronic systemic inflammatory reactions and plays a central role in several autoimmune diseases. A number of approved monoclonal antibodies (mAbs) are widely used to subside these autoimmune diseases. However, there is a high rate of non-responsiveness to treatments with these mAbs. Therefore, it is important to be able to predict responses of the patients in an individualistic manner to these therapeutic antibodies before administration. In the present study, we used in silico tools to explore the effects of missense variants in the respective epitopes of four therapeutic anti-TNFA mAbs—adalimumab (ADA), certolizumab pegol (CZP), golimumab (GLM), and infliximab (IFX)—on their interactions with TNFA. Results The binding affinities of CZP and ADA to corresponding epitopes appear to be reduced by four (TNFAR131Q, TNFAE135G, TNFAR138Q, and TNFAR138W) and two (TNFAG66C and TNFAG66S) variants, respectively. The binding of GLM and IFX appears to be affected by TNFAR141S and TNFAR138W, respectively. TNFAG66C and TNFAG66S may be associated with autoimmune diseases, whereas TNFAE135G, TNFAR138W, and TNFAR141S may be pathogenic per se. Conclusion These variants may contribute to the observed inter-individual variability in response to anti-TNFA mAbs treatments and be used as markers to predict responses, and thus optimize therapeutic benefits to the patients. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-021-00288-y.

TNFA is one of the key mediators of acute and chronic systemic inflammatory reactions. TNFA induces its own secretion as well as the production of other inflammatory cytokines and chemokines, and thus plays a central role in several autoimmune diseases such as rheumatoid arthritis (RA), inflammatory bowel disease (IBD) including Crohn's disease (CD), ulcerative colitis (UC), multiple sclerosis (MS), systemic lupus erythematosus (SLE), systemic sclerosis, psoriasis (PS), psoriatic arthritis (PsA), and ankylosing spondylitis (AS) [11][12][13]. Modulation of inflammation by antibody-based inhibitors of receptor-TNFA interactions leads to dramatic improvements in patients suffering from these autoimmune diseases [14]. Several FDA-approved anti-TNFA monoclonal antibodies (mAbs) such as adalimumab (ADA), certolizumab pegol (CZP), golimumab (GLM), and infliximab (IFX) are commercially available and widely used to treat such diseases [15].
ADA is a fully human anti-TNFA mAb [16], whereas CZP is composed of a humanized F ab fragment (50 kDa) of IgG1 isotype conjugated to a 40-kDa polyethylene glycol (PEG) moiety [17]. GLM is a human IgG1 mAb, which was originally produced in transgenic mice immunized with human TNFA, but is commercially produced in a recombinant cell line [18]. IFX is a chimeric monoclonal IgG1 antibody with human constant and murine variable regions [19]. ADA, CZP, GLM, and IFX bind to sites on TNFA that overlap with its sites of interaction with the receptor [20][21][22][23].
ADA, CZP, GLM, and IFX are widely prescribed to treat RA, CD, UC, PS, PsA, and AS. In fact, ADA and IFX are among the top-selling therapeutic mAbs [24]. These protein-based therapeutics are quite expensive [25,26]. However, their clinical efficacy is sometimes limited due to a high rate (30-40%) of non-responsiveness [27,28]. Hence, it is important to be able to predict responses of the patients in an individualistic manner to these therapeutic antibodies before administering them. To the best of our knowledge, no single-nucleotide polymorphism (SNP) or variant in TNFA epitopes has been associated with responses to corresponding mAbs. In this study, we investigated the interactions of ADA, CZP, GLM, and IFX mAbs with their respective epitopic variants in TNFA using in silico tools.

Missense variants in the epitopes for anti-TNFA monoclonal antibodies
The amino acid residues of TNFA that constitute the epitopes for ADA, CZP, GLM, and IFX were retrieved through literature search [20][21][22][23]. Missense variants within the TNFA epitopes were obtained using the Ensembl Genome Browser [29]. X-ray crystallographic structures of TNFA-ADA (PDB  ID: 3WD5), TNFA-CZP (PDB ID: 5WUX), TNFA-GLM  (PDB ID: 5YOY), and TNFA-IFX (PDB ID: 4G3Y) complexes were retrieved from the Protein Data Bank (PDB) [30]. Since 3WD5 and 4G3Y contained a single light and a heavy chain of the antibody with a single epitope, these complexes were not customized any further. 5WUX and 5YOY contain complex structures with multiple epitopes and heavy and light chains, and therefore required customization for analysis. H (CZP heavy chain), L (CZP light chain), and E (TNFA molecule) chains from TNFA-CZP, and I (GLM heavy chain), F (GLM light chain), and A (TNFA molecule) chains from TNFA-GLM X-ray crystallographic structures were retained for further analysis. The rest of the chains were removed using UCSF Chimera 1.14 [31] to retain interactions between a single antibody molecule and a single TNFA protomer. These four structures of antigen-antibody complexes were used as inputs to predict the effects of the epitopic missense variants on interaction with the relevant antibodies using mCSM-PPI2 [32], SAAMBE-3D [33], and MutaBind [34].

3D modeling and stabilizing energy calculation of TNFA variant-mAb complexes
The amino acid sequence of TNFA (UniProt accession number: P01375) was retrieved from UniProt [35]. Deletion of the signal peptide (first 76 residues) and substitution of residues at the variant sites was performed manually. Amino acid sequences of the heavy and light chains of the mAbs were obtained from the Therapeutic Structural Antibody Database (Thera-SAbDab) [36]. 3D models of TNFA-mAb complexes were generated using template-based modeling in SWISS-MODEL [37]. X-ray crystallographic structures of the four TNFA-mAb complexes (previously mentioned) were used as templates. Chains in the models were renamed (wherever applicable) to H (mAb heavy chain), L (mAb light chain), and A (TNFA molecule) using UCSF Chimera 1.14 [31]. Stabilizing energies of TNFA-mAb complexes were calculated using the PPCheck web server [38].

Analyses of structures, interfaces, and interactions
Changes in the TNFA structure due to the variants were predicted using Missense3D [39]. 2D maps of interactions between TNFA and H and L chains of mAbs were generated with iCn3D using the default parameters [40]. Distances between selected atoms and areas of interacting surfaces between TNFA and mAb chains were measured using UCSF Chimera 1.14 [31] and iCn3D [40], respectively. The number of bonds and interactions (van der Waals interactions, H-bonds, weak H-bonds, and ionic interactions) between TNFA and mAb chains was calculated with Arpeggio-a web server for calculating interatomic interactions in protein structures [41]. Antigen-antibody interfaces were visualized using PyMOL [42]. In all cases, 3D models of TNFA-mAb complexes were used as the inputs.

Prediction of the variants' effects on TNFA-receptor interactions
To assess whether the variants predicted to significantly affect the TNFA-mAb interactions would also affect interactions between TNFA and its receptors, X-ray crystallographic structure of the TNFA-TNFR2 complex (PDB ID: 3ALQ) was retrieved from the PDB [30]. The PDB file contained multiple TNFA and TNFR2 chains. Therefore, the structure was customized by keeping only chain A and removing all the other chains using UCSF Chimera 1.14 [31]. TNFA-TNFR2 interactions were checked using iCn3D [40].
Only two chains (R and T) of TNFR2 were found to form bonds with the selected TNFA chain. So, these two chains were retained and the other chains were deleted. The PDB file thus prepared was used as an input in mCSM-PPI2 [32], SAAMBE-3D [33], and MutaBind2 [43] to assess the effects of selected missense variants on TNFA-TNFR2 interactions. We could not retrieve any X-ray crystallographic structure of TNFA-TNFR1 complex.

Missense variants in the epitopes of anti-TNFA monoclonal antibodies
Amino acid residues that constitute the epitopes of TNFA for ADA [20], CZP [21], GLM [22], and IFX [23] were retrieved through literature survey (Table 1). Based on the data collected via Ensembl, eleven missense variants were found in each of the epitopes for ADA and CZP (Tables 2 and 3). Seven and ten missense variants were found in the epitopes for GLM and IFX, respectively (Tables 4 and 5).

Binding affinities of TNFA epitopic variants to anti-TNFA therapeutic antibodies
The effects of the identified missense variants on TNFA-mAb interactions were predicted in two different ways. First, changes in the binding affinity (∆∆G = ∆G mutant − ∆G wild-type ) were predicted using the crystallographic structures with mCSM-PPI2 [32], SAAMBE-3D [33], and MutaBind [34]. Second, the 3D models of TNFA-mAb complexes (both wild-type and epitopic variants of TNFA) were generated and total stabilizing energies of the complexes were determined using PPCheck [38]. In all cases, ∆∆G values of > 1.0 kcal/mol (4.18 kJ/mol) were considered to be a significant reduction in binding affinity [49]. In case of antigen-antibody interactions, |∆∆G| > 1.0 kcal/mol may be considered as an indication of significant changes in antibody binding affinity [50]. None of the variants was predicted by every tool to reduce the binding affinity of ADA to TNFA by more than 1.0 kcal/ mol ( Table 2). Only the mCSM-PPI2 and MutaBind predicted values of ∆∆G due for TNFA G66C and TNFA G66S (rs1454071630) variants were > 1.0 kcal/mol. Since ∆∆G values predicted by SAAMBE-3D were very close to 0.53 kcal/mol [33] in both cases, TNFA G66C and TNFA G66S might actually destabilize the TNFA-ADA complex. PPcheck predicted interactions between these epitopic variants of TNFA with the L chain of ADA were weaker as well. Although MutaBind predicted weaker interactions of these variants with both L and H chains of ADA, binding to the L chain was more severely affected (∆∆G > 2.3 kcal/mol) than that of the H chain (∆∆G ≈ 1.6 kcal/mol). Table 1 Epitopes of TNFA for adalimumab, certolizumab pegol, golimumab and infliximab Epitopes with TNFA R131Q (rs376368223), TNFA E135G (rs777545571), TNFA R138Q (rs770509340), and TNFA R138W (rs141307820) variants were predicted to have weaker interactions with CZP, although none of the variants was predicted to significantly decrease (∆∆G > 1.0 kcal/mol) by all the tools (Table 3). Although mCSM-PPI2, SAAMBE-3D, and MutaBind predicted destabilizing effect of TNFA E135k (rs758189183) on interaction with CZP, this finding was not corroborated with the 3D model. TNFA R131Q weakened the interactions between TNFA and the H chain of CZP. On the other hand, TNFA E135G , TNFA R138Q , and TNFA R138W weakened interactions between TNFA and the L chain of CZP.
Significant reduction in affinity (∆∆G > 1.0 kcal/mol) of GLM for its TNFA epitope was convincingly predicted to be caused due to the TNFA Y141S (rs763621932) variant by all the tools used in this study (Table 4). Both MutaBind and PPCheck predicted weaker interactions between TNFA Y141S and the H chain of GLM. mCSM-PPI2, SAAMBE-3D, and MutaBind predicted significantly weakened interaction (∆∆G > 1.0 kcal/mol) between TNFA R138W (rs141307820) variant and IFX (Table 5). But the effect could not be replicated with 3D modeling.

Effects of epitopic variants on TNFA structure and interaction with anti-TNFA antibodies and TNFR2
In TNFA WT , the Glu23 residue was not in contact (cutoff value 4) with Asp1 residue in the L chain of ADA (Fig. 1A). The distance between Glu23 in TNFA (OE1) and Asp1 residue in the L chain was 4.065 Å (Fig. 1D). But both TNFA G66C and TNFA G66S variants changed the conformation and brought those two residues in contact with one another by reducing the distance to < 3.9 Å (Fig. 1B, 1C, 1E and 1F). TNFA G66C and TNFA G66S variants also caused structural changes in TNFA by expanding the cavity (surface pocket) by > 70 Å 3 (Table 6). Furthermore, these two variants increased the buried surface area between TNFA and the L chain of ADA by  > 100 Å 2 compared to the wild-type complex, although the rest of the surface area of these two chains remained similar (Table 6). Additionally, alterations in the antigen-antibody interface region were observed in the case of these two variants (Fig. 2). These structural changes affected interactions between TNFA and the L chain of ADA mainly through the loss or weakening of van der Waals interactions and H-bonds. Change in TNFA structure or interaction at the TNFA-CZP interface was not caused by any of the four variants (TNFA R131Q , TNFA E135G , TNFA R138Q , and TNFA R138W ) that were predicted by mCSM-PPI2, SAAMBE-3D, and MutaBind to destabilize the complex ( Table 6). The ionic interaction between Arg131 in TNFA and Asp31 in the H chain of CZP was lost due to TNFA R131Q (Fig. 3A and B). This loss of interaction caused structural changes in the antigen-antibody interface region (Fig. 4B). The H-bond between Glu135 of TNFA and Tyr49 in the L chain of CZP was lost in the presence of TNFA E135G (Fig. 3C and D). Both TNFA R138Q and TNFA R138W caused the loss of H-bond between Arg138 of TNFA and Tyr60 in the L chain of CZP ( Fig. 3E and F). These H-bond losses changed patterns of interacting residues in the CZP L chain (Fig. 4E and F).
None of the TNFA Y141S and TNFA R138W variants caused any changes in TNFA structure or the interfaces of TNFA-GLM H chain and TNFA-IFX L chain, respectively (Table 6). TNFA Y141S caused the loss of the H-bond between Tyr141 of TNFA and Tyr111 in the H chain of GLM ( Fig. 5A and B). TNFA R138W caused the loss of cation-pi interaction between Arg138 of TNFA and Phe96 in the L chain of IFX ( Fig. 5C and D). These alterations however did not cause any noticeable changes at antigenantibody interfaces (Fig. 6).

Pathogenicity of the variants
Based on the data retrieved from DisGeNET and Phe-noScanner databases, rs1454071630 (TNFA G66C and TNFA G66S ) is associated with generalized pustular psoriasis [51]. No other variant among the seven selected ones has been associated with any disease yet. But along with TNFA G66C and TNFA G66S , TNFA E135G was also predicted to have deleterious effects by all of the predictive tools (Table 7). Multiple tools predicted the TNFA R138W and TNFA Y141S variants to be disease-causing, whereas all tools used in this study predicted TNFA R131Q and TNFA R138Q to be non-pathogenic.

Discussion
The aim of this study was to investigate the effects of epitopic missense variants in TNFA on interaction with corresponding therapeutic anti-TNFA antibodies and identify the candidate variants that may be an important predictor of responsiveness to these antibodies. In order to do so, the effects of those variants on the binding affinity of ADA, CZP, GLM, and IFX to TNFA were predicted using mCSM-PPI2, SAAMBE-3D, and MutaBind. These tools use different methods to predict ∆∆G. mCSM-PPI2 is a machine learning method that utilizes graph-based structural signatures known as mCSM [32]. These signatures encode distance patterns between atoms and can represent the protein residue environment and are used to train predictive models [52]. SAAMBE-3D is a development of the SAAMBE method, which uses the molecular mechanics/Poisson Boltzmann surface area (MM/ PBSA) and knowledge-based terms to predict ∆∆G and amino acid-specific dielectric constant to mimic mutation-induced conformational flexibility [33]. MutaBind method uses molecular mechanics force fields, statistical potentials, and fast side-chain optimization algorithms [34]. Tools that use different methods may yield conflicting results, but their combined usage can help clarify the impacts of mutations [53]. The predicted destabilizing effects in our study were also assessed using 3D models of TNFA-mAb complexes. Variants that were predicted to substantially destabilize the TNFA-mAb interactions were selected for further analyses such as prediction of changes in TNFA structures and TNFA-mAb interfaces and calculation of van der Waals interactions, H-bonds, weak H-bonds, and ionic interactions because of their importance in antigen-antibody interactions [54]. Although none of these tools is supposed to be 100% accurate, we minimized the chances of erroneous predictions by using multiple tools and making consensus decisions. Direct experimental methods may provide more accurate results, but the use of in silicotools to prioritize the candidate variants may save time and effort [55].   The binding affinity of ADA may be affected by TNFA G66C and TNFA G66S variants ( Table 2). Although Gly66 in TNFA does not directly interact with any residue of ADA, it is a part of the epitope [20]. Both TNFA G66C and TNFA G66S variants brought Glu23 of TNFA in contact with (within 4 Å) Asp1 of ADA in the L chain (Fig. 1C-F). Glu23 of TNFA contributes considerably to ADA binding [20]. Besides, two flanking residues of Gly66 in TNFA-Lys65 and Gln67-are essential for ADA binding [20]. Hence, the substitution of Gly66 in TNFA is likely to interfere with the binding affinity of ADA. Additionally, TNFA G66C and TNFA G66S significantly altered the TNFA structure in complex with ADA (cavity was expanded by > 70 Å 3 ), but not in its trimeric state (PDB ID: 1TNF). Such changes could affect the stability of TNFA in complex with ADA [39]. These variants also increased the buried surface between TNFA and the L chain of ADA by > 100 Å 2 and introduced observable changes in the antigen-antibody interface ( Fig. 2B and  C). Although the buried surface area between antigen and antibody may not reveal much information regarding the strength of their interactions [56], changes in such area may provide further evidence of structural alterations by TNFA G66C and TNFA G66S . As a result of these alterations, there was weakening and/or losses of van der Waals interactions and H-bonds between TNFA G66C and TNFA G66S with the L chain of ADA.
Among the other epitopic missense variant positions, substitution of Glu135 in TNFA with an alanine caused dramatic reduction in TNFA-ADA affinity [20]. But such drastic reduction in presence of TNFA E135G and TNFA E135K variants was not observed, which might be due to interaction of ADA with Glu135 in an adjacent TNFA protomer [20].
TNFA R131Q , TNFA E135G , TNFA R138Q , and TNFA R138W were predicted to affect TNFA-CZP interactions ( Table 3). Substitution of Arg138 in TNFA with an Ala drastically decreases binding affinity to CZP [21]. In agreement with this finding, we found that both TNFA R138Q and TNFA R138W missense variants at this position may substantially reduce interaction with CZP. We also observed changes in the patterns of interacting residues in the CZP L chain in the case of these two variants ( Fig. 4E and F). H-bond between Glu135 of TNFA and Tyr49 of CZP in the L chain plays a supplementary role in the energetics of TNFA-CZP interactions [21]. Although both TNFA E135G and TNFA E135K were predicted to destabilize the TNFA-CZP interactions with majority of the tools, these predictions could not be replicated with the 3D models of the TNFA E135K variant. Substitution of both Arg138 and Glu135 in TNFA interrupted the formation of H-bonds between these residues and two Tyr residues of CZP (Tyr60 and Tyr49, respectively) in the L chain (Fig. 2C-E). Interactions between residues in epitopes and Tyr residues in paratopes are extremely important in the context of antigen-antibody affinity [57]. TNFA R131A does not affect CZP binding affinity [21]. But in our study, TNFA R131Q was predicted and determined to reduce the affinity of the H chain of CZP to TNFA in most of the cases as it caused the loss of salt bridge and/ or ionic interaction between Arg131 of TNFA and Asp31 of CZP in the H chain ( Fig. 3A and B). Side-chain atoms (NH1) of Arg131 in TNFA participate in this interaction (detected with iCn3D). So, the substitution of this residue is supposed to eliminate this salt bridge/ionic interaction with CZP, which may cause structural changes in the antigen-antibody interface (Fig. 4B).
In the present study, TNFA Y141S was predicted to significantly destabilize (∆∆G > 1.0 kcal/mol) TNFA-GLM interactions by all the tools used (Table 4). This variant reduced the binding affinity of the H chain of GLM to TNFA. Tyr141 of TNFA forms the H-bond with Tyr110 of GLM in the H chain (Fig. 3A). TNFA Y141S resulted in the loss of this H-bond (Fig. 3B). TNFA R138W was predicted to drastically reduce (∆∆G > 1.0 Kcal/mol) interaction strength between TNFA and IFX by all predictive tools, but such reductions could not be detected in the 3D models of TNFA-IFX complex (Table 5). TNFA R138W caused the loss of cation-pi interaction between Arg131 of TNFA and Phe96 of IFX in the L chain ( Fig. 5C and  D). Cation-pi interactions play a crucial role in stabilizing antigen-antibody complexes [58].
None of the variants, except TNFA G66C and TNFA G66S , caused alteration in the TNFA structure in the complex with the corresponding mAbs or changed the buried surface area between TNFA and mAbs (Table 6). Besides, calculations by Arpeggio [41] hardly detected changes in the number of interactions in any other variant except TNFA G66C and TNFA G66S . So, in the case of TNFA G66C and TNFA G66S , structural alterations and eventual reductions in TNFA-ADA L chain interactions caused a decrease in TNFA-ADA affinity. In the case of the other three mAbs, the losses of some specific interactions (mostly H-bonds) caused reductions in the TNFA-mAb affinity. Another important finding of this study is that TNFA R138W may interrupt both TNFA-CZP and TNFA-IFX interactions. So, individuals with this variant may be poor responders to both CZP and IFX. Further studies are needed to confirm this prediction. None of these variants affect the distribution of the TNFA residues that play important role in TNFA-TNFR2 interactions [20,59]. So, these variants are less likely to affect interactions between TNFA and its receptor. This assumption was confirmed using predictive tools (Supplementary Table 1). The seven missense variants (TNFA G66C , TNFA G66S , TNFA R131Q , TNFA E135G , TNFA R138Q , TNFA R138W , and   TNFA Y141S ) are present at low frequencies in the global population. TNFA G66C and TNFA G66S , however, may be associated with autoimmune disease [51]. TNFA E135G was predicted to be pathogenic by all the tools used in this analysis (Table 7). Among these tools, SIFT uses sequence homology to predict the effect of an amino acid substitution on protein function [60]. PolyPhen2 uses multiple protein sequence alignment pipeline and machine learning classification method to predict a missense variant's pathogenicity [61]. PMut also uses the machine learning method [46]. Meta-SNP and Predict SNP 1.0 are consensus classifiers that utilize results from multiple high-performing classifier tools [47,48]. So, it is highly probable that TNFA E135G may be associated with disease. Besides, TNFA R138W and TNFA Y141S were predicted to be disease-causing by multiple tools. Further studies are needed to clarify if these three variants are associated with autoimmune diseases that are treated with anti-TNFA mAbs. If so, then at least these three variants along with TNFA G66C and TNFA G66S can be more frequent among patients who are treated with anti-TNFA mAbs compared with the rest of the population.
In order to contextualize our findings, we searched for ADA, CZP, GLM, and IFX response-associated SNPs in the Pharmacogenomics Knowledge Base (PharmGKB) [62]. We found twenty-one ADA, CZP, and IFX responserelated SNPs. No GLM response-associated SNP was found. All these variant-drug combinations have level 3 clinical annotation, i.e., evidence for those associations comes from single significant studies and the findings are yet to be replicated [62]. Variant alleles of these SNPs may be associated with poor [63] or better response [64]. The response may also depend on the disease being treated [65]. Only two (rs1799724 and rs361525) of these 21 SNPs were located in the TNF gene. rs1799724 and rs361525 are both located in the non-coding region and associated with responses to ADA [66] and IFX [67] in rheumatoid arthritis patients, respectively. So, there is a huge gap in our knowledge regarding the roles played by anti-TNFA mAb response-associated SNPs in their therapeutic outcome. Our study identifies seven missense variants in the epitopic region of TNFA, which may be predictive of the clinical outcomes of anti-TNFA mAb therapy regardless of the disease being treated. Moreover, this is the first study to our knowledge that has identified an association between a probable disease-causing variant (TNFA Y141S ) and response to GLM. Therefore, our findings may contribute to the further clarification of the roles played by patients' genotypes in their responses to ADA, CZP, GLM, and IFX.

Conclusion
This study was inspired by the high rate (30-40%) of non-responsiveness to therapeutic anti-TNFA antibodies, which are used to treat multiple debilitating autoimmune diseases like rheumatoid arthritis, Crohn's disease, ulcerative colitis, psoriasis, psoriatic arthritis, and ankylosing spondylitis. This in silico study predicts the candidate genetic variants that may underlie such variable responses to anti-TNFA antibodies. Based on these predictions, further studies may be designed and conducted using in vitro models to assess the effects of candidate variants on TNFA-mAb interactions. In addition, studies with patients undergoing treatments with anti-TNFA antibodies may shed more light on this heterogeneous responsiveness. Determining the genotypes at the candidate variant loci may improve clinical outcomes in patients. Also, population-specific studies might highlight the general genetic architecture at these loci, and thus help decide the more appropriate therapeutic antibodies for treatment.