In silico analysis of phylogeny, structure, and function of arsenite oxidase from unculturable microbiome of arsenic contaminated soil

Background Arsenite oxidase (EC 1.20.2.1) is a metalloenzyme that catalyzes the oxidation of arsenite into lesser toxic arsenate. In this study, 78 amino acid sequences of arsenite oxidase from unculturable bacteria available in metagenomic data of arsenic-contaminated soil have been characterized by using standard bioinformatics tools to investigate its phylogenetic relationships, three-dimensional structure and functional parameters. Results The phylogenetic relationship of all arsenite oxidase from unculturable microorganisms was revealed their closeness to bacterial order Rhizobiales. The higher aliphatic content showed that these enzymes are thermostable and could be used for in situ bioremediation. A representative protein from each phylogenetic cluster was analysed for secondary structure arrangements which indicated the presence of α-helices (~63%), β-sheets (57–60%) and turns (13–15%). The validated 3D models suggested that these proteins are hetero-dimeric with two chains whereas alpha chain is the main catalytic subunit which binds with arsenic oxides. Three representative protein models were deposited in Protein Model Database. The query enzymes were predicted with two conserved motifs, one is Rieske 3Fe-4S and the other is molybdopterin protein. Conclusions Computational analysis of protein interactome revealed the protein partners might be involved in the whole process of arsenic detoxification by Rhizobiales. The overall report is unique to the best of our knowledge, and the importance of this study is to understand the theoretical aspects of the structure and functions of arsenite oxidase in unculturable bacteria residing in arsenic-contaminated sites. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-021-00146-x.


Results
The SWISS-MODEL template library (SMTL version 2020-05-27, PDB release 2020-05-22) was searched with BLAST (Camacho et al.) and HHBlits (Remmert et al.) for evolutionary related structures matching the target sequence in Table T1. For details on the template search, see Materials and Methods. Overall 162 templates were found (Table T2).
The target sequence was searched with BLAST against the primary amino acid sequence contained in the SMTL. A total of 13 templates were found.
An initial HHblits profile has been built using the procedure outlined in (Remmert et al.), followed by 1 iteration of HHblits against NR20. The obtained profile has then be searched against all profiles of the SMTL. A total of 150 templates were

Model Building Report
This document lists the results for the homology modelling project "Untitled Project" submitted to SWISS-MODEL workspace on May 28, 2020, 1:35 p.m..The submitted primary amino acid sequence is given in Table T1.

Results
The  Table T1.
For details on the template search, see Materials and Methods. Overall 162 templates were found (Table T2).

Models
The following    RRTVTVNACEVEAGKDRVMHLAINSGSDLALFNAWMTYIAEKGWVDKALIAAST
The target sequence was searched with BLAST against the primary amino acid sequence contained in the SMTL. A total of 14 templates were found.
An initial HHblits profile has been built using the procedure outlined in (Remmert et al.), followed by 1 iteration of HHblits against NR20. The obtained profile has then be searched against all profiles of the SMTL. A total of 148 templates were https://swissmodel.expasy.org/interactive/WkhkFC/models/report.html 3/5 found.

Model Building
Models are built based on the target-template alignment using ProMod3. Coordinates which are conserved between the target and the template are copied from the template to the model. Insertions and deletions are remodelled using a fragment library. Side chains are then rebuilt. Finally, the geometry of the resulting model is regularized by using a force field. In case loop modelling with ProMod3 fails, an alternative model is built with PROMOD-II (Guex et al.).

Model Quality Estimation
The global and per-residue model quality has been assessed using the QMEAN scoring function (Studer et al.).

Ligand Modelling
Ligands present in the template structure are transferred by homology to the model when the following criteria are met: (a) The ligands are annotated as biologically relevant in the template library, (b) the ligand is in contact with the model, (c) the ligand is not clashing with the protein, (d) the residues in contact with the ligand are conserved between the target and the template. If any of these four criteria is not satisfied, a certain ligand will not be included in the model. The model summary includes information on why and which ligand has not been included.

Results
The SWISS-MODEL template library (SMTL version 2020-05-27, PDB release 2020-05-22) was searched with BLAST (Camacho et al.) and HHBlits (Remmert et al.) for evolutionary related structures matching the target sequence in Table T1. For details on the template search, see Materials and Methods. Overall 185 templates were found (Table T2).
The target sequence was searched with BLAST against the primary amino acid sequence contained in the SMTL. A total of 12 templates were found.
An initial HHblits profile has been built using the procedure outlined in (Remmert et al.), followed by 1 iteration of HHblits against NR20. The obtained profile has then be searched against all profiles of the SMTL. A total of 173 templates were found.

Model Building
Models are built based on the target-template alignment using ProMod3. Coordinates which are conserved between the target and the template are copied from the template to the model. Insertions and deletions are remodelled using a fragment library. Side chains are then rebuilt. Finally, the geometry of the resulting model is regularized by using a force field. In case loop modelling with ProMod3 fails, an alternative model is built with PROMOD-II (Guex et al.).

Model Quality Estimation
The global and per-residue model quality has been assessed using the QMEAN scoring function (Studer et al.).

Ligand Modelling
Ligands present in the template structure are transferred by homology to the model when the following criteria are met: (a) The ligands are annotated as biologically relevant in the template library, (b) the ligand is in contact with the model, (c) the ligand is not clashing with the protein, (d) the residues in contact with the ligand are conserved between the target and the template. If any of these four criteria is not satisfied, a certain ligand will not be included in the model. The model summary includes information on why and which ligand has not been included.

Oligomeric State Conservation
The quaternary structure annotation of the template is used to model the target sequence in its oligomeric form. The method (Bertoni et al.) is based on a supervised machine learning algorithm, Support Vector Machines (SVM), which combines interface conservation, structural clustering, and other template features to provide a quaternary structure quality estimate (QSQE). The QSQE score is a number between 0 and 1, reflecting the expected accuracy of the interchain contacts for a model built based a given alignment and template. Higher numbers indicate higher reliability. This complements the GMQE score which estimates the accuracy of the tertiary structure of the resulting model.