Like all methods on Genome3D, the VIVACE pipeline is based on homology modelling, which at its core depends on detecting and using structural similarity to derive information about our systems. What kind of information is most valuable to us will depend on our research, but, structurally, it will invariably come from the homologues we find.
While in some cases we are lucky to find even one, in others we are spoilt for choice. Depending on our goals, the selection might not make much difference, but it may also potentially lead to the disagreements we can observe on Genome3D. In any case, the variety of options itself can often give us valuable insights about the system.
Through its use of the TOCCATA database, the VIVACE pipeline is designed from its conception to be aware of the variety of templates and to try and present it and use it efficiently.
Go to the Genome3D page for uniprot:Q8N427 and click on one of the FUGUE links. It doesn't matter whether SCOP- or CATH-based, they both lead to the same place. That page displays the detailed FUGUE results against the profiles of the TOCCATA database. The first column indicates the profile hit and it shows why the choice of resource made no difference: internally, TOCCATA uses its own SCOP/CATH consensus system to categorise domains, which can be seen in how some profiles are labelled with a nomenclature from SCOP, CATH or both. Unlike Genome3D's own consensus endeavour, the objective was not so much the analysis and comparison of the resources, but to cluster domains from both systems into as few sensible and consistent groups as possible (which is the reason why SCOP was used at the family level instead of superfamily, as you may notice). The length column describes the length of the listed profile. The Z-score column shows the significance of the profile for a given region of the sequence, specified on the last. It is colour-coded for guidance, but basically anything over a value of 6 should be safe to assume as similar and over 4 as likely so.
Click on the name of a profile (a good example would be d.58.6.1-220.127.116.11). It will take you the primary page of a TOCCATA profile. It consists of three sections:
- A JOY formatted alignment of representative sequences: JOY uses typography and formatting to encode structural information into a sequence alignment that can be viewed at a glance to facilitate inspection and comparison of structures, often without needing to visualise 3D structures. The basic original format includes information such as secondary structure, solvent exposure and main chain hydrogen bonding and its key can be seen here, However, the XSuLT extension potentially adds many other features, such as residue depth, inter-residue contacts (not currently displayed), interface and ligand binding residues, as well as data from further such as sequence conservation (entropy) and RMSD from superposition, which highlights areas of spatial variability in a fold.
- A list of related profiles: There are a few ways in which a profile can be related to others. For instance, in this case there is 18.104.22.168. You might notice that the 22.214.171.124 superfamily is also part of the current consensus profile. However, the structures on that profile have not been characterised under SCOP, and TOCCATA, rather than make assumptions about whether they should belong in the consensus, prefers to create an individual profile for them (incidentally, this CATH/SCOP pair happens to be a "Bronze" mapping on Genome3D, since there are some SCOP assignments to the parent superfamily that do not match the CATH one). Another way profiles can be related, not exemplified in this case, is when one of the elements can be found in any multi-domain patterns present on TOCCATA.
- A list of all domains or chains belonging to this profile: This presents all PDBs in this profile clustered at various identity thresholds and annotated according to their conformational status, in addition to experimental data. Chains with the same cluster number/colour, belong to the same cluster at the given threshold.
The button at the bottom of the page allows you to align any sequence to that particular profile using FUGUE. The resulting page will provide the estimated Z-score for the alignment as well as give the option of customising the template selection (including those from related profiles, if so desired).
Let's go back to the result page for the VIVACE prediction, and go under the "Model and alignments" tab. There we can view an interactive model of each predicted domain along with an XSuLT representation of the alignment used to generate the model. In addition to the features seen on the TOCCATA website, when used on an alignment with sequences or models in addition to structures, XSuLT also includes secondary structure and disorder prediction for the sequence, represented as a coloured line (red for helix, blue for strand, green for disorder) on top of the modelled sequence, which can help to assess the quality of the alignment that is critical to the resulting model.