Aditi Shenoy

PhD Candidate in Protein Bioinformatics and Machine Learning

CASP 15 Day-4

Posted at — Dec 13, 2022

The assessment of predicted models is almost as important as the prediction itself. It would be hard to trust predictions if we don’t know how good they are when the native structure is not known. This year, new criteria needed to be used to assess quaternary structures as compared to tertiary structures as in previous years. This is where the definition of interfaces needs to clarified and subsequently poses a challenge on how best to evaluate interfaces.

The targets were assessed on a per-residue basis and these performed at par with the results from the consenses method. The Q-score is an interface based contact score, where the residues are weighted based on the contacts distance from target/model. DockQ evaluates single interfaces and becomes weighted for higher order complexes i.e. big interfaces contributes more than smaller interfaces. The assesment highlighted the problem of chain mapping where the order of chains in the model and target do not have a 1-1 correspondance. In these cases, a combinatorial comparision needs to be conducted or the chains need to be aligned using global score like MM/US-align. A global evaluation (based on super-imposition) was also conducted which compares if domains are in the right position. The oligo-GDT_TS score was also used which also required a 1-to-1 mapping but is in fact a stricter score than TM score. Local scores like lDDT and CAD are also used. lDDT, however does not penalize wrong contacts. An additional problem while comparing interfaces is (in Kliment’s words) for “Michelangelo contacts” where just a single pair of residues might be in contact and maybe should not be considered an interface. An idea about using PatchQS score or Patch DockQ was introduced to compare the performance over an patch at the interface instead of single pairs of residues. The analysis today again showed that nanobodies are difficult targets and flexibility in targets also make it a hard problem.

All top-performing quality assessment methods were either consensus measures (using all vs all pairwise MMalign or variety of scores) or using deep learning (with e.g. graphs or voronoi tesselations). These method performed well for most cases except for ones having multi-state conformations.

The session ended with discussions about how to integrate the various SIG groups (via journal clubs, monthly zoom meetings). An interesting idea on introducing targets based on a certain diseae/system (in order study the underlying biology) was introduced. A white paper to democratize the use of these prediction methods (in addition to the AlphaFold updates) was suggested. It would also be of interest to develop well curated training/validation and testing datasets (taking well-defined homology criteria into consideration) and making it accessible for more reliable predictors from the machine learning community.

comments powered by Disqus