[chimerax-users] Re: Questions about alphafold interfaces output discrepancies

19 May 2025

      Hi Bruno,
First I should say that the developer of this feature is away for a few days so you may need to wait a while for a better answer.

I will try to offer some thoughts but I don't understand what you are doing.  Firstly, the "alphafold interfaces" command is for batch processing of multiple protein-protein dimer predictions (i.e. a bunch of different pairs of protein sequences to help figure out which proteins might really bind each other).  It seems that you are only predicting the structure of one particular pair of proteins.   Are you indeed trying to figure out if they actually bind to each other or not?

The table you are getting may not make much sense if you are not giving it the type of data it expects (multiple different dimer predictions of different sequences in same folder). My original understanding was that #Res1 and #Res2 are not a quantity of residues (how many residues) but the actual residue number in the structure, i.e. residue 12, but I agree that doesn't really make sense either.

I don't see anything in the "alphafold interfaces" help about it making any selection so I have no idea what is selecting the 24 residues.  I would guess it was some other action you performed that was not part of the "alphafold interfaces" command.
<https://rbvi.ucsf.edu/chimerax/docs/user/commands/alphafold.html#batch>

These batch commands (link above) are meant for advanced users with typically hundreds of predictions of different complexes.

For a single prediction (you are only looking a specific pair of protein chains) it may make sense to use a different command for analysis, say "alphafold contacts"
<https://rbvi.ucsf.edu/chimerax/docs/user/commands/alphafold.html#contacts>

  As the help for "alphafold interfaces" says, it is considering as possibly real binders the dimers with at least N pairs of resides across the interface with PAE less than some value and atoms within some distance (defaults 10, 5.0, and 4.0, respectively but these are user-adjustable).  So you could just use "alphafold contacts" with some PAE and distance cutoffs as you feel appropriate between the two chains and see how many lines you get for your particular dimer.

If I do this with your data I get 20 pseudobonds showing the pairs that meet the criteria.  From the Log:

open /Users/meng/Desktop/o15305_a108e/fold_o15305_a108e_model_0.cif
open /Users/meng/Desktop/o15305_a108e/fold_o15305_a108e_full_data_0.json
alphafold contacts /A toAtoms /B distance 4 maxPae 5
Found 20 residue or atom pairs within distance 4 with pae <= 5

I hope this helps,
Elaine
-----
Elaine C. Meng, Ph.D.                       
UCSF Chimera(X) team
Resource for Biocomputing, Visualization, and Informatics
Department of Pharmaceutical Chemistry
University of California, San Francisco
...
On May 19, 2025, at 7:56 AM, Bruno Hay Mele via ChimeraX-users <chimerax-users@cgl.ucsf.edu> wrote:
Hi!
I am dabbling with the alphafold interfaces command, and I cannot make 
sense of the data.
In my user case, the table synthesis produced by the command says:
Models  Confident pairs #Res1 #Res2
     4               23    12    12
opening the best model produces these relevant lines in the log
200 atoms, 188 bonds, 24 residues, 1 model selected
alphafold contacts last-opened & /A toAtoms last-opened & /B distance 
4.0 maxPae 5.0
Found 14 residue or atom pairs within distance 4 with pae <= 5
in the csv, the same model has these relevant columns:
distance, max_pae, num_res1, num_res2, num_interface_res1,
       4,       5,      246,      246,                 20,
num_interface_res2, num_interface_res_pairs, num_confident_pairs,
                20,                      42,                  23,
interface_res_num1, interface_res_num2
           12 values           12 values
I am struggling a bit to reconcile all these numbers.
The 24 residues selected by chimera after opening are not anywhere else.
I suppose these are #Res1 + #Res2 of the synthesis (and 
interface_res_num1 + interface_res_num2 in the csv)?
Both the log synthesis and the csv report 23 as number of confident 
pairs, but the log message after opening best model says 14 residues (I 
assume these are those within the distance/pae constraint?)
Why Confident pairs is not #Res1 + #Res2?
I am also a little bit puzzled by the csv where I read 
num_interface_res_pairs = 42 and num_interface_res1, num_interface_res2 
20 ( I was expecting either 21,21 for num_interface_res or 40 for 
num_interface_res_pairs.
I attach the prediction I was using for diagnosing the problem.
cheers,
--
Bruno Hay Mele, PhD
2D-20, Biology Dept., University of Naples Federico II
https://github.com/bhym/ | +39 081 67 9118
<o15305_a108e.zip>_______________________________________________
ChimeraX-users mailing list -- chimerax-users@cgl.ucsf.edu
To unsubscribe send an email to chimerax-users-leave@cgl.ucsf.edu
Archives: https://mail.cgl.ucsf.edu/mailman/archives/list/chimerax-users@cgl.ucsf.edu/

[chimerax-users] Re: Questions about alphafold interfaces output discrepancies

Elaine Meng