Getting average of PAE values for a given region / interaction
Greetings and Happy New Year to All! My question is about broadly protein-protein interactions (PPIs. Last year several manuscripts came out proposing new metrics for assessing PPIs. This was in logical response to the growing understanding that ipTM (alone) is a fairly flawed metric, particularly for detecting interactions that involve a small number/fraction of the total residues. For those interested, I’ve pasted links to three such papers below (Coincidentally one is from a former roommate's lab, which I didn’t notice until recently!). Some may be easier to use than others (have GUIs). Tom Goddard and I emailed back and forth a bit about the relative merits of some of these metrics - nothing is perfect obviously (thank you again Tom). With respect to ChimeraX, I was wondering if there was a way to extract average PAE values after employing a cutoff. For example, "alphafold contacts /A distance 5 maxPae 5” will tell me that are "54 residue or atom pairs within distance 5 with pae <= 5”. But what if I want to know the average of those 54? Along those lines, for some PPIs there may be several spatially separated interfaces, with each one contributing to the total number. Is there an easy way to obtain the average and other metrics for each one? Some of this information I could probably find in my clunky (borderline boomer) way, by filtering, adding, and averaging values from the pairwise data (perhaps made easier by Predictome). Lastly, drawing rectangles on the PAE plot does a great job of giving coordinates and the average PAE, but it doesn’t give the range or other potentially useful data, such as the number of PAE<5 values etc. I guess I’m thinking that it could be very useful to be able to get metrics for a prescribed region of the PAE plot simply by drawing the rectangle. (And maybe there is but I missed it.) Those are my questions and thoughts. But I also wanted to express my gratitude for the ChimeraX software, and even more, the people working at ChimeraX who are always exceptionally responsive and helpful. David https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 https://www.biorxiv.org/content/10.1101/2024.10.23.619601v1 https://www.biorxiv.org/content/10.1101/2024.04.09.588596v1 David S. Fay Ph.D. Professor, Department of Molecular Biology Associate Director, National Institutes of Health Wyoming INBRE University of Wyoming email: davidfay@uwyo.edu
Hi David, Happy New Year. Possibly the best feature of ChimeraX is that it is written in Python and you can modify any of those Python files to output more info or change what the built-in features do. Of course, it requires some knowledge of Python and some skill at figuring out what other people's code is doing. But it extends the range of what you can do with ChimeraX ten-fold. The things you are asking about are for the most part too specialized for our limited UCSF ChimeraX development team to build into the program. But let me give you examples of how I took a few minutes to chang my installed ChimeraX to do some of what you asked. To make the alphafold contacts command output the average PAE value I edited the Python file in my ChimeraX distribution (on Mac) ChimeraX.app/Contents/lib/python3.11/site-packages/chimerax/alphafold/contacts.py where it prints out the "54 residue or atom pairs within distance 5 with pae <= 5” in these lines msg = f'Found {len(rapairs)} residue or atom pairs within distance %.3g' % distance if max_pae is not None: msg += ' with pae <= %.3g' % max_pae I added a couple more lines that print the average PAE from numpy import mean msg += ' with average pae %.3g' % mean(pae_values) Then when I restart ChimeraX and use the alphafold contacts command I get output Found 5 residue or atom pairs within distance 3 with average pae 4.1 You also asked about making a mouse drag on the PAE plot report the average PAE value within the rectangle dragged. I'm not sure why you said the PAE plot does a great job reporting average PAE, it doesn't do that, unless you add lines of code like what I show here. In ChimeraX distribution file (on Mac) ChimeraX.app/Contents/lib/python3.11/site-packages/chimerax/alphafold/pae.py at the bottom of the routine _rectangle_select() I added these lines of code pae = self._pae ave = pae.pae_matrix[r3:r4+1,r1:r2+1].mean() rra = pae.row_residues_or_atoms() yres = _residue_and_atom_spec(rra[r3:r4+1]) xres = _residue_and_atom_spec(rra[r1:r2+1]) print(f'Mean PAE {"%.3g" % ave} in dragged box {xres} aligned to {yres}') Restarting ChimeraX and dragging a rectangle on the PAE plot now reports Mean PAE 5.97 in dragged box /A:139-176 aligned to /A:2-19 I'm not saying it is easy to figure out where the code you need to modify is or what lines you need to add. I can do it easily because I wrote the PAE code. It took me about 10 minutes to add code for these two examples and test them. I'd guess someone with familiarity with Python and knowing a bit about the organization of ChimeraX files could do it in 30 minutes. Lacking that knowledge you will have to do things in harder ways or not at all. For instance if you wanted the average PAE value from the "alphafold contacts" command you could output all the contacts to a file as described in the documentation using the outputFile option https://www.rbvi.ucsf.edu/chimerax/docs/user/commands/alphafold.html#contact... and then use other software like a speadsheet program to extract the column of PAE values from the file and average it. Painful. Tom
On Jan 1, 2025, at 11:34 AM, David S. Fay via ChimeraX-users <chimerax-users@cgl.ucsf.edu> wrote:
Greetings and Happy New Year to All!
My question is about broadly protein-protein interactions (PPIs. Last year several manuscripts came out proposing new metrics for assessing PPIs. This was in logical response to the growing understanding that ipTM (alone) is a fairly flawed metric, particularly for detecting interactions that involve a small number/fraction of the total residues. For those interested, I’ve pasted links to three such papers below (Coincidentally one is from a former roommate's lab, which I didn’t notice until recently!). Some may be easier to use than others (have GUIs). Tom Goddard and I emailed back and forth a bit about the relative merits of some of these metrics - nothing is perfect obviously (thank you again Tom).
With respect to ChimeraX, I was wondering if there was a way to extract average PAE values after employing a cutoff. For example, "alphafold contacts /A distance 5 maxPae 5” will tell me that are "54 residue or atom pairs within distance 5 with pae <= 5”. But what if I want to know the average of those 54? Along those lines, for some PPIs there may be several spatially separated interfaces, with each one contributing to the total number. Is there an easy way to obtain the average and other metrics for each one?
Some of this information I could probably find in my clunky (borderline boomer) way, by filtering, adding, and averaging values from the pairwise data (perhaps made easier by Predictome).
Lastly, drawing rectangles on the PAE plot does a great job of giving coordinates and the average PAE, but it doesn’t give the range or other potentially useful data, such as the number of PAE<5 values etc. I guess I’m thinking that it could be very useful to be able to get metrics for a prescribed region of the PAE plot simply by drawing the rectangle. (And maybe there is but I missed it.)
Those are my questions and thoughts. But I also wanted to express my gratitude for the ChimeraX software, and even more, the people working at ChimeraX who are always exceptionally responsive and helpful.
David
https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 https://www.biorxiv.org/content/10.1101/2024.10.23.619601v1 https://www.biorxiv.org/content/10.1101/2024.04.09.588596v1
David S. Fay Ph.D. Professor, Department of Molecular Biology Associate Director, National Institutes of Health Wyoming INBRE University of Wyoming email: davidfay@uwyo.edu
_______________________________________________ ChimeraX-users mailing list -- chimerax-users@cgl.ucsf.edu To unsubscribe send an email to chimerax-users-leave@cgl.ucsf.edu Archives: https://mail.cgl.ucsf.edu/mailman/archives/list/chimerax-users@cgl.ucsf.edu/
participants (2)
-
David S. Fay
-
Tom Goddard