Guidance on aligning models with overlapping aa subsequences

Hi everyone, I'm hoping to get some assistance from the community for visualizing structures from multiple AF runs with overlapping subsequences. We have been using AlphaFold monomer to try to identify structure and domains of a large (+3000aa) human protein. Our AF runs fail to model the full structure (possibly insufficient resources), so we have tried 'breaking' up the protein into overlapping subsequences. I was wondering if there is a sensible way to combine these substructures based on their overlapping regions? I have tried matchmaker, which I believe constructs a MSA and attempts to align the structures, but it seems a significant region of the overlap is not aligning correctly (img below with the overlapping subsequences highlighted). Any guidance is greatly appreciated. Thank you, Martin [image: image.png] Thank you, Martin

Hi Martin, You would use "align" (not "matchmaker") to specify which residues/atoms to use in each pairwise fit. The tricky part is specifying the residues/atoms of the overlap portions correctly so that you get exactly the same number of atoms from each as well as the pairing you want. There are several options to the command that may assist in this process, however, see: <https://rbvi.ucsf.edu/chimerax/docs/user/commands/align.html> I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Chimera(X) team Resource for Biocomputing, Visualization, and Informatics Department of Pharmaceutical Chemistry University of California, San Francisco
On May 7, 2024, at 11:52 AM, Martin Gordon via ChimeraX-users <chimerax-users@cgl.ucsf.edu> wrote:
Hi everyone,
I'm hoping to get some assistance from the community for visualizing structures from multiple AF runs with overlapping subsequences.
We have been using AlphaFold monomer to try to identify structure and domains of a large (+3000aa) human protein. Our AF runs fail to model the full structure (possibly insufficient resources), so we have tried 'breaking' up the protein into overlapping subsequences.
I was wondering if there is a sensible way to combine these substructures based on their overlapping regions? I have tried matchmaker, which I believe constructs a MSA and attempts to align the structures, but it seems a significant region of the overlap is not aligning correctly (img below with the overlapping subsequences highlighted).
Any guidance is greatly appreciated.
Thank you, Martin <image.png>
Thank you, Martin

You might be interested in this ChimeraX Python code that was used to assemble large AlphaFold models from many predicted pieces. https://rbvi.github.io/chimerax-recipes/big_alphafold/bigalpha.html Tom
On May 7, 2024, at 11:52 AM, Martin Gordon via ChimeraX-users <chimerax-users@cgl.ucsf.edu> wrote:
Hi everyone,
I'm hoping to get some assistance from the community for visualizing structures from multiple AF runs with overlapping subsequences.
We have been using AlphaFold monomer to try to identify structure and domains of a large (+3000aa) human protein. Our AF runs fail to model the full structure (possibly insufficient resources), so we have tried 'breaking' up the protein into overlapping subsequences.
I was wondering if there is a sensible way to combine these substructures based on their overlapping regions? I have tried matchmaker, which I believe constructs a MSA and attempts to align the structures, but it seems a significant region of the overlap is not aligning correctly (img below with the overlapping subsequences highlighted).
Any guidance is greatly appreciated.
Thank you, Martin <image.png>
Thank you, Martin _______________________________________________ ChimeraX-users mailing list -- chimerax-users@cgl.ucsf.edu To unsubscribe send an email to chimerax-users-leave@cgl.ucsf.edu Archives: https://mail.cgl.ucsf.edu/mailman/archives/list/chimerax-users@cgl.ucsf.edu/

Will check out this tool, thank you Elaine! On Tue, May 7, 2024 at 12:02 PM Elaine Meng <meng@cgl.ucsf.edu> wrote:
Hi Martin, You would use "align" (not "matchmaker") to specify which residues/atoms to use in each pairwise fit. The tricky part is specifying the residues/atoms of the overlap portions correctly so that you get exactly the same number of atoms from each as well as the pairing you want. There are several options to the command that may assist in this process, however, see:
<https://rbvi.ucsf.edu/chimerax/docs/user/commands/align.html>
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Chimera(X) team Resource for Biocomputing, Visualization, and Informatics Department of Pharmaceutical Chemistry University of California, San Francisco
On May 7, 2024, at 11:52 AM, Martin Gordon via ChimeraX-users < chimerax-users@cgl.ucsf.edu> wrote:
Hi everyone,
I'm hoping to get some assistance from the community for visualizing structures from multiple AF runs with overlapping subsequences.
We have been using AlphaFold monomer to try to identify structure and domains of a large (+3000aa) human protein. Our AF runs fail to model the full structure (possibly insufficient resources), so we have tried 'breaking' up the protein into overlapping subsequences.
I was wondering if there is a sensible way to combine these substructures based on their overlapping regions? I have tried matchmaker, which I believe constructs a MSA and attempts to align the structures, but it seems a significant region of the overlap is not aligning correctly (img below with the overlapping subsequences highlighted).
Any guidance is greatly appreciated.
Thank you, Martin <image.png>
Thank you, Martin
participants (3)
-
Elaine Meng
-
Martin Gordon
-
Tom Goddard