Clustering docking results using "headless" Chimera

Hi there, I want to use Chimera on a server (or in an automated fashion by command line; "-nogui") to read in a bunch (1000s) of PDB structures (each having an identical number of atoms) and then do clustering using the NMRCLUST code that you've so conveniently adapted. I found this old thread: http://www.cgl.ucsf.edu/pipermail/chimera-users/2008-October/003193.html Is this still the preferred method for doing what I'm hoping? Also, is it possible to do a MatchMaker alignment before the clustering? Thanks, Darrell -- Darrell Hurt, Ph.D. Section Head, Computational Biology Bioinformatics and Computational Biosciences Branch (BCBB) OCICB/OSMO/OD/NIAID/NIH 31 Center Drive, Room 3B62B, MSC 2135 Bethesda, MD 20892-2135 Office 301-402-0095 Mobile 301-758-3559 http://bioinformatics.niaid.nih.gov<http://bioinformatics.niaid.nih.gov/> (Within NIH) http://exon.niaid.nih.gov<http://exon.niaid.nih.gov/> (Public) Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives.

Yes, a script is still probably the easiest way to do what you want. However, there is a call in the old script that no longer exists in the new one. I'm attaching a new script that will do a couple things differently than before: - Instead of opening files in the script, it will now use models that have already been opened. - It will call matchmaker to align all models to the model with the lowest model/submodel number (usually the first model opened). To run it, save the script as "cluster.py" and run the command: chimera --nogui --silent your_model_files cluster.py where cluster.py is the name of the script. The first model opened will be the reference structure to which everything else is aligned. In the script, Matchmaker is invoked by the line: chimera.runCommand("mm %s %s" % (ref.oslIdent(), m.oslIdent())) If you want to use more options (see http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/mmaker.html for details), just add it after the second "%s" on the line. Please let me if this does what you need, or if you'd like it to do more stuff. Conrad On 6/26/2012 7:57 PM, Hurt, Darrell (NIH/NIAID) [E] wrote:
Hi there,
I want to use Chimera on a server (or in an automated fashion by command line; "-nogui") to read in a bunch (1000s) of PDB structures (each having an identical number of atoms) and then do clustering using the NMRCLUST code that you've so conveniently adapted.
I found this old thread: http://www.cgl.ucsf.edu/pipermail/chimera-users/2008-October/003193.html
Is this still the preferred method for doing what I'm hoping?
Also, is it possible to do a MatchMaker alignment before the clustering?
Thanks, Darrell
-- Darrell Hurt, Ph.D. Section Head, Computational Biology Bioinformatics and Computational Biosciences Branch (BCBB) OCICB/OSMO/OD/NIAID/NIH
31 Center Drive, Room 3B62B, MSC 2135 Bethesda, MD 20892-2135 Office 301-402-0095 Mobile 301-758-3559 http://bioinformatics.niaid.nih.gov<http://bioinformatics.niaid.nih.gov/> (Within NIH) http://exon.niaid.nih.gov<http://exon.niaid.nih.gov/> (Public)
Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives.
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

Hi Conrad, This is fantastic. I'll give it a try and let you know how things go. Best wishes, Darrell Darrell Hurt, Ph.D. Section Head, Computational Biology Bioinformatics and Computational Biosciences Branch (BCBB) OCICB/OSMO/OD/NIAID/NIH 31 Center Drive, Room 3B62B, MSC 2135 Bethesda, MD 20892-2135 Office 301-402-0095 Mobile 301-758-3559 http://bioinformatics.niaid.nih.gov (Within NIH) http://exon.niaid.nih.gov (Public) Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives. On 6/27/12 2:03 PM, "Conrad Huang" <conrad@cgl.ucsf.edu> wrote:
Yes, a script is still probably the easiest way to do what you want. However, there is a call in the old script that no longer exists in the new one. I'm attaching a new script that will do a couple things differently than before:
- Instead of opening files in the script, it will now use models that have already been opened. - It will call matchmaker to align all models to the model with the lowest model/submodel number (usually the first model opened).
To run it, save the script as "cluster.py" and run the command:
chimera --nogui --silent your_model_files cluster.py
where cluster.py is the name of the script. The first model opened will be the reference structure to which everything else is aligned.
In the script, Matchmaker is invoked by the line:
chimera.runCommand("mm %s %s" % (ref.oslIdent(), m.oslIdent()))
If you want to use more options (see http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/mmaker.html for details), just add it after the second "%s" on the line.
Please let me if this does what you need, or if you'd like it to do more stuff.
Conrad
On 6/26/2012 7:57 PM, Hurt, Darrell (NIH/NIAID) [E] wrote:
Hi there,
I want to use Chimera on a server (or in an automated fashion by command line; "-nogui") to read in a bunch (1000s) of PDB structures (each having an identical number of atoms) and then do clustering using the NMRCLUST code that you've so conveniently adapted.
I found this old thread: http://www.cgl.ucsf.edu/pipermail/chimera-users/2008-October/003193.html
Is this still the preferred method for doing what I'm hoping?
Also, is it possible to do a MatchMaker alignment before the clustering?
Thanks, Darrell
-- Darrell Hurt, Ph.D. Section Head, Computational Biology Bioinformatics and Computational Biosciences Branch (BCBB) OCICB/OSMO/OD/NIAID/NIH
31 Center Drive, Room 3B62B, MSC 2135 Bethesda, MD 20892-2135 Office 301-402-0095 Mobile 301-758-3559
http://bioinformatics.niaid.nih.gov<http://bioinformatics.niaid.nih.gov/> (Within NIH) http://exon.niaid.nih.gov<http://exon.niaid.nih.gov/> (Public)
Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives.
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

Hi Conrad, Thanks again for your hard work. This works very well for me. I really like the clustering algorithm in Chimera because it adjusts the clustering threshold automatically and does its thing very quickly. Speed is very important for my purposes, so it was necessary to comment out the MatchMaker part of the script (thanks for including that; I'm glad it seems like that wasn't the majority of your work!). Another thing slowing down the clustering was the invocation of "ksdssp" upon opening of the individual PDBs. Is there an option to skip that? Finally, I want to run this on at least 1000 PDBs. After about 50, Python had gobbled up more than 1 GB of memory and my laptop slowed to a crawl. I was able to kill it, but this may be a concern for production work. We intend to run this on a high-memory node of a cluster, but is there some way to reduce the memory used? Thanks for everything! Darrell -- Darrell Hurt, Ph.D. Section Head, Computational Biology Bioinformatics and Computational Biosciences Branch (BCBB) OCICB/OSMO/OD/NIAID/NIH 31 Center Drive, Room 3B62B, MSC 2135 Bethesda, MD 20892-2135 Office 301-402-0095 Mobile 301-758-3559 http://bioinformatics.niaid.nih.gov <http://bioinformatics.niaid.nih.gov/> (Within NIH) http://exon.niaid.nih.gov <http://exon.niaid.nih.gov/> (Public) Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives. On 6/27/12 2:03 PM, "Conrad Huang" <conrad@cgl.ucsf.edu> wrote:
Yes, a script is still probably the easiest way to do what you want. However, there is a call in the old script that no longer exists in the new one. I'm attaching a new script that will do a couple things differently than before:
- Instead of opening files in the script, it will now use models that have already been opened. - It will call matchmaker to align all models to the model with the lowest model/submodel number (usually the first model opened).
To run it, save the script as "cluster.py" and run the command:
chimera --nogui --silent your_model_files cluster.py
where cluster.py is the name of the script. The first model opened will be the reference structure to which everything else is aligned.
In the script, Matchmaker is invoked by the line:
chimera.runCommand("mm %s %s" % (ref.oslIdent(), m.oslIdent()))
If you want to use more options (see http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/mmaker.html for details), just add it after the second "%s" on the line.
Please let me if this does what you need, or if you'd like it to do more stuff.
Conrad
On 6/26/2012 7:57 PM, Hurt, Darrell (NIH/NIAID) [E] wrote:
Hi there,
I want to use Chimera on a server (or in an automated fashion by command line; "-nogui") to read in a bunch (1000s) of PDB structures (each having an identical number of atoms) and then do clustering using the NMRCLUST code that you've so conveniently adapted.
I found this old thread: http://www.cgl.ucsf.edu/pipermail/chimera-users/2008-October/003193.html
Is this still the preferred method for doing what I'm hoping?
Also, is it possible to do a MatchMaker alignment before the clustering?
Thanks, Darrell
-- Darrell Hurt, Ph.D. Section Head, Computational Biology Bioinformatics and Computational Biosciences Branch (BCBB) OCICB/OSMO/OD/NIAID/NIH
31 Center Drive, Room 3B62B, MSC 2135 Bethesda, MD 20892-2135 Office 301-402-0095 Mobile 301-758-3559
http://bioinformatics.niaid.nih.gov<http://bioinformatics.niaid.nih.gov/> (Within NIH) http://exon.niaid.nih.gov<http://exon.niaid.nih.gov/> (Public)
Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives.
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

Assigning secondary structure is pretty ingrained, but if you _really_ want to do it, you can open the attached "noksdssp.py" file before your models and it will prevent ksdssp from being called. Just remember that it's a HACK :-) and may have side effects if you do anything else that might depend on secondary structure type. The size problem, on the other hand, needs a little more analysis. Would it be possible for you to send me a large dataset so that I can try out a few things? Thanks. Conrad On 6/28/12 6:23 AM, Hurt, Darrell (NIH/NIAID) [E] wrote:
Hi Conrad,
Thanks again for your hard work. This works very well for me. I really like the clustering algorithm in Chimera because it adjusts the clustering threshold automatically and does its thing very quickly.
Speed is very important for my purposes, so it was necessary to comment out the MatchMaker part of the script (thanks for including that; I'm glad it seems like that wasn't the majority of your work!). Another thing slowing down the clustering was the invocation of "ksdssp" upon opening of the individual PDBs. Is there an option to skip that?
Finally, I want to run this on at least 1000 PDBs. After about 50, Python had gobbled up more than 1 GB of memory and my laptop slowed to a crawl. I was able to kill it, but this may be a concern for production work. We intend to run this on a high-memory node of a cluster, but is there some way to reduce the memory used?
Thanks for everything!
Darrell
participants (2)
-
Conrad Huang
-
Hurt, Darrell (NIH/NIAID) [E]