
Dear all, I would like to write both the sequence alignment and RMSD out after run “mmaker”, but I do NOT want to do this interactively because I have a batch of structures to deal with. I found Eric’s python script: structAlign.py, which is gui-only, but I do not know how to extract RMSD from the Reply Log in gui mode. Any help would be appreciated. Thanks a lot! Btw, may I know how to run structAlign.py? Is there any command like “chimera structAlign.py” and where should I put the input file structList? I can not run this script right now and here is the error message I got after run “chimera structAlign.py”: Traceback (most recent call last): File "/home/sheng/local/bin/chimera/share/__main__.py", line 59, in ? value = chimeraInit.init(sys.argv) File "CHIMERA/share/chimeraInit.py", line 298, in init chimera.openModels.open(a, prefixableType=1) File "CHIMERA/share/chimera/__init__.py", line 1253, in open File "CHIMERA/share/chimera/__init__.py", line 746, in _openPython File "/home/sheng/local/bin/chimera/from_mailinglist/structAlign.py", line 18, in ? pdb1, pdb2, output = line.strip().split() ValueError: need more than 2 values to unpack Error while processing structAlign.py: ValueError: need more than 2 values to unpack (see reply log for Python traceback info) Best wishes, Zhiya

Hi Zhiya, That error indicates that your "structList" file has a line that only has two fields, but three are required: pdb1, pdb2, output_file. E.g. a line like this: /mol/pdb/mn/pdb2mnr.ent /mol/pdb/en/pdb4enl.ent aligned.msf which would align the two named PDB files and write the alignment to aligned.msf. Now, there have been some improvements/changes in Chimera since the time that the structAlign.py script was written (2005). I've appended a revised version of the script that not only writes out the alignment, but writes out a second file with the RMSD (same name as the alignment file but with ".rmsd" appended). Since it seems that you didn't care about the Match->Align part of the original script, I cut that part out -- so the alignment is just what MatchMaker generates. --Eric On Dec 7, 2007, at 1:55 AM, "" <shengzhiya@nibs.ac.cn> wrote:
Dear all,
I would like to write both the sequence alignment and RMSD out after run “mmaker”, but I do NOT want to do this interactively because I have a batch of structures to deal with. I found Eric’s python script: structAlign.py, which is gui-only, but I do not know how to extract RMSD from the Reply Log in gui mode. Any help would be appreciated. Thanks a lot!
Btw, may I know how to run structAlign.py? Is there any command like “chimera structAlign.py” and where should I put the input file structList? I can not run this script right now and here is the error message I got after run “chimera structAlign.py”: Traceback (most recent call last): File "/home/sheng/local/bin/chimera/share/__main__.py", line 59, in ? value = chimeraInit.init(sys.argv) File "CHIMERA/share/chimeraInit.py", line 298, in init chimera.openModels.open(a, prefixableType=1) File "CHIMERA/share/chimera/__init__.py", line 1253, in open File "CHIMERA/share/chimera/__init__.py", line 746, in _openPython File "/home/sheng/local/bin/chimera/from_mailinglist/ structAlign.py", line 18, in ? pdb1, pdb2, output = line.strip().split() ValueError: need more than 2 values to unpack Error while processing structAlign.py: ValueError: need more than 2 values to unpack (see reply log for Python traceback info)
Best wishes, Zhiya
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://www.cgl.ucsf.edu/mailman/listinfo/chimera-users

Dear Eric, Thank you very much for your reply and I apologize for this late “thank you”! I am out of town and can not touch the internet during the weekend. I still got several questions here. Would you please help me? 1. The significance of RMSD varies with the numbers of atom pairs, so maybe it makes more sense if I can write out the number of atom pairs at the same time. Is this possible? 2. I think I need the alignment from Match->Align, which better represents the structure alignment. Sorry I kind of got stuck on the output problem, and forgot what I need in the first place. And I am curious about which residues are included in the “core” regions and used for RMSD calculation after turning on the “iterate” option. Can I see this from the Match->Align output? It seems that even if I choose the same cutoff in Match->Align as in iteration, the number of aligned residues is still not the same as the number of atom pairs in RMSD calculation. 3. Also, I am really sorry for not mention this in my first letter -- can I write the transformed PDB file out at the same time? I can only do this with a “.com” file but not a python script. Thank you very much for your help! Have a nice day! Best wishes, Zhiya

Hi Zhiya, I've attached an enhanced version of the script that covers your additional needs. Changes are: 1) writes out # of residues pairs involved in RMSD 2) appends list of those residue pairs 3) does the Match->Align step and writes that alignment instead of MatchMaker alignment 4) writes out transformed version of second PDB 5) if alignment file is named xxx.msf, then RMSD file is xxx.rmsd (instead of xxx.msf.rmsd) and PDB file is xxx.pdb Note that in a Python script if you know how to do something with a Chimera command you can do the same thing in the script by using the "runCommand" function. So for instance if you look in the script you can see that I write the PDB file by using runCommand with an string argument that is the same as the "write" command that you would use at the Chimera command line to write the PDB file. Also, you may want change the script to use different values in the calls to the MatchMaker and Match->Align functions than what I used. For instance, the call to Match->Align uses a distance cutoff of 4.0 (same as the 2005 script) whereas the default cutoff nowadays is 5.0. In regards to Match->Align producing a different number of aligned columns than the MatchMaker "core" even with the same cutoff value, that is totally believable. You will get situations where loop residues happen to cross each other within the cutoff distance but those residues were not in the same column of the MatchMaker alignment [nor should they be really] and therefore could not be in it's final "core". --Eric On Dec 10, 2007, at 2:54 AM, "" <shengzhiya@nibs.ac.cn> wrote:
Dear Eric,
Thank you very much for your reply and I apologize for this late “thank you”! I am out of town and can not touch the internet during the weekend.
I still got several questions here. Would you please help me? 1. The significance of RMSD varies with the numbers of atom pairs, so maybe it makes more sense if I can write out the number of atom pairs at the same time. Is this possible? 2. I think I need the alignment from Match->Align, which better represents the structure alignment. Sorry I kind of got stuck on the output problem, and forgot what I need in the first place. And I am curious about which residues are included in the “core” regions and used for RMSD calculation after turning on the “iterate” option. Can I see this from the Match->Align output? It seems that even if I choose the same cutoff in Match->Align as in iteration, the number of aligned residues is still not the same as the number of atom pairs in RMSD calculation. 3. Also, I am really sorry for not mention this in my first letter -- can I write the transformed PDB file out at the same time? I can only do this with a “.com” file but not a python script.
Thank you very much for your help!
Have a nice day!
Best wishes, Zhiya

On Dec 10, 2007, at 2:54 PM, Eric Pettersen wrote:
In regards to Match->Align producing a different number of aligned columns than the MatchMaker "core" even with the same cutoff value, that is totally believable. You will get situations where loop residues happen to cross each other within the cutoff distance but those residues were not in the same column of the MatchMaker alignment [nor should they be really] and therefore could not be in it's final "core".
Just wanted to add: I've found that with more difficult-to-align sequences (worst case is very low sequence identity and all-beta or all-alpha secondary structure), there may be incorrect segments in the initial MatchMaker alignment that are corrected in alignment from a subsequent Match-
Align step. The purpose of MatchMaker is to generate a correct superposition, and this is still successful in nearly all cases because the incorrect areas are pruned during fit iteration (only the correct columns are used to generate the final superposition). Match- Align will then generate columns for all the superimposed parts, some of which were not used in the prior fitting step.
Which alignment is better depends on the situation. For example, there could be different structures of the same protein where one loop moves a lot. The MatchMaker alignment will simply align the entire identical or nearly identical sequences, whereas Match->Align will not align the loops that are poorly superimposed in space. If you are comparing different structural matches, especially hard-to- align distantly related cases, it may be more appropriate to use the number of pairs in the Match->Align alignment rather than those values from MatchMaker. In our paper, for example, we reported the number of pairs and RMSDs from the Match->Align alignment, not the MatchMaker one: Tools for integrated sequence-structure analysis with UCSF Chimera: E.C. Meng, E.F. Pettersen, G.S. Couch, C.C. Huang, and T.E. Ferrin, BMC Bioinformatics 7, 339 (2006). http://www.biomedcentral.com/ 1471-2105/7/339 The drawback is that getting that RMSD value requires another round of fitting, this time on all positions in the Match->Align alignment (without iteration). This may have been more detail than you wanted! Elaine ----- Elaine C. Meng, Ph.D. meng@cgl.ucsf.edu UCSF Computer Graphics Lab and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco http://www.cgl.ucsf.edu/home/meng/index.html

Dear Eric and Elaine, Thank you so much for your help and advice! Best wishes, Zhiya
participants (3)
-
Elaine Meng
-
Eric Pettersen
-
shengzhiya@nibs.ac.cn