
Hello, I'm solving a crystal structure and as I'm preparing to submit the final structure to PDB, I used the PDB_Extract tool to convert my final refined .pdb to an mmCIF file as per the request of PDB. After I do this, however, I noticed that if I open the CIF file in Chimera, it does not have any connectivity between the residues. This does not appear to be a problem in PyMOL. Less importantly, I also had a question about the Sequence view in Chimera. I have a break in my chain due to unmodelled residues, and I noticed that if I float my cursor over a residue in the sequence, it shows the correct residue number at the bottom of the window, however the numbering at the side of the sequence view window is wrong after the break because it does not take the chain break into account. I appreciate any advice/help people can offer. Thanks, Mike

Hi Michael, Re the Sequence issue: the structure-residue numbering and the sequence window numbering could easily diverge (as you observed), since the sequence-window numbering is simply the position in the sequence or sequence alignment, and certainly in the alignment case, a certain column could contain quite different residue numbers from multiple associated structures. Another common reason is that the structure numbering doesn’t start with 1. Also consider gap characters in an alignment, which would be included in the sequence numbering but not in the structure. However, any difference in the sequence numbering doesn’t pose a problem for the correct sequence-structure association, as reflected in the structure residue information shown at the bottom of the sequence window upon mouseover. In the sequence-window “Numberings” menu you can hide the numbering entirely or adjust the start number (only, which won’t help if you have an internal missing segment). There isn’t an option to use the residue numbers from the input coordinates file, but since the associations are correct, it's mainly cosmetic. Are you wanting the two kinds of numbering to match for convenience in interactive work, or to make a figure? If your sequence information contained all the residues and the numbering of the structure residues starts with 1, the two types of numbering would match up. The Sequence window would just show a red outline box (by default) around the residues that were missing from the coordinates, as is the case for deposited PDB entries. It sounds like you just started the Sequence tool, which would try to extract the sequence from the structure input file. There are two general ways to get the full sequence information if it’s not in the coordinates: (1) the PDB input includes the full sequence in a SEQRES section (2) you can simply open the full sequence from a text file in some standard format like FASTA, or even fetch the sequence if you know its UniProt ID, and then associate your structure with that sequence. Association may happen automatically, but if it doesn’t you can force it with “Structure…. Associations” in the sequence window menu. In that case you don’t call the Sequence tool, you just use File… Open (to open a fasta file) or File… Fetch by ID (to fetch sequence from UniProt) and the sequence will automatically show up in a separate window. Sequence window menus, associations, etc. are described in more detail here: <http://www.rbvi.ucsf.edu/chimera/docs/ContributedSoftware/multalignviewer/framemav.html> I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco On Apr 8, 2016, at 10:49 AM, Michael Blaisse <mblaisse@berkeley.edu> wrote:
Hello,
I'm solving a crystal structure and as I'm preparing to submit the final structure to PDB, I used the PDB_Extract tool to convert my final refined .pdb to an mmCIF file as per the request of PDB. After I do this, however, I noticed that if I open the CIF file in Chimera, it does not have any connectivity between the residues. This does not appear to be a problem in PyMOL.
Less importantly, I also had a question about the Sequence view in Chimera. I have a break in my chain due to unmodelled residues, and I noticed that if I float my cursor over a residue in the sequence, it shows the correct residue number at the bottom of the window, however the numbering at the side of the sequence view window is wrong after the break because it does not take the chain break into account.
I appreciate any advice/help people can offer.
Thanks, Mike

Hi Mike, I believe PYMOL draws bonds where atoms are within a certain distance (I think it is in pymol source /pymol/layer2/ Atominfo.cpp) or where there is explicit record found in the cif file, hence you see connectivity. According to the CHIMERA website (https://www.cgl.ucsf.edu/chimera/data/mmcif-oct2013/mmcif.html <https://www.cgl.ucsf.edu/chimera/data/mmcif-oct2013/mmcif.html>), CHIMERA has a dictionary of standard bond template for 265 residues and uses that to create your bonds. Perhaps your converted mmCIF has non-standard residues? In any case, I think RCSB requires coordinates and structure factor file only (but this could have changed in recent years). My personal opinion is that I wouldn’t worry too much about bond connectivity, but that your R statistics for your structure is reasonable for your resolution. Hope this helps, Dave
On Apr 8, 2016, at 10:49 AM, Michael Blaisse <mblaisse@berkeley.edu> wrote:
Hello,
I'm solving a crystal structure and as I'm preparing to submit the final structure to PDB, I used the PDB_Extract tool to convert my final refined .pdb to an mmCIF file as per the request of PDB. After I do this, however, I noticed that if I open the CIF file in Chimera, it does not have any connectivity between the residues. This does not appear to be a problem in PyMOL.
Less importantly, I also had a question about the Sequence view in Chimera. I have a break in my chain due to unmodelled residues, and I noticed that if I float my cursor over a residue in the sequence, it shows the correct residue number at the bottom of the window, however the numbering at the side of the sequence view window is wrong after the break because it does not take the chain break into account.
I appreciate any advice/help people can offer.
Thanks, Mike _______________________________________________ Chimera-users mailing list: Chimera-users@cgl.ucsf.edu Manage subscription: http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

Hi Mike, I was waiting for someone who knows more about our mmCIF-reading code to give a more explanatory answer, but in the meanwhile, a solution may be to add the bonds using the Build Structure tool. I assumed you meant the C-N peptide bonds between successive amino acids within a chain. Since I didn’t have an example I tried to generate one by deleting those bonds, commands: open 2gbp sel [… that reports 2396 bonds ] ~sel ~ribbon; display ~bond @n,c If I select again it now reports 2088 bonds. OK, now to test the solution.. In Build Structure (in menu under Tools… Structure Editing) I went to the Adjust Bonds section (and with all atoms selected) clicked “Add” reasonable bonds between selected atoms, then refreshed the selection again to check how many bonds were added. However, I got 2397 instead of 2396, so there was apparently one more than in the original structure. So that is the main danger/difficulty… if you get more or fewer bonds than expected, figuring out which they are. You can delete any extra with ~bond, specifying only the offending pair(s) of atoms. Simply adding the bonds doesn’t change the bond length. I see there is a separate option to change bond length right below, which might be a little confusing. There is also a “bond” command to add a bond, but it takes only two atoms. Only the GUI can be used to add bonds en masse. Docs for Adjust Bonds, including criteria for what is a reasonable bond: <http://www.rbvi.ucsf.edu/chimera/docs/ContributedSoftware/editing/editing.html#bond> … for commands bond and ~bond <http://www.rbvi.ucsf.edu/chimera/docs/UsersGuide/midas/bond.html> I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco On Apr 9, 2016, at 11:58 AM, david gae <ddgae@ucdavis.edu> wrote:
Hi Mike,
I believe PYMOL draws bonds where atoms are within a certain distance (I think it is in pymol source /pymol/layer2/ Atominfo.cpp) or where there is explicit record found in the cif file, hence you see connectivity.
According to the CHIMERA website (https://www.cgl.ucsf.edu/chimera/data/mmcif-oct2013/mmcif.html), CHIMERA has a dictionary of standard bond template for 265 residues and uses that to create your bonds. Perhaps your converted mmCIF has non-standard residues?
In any case, I think RCSB requires coordinates and structure factor file only (but this could have changed in recent years). My personal opinion is that I wouldn’t worry too much about bond connectivity, but that your R statistics for your structure is reasonable for your resolution.
Hope this helps, Dave
On Apr 8, 2016, at 10:49 AM, Michael Blaisse <mblaisse@berkeley.edu> wrote:
Hello,
I'm solving a crystal structure and as I'm preparing to submit the final structure to PDB, I used the PDB_Extract tool to convert my final refined .pdb to an mmCIF file as per the request of PDB. After I do this, however, I noticed that if I open the CIF file in Chimera, it does not have any connectivity between the residues. This does not appear to be a problem in PyMOL.
Less importantly, I also had a question about the Sequence view in Chimera. I have a break in my chain due to unmodelled residues, and I noticed that if I float my cursor over a residue in the sequence, it shows the correct residue number at the bottom of the window, however the numbering at the side of the sequence view window is wrong after the break because it does not take the chain break into account.
I appreciate any advice/help people can offer.
Thanks, Mike

Silly me, if I select only the N,C atoms before adding all reasonable bonds, I get the desired 2396 bonds total for 2gbp. Thus you can minimize the “danger” by selecting only the relevant atoms for bond addition. Elaine On Apr 9, 2016, at 12:29 PM, Elaine Meng <meng@cgl.ucsf.edu> wrote:
Hi Mike, I was waiting for someone who knows more about our mmCIF-reading code to give a more explanatory answer, but in the meanwhile, a solution may be to add the bonds using the Build Structure tool.
I assumed you meant the C-N peptide bonds between successive amino acids within a chain. Since I didn’t have an example I tried to generate one by deleting those bonds, commands:
open 2gbp sel [… that reports 2396 bonds ] ~sel ~ribbon; display ~bond @n,c
If I select again it now reports 2088 bonds. OK, now to test the solution..
In Build Structure (in menu under Tools… Structure Editing) I went to the Adjust Bonds section (and with all atoms selected) clicked “Add” reasonable bonds between selected atoms, then refreshed the selection again to check how many bonds were added. However, I got 2397 instead of 2396, so there was apparently one more than in the original structure. So that is the main danger/difficulty… if you get more or fewer bonds than expected, figuring out which they are. You can delete any extra with ~bond, specifying only the offending pair(s) of atoms.
Simply adding the bonds doesn’t change the bond length. I see there is a separate option to change bond length right below, which might be a little confusing.
There is also a “bond” command to add a bond, but it takes only two atoms. Only the GUI can be used to add bonds en masse.
Docs for Adjust Bonds, including criteria for what is a reasonable bond: <http://www.rbvi.ucsf.edu/chimera/docs/ContributedSoftware/editing/editing.html#bond>
… for commands bond and ~bond <http://www.rbvi.ucsf.edu/chimera/docs/UsersGuide/midas/bond.html>
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Apr 9, 2016, at 11:58 AM, david gae <ddgae@ucdavis.edu> wrote:
Hi Mike,
I believe PYMOL draws bonds where atoms are within a certain distance (I think it is in pymol source /pymol/layer2/ Atominfo.cpp) or where there is explicit record found in the cif file, hence you see connectivity.
According to the CHIMERA website (https://www.cgl.ucsf.edu/chimera/data/mmcif-oct2013/mmcif.html), CHIMERA has a dictionary of standard bond template for 265 residues and uses that to create your bonds. Perhaps your converted mmCIF has non-standard residues?
In any case, I think RCSB requires coordinates and structure factor file only (but this could have changed in recent years). My personal opinion is that I wouldn’t worry too much about bond connectivity, but that your R statistics for your structure is reasonable for your resolution.
Hope this helps, Dave
On Apr 8, 2016, at 10:49 AM, Michael Blaisse <mblaisse@berkeley.edu> wrote:
Hello,
I'm solving a crystal structure and as I'm preparing to submit the final structure to PDB, I used the PDB_Extract tool to convert my final refined .pdb to an mmCIF file as per the request of PDB. After I do this, however, I noticed that if I open the CIF file in Chimera, it does not have any connectivity between the residues. This does not appear to be a problem in PyMOL.
Less importantly, I also had a question about the Sequence view in Chimera. I have a break in my chain due to unmodelled residues, and I noticed that if I float my cursor over a residue in the sequence, it shows the correct residue number at the bottom of the window, however the numbering at the side of the sequence view window is wrong after the break because it does not take the chain break into account.
I appreciate any advice/help people can offer.
Thanks, Mike
_______________________________________________ Chimera-users mailing list: Chimera-users@cgl.ucsf.edu Manage subscription: http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

Hi, Mike. The problem is that Chimera depends on the presence of the "entity_poly_seq" table for inter-residue connectivity, and your mmCIF file does not have that table. I've modified Chimera so that, instead of just giving up when entity_poly_seq is missing, it uses a simple-minded heuristic to connect adjacent amino acids with the same chain identifier. That should be in the daily builds (http://www.cgl.ucsf.edu/chimera/download.html#daily) dated April 14 or later. Can you please give that a spin tomorrow, or when you have a chance? Thanks. Conrad On 4/9/2016 12:44 PM, Elaine Meng wrote:
Silly me, if I select only the N,C atoms before adding all reasonable bonds, I get the desired 2396 bonds total for 2gbp. Thus you can minimize the “danger” by selecting only the relevant atoms for bond addition.
Elaine
On Apr 9, 2016, at 12:29 PM, Elaine Meng <meng@cgl.ucsf.edu> wrote:
Hi Mike, I was waiting for someone who knows more about our mmCIF-reading code to give a more explanatory answer, but in the meanwhile, a solution may be to add the bonds using the Build Structure tool.
I assumed you meant the C-N peptide bonds between successive amino acids within a chain. Since I didn’t have an example I tried to generate one by deleting those bonds, commands:
open 2gbp sel [… that reports 2396 bonds ] ~sel ~ribbon; display ~bond @n,c
If I select again it now reports 2088 bonds. OK, now to test the solution..
In Build Structure (in menu under Tools… Structure Editing) I went to the Adjust Bonds section (and with all atoms selected) clicked “Add” reasonable bonds between selected atoms, then refreshed the selection again to check how many bonds were added. However, I got 2397 instead of 2396, so there was apparently one more than in the original structure. So that is the main danger/difficulty… if you get more or fewer bonds than expected, figuring out which they are. You can delete any extra with ~bond, specifying only the offending pair(s) of atoms.
Simply adding the bonds doesn’t change the bond length. I see there is a separate option to change bond length right below, which might be a little confusing.
There is also a “bond” command to add a bond, but it takes only two atoms. Only the GUI can be used to add bonds en masse.
Docs for Adjust Bonds, including criteria for what is a reasonable bond: <http://www.rbvi.ucsf.edu/chimera/docs/ContributedSoftware/editing/editing.html#bond>
… for commands bond and ~bond <http://www.rbvi.ucsf.edu/chimera/docs/UsersGuide/midas/bond.html>
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Apr 9, 2016, at 11:58 AM, david gae <ddgae@ucdavis.edu> wrote:
Hi Mike,
I believe PYMOL draws bonds where atoms are within a certain distance (I think it is in pymol source /pymol/layer2/ Atominfo.cpp) or where there is explicit record found in the cif file, hence you see connectivity.
According to the CHIMERA website (https://www.cgl.ucsf.edu/chimera/data/mmcif-oct2013/mmcif.html), CHIMERA has a dictionary of standard bond template for 265 residues and uses that to create your bonds. Perhaps your converted mmCIF has non-standard residues?
In any case, I think RCSB requires coordinates and structure factor file only (but this could have changed in recent years). My personal opinion is that I wouldn’t worry too much about bond connectivity, but that your R statistics for your structure is reasonable for your resolution.
Hope this helps, Dave
On Apr 8, 2016, at 10:49 AM, Michael Blaisse <mblaisse@berkeley.edu> wrote:
Hello,
I'm solving a crystal structure and as I'm preparing to submit the final structure to PDB, I used the PDB_Extract tool to convert my final refined .pdb to an mmCIF file as per the request of PDB. After I do this, however, I noticed that if I open the CIF file in Chimera, it does not have any connectivity between the residues. This does not appear to be a problem in PyMOL.
Less importantly, I also had a question about the Sequence view in Chimera. I have a break in my chain due to unmodelled residues, and I noticed that if I float my cursor over a residue in the sequence, it shows the correct residue number at the bottom of the window, however the numbering at the side of the sequence view window is wrong after the break because it does not take the chain break into account.
I appreciate any advice/help people can offer.
Thanks, Mike
_______________________________________________ Chimera-users mailing list: Chimera-users@cgl.ucsf.edu Manage subscription: http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
_______________________________________________ Chimera-users mailing list: Chimera-users@cgl.ucsf.edu Manage subscription: http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
participants (4)
-
Conrad Huang
-
david gae
-
Elaine Meng
-
Michael Blaisse