
Hi Sam, To supplement Elaine's answer, obviously there is no approach to this issue that doesn't have it's problems or the PDB would have adopted it instead of its current awkward single-entry-split-across-multiple- files/IDs approach. We thought about various options and decided to adopt AMBER's approach of bleeding the hundred-thousands digit into the sixth column (essentially adding "ATOM 1" through "ATOM 9" records). The problem with resetting serial numbers is that it breaks CONECT records. Of course, the digit-bleed approach does too (CONECT records only accommodate 5-digit serial numbers), but at least for a fairly common type of large-atom-number system, namely highly solvated proteins from MD (where CONECT records are typically only needed in the first 99999 atoms), the approach works without any problems. We're probably going to stick with our wrong approach instead of adopting your wrong approach. :-) Perhaps the forthcoming revised PDB format that Michael Zimmermann mentioned will solve all our problems... --Eric Eric Pettersen UCSF Computer Graphics Lab On Feb 16, 2010, at 10:14 AM, Elaine Meng wrote:
Hi Sam, The spillover into column 6 was intentional, but it is understandable that it could cause problems for other programs. The original PDB format (as you must be painfully aware) was not designed to handle such large structures as we have today, and different programs have taken different liberties with the format to try to accommodate these things. PDB itself has taken the approach of splitting large structures into multiple entries. Your approach breaks the rule of unique serial numbers, and Chimera's approach messes with column 6. Chimera can use the serial numbers as unique identifiers, and if I remember correctly, we started using column 6 after noting that another program (maybe it was AMBER trajectories?) expanded large serial numbers there. Conversely, the programs you are using apparently tolerate duplicate serial numbers but not numbers in column 6.
I'm not sure what the upshot will be, but it is useful to know about these problems. Thanks for letting us know, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Feb 14, 2010, at 1:00 PM, Samuel Coulbourn Flores wrote:
Hi Guys, I thought I'd point out a problem with Chimera's atom numbering. I am working on a ribosome model which has about 50 chains. The ribosome pushes the limits of the PDB format; the five columns reserved for atom numbers are not sufficient to give each atom a unique atom number. This is not typically a problem for me, I just start the atom numbering at 1 for each chain. However chimera doesn't do this, it tries to assign sequential, unique numbers to each atom and ends up spilling over onto column 6, which is in the record name field. This wreaks havoc with other programs which I use to process the PDB files that chimera puts out. I am manually renumbering to deal with the issue, but the developers should think about a more permanent solution. Sam
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://www.cgl.ucsf.edu/mailman/listinfo/chimera-users