Re: [Chimera-users] problem with huge PDB atom numbers

16 Feb 2010


      Hi Sam,
	To supplement Elaine's answer, obviously there is no approach to this  
issue that doesn't have it's problems or the PDB would have adopted it  
instead of its current awkward single-entry-split-across-multiple- 
files/IDs approach.  We thought about various options and decided to  
adopt AMBER's approach of bleeding the hundred-thousands digit into  
the sixth column (essentially adding "ATOM 1" through "ATOM 9"  
records).  The problem with resetting serial numbers is that it breaks  
CONECT records.  Of course, the digit-bleed approach does too (CONECT  
records only accommodate 5-digit serial numbers), but at least for a  
fairly common type of large-atom-number system, namely highly solvated  
proteins from MD (where CONECT records are typically only needed in  
the first 99999 atoms), the approach works without any problems.
	We're probably going to stick with our wrong approach instead of  
adopting your wrong approach. :-)  Perhaps the forthcoming revised PDB  
format that Michael Zimmermann mentioned will solve all our problems...

--Eric

	Eric Pettersen
	UCSF Computer Graphics Lab

On Feb 16, 2010, at 10:14 AM, Elaine Meng wrote:
...
Hi Sam,
The spillover into column 6 was intentional, but it is  
understandable that it could cause problems for other programs.  The  
original PDB format (as you must be painfully aware) was not  
designed to handle such large structures as we have today, and  
different programs have taken different liberties with the format to  
try to accommodate these things.  PDB itself has taken the approach  
of splitting large structures into multiple entries.  Your approach  
breaks the rule of unique serial numbers, and Chimera's approach  
messes with column 6.  Chimera can use the serial numbers as unique  
identifiers, and if I remember correctly, we started using column 6  
after noting that another program (maybe it was AMBER trajectories?)  
expanded large serial numbers there.  Conversely, the programs you  
are using apparently tolerate duplicate serial numbers but not  
numbers in column 6.
I'm not sure what the upshot will be, but it is useful to know about  
these problems.  Thanks for letting us know,
Elaine
-----
Elaine C. Meng, Ph.D.
UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab
Department of Pharmaceutical Chemistry
University of California, San Francisco
On Feb 14, 2010, at 1:00 PM, Samuel Coulbourn Flores wrote:
...
Hi Guys,
I thought I'd point out a problem with Chimera's atom numbering.  I  
am working on a ribosome model which has about 50 chains.  The  
ribosome pushes the limits of the PDB format;  the five columns  
reserved for atom numbers are not sufficient to give each atom a  
unique atom number.  This is not typically a problem for me, I just  
start the atom numbering at 1 for each chain.  However chimera  
doesn't do this, it tries to assign sequential, unique numbers to  
each atom and ends up spilling over onto column 6, which is in the  
record name field.  This wreaks havoc with other programs which I  
use to process the PDB files that chimera puts out.  I am manually  
renumbering to deal with the issue, but the developers should think  
about a more permanent solution.
Sam
_______________________________________________
Chimera-users mailing list
Chimera-users@cgl.ucsf.edu
http://www.cgl.ucsf.edu/mailman/listinfo/chimera-users