
Hi Greg,
Adding a atom_site.label_seq_id isn't different from supplying a residue number in PDB file. When there are adjacent residues of the same type, does the PDB reader see a duplicate atom and generate a new residue? merge the residues? generate an error? I haven't tested the PDB reader, but a residue number helps it too.
The sequence number/id from the PDB format tells which atoms are in the same residue, but it doesn't imply connectivity between residues, because the numbers don't need to be consecutive. In the mmCIF format it is stored as _atom_site.auth_seq_id (+pdbx_PDB_ins_code for the full id). So this makes conversion from PDB to mmCIF problematic. If SEQRES is present I do sequence alignment to determine label_seq_id. If SEQRES is missing I could ask the user to supply the full sequence, but then the user may think that (since this was not obligatory when working with PDB files) moving to mmCIF is a step backward. I could infer gaps and increase label_seq_id by 2 if there is a gap, but the resulting mmCIF file can be used for any purpose, not only by Chimera. The apparent gap may actually be caused by misplaced atoms and it could turn out that the gap in numbering is causing later different problems. It's not clear to me what's better here - leaving label_seq_id null or filling it with the best guesses (which sometimes will be wrong). Marcin