
Hi Marcin, In the mmCIF format atom_site.label_seq_id is required as described in the mmCIF documentation http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site.labe... <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site.labe...> That documentation says it is used by the 60 mmCIF tables I copied below -- if you don't included it then none of those tables can refer to a specific residue. A minimal mmCIF file can of course not have any of those 60 tables so it could work in principle to omit label_seq_id or specify it as ".". But you are really asking for software not to work. It is like not including any atom names because you don't happen to need the atom names -- good luck getting software to work correctly when you omit basic information. Tom _atom_site_anisotrop.pdbx_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site_anis...> _geom_angle.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_angle.ato...> _geom_angle.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_angle.ato...> _geom_angle.atom_site_label_seq_id_3 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_angle.ato...> _geom_bond.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_bond.atom...> _geom_bond.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_bond.atom...> _geom_contact.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_contact.a...> _geom_contact.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_contact.a...> _geom_hbond.atom_site_label_seq_id_A <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_hbond.ato...> _geom_hbond.atom_site_label_seq_id_D <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_hbond.ato...> _geom_hbond.atom_site_label_seq_id_H <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_hbond.ato...> _geom_torsion.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.atom_site_label_seq_id_1.html>_geom_torsion.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.a...> _geom_torsion.atom_site_label_seq_id_3 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.a...> _geom_torsion.atom_site_label_seq_id_4 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.a...> _ndb_struct_na_base_pair.i_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair.i_label_seq_id.html>_ndb_struct_na_base_pair.j_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_...> _ndb_struct_na_base_pair_step.i_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_...> _ndb_struct_na_base_pair_step.i_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_...> _ndb_struct_na_base_pair_step.j_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_...> _ndb_struct_na_base_pair_step.j_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_...> _pdbx_atom_site_aniso_tls.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_atom_site...> _pdbx_domain_range.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_domain_ra...> _pdbx_domain_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_domain_ra...> _pdbx_feature_monomer.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_feature_m...> _pdbx_refine_component.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_refine_co...> _pdbx_remediation_atom_site_mapping.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_remediati...> _pdbx_sequence_range.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_sequence_...> _pdbx_sequence_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_sequence_...> _pdbx_struct_chem_comp_diagnostics.seq_num <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_ch...> _pdbx_struct_chem_comp_feature.seq_num <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_ch...> _pdbx_struct_conn_angle.ptnr1_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_co...> _pdbx_struct_conn_angle.ptnr2_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_co...> _pdbx_struct_conn_angle.ptnr3_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_co...> _pdbx_struct_group_component_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_gr...> _pdbx_struct_group_components.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_gr...> _pdbx_struct_mod_residue.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_mo...> _pdbx_struct_sheet_hbond.range_1_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_sh...> _pdbx_struct_sheet_hbond.range_2_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_sh...> _struct_conf.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conf.be...> _struct_conf.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conf.en...> _struct_conn.pdbx_ptnr3_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn.pd...> _struct_conn.ptnr1_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn.pt...> _struct_conn.ptnr2_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn.pt...> _struct_mon_nucl.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_nuc...> _struct_mon_prot.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_pro...> _struct_mon_prot_cis.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_pro...> _struct_mon_prot_cis.pdbx_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_pro...> _struct_sheet_hbond.range_1_beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_h...> _struct_sheet_hbond.range_1_end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_h...> _struct_sheet_hbond.range_2_beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_h...> _struct_sheet_hbond.range_2_end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_h...> _struct_sheet_range.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_r...> _struct_sheet_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_r...> _struct_site_gen.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_site_ge...>
On Jul 8, 2020, at 4:27 AM, Marcin Wojdyr <wojdyr@gmail.com> wrote:
Hi Greg,
Adding a atom_site.label_seq_id isn't different from supplying a residue number in PDB file. When there are adjacent residues of the same type, does the PDB reader see a duplicate atom and generate a new residue? merge the residues? generate an error? I haven't tested the PDB reader, but a residue number helps it too.
The sequence number/id from the PDB format tells which atoms are in the same residue, but it doesn't imply connectivity between residues, because the numbers don't need to be consecutive. In the mmCIF format it is stored as _atom_site.auth_seq_id (+pdbx_PDB_ins_code for the full id). So this makes conversion from PDB to mmCIF problematic. If SEQRES is present I do sequence alignment to determine label_seq_id. If SEQRES is missing I could ask the user to supply the full sequence, but then the user may think that (since this was not obligatory when working with PDB files) moving to mmCIF is a step backward. I could infer gaps and increase label_seq_id by 2 if there is a gap, but the resulting mmCIF file can be used for any purpose, not only by Chimera. The apparent gap may actually be caused by misplaced atoms and it could turn out that the gap in numbering is causing later different problems. It's not clear to me what's better here - leaving label_seq_id null or filling it with the best guesses (which sometimes will be wrong).
Marcin _______________________________________________ ChimeraX-users mailing list ChimeraX-users@cgl.ucsf.edu Manage subscription: https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users