UniProt ID detection and display from mmCIF files
Hello, I’m adding metadata to a large set of mmCIF files coming from a large-scale AlphaFold prediction effort. The goal is to have descriptions for each polypeptide chain visible when opening the file in ChimeraX. Adding descriptions is easy via the _entity.pdbx_description parameter. However, I can’t figure out how to add UniProt IDs so they show up in their own column, which is usually the case for files I’ve downloaded from the AlphaFold database or PDB. I’ve tried editing struct_ref and struct_ref_seq , but ChimeraX still won’t display the UniProt IDs. How do I do this? Version information: mmCIF dictionary: mmcif_ma.dic 1.4.5 ChimeraX: version 1.10 (2025-06-26) Thanks for your help! Sam Nitz PhD candidate, Rock lab Rockefeller University
Hi Sam, ChimeraX gets the uniprot ids from the struct_ref and struct_ref_seq mmCIF tables. Here is the code it uses https://github.com/RBVI/ChimeraX/blob/develop/src/bundles/mmcif/src/uniprot_... That code shows it uses fields id, db_name, db_code, pdbx_db_accession in the struct_ref table and ref_id, pdbx_strand_id, db_align_beg, db_align_end, pdbx_auth_seq_align_beg, pdbx_auth_seq_align_end in struct_ref_seq table. It requires some of those fields but not all of them. There is probably something wrong in your tables that prevents ChimeraX from recognizing the ids. Comparing your tables to any working PDB entry tables, for example 1a0m below should give a clue where your problem lies. Tom # _struct_ref.id 1 _struct_ref.db_name UNP _struct_ref.db_code CXA1_CONEP _struct_ref.entity_id 1 _struct_ref.pdbx_db_accession P56638 _struct_ref.pdbx_align_begin 1 _struct_ref.pdbx_seq_one_letter_code GCCSDPRCNMNNPDYC _struct_ref.pdbx_db_isoform ? # loop_ _struct_ref_seq.align_id _struct_ref_seq.ref_id _struct_ref_seq.pdbx_PDB_id_code _struct_ref_seq.pdbx_strand_id _struct_ref_seq.seq_align_beg _struct_ref_seq.pdbx_seq_align_beg_ins_code _struct_ref_seq.seq_align_end _struct_ref_seq.pdbx_seq_align_end_ins_code _struct_ref_seq.pdbx_db_accession _struct_ref_seq.db_align_beg _struct_ref_seq.pdbx_db_align_beg_ins_code _struct_ref_seq.db_align_end _struct_ref_seq.pdbx_db_align_end_ins_code _struct_ref_seq.pdbx_auth_seq_align_beg _struct_ref_seq.pdbx_auth_seq_align_end 1 1 1A0M A 1 ? 16 ? P56638 1 ? 16 ? 1 16 2 1 1A0M B 1 ? 16 ? P56638 1 ? 16 ? 1 16
On Jun 10, 2026, at 10:58 AM, Samuel Nitz via ChimeraX-users <chimerax-users@cgl.ucsf.edu> wrote:
Hello,
I’m adding metadata to a large set of mmCIF files coming from a large-scale AlphaFold prediction effort. The goal is to have descriptions for each polypeptide chain visible when opening the file in ChimeraX. Adding descriptions is easy via the _entity.pdbx_description parameter. However, I can’t figure out how to add UniProt IDs so they show up in their own column, which is usually the case for files I’ve downloaded from the AlphaFold database or PDB. I’ve tried editing struct_ref and struct_ref_seq , but ChimeraX still won’t display the UniProt IDs. How do I do this?
Version information: mmCIF dictionary: mmcif_ma.dic 1.4.5 ChimeraX: version 1.10 (2025-06-26)
Thanks for your help!
Sam Nitz PhD candidate, Rock lab Rockefeller University _______________________________________________ ChimeraX-users mailing list -- chimerax-users@cgl.ucsf.edu To unsubscribe send an email to chimerax-users-leave@cgl.ucsf.edu Archives: https://mail.cgl.ucsf.edu/mailman/archives/list/chimerax-users@cgl.ucsf.edu/
participants (2)
-
Samuel Nitz -
Tom Goddard