Hi Guillaume,
You can get the full sequence into your refined mmCIF file with the following cumbersome procedure. If you open your original full-length AlphaFold .pdb file and save it as .cif in ChimeraX then ChimeraX won't put the sequence info in the file,
even if you haven't deleted any residues, because the original AlphaFold .pdb file does not have SEQRES records. But the ChimeraX .pdb file writer works better. If you open your original full AlphaFold .pdb model, then save it as PDB in ChimeraX it will
add the SEQRES records. Then you can copy those SEQRES records (a few lines at the top of the file) using a text editor into your refined .pdb file which does not have all the residues. Since you are using mmCIF for your refined model, you would first open
that and save it as .pdb format in order to add the SEQRES, then you could resave that .pdb as .cif.
Here are the ChimeraX commands, assuming fullalphafold.pdb is the original AlphaFold model, and refined.cif is your refined model which lacks the full sequence.
open refined.cif
save refined_noseq.pdb
close
open fullalphafold.pdb
save alphafold_withseq.pdb
close
# In a text editor copy the SEQRES lines from alphafold_withseq.pdb to refined_noseq.pdb to make refined_withseq.pdb
# The SEQRES lines look like:
# SEQRES 1 A 125 GLY HIS MET HIS ASP CYS HIS GLN VAL THR VAL SER ARG
# SEQRES 2 A 125 ASP VAL THR LEU GLN ASN LYS GLU ARG HIS ASP CYS ASN
# SEQRES 3 A 125 GLN VAL CYS ALA SER ILE ASP LYS GLU THR GLU ASN LYS
# ...
open refined_withseq.pdb
save refined_withseq.cif
I'm kind of surprised that the "bestGuess" option Greg described does not actually put the entity_poly_seq table in the mmCIF. If it did you could avoid all the monkey-business of converting to .pdb format, and just output the original AlphaFold
model with bestGuess true as mmCIF and copy its entity_poly_seq table to your refined .cif file. But at least you can get the job done using the old .pdb format.
Tom
Thank you, good to know!
I think in this case I can't get my FL sequence with ChimeraX without loosing a lot of my refinement work, because I saved my progress (without using these options...) while working in ISOLDE, re-opening
the work-in-progress model every time I resumed working on it. But I will try the program in phenix.
Cheers,
From: Greg Couch <gregc@cgl.ucsf.edu>
Sent: Friday, December 1, 2023 1:22:34 AM
To: ChimeraX Users Help
Cc: Guillaume Gaullier
Subject: Re: [chimerax-users] How to save a sequence in an mmCIF file?
To answer the general question, ChimeraX doesn't yet have a way to set the full sequence for a mmCIF entity. See
https://www.wwpdb.org/deposition/preparing-pdbx-mmcif-files for
how to fix the mmCIF output from various refinement packages. For example, the Phenix has "mmtbx.prepare_pdb_deposition program to create a mmCIF file with the sequence".
In this particular case, where the starting structure is an Alphafold prediction with atoms for every residue in the full sequence, you can get the correct sequence into the mmCIF output with the "bestGuess"
option. See
https://www.cgl.ucsf.edu/chimerax/docs/user/commands/save.html#mmcif. I'd also recommend,
in your case, using the computedSheets option. In older ChimeraX's, you need to run dssp before using computedSheets to get the helix information. In recent ChimeraX's (daily build and 1.7 release candidate), computedSheets will also output the helix information
if it wasn't present in the input.
Adding a sequence with bestGuess can be deceiving of because missing leading or trailing residues, or gaps of indeterminate length. But in this case, you should be fine.
-- Greg
On 11/29/23 02:45, Guillaume Gaullier wrote:
From: Guillaume
Gaullier via ChimeraX-users <chimerax-users@cgl.ucsf.edu>
Subject: [chimerax-users]
How to save a sequence in an mmCIF file?
Date: November
29, 2023 at 2:45:31 AM PST
Hello,
Starting from an AlphaFold prediction, I refined a model against a map with ISOLDE. I trimmed the segments not supported by any density. The resulting mmCIF file that I saved now opens with this warning:
Unknown polymer entity '1' near line 187
Missing or incomplete entity_poly_seq table. Inferred polymer connectivity.
Displaying the sequence of this chain shows the correct numbering (with jumps in numbering according to missing segments in the structure), but the sequence of the missing structure segments is not displayed.
When I open the fasta file containing the full-length sequence, the sequence gets automatically associated to the structure, and the sequence viewer annotates the segments with missing structure correctly. I would like to save this full-length
sequence in my mmCIF file so the full-length sequence with annotated missing structure segments shows up next time I open this file. But when I try to save at this point, I get the following notice:
Not saving entity_poly_seq for non-authoritative sequences
The documentation for "save" and "sequence" didn't help. How can I make this sequence "authoritative" and save it into my mmCIF file?
Thank you in advance,
När
du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
E-mailing
Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy_______________________________________________
ChimeraX-users
mailing list -- chimerax-users@cgl.ucsf.edu
To
unsubscribe send an email to chimerax-users-leave@cgl.ucsf.edu
Archives: https://mail.cgl.ucsf.edu/mailman/archives/list/chimerax-users@cgl.ucsf.edu/
VARNING: Klicka inte på länkar och öppna inte bilagor om du inte känner igen avsändaren och vet att innehållet är säkert.
CAUTION: Do not click on links or open attachments unless you recognise the sender and know the content is safe.
_______________________________________________
ChimeraX-users
mailing list -- chimerax-users@cgl.ucsf.edu
To
unsubscribe send an email to chimerax-users-leave@cgl.ucsf.edu
Archives: https://mail.cgl.ucsf.edu/mailman/archives/list/chimerax-users@cgl.ucsf.edu/
VARNING: Klicka inte på länkar och öppna inte bilagor om du inte känner igen avsändaren och vet att innehållet är säkert.
CAUTION: Do not click on links or open attachments unless you recognise the sender and know the content is safe.