Suggestion on biological assembly load

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear all, I have small suggestion to make that I would find helpful if implemented in chimera: I think it would be great if the "biological assembly" could be fetched by ID from the File menu, just as it goes now for the pdb coordinates file from the RCSB site. Such assemblies are kept at the PQS database at the EBI (Cambridge). The links there follow this syntax: http://pqs.ebi.ac.uk/pqs-doc/macmol/<ID>.mmol where <ID> is obviously that of the PDB structure. Despite the extension, it is a plain PDB file. The CRYST1 record (for crystallographic structures) of such coordinates files has been modified to space group P1, since the crystallographic operations of the original structure don't hold. A potential problem if these files is that they don't seem to care about having duplicated atom numbers... These assemblies are most often correct and validated by experiments. However, you might want to add a caveat or, even better, an intermediate window with a link to the place where the PQS explains how that particular oligomer was generated and why it is considered biologically relevant: http://pqs.ebi.ac.uk/pqs-bin/macmol.pl?filename=<ID> Best regards, Miguel - -- Miguel Ortiz Lombardía Centro de Investigaciones Oncológicas C/ Melchor Fernández Almagro, 3 28029 Madrid, Spain Tel. +34 912 246 900 Fax. +34 912 246 976 e-mail: molatwork@yahoo.es - ---------------------------------------------------------------------- Et ainsi ne pouvant faire que ce qui est juste fût fort, on a fait que ce qui est fort fût juste. Blaise Pascal, Pensées -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iD8DBQFGGL+gF6oOrDvhbQIRAtQ2AKCi9oIpIJffYwshk1PgB2aiG4e9rgCghjzZ Uz0yTug1VXEHXNP2MG2HsmI= =EGzy -----END PGP SIGNATURE-----

Hi Miguel, Adding a fetch by id option for PQS biological assembly PDB files looks like a good idea. I'll put it on the requested features list. Some of those files reuse atom serial numbers and reuse chain identifiers. For example, satellite tobacco mosaic virus 1a34 which involves 60 copies of the asymmetric unit (4 chains) has more than 200,000 atoms and hundreds of chains. The PDB file format can't handle that correctly. Still the file displays in Chimera although it can be hard to select just a single capsid protein since it shares a chain identifier with other subunits. This is a very small virus. Bigger viruses just don't have PQS entries -- they probably set a size limit. So this isn't a good way to handled icosahedral viruses. But probably most other PDB entries will not have size problems. Another issue is that PQS has more than one biological assembly for some PDB entries. For example, 1a6v has 3 assemblies, with files called 1a6v_1.mmol, 1a6v_2.mmol and 1a6v_3.mmol. I guess it would make sense to load all of the assemblies as separate Chimera models in those cases. Thanks for the suggestion. Tom

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Tom, Thanks for your interest! I think that a collection of models for the cases where there is more than one assembly is an excellent idea. On a related note, it's a pity that mmCIF files are not more widely used instead of the very limited PDB format. I like the fact that Chimera can handle them. Best regards, Miguel Thomas Goddard escribió:
Hi Miguel,
Adding a fetch by id option for PQS biological assembly PDB files looks like a good idea. I'll put it on the requested features list.
Some of those files reuse atom serial numbers and reuse chain identifiers. For example, satellite tobacco mosaic virus 1a34 which involves 60 copies of the asymmetric unit (4 chains) has more than 200,000 atoms and hundreds of chains. The PDB file format can't handle that correctly. Still the file displays in Chimera although it can be hard to select just a single capsid protein since it shares a chain identifier with other subunits. This is a very small virus. Bigger viruses just don't have PQS entries -- they probably set a size limit. So this isn't a good way to handled icosahedral viruses. But probably most other PDB entries will not have size problems.
Another issue is that PQS has more than one biological assembly for some PDB entries. For example, 1a6v has 3 assemblies, with files called 1a6v_1.mmol, 1a6v_2.mmol and 1a6v_3.mmol. I guess it would make sense to load all of the assemblies as separate Chimera models in those cases.
Thanks for the suggestion.
Tom
- -- Miguel Ortiz Lombardía Centro de Investigaciones Oncológicas C/ Melchor Fernández Almagro, 3 28029 Madrid, Spain Tel. +34 912 246 900 Fax. +34 912 246 976 email: molatwork@yahoo.es www: http://www.pangea.org/mol/spip.php?rubrique2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le travail est ce que l'homme a trouvé de mieux pour ne rien faire de sa vie. (Raoul Vaneigem) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) iD8DBQFGHcb8F6oOrDvhbQIRAoHnAJ952AbA01XpJmKp6ctyWyKrPJEOFwCgpSNu 0riRyAIZrocy+SACVZAiwM8= =mMMB -----END PGP SIGNATURE-----

Hi Miguel, The remediated mmCIF virus files contain matrices to specify the full capsid, a single pentamer, a 23 hexamer (I think 6 monomers about a 3-fold axis), and the crystal asym unit. I plan to allow the Chimera multiscale tool to list and display all these biological assemblies listed in mmCIF files. Tom
participants (4)
-
Miguel Ortiz Lombardia
-
Miguel Ortiz-Lombardia
-
Thomas Goddard
-
Tom Goddard