Hi Ingvar,

  Glad to hear that the EMDB is collecting the map symmetry information.  This will make the maps more valuable for further research.

  Before creating weird heuristics for guessing the symmetry in ambiguous cases, it is important to understand why we don’t get correlation values equal to 1 for a map that has been symmetrized.  The map we have been talking about, EMDB 2463, is said to have been created by imposing 12-fold symmetry as described in the article

Structure, adsorption to host, and infection mechanism of virulent lactococcal phage p2.
Bebeacua C, Tremblay D, Farenc C, Chapot-Chartier MP, Sadovskaya I, van Heel M, Veesler D, Moineau S, Cambillau C.
J Virol. 2013 Nov;87(22):12302-12. doi: 10.1128/JVI.02033-13. Epub 2013 Sep 11.

Chimera reports the center of symmetry of the 128 by 128 by 128 grid point map to be at grid point 64,64,any-z-index.  Is this right?  The “measure symmetry” command only considers symmetry axes passing through a grid point, or midway between grid points.  In other words it assumes the grid index i,j,k that the symmetry axis passes through has integer or half-integer values.  That doesn’t appear to be the case for this map.  The first sign that raises some suspicion is that the C4 correlation values is not 1.0.  If the map has C12 symmetry then it also has C4 symmetry, and C4 symmetry is just rotation by 90 degrees.  If the center is at a grid point (i=j=64) then the 90 degree rotation maps each grid point exactly to another grid point.  So the map values should be identical at symmetry equivalent grid points (i,j,k), (-j,i,k), (-i,-j,k), (j,-i,k).  If I make two copies of the map and rotate one by 90 degrees and subtract, the result should be exactly zero at each grid point.  So why does Chimera report the C4 correlation as 0.99995 instead of 1.0?

In these cases I like to visually look at two copies of the map, with one copy rotated about the presumed symmetry axis.  I’ve attached an image emdb_2463_r180_64.jpg showing the result.  The yellow map was rotated by 180 degrees about the z-axis through grid point 64,64,64.  It doesn’t align with the gray copy of the map — it appears shifted to the left, indicating that the x grid index for the center is wrong.  The second image emdb_2463_r180_64.2.jpg shows the result if I rotate about grid point 64.2,64,64 — it is misaligned in the opposite direction.  Using rotation center 64.08,64,64 gives a resonable superposition, image emdb_2463_r180_64.08.jpg.  The surfaces don’t match exactly because rotation by 180 degrees is not mapping a grid point to a grid point because the center is not an integer grid index.  The surfaces are computed using grids that are shifted relative to one another and will not be identical since the surface triangulation is based on the grid (using marching cubes algorithm).

Unfortunately the problems with the symmetry of this map are not as simple as the above explanation that the center is not at a grid point.  The previous images used the contour level 0.05.  But if I now switch to contour level 0.08 then 64.08,64,64 does not appear to be the correct center of rotation as show in image emdb_2463_r180_64.08_lev0.08.jpg.  For level 0.08 the correct center seems close to 64.02,64,64.  But then the base (low z) and tube (high z) portions of the map appear to be shifted in opposite directions, as if the symmetry axis is not really exactly along z, but tilted slightly in the xz plane.

My conclusion is that something very weird was done in the symmetrization of this map.  It is perhaps not surprising since the biological structure is a 12-fold symmetric portal crammed into a 5-fold symmetric virus capsid vertex.  It is difficult to handle the symmetry mismatch in such reconstructions and that probably plays a role in why this map does not appear to have C12 symmetry.  Its deviations from C12 symmetry are more than just limited floating point precision (32-bit float values have about 7 digits of precision).

Although we could cook up some “rules” to make Chimera choose C12 symmetry for this map, it would not make sense.  Your efforts to guess the map symmetry are bound to run into hard cases like this.  Obviously the right solution is to have the depositor tell you the answer, and then you should verify it when the map is deposited.  I don’t think days of work on this one map would lead to an understanding of what was done to “impose 12-fold symmetry”.

I think one improvement that should be made to Chimera “measure symmetry” is it should not just give you the “best guess”.  It should report the other high correlation guesses so you can see when the assignment is ambiguous.

Tom



On Apr 6, 2014, at 6:26 AM, ingvar  wrote:

Hi Tom,

Thank you for the thorough answer.  It was really useful to see all the correlation coefficients.

EMDB has started to collect any imposed symmetry in the image processing, but most entries in the archive do not have that info.  One idea is to try remediate the older entries with the symmetries reported by Chimera, and I picked this entry as a spot check.  For this volume the authors reported that they imposed C12 symmetry, so I wanted to see if that was what Chimera reported.  When I viewed the volume it looked to me that there were some features that was C4 rather then C12, so it was nice to see that the cc for C4 was higher than C12, but not massively so.

I agree that just taking the highest cc is likely to not be the best choice.  One could device a significance score, maybe log(1-cc) and say that the lower symmetry is the relevant one if the difference is larger than log(2).  (There may be a better value than log(2), possibly based on the symmetry multiples between the two cases.)  With these rules C12 would be favoured over C24 and also just pip C4, but a larger sampling set would be needed to see if this actually works.
This type of score would only apply in a relatively narrow band, we are only interested if the correlation is high enough, > 0.99, and there will be an upper limit were the values of a handful of voxels would be all decisive.

Another observation is that the cc is quite large for both C11 and C13. I think that is probably the case for any large n, if the cc for Cn is high then the cc's for both Cn-1 and Cn+1 are likely to also be quite high.

All the best,
Ingvar

On 2014-04-04 19:13, Tom Goddard wrote:
Hi Ingvar,
 The basic trouble is that this map is so close to being
cylindrically symmetric, ie identical at any rotation angle about z,
that the “measure symmetry” command makes the wrong choice.  I put a
print statement in the measure symmetry code and here are the
correlation values it found for the map with a rotated copy of itself
for cyclic n-fold symmetry with n = 2, 3, …, 24:
measure sym #0 nmax 24
Symmetry emd_2463.map: C13, center 64 64 37
2 0.999913062795
3 0.999864979563
4
5 0.991690288767
6 0.999886369722
7 0.994690864511
8 0.991400840587
9 0.992535268321
10 0.995822490928
11 0.999002374601
12 0.999904810469
13 0.999346674071
14 0.998135913568
15 0.996806980709
16 0.99559228476
17 0.994463541105
18 0.993546515874
19 0.992755122912
20 0.992182861668
21 0.99181103327
22 0.991574006772
23 0.991460398423
24 0.991401690757
Now the default correlation threshold for “measure symmetry” to
recognize a symmetry is 0.99.  So the above table shows that with that
criteria this map could be cyclic n-fold symmetric for any n = 2, …,
24.
 So you might say let’s take the highest correlation value.  Looking
at the table that is for n = 4, 4-fold symmetry.  The measure symmetry
command doesn’t choose that wrong answer.  The map is C4 symmetric.
But a C12 map is automatically C6, C4, C3 and C2 symmetric since 6, 4,
3, and 2 divides evenly into 12.  So the measure symmetry command
eliminates the choices for n that are divisors of another choice of n
that has correlation > 0.99.  This is where C12 gets eliminated.
Because you see C24 has correlation of 0.991 and so C12 is eliminated
because measure symmetry prefers to say the maps is C24.  Once you
eliminate lower order symmetries the choices left are C13 through C24
and C13 has the highest correlation of those choices!
 There are various ways you can get the right answer C12.  Using
"measure sym  #0 nmax 23” to exclude 24-fold symmetry which is causing
C12 to be knocked out the competition works.  Or setting a higher
correlation threshold works since again C24 then gets eliminated
because it doesn’t meet the correlation threshold.  Or changing the
contour levels changes all the correlation values since it only
considers grid points within the contour level when computing
correlation.
 The fundamental difficulty is that every choice of n for n-fold
symmetry gives a correlation value greater than 0.99.  Maybe smarter
code would not use a fixed cutoff value.  Instead it would look at the
correlation values it finds, and choose a cutoff based on those.  But
there is no easy answer.  If you look at the table of numbers above C4
symmetry has correlation 0.99994 while C12 has only 0.99990.  Why
don’t I say the map has C4 symmetry and just happens to be very close
to cylindrical so C12 is also high?  I guess you could look at the
variance of the correlation values.  I see C8 is only 0.991, so
clearly the map isn’t cylindrically symmetric at the 0.9999 level.  So
some vary nuanced code could probably get the right answer.
 I have no sound suggestion as to how to get the right answer
consistently.  These EM maps were computed with an imposed symmetry,
and the fundamental problem is that EMDB didn’t collect that
information from the person who deposited the map.  And now it is very
tricky to deduce what the symmetry the author used was in an automated
way.  The solution is to collect this information from the author when
they deposit the map — as this is very important information about the
map.
Tom
On Apr 4, 2014, at 2:53 AM, ingvar <ingvar@ebi.ac.uk> wrote:
Hello,
I fetched entry EMD-2463 from EMDB and issued:
measure symmetry #0 nMax 24
and got the result "Symmetry emd_2463.map: C13, center 64 64 37"
Increasing the number of sampling points does not change the result, but
if I increase the correlation criteria a little, to say 0.992, I get the expected C12 symmetry.
Also if I change the contour level to 0.05, the author recommended level, I get C12 again.
Looking at the volume at several different contour levels, it is difficult to see how it could be considered C13, and also how it would go from C13 to C12.
Kind Regards,
Ingvar Lagerstedt
--
Ingvar Lagerstedt
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
Tel: +44 (0)1223 492533
_______________________________________________
Chimera-users mailing list
Chimera-users@cgl.ucsf.edu
http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

-- 
Ingvar Lagerstedt
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD

Tel: +44 (0)1223 492533
_______________________________________________
Chimera-users mailing list
Chimera-users@cgl.ucsf.edu
http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users