How to know the cut-off used in ensemble cluster?
i am using MD movie to cluster my Molecular Dynamics results. how can i know the cut-off used by chimera to do this? i know that it depends on the data itself, but how can i know this cut-off for specific data? Thanks
As you mentioned, the cutoff is specific to each data set. As described in the NMRCLUST paper, the cutoff is based on a "penalty value" as a function of the number of clusters. The penalty value is the sum of the average "spread" (distance between two samples within a cluster) and the number of clusters. We simply choose the number of clusters as the one with the lowest penalty value. I did not bother reporting the actual penalty value because it does not really correspond to a physical quantity, and the chosen penalty value is not useful without the context of all other penalty values. If I were doing it over, I would probably include a penalty-vs-#cluster plot and allow users to change the cutoff, but we have moved on to ChimeraX development and, sadly, there is not enough time to do everything we want. Conrad On 6/1/2020 10:16 AM, Ibrahim Mohamed wrote:
i am using MD movie to cluster my Molecular Dynamics results. how can i know the cut-off used by chimera to do this? i know that it depends on the data itself, but how can i know this cut-off for specific data? Thanks
_______________________________________________ Chimera-users mailing list: Chimera-users@cgl.ucsf.edu Manage subscription: https://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
participants (2)
-
Conrad Huang
-
Ibrahim Mohamed