
Hi list, I clustered a bunch of structures using the MD clustering tool, I am very happy with the result but I can't find how to download/save the result. Is there a way for instance to copy-paste the columns indicating the number of models as well as the index of the representative for each cluster; alternatively, is possible to sort out frames by clusters ? If there are command shortcuts to manipulate the clustering, I'd be happy to get them, or to get a link to it in the manual. Chimera is really a great tool, but the learning curve appears really steep ;) Ultimately, I wonder about the type of clustering used here. I suspect a complete linkage but I couldn't find the confirmation in the manual. Any info on that ? Thanks a lot for your extremely good work, and the fast answers on this mailing list --Ben

Hi Ben, I can answer parts of the question. The clustering is a reimplementation of what is described in this paper: An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Kelley LA, Gardner SP, Sutcliffe MJ. Protein Eng. 1996 Nov;9(11):1063-5. <http://www.ncbi.nlm.nih.gov/pubmed/8961360> This reference is given in the Ensemble Cluster docs: <http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/ensemblecluster/ensemblecluster.html> (and the MD Movie clustering docs link to this page; from MD Movie tool, click Help button and go to the clustering section of the resulting page) Clustering is not available as a Chimera command. Chimera capabilities can be accessed with python scripts, but I will have to leave any details on that, as well as how to save results (other than saving your Chimera session), for the others to provide. I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco On Nov 2, 2012, at 7:46 AM, Benjamin SCHWARZ wrote:
Hi list,
I clustered a bunch of structures using the MD clustering tool, I am very happy with the result but I can't find how to download/save the result.
Is there a way for instance to copy-paste the columns indicating the number of models as well as the index of the representative for each cluster; alternatively, is possible to sort out frames by clusters ?
If there are command shortcuts to manipulate the clustering, I'd be happy to get them, or to get a link to it in the manual. Chimera is really a great tool, but the learning curve appears really steep ;)
Ultimately, I wonder about the type of clustering used here. I suspect a complete linkage but I couldn't find the confirmation in the manual. Any info on that ?
Thanks a lot for your extremely good work, and the fast answers on this mailing list
--Ben

Thanks Elaine, According to the paper the clustering is performed with average linkage, with some tricky method to determine the number of clusters. A few suggestions for the enhancement of the clustering functionality in Chimera : - Allow the user to save the cluster information. For instance by saving clustered frames in separate files, or by saving the index of frames accompanied of their cluster index in a coma separated file. - Since the clustering scheme is a linkage, it could be good to show the dendogram and let the user play with it to determine its own cutof. The most expensive part being the computation of the pairwise distance matrix, once it is done, it might be possible to let the user interactively choose a cutof distance, or a number of clusters he desires and see what happens. --Ben
Hi Ben, I can answer parts of the question. The clustering is a reimplementation of what is described in this paper: An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Kelley LA, Gardner SP, Sutcliffe MJ. Protein Eng. 1996 Nov;9(11):1063-5. <http://www.ncbi.nlm.nih.gov/pubmed/8961360>
This reference is given in the Ensemble Cluster docs: <http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/ensemblecluster/ensemblecluster.html> (and the MD Movie clustering docs link to this page; from MD Movie tool, click Help button and go to the clustering section of the resulting page)
Clustering is not available as a Chimera command.
Chimera capabilities can be accessed with python scripts, but I will have to leave any details on that, as well as how to save results (other than saving your Chimera session), for the others to provide.
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Nov 2, 2012, at 7:46 AM, Benjamin SCHWARZ wrote:
Hi list,
I clustered a bunch of structures using the MD clustering tool, I am very happy with the result but I can't find how to download/save the result.
Is there a way for instance to copy-paste the columns indicating the number of models as well as the index of the representative for each cluster; alternatively, is possible to sort out frames by clusters ?
If there are command shortcuts to manipulate the clustering, I'd be happy to get them, or to get a link to it in the manual. Chimera is really a great tool, but the learning curve appears really steep ;)
Ultimately, I wonder about the type of clustering used here. I suspect a complete linkage but I couldn't find the confirmation in the manual. Any info on that ?
Thanks a lot for your extremely good work, and the fast answers on this mailing list
--Ben
--- Benjamin SCHWARZ Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18 Biocomputing group -- Integrated Structural Biology -- IGBMC 1 rue Laurent Fries, BP 10142 F - 67404 Illkirch CEDEX FRANCE

On Nov 4, 2012, at 7:26 AM, Benjamin SCHWARZ wrote:
Thanks Elaine,
According to the paper the clustering is performed with average linkage, with some tricky method to determine the number of clusters.
A few suggestions for the enhancement of the clustering functionality in Chimera : - Allow the user to save the cluster information. For instance by saving clustered frames in separate files, or by saving the index of frames accompanied of their cluster index in a coma separated file.
Hi Ben, Tonight's daily build will have a "Save" button on the clustering dialog. The saved file will have one cluster per line with the representative frame number listed first, followed by the other frame numbers of that cluster.
- Since the clustering scheme is a linkage, it could be good to show the dendogram and let the user play with it to determine its own cutof. The most expensive part being the computation of the pairwise distance matrix, once it is done, it might be possible to let the user interactively choose a cutof distance, or a number of clusters he desires and see what happens.
That would be nice. I'll open a enhancement-request ticket in our bug database with you on the recipient list, so you'll be notified if/when we get to it. --Eric Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu
--Ben
Hi Ben, I can answer parts of the question. The clustering is a reimplementation of what is described in this paper: An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Kelley LA, Gardner SP, Sutcliffe MJ. Protein Eng. 1996 Nov;9(11):1063-5. <http://www.ncbi.nlm.nih.gov/pubmed/8961360>
This reference is given in the Ensemble Cluster docs: <http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/ensemblecluster/ensemblecluster.html> (and the MD Movie clustering docs link to this page; from MD Movie tool, click Help button and go to the clustering section of the resulting page)
Clustering is not available as a Chimera command.
Chimera capabilities can be accessed with python scripts, but I will have to leave any details on that, as well as how to save results (other than saving your Chimera session), for the others to provide.
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Nov 2, 2012, at 7:46 AM, Benjamin SCHWARZ wrote:
Hi list,
I clustered a bunch of structures using the MD clustering tool, I am very happy with the result but I can't find how to download/save the result.
Is there a way for instance to copy-paste the columns indicating the number of models as well as the index of the representative for each cluster; alternatively, is possible to sort out frames by clusters ?
If there are command shortcuts to manipulate the clustering, I'd be happy to get them, or to get a link to it in the manual. Chimera is really a great tool, but the learning curve appears really steep ;)
Ultimately, I wonder about the type of clustering used here. I suspect a complete linkage but I couldn't find the confirmation in the manual. Any info on that ?
Thanks a lot for your extremely good work, and the fast answers on this mailing list
--Ben
--- Benjamin SCHWARZ Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18
Biocomputing group -- Integrated Structural Biology -- IGBMC 1 rue Laurent Fries, BP 10142 F - 67404 Illkirch CEDEX FRANCE
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users

On 11/06/2012 09:50 AM, Eric Pettersen wrote:
On Nov 4, 2012, at 7:26 AM, Benjamin SCHWARZ wrote:
Thanks Elaine,
According to the paper the clustering is performed with average linkage, with some tricky method to determine the number of clusters.
A few suggestions for the enhancement of the clustering functionality in Chimera : - Allow the user to save the cluster information. For instance by saving clustered frames in separate files, or by saving the index of frames accompanied of their cluster index in a coma separated file.
Hi Ben, Tonight's daily build will have a "Save" button on the clustering dialog. The saved file will have one cluster per line with the representative frame number listed first, followed by the other frame numbers of that cluster.
- Since the clustering scheme is a linkage, it could be good to show the dendogram and let the user play with it to determine its own cutof. The most expensive part being the computation of the pairwise distance matrix, once it is done, it might be possible to let the user interactively choose a cutof distance, or a number of clusters he desires and see what happens.
Hi Ben, ;) There are even ways to accelerate the initialisation of this distance matrix, if the distance being used is a metric. http://bioinformatics.oxfordjournals.org/content/27/7/939 But, the applicability of these kind of methods depends on the clustering algorithm being used. Some algorithms would allow the matrix to be initialised "lazily", some would not. My intuition tells me that in hierarchical clustering, complete and single linkage can be accelerated by a quite similar technique. Average linkage cannot. I played quite a lot with some software for this, usually after 20k PDBs, a desktop workstation has not enough memory to handle the distance matrix. Regards, Francois.
That would be nice. I'll open a enhancement-request ticket in our bug database with you on the recipient list, so you'll be notified if/when we get to it.
--Eric
Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu
--Ben
Hi Ben, I can answer parts of the question. The clustering is a reimplementation of what is described in this paper: An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. Kelley LA, Gardner SP, Sutcliffe MJ. Protein Eng. 1996 Nov;9(11):1063-5. <http://www.ncbi.nlm.nih.gov/pubmed/8961360>
This reference is given in the Ensemble Cluster docs: <http://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/ensemblecluster/ensemblecluster.html> (and the MD Movie clustering docs link to this page; from MD Movie tool, click Help button and go to the clustering section of the resulting page)
Clustering is not available as a Chimera command.
Chimera capabilities can be accessed with python scripts, but I will have to leave any details on that, as well as how to save results (other than saving your Chimera session), for the others to provide.
I hope this helps, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Nov 2, 2012, at 7:46 AM, Benjamin SCHWARZ wrote:
Hi list,
I clustered a bunch of structures using the MD clustering tool, I am very happy with the result but I can't find how to download/save the result.
Is there a way for instance to copy-paste the columns indicating the number of models as well as the index of the representative for each cluster; alternatively, is possible to sort out frames by clusters ?
If there are command shortcuts to manipulate the clustering, I'd be happy to get them, or to get a link to it in the manual. Chimera is really a great tool, but the learning curve appears really steep ;)
Ultimately, I wonder about the type of clustering used here. I suspect a complete linkage but I couldn't find the confirmation in the manual. Any info on that ?
Thanks a lot for your extremely good work, and the fast answers on this mailing list
--Ben
--- Benjamin SCHWARZ Email : schwarz@igbmc.fr <mailto:schwarz@igbmc.fr> Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18
Biocomputing group -- Integrated Structural Biology -- IGBMC 1 rue Laurent Fries, BP 10142 F - 67404 Illkirch CEDEX FRANCE
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu <mailto:Chimera-users@cgl.ucsf.edu> http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
_______________________________________________ Chimera-users mailing list Chimera-users@cgl.ucsf.edu http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
participants (4)
-
Benjamin SCHWARZ
-
Elaine Meng
-
Eric Pettersen
-
Francois Berenger