findclash "batch" calculations
On Feb 4, 2011, at 2:51 AM, Julien wrote:
Hi Elaine, I would like to find clashes between a ligand and a protein. I have around 100 000 poses of this ligands. The information to get the poses are at the moment saved in a pdb file. Indeed three pseudo atoms are used to place the ligand in the correct position using the command match.
The workflow would be the followin: open ligand open protein open pseudo.pdb
match ligand position1 findclash match ligand position2 ...
First, this is very slow, even in batch mode. Maybe you know a better way to do that. Second chimera does not open more than 9999 residues, the pseudo pdb have nearly 100 000 residues. Is there a way to get around that.
I guess the routine findclash calculates all distances brutally and compare them to a threshold. many greetings Julien
Hi Julien, Findclash does calculate distances between atom centers, but the cutoff parameter is compared to "overlap" between the VDW spheres -- in other words, smaller atoms must be closer together than larger atoms to be considered overlapping or clashing. The output information (optional) includes both the center-center distances and the overlaps, however. You can limit the calculation to only certain atoms. There are also options to control cutoff and other settings. Details are here: <http://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/findclash.html> If you are really just using "findclash" without arguments, that would look at the whole structure and take more time than necessary. For example, if your ligand is model #0 and the protein model #1, one way to limit the calculation to only clashes between them and send info to the Reply Log is: findclash #0 test #1 log true Otherwise, I'm not an expert on any large-scale calculations or how you might optimize the procedure for large sets of atoms, so I'm CC-ing this question to the chimera-users list. (It is generally better to send Chimera questions to this list rather than to me personally.) The other users or developers may have good ideas. Best, Elaine ----- Elaine C. Meng, Ph.D. UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab Department of Pharmaceutical Chemistry University of California, San Francisco
On Feb 4, 2011, at 9:54 AM, Elaine Meng wrote:
Second chimera does not open more than 9999 residues, the pseudo pdb have nearly 100 000 residues. Is there a way to get around that.
This is a limitation of PDB format -- 4 characters allocated to the residue-number columns. I think your options are: 1) Use another format that has no limit for the pseudo structure, such as Mol2. 2) Use MODEL records. MODEL records also have a 4-digit "limit" but Chimera ignores the number given in the MODEL record and just numbers them sequentially from 1. Therefore you can have 100,000 "MODEL 1" records in your file (with corresponding ENDMDL records) and Chimera will open the file as 100,000 separate models (e.g. #2.1 through #2.100000). --Eric Eric Pettersen UCSF Computer Graphics Lab http://www.cgl.ucsf.edu
participants (2)
-
Elaine Meng
-
Eric Pettersen